next up previous contents
Next: Indexing through a joining Up: Data reorganization Previous: Transpose of a data

Joining data frames

  

tabular730

These two commands perform joining operations. The first operation makes a full join, and the second operation picks only the first pair of the keys which match the joining.

Example: The commands are useful when a data has a relational form where the clear keys of the data records can be found. Then the analysis can be made by using the following concept. We have a data that tells who has bought which product. We want to profile customers which are identified by their codes. The first data records are summarized related to the code, for instance, by computing the average values. Then the dimension of the data is reduced inside the keys. These results are returned back to the data in which they can be used (see the baseball example).

...
NDA> ls -fr orderData
custNro
custSize
custCover
# summarize the data related to the customer number (custNro)
NDA> select keyFr -f orderData.custNro
NDA> uniq -d keyFr -cout custUniqs
NDA> select custFlds -f orderData.custSize orderData.custCover
NDA> clstat -d custFlds -c custUniqs -dout custData -avg
...
# som analysis -> the groups of the neurons -> binarizing them to data2
...
# bind the customer key to data2 by the key operation
NDA> select custKeyOrg -f orderData.custNro
NDA> clkey -d custKeyOrg -c custUniqs -dout custData
NDA> select custKey -f custData.custNro
...
# join binarized groups to the original data
NDA> cd ..
NDA> join -k1 keyFr -d1 orderData -k2 custKey 
     -d custData -dout data2
...



Erkki Hakkinen
Thu Sep 24 11:51:34 EET DST 1998