next up previous contents
Next: Convert a classification into Up: Data reorganization Previous: Merging a grouping and

Set-relational operations for classes

      

These set operations can be used to compare single classes or complete classifications having the same class names.

clinsec Evaluate the intersection of two classes
-cl1 <data1> source class 1
-cl2 <data2> source class 2
-clout <dataout> target class
clunion Evaluate the union of two classes
-cl1 <data1> source class 1
-cl2 <data2> source class 2
-clout <dataout> target class
cldiff Evaluate the difference of two classes
-cl1 <data1> source class 1
-cl2 <data2> source class 2
-clout <dataout> target class

These three commands perform normal set operations intersection, union or difference to two classified data classes. Intersection contains those data record indices included in both classes, union contains a superset of all indices within the two classes and difference those indices within just one of the classes, but not in both of them.

Example: For a thorough example, see the example of classification comparisons. Here is just a basic example with a small data set.

...
# Data file with 8 numbers (1, 2, 3, 5, 8, 6, 7 and 4)
NDA> getdata k
x                               # Field name
1                               # Data type (integer)
1                               # Data values
2
3
5
8
6
7
4
# Select two classes containing indices to values larger than
# or equal to 3 (first) and values between 5 and 7 (sec)
NDA> selcl -cout cli -clout first -expr 'k.x' >= 3;
NDA> selcl -cout cli -clout sec -expr 'k.x' >= 5 and 'k.x' <= 7;
# Evaluate difference of classes
NDA> cldiff -cl1 cli.first -cl2 cli.sec -clout cli.result
NDA> save cli
...

# This results in cli.cld containing the following data lines:
...
class_info
201 1 first 6 2 3 4 5 6 7 
201 1 sec 3 3 5 6 
201 1 result 3 7 4 2 
# Class first: 6 indices (234567), class sec 3 indices (356),
# Difference(result):  3 indices (247)

cldinsec Evaluate the intersection of two classifications
-c1 <cldata1> source classified data 1
-c2 <cldata2> source classified data 2
-cout <cldataout> target classified data
cldunion Evaluate the union of two classifications
-c1 <cldata1> source classified data 1
-c2 <cldata2> source classified data 2
-cout <cldataout> target classified data
clddiff Evaluate the difference of two classifications
-c1 <cldata1> source classified data 1
-c2 <cldata2> source classified data 2
-cout <cldataout> target classified data

These commands perform class by class set operations for entire classifications. Each class name of classified data <cldata1> is compared to the names of classes in <cldata2>. If a match is found, a new class with the same name is created into the target classified data containing the intersection, union or difference.

Example: To compare the classification results of two different layers of TS-SOM, union and intersection can be used.

...
# Data loaded into d - Selection of fields, training of a SOM
NDA> select datas -f d.GP d.GN d.TP d.TN
NDA> somtr -d datas -sout som -cout cldata -l 4
 Trained layer: 0
 Trained layer: 1
 Trained layer: 2
 Trained layer: 3
# Average value calculation for each neuron
NDA> clstat -c cldata -d datas -dout st -avg
NDA> ls -fr st
 st.GP_avg
 st.GN_avg
 st.TP_avg
 st.TN_avg
# TS-SOM layer info is needed for selection of data records
NDA> somlayer -s som -fout sl
NDA> selcl -cout g2 -clout GP -expr 'sl'=2 and 'st.GP_avg'>=0.7;
NDA> selcl -cout g2 -clout GN -expr 'sl'=2 and 'st.GN_avg'>=0.7;
NDA> selcl -cout g2 -clout TP -expr 'sl'=2 and 'st.TP_avg'>=0.7;
NDA> selcl -cout g2 -clout TN -expr 'sl'=2 and 'st.TN_avg'>=0.7;
NDA> selcl -cout g3 -clout GP -expr 'sl'=3 and 'st.GP_avg'>=0.7;
NDA> selcl -cout g3 -clout GN -expr 'sl'=3 and 'st.GN_avg'>=0.7;
NDA> selcl -cout g3 -clout TP -expr 'sl'=3 and 'st.TP_avg'>=0.7;
NDA> selcl -cout g3 -clout TN -expr 'sl'=3 and 'st.TN_avg'>=0.7;
# Selected groups of records need to be converted into cldatas
NDA> mergecld -c1 g2 -c2 cldata -cout clo2
NDA> mergecld -c1 g3 -c2 cldata -cout clo3
# Compute intersection and union of indices in classifications
NDA> cldinsec -c1 clo2 -c2 clo3 -cout clo_insec
NDA> cldunion -c1 clo2 -c2 clo3 -cout clo_union
# Calculate the index counts of each class into a frame
NDA> clhits -c clo_insec -fout ins
NDA> clhits -c clo_union -fout uni
NDA> select clp -f ins uni
NDA> ls -fr clp
 clp.ins
 clp.uni
# Calculate the ratio of the set sizes (intersection / union)
# for each class (GP, GN, TP and TN)
NDA> expr -dout props -fout prop -expr 'clp.ins' / 'clp.uni';
NDA> getdata props
prop
2
0.870521
0.940236
0.810526
0.972215
...


next up previous contents
Next: Convert a classification into Up: Data reorganization Previous: Merging a grouping and

Anssi Lensu
Wed Oct 6 12:57:48 EET DST 1999