These set operations can be used to compare single classes or complete classifications having the same class names.
These three commands perform normal set operations intersection, union or difference to two classified data classes. Intersection contains those data record indices included in both classes, union contains a superset of all indices within the two classes and difference those indices within just one of the classes, but not in both of them.
Example: For a thorough example, see the example of classification comparisons. Here is just a basic example with a small data set.
... # # Data file with 8 numbers (1, 2, 3, 5, 8, 6, 7 and 4) # NDA> getdata koje x 1 1 2 3 5 8 6 7 4 # # Select two classes containing indices to values larger than or equal to # 3 (first) and values between 5 and 7 (second) # NDA> selcl -cout cli -clout first -expr 'koje.x' >= 3; NDA> selcl -cout cli -clout second -expr 'koje.x' >= 5 and 'koje.x' <= 7; # # Evaluate difference of classes # NDA> cldiff -cl1 cli.first -cl2 cli.second -clout cli.result NDA> save cli ... # # This results in cli.cld containing following data lines # ... class_info 201 1 first 6 2 3 4 5 6 7 201 1 second 3 3 5 6 201 1 res 3 7 4 2 # # Class first: 6 indices (234567), class second 3 indices (356), # Difference: 3 indices (247) #
These commands perform class by class set operations for entire classifications. Each class name of classified data 1 is compared to the names of classes in classified data 2. If a match is found, a new class with the same name is created into the target classified data containing the intersection, union or difference.
Example: To compare the classification results of two different layers of TS-SOM, union and intersection can be used.
... # # Data loaded into d - Selection of fields, training of a TS-SOM # NDA> select datas -f d.GPos d.GNeg d.TPos d.TNeg NDA> somtr -d datas -sout som -cout cldata -l 4 train layer: 0 train layer: 1 train layer: 2 train layer: 3 # # Average value calculation for each neuron # NDA> clstat -c cldata -d datas -dout st -avg NDA> ls -fr st st.GPos_avg st.GNeg_avg st.TPos_avg st.TNeg_avg # # TS-SOM layer information is needed for selection of data records # NDA> somlayer -s som -fout soml NDA> selcl -cout grp2 -clout GP -expr 'soml' = 2 and 'st.GPos_avg' >= 0.7; NDA> selcl -cout grp2 -clout GN -expr 'soml' = 2 and 'st.GNeg_avg' >= 0.7; NDA> selcl -cout grp2 -clout TP -expr 'soml' = 2 and 'st.TPos_avg' >= 0.7; NDA> selcl -cout grp2 -clout TN -expr 'soml' = 2 and 'st.TNeg_avg' >= 0.7; NDA> selcl -cout grp3 -clout GP -expr 'soml' = 3 and 'st.GPos_avg' >= 0.7; NDA> selcl -cout grp3 -clout GN -expr 'soml' = 3 and 'st.GNeg_avg' >= 0.7; NDA> selcl -cout grp3 -clout TP -expr 'soml' = 3 and 'st.TPos_avg' >= 0.7; NDA> selcl -cout grp3 -clout TN -expr 'soml' = 3 and 'st.TNeg_avg' >= 0.7; # # Selected groups of records need to be converted to classified datas # NDA> mergecld -c1 grp2 -c2 cldata -cout clo2 NDA> mergecld -c1 grp3 -c2 cldata -cout clo3 # # Perform intersection and union of indices in classifications # NDA> cldinsec -c1 clo2 -c2 clo3 -cout clo_insec NDA> cldunion -c1 clo2 -c2 clo3 -cout clo_union # # Calculate the index counts of each class and select them to same class # NDA> clhits -c clo_insec -fout ins NDA> clhits -c clo_union -fout uni NDA> select clprop -f ins uni NDA> ls -fr clprop clprop.ins clprop.uni # # Calculate the ratio of the set sizes (intersection / union) # for each class (GP, GN, TP and TN) # NDA> expr -dout props -fout prop -expr 'clprop.ins' / 'clprop.uni'; NDA> getdata props prop 2 0.870521 0.940236 0.810526 0.972215 ...