Operations for computing class statistics follow the same principles as field statistics. An operation takes a classified data as a parameter. Then it collects all the data records pointed to by these classes and computes statistical values for them. The result will be a data frame, in which each class will have a data record and one statistical value over all classes is stored in a field. The principle of these operations has been described in the following figure:
clstat | Macro command for computing statistics for classes |
-c <cldata> | classified data for defining classes |
-d <srcdata> | source data frame |
-dout <trgdata> | target data for statistics in classes |
[-all] | compute all the statistics (other flags) |
[-sum] | compute sums |
[-avg] | compute averages |
[-med] | compute medians |
[-var] | compute variances |
[-adev] | compute average standard deviation |
[-min] | find minimum values |
[-max] | find maximum values |
[-quar] | compute the first and the third quartiles (25%, 75%) |
[-hits] | create a variable for the number of the items in classes |
[-name] | the name of the class as a string |
[-id] | the identifier of the class as an integer |
This macro command computes statistics for classes of classified data <cldata> from <srcdata>. The procedures called by this command are documented in their own sections below.
Example (ex4.9): In the first example, Boston data is classified between unique values located in field chas, which has two possible values: the appartment is located near the river (1) or not (0). Then the average values are computed from other fields. Also the numbers of items in both classes is evaluated.
... NDA> select key1 -f boston.chas NDA> uniq -d key1 -cout cld1 NDA> clstat -d boston -c cld1 -dout sta -avg -hits NDA> ls -fr sta -f hits crim_avg zn_avg ...
Example (ex4.10): This second example demonstrates computing statistics for SOM neurons. One advantage of using the statistical values compared to the use of weights is, that they have clear interpretations such as mean, minimum and maximum of the data records chosen to the neurons.
... NDA> somtr -d predata -sout som1 -l 4 ... NDA> somcl -d predata -s som1 -cout cld1 NDA> clstat -d boston -c cld1 -dout sta -all