next up previous contents
Next: Basic statistics for classes Up: Basic computing operations Previous: Comparison of two data

Class statistics

 

Operations for computing statistics in classes follow the same principle. An operation takes a classified data as a parameter. Then it collects all the data records pointed by these classes and computes statistical values for them. The result will be a data frame in which each class will have a data record and one statistical value over all the classes is stored in a field. The principle of these operations have been described in the following figure:

figure1278

tabular1282

This macro command computes statistics for classes in the classified data <cldata> from the data <srcdata>. The procedures called by this command are documented in their own sections below.

Example (ex4.9): In the first example, the Boston data is classified between unique values in the field "chas" which has two values: the apartment is on river side or not. Then the average values are computed from other fields. Also the numbers of the items in both classes are found.

...
NDA> select key1 -f boston.chas
NDA> uniq -d key1 -cout cld1
NDA> clstat -d boston -c cld1 -dout sta -avg -hits
NDA> ls -fr sta -f
hits
crim_avg
zn_avg
...

Example (ex4.10): The second example demonstrates computing statistics for neurons. One advantage of the statistical values comparing with weights is that they have clear interpretations such as the mean, the minimum and maximum values of the data records in neurons.

...
NDA> somtr -d predata -sout som1 -l 4
NDA> somcl -d predata -s som1 -cout cld1
NDA> clstat -d boston -c cld1 -dout sta -all
...





Erkki Hakkinen
Thu Sep 24 11:51:34 EET DST 1998