next up previous contents
Next: Classifying data by a Up: The Tree-Structured Self-Organizing Map Previous: The Tree-Structured Self-Organizing Map

Training a TS-SOM

 

somtr Train a SOM
-d <data> name of the training data frame
-sout <som> name of TS-SOM structure to be created
[-cout <cldata>] SOM classification to be created
[-w <wmat>] specify a different name for the weight matrix (default <som>_W)
[-l <layers>] the number of layers (default 3)
[-md <missing-data>] a code value to indicate missing data
[-D <dimension>] dimension of the TS-SOM (default 2)
[-t <type>] type of topology (default 0)
[-wght <weighting>] weighting of neighbors (default 0.5)
[-c <stop-crit>] stopping criteria (default 0.001)
[-m <max-iter>] maximum number of iterations (default 20)
[-L] do not use a lookup table (default YES)
[-f <corr-layers>] number of corrected layers (default 3)
[-r <train-rule>] training rule (default 0)

The SOM training creates a TS-SOM structure and organizes it. The result includes the structure of the network, stored with the given name <som>. In addition, the weight matrix of TS-SOM is stored in a data frame with the name <som>_W, as described in the figure above. If -cout <cldata> is specified, somtr also creates a classified data frame containing BMU classifications for each data record.

somtr command has lots of parameters, of which the first three are mostly used. Concerning the other parameters, if you are not sure about their use, then you will probably get the best result with their default values. Here are a few parameters described in more detail:

-d <data>
Data frame containing training data
-sout <som>
Name for the resulting TS-SOM structure and weight matrix.
-l <layers>
The number of layers in TS-SOM. Default value is 3
-D <dim>
The dimension of the TS-SOM. This defines the dimension of the TS-SOM i.e. the dimension of the SOM in each layer of TS-SOM. The default value is 2
-t <type>
Type of topology: 0 = lattice, 1 = ring, 2 = tree-structured vector quantifier
-e <weightning>
Factor for weightning the neighbors of neurons during the training process
-c <stop-crit>
The stopping criteria is defined through quantifying the error
-m <max-iter>
The maximum number of the epochs when training one layer of TS-SOM
-L
The lookup table (references from data vectors to their BMUs) is used as default. You can skip that by setting this flag
-f <corr-layer>
The number of layers from which the lookup tables will be corrected. Larger value provides better results but slows down the training
-r <train-rule>
The training rule: 0 = vector quantization (VQ), 1 = spreading. The first rule tries to follow the distribution of the data, while the second rule tries to spread the neurons over data points as completely as possible

Example (ex5.1): Training data is created by preprocessing and a SOM is trained using it. In addition, the SOM is used for classification (see command somcl in section 5.1.2).

...
NDA> prepro -d boston -dout predata -e -n
NDA> somtr -d predata -sout som1 -l 4
...
NDA> somcl -d predata -s som1 -cout cld1


next up previous contents
Next: Classifying data by a Up: The Tree-Structured Self-Organizing Map Previous: The Tree-Structured Self-Organizing Map

Anssi Lensu
Wed Oct 6 12:57:48 EET DST 1999