Generating data according to the grid of the TS-SOM

Next: Backpropagation Up: The Tree-Structured Self-Organizing Map Previous: Mapping data records onto

Generating data according to the grid of the TS-SOM

tabular1676

The operation generates new data points according to the grid of the TS-SOM. It creates a data point for each neuron of the given TS-SOM <som>.

The basic idea is to generate values according to the indexes of the neurons. Actually, one index can run, and it is defined by the dimension <dim>. Thus, new values will be constant related to other dimensions. When values are generated, they are scaled into the given range [<mindef>,<maxdef>].

The scaling function <func> defines how new values are generated. Also, parameters <mindef> and <maxdef> depend on the function as follows:

<func> = "vec":: The definitions, for instance, statistics for fields, are read from two data fields that are referred by the parameters <mindef> and <maxdef>. As many new fields are generated as the fields <mindef> and <maxdef> include data records. Of course, they must have the same length.
<func> = "abs":: The definitions <mindef> and <maxdef> are given as absolute values. If the parameter <src_data> has been omitted, then one new data field is generated. Otherwise a new field is created for each field in this data frame.

There are three alternatives to name new data fields. If only one data field is created, then its name can be defined through the parameter <field_out>. If the parameter <src_data> has been given, then new data fields are named according to fields in this data frame. If both of these parameters have been omitted, then new fields are named automatically ("f0","f1", ).

The data points can be stored in two ways. If the output data frame <data_out> has been given, then the generated data points are stored there. Otherwise new data fields are added into the current directory.

Example (ex5.17): In this example, the MLP network is trained by the Boston data ("zn", "indus" "rate"). Then, an empty TS-SOM is created for the basis of visualization. Two variables ("zn" and "indus") are generated for the dimensions x and y by the command somgrid, and the trained network is used to predict the variable "rate" for each node of the grid.

...
# Train MLP network by the rprop
NDA> select src -f boston.zn boston.indus
NDA> prepro -d src -dout src2 -e
NDA> select trg -f boston.rate
NDA> prepro -d trg -dout trg2 -e
NDA> rprop -d src2 -dout trg2 -net 2 6 1 -full -types t t
   -em 300 -bs 0.01 -mup 1.1 -mdm 0.8 -wout wei -ef virhe
#
# Build TS-SOM and generate values based on statistics
NDA> somtr build -sout s1 -l 6 -D 2
NDA> select x -f src2.zn
NDA> select y -f src2.indus
NDA> fldstat -d x -dout xsta -min -max
NDA> fldstat -d y -dout ysta -min -max
NDA> somgrid -s s1 -min xsta.min -max xsta.max -d x
   -dout datain -sca vec -dim 0
NDA> somgrid -s s1 -min ysta.min -max ysta.max -d y
   -dout datain -sca vec -dim 1
#
# Predict "rate"
NDA> fbp -d datain -win wei -dout trgout
#
# Create Graphics
NDA> mkgrp /win1 -s /s1
...

Next: Backpropagation Up: The Tree-Structured Self-Organizing Map Previous: Mapping data records onto

Erkki Hakkinen
Thu Sep 24 11:51:34 EET DST 1998