Compacting self-similar data

Next: Changing field types Up: Data reorganization Previous: Overlapping sequential data sets

Compacting self-similar data

compact Remove repeated data records to reveal trends

-d <datain> source data

-dout <dataout> target data

[-fout <recids>] original data record IDs, which were copied to target

This function removes repeating data records but preserves data record order. If certain data record is followed by another (or several) data record(s) that is (are) identical, these copies are not moved into the target data frame. The ID numbers of the original data frame that were copied to <dataout> are put into <recids>.

Example: A typical use of this command is to find unique instances of data without disturbing the order of events.

NDA> load process.dat -n data
# Select binary parameters representing process state
NDA> select statedata -f data.statep1 data.statep2 data.statep3
NDA> compact -d statedata -dout realstates -fout recids
# "realstates" could be used to train a SOM and visualize
# trajectories
# "recids" could be used to calculate statistics of other
# parameters for each distinct state

Anssi Lensu
Thu May 17 15:00:44 EET DST 2001

compact	Remove repeated data records to reveal trends
-d <datain>	source data
-dout <dataout>	target data
[-fout <recids>]	original data record IDs, which were copied to target