Next: Neural Data Analysis
Up: Titlepage
Previous: Titlepage
In this study, computationally intelligent methods have been developed for profiling of any documents containing textual data.
The main goal has been to locate and identify groups of similar documents.
- Number of documents usually big
- Size of the vocabulary might be huge
- Textual data might contain several languages and spelling errors
- Finnish language contains at least compound words and case endings,
but possibly also several dialects
- Neural Data Analysis Environment
- Self-Organizing Maps (TS-SOM)
- Multi-stage analysis model
- Word similarity detection and grouping
- Contexts are formed according to the use of words in sentences
- Features representing whole documents are used to train a concluding SOM
Anssi Lensu
Tue Oct 27 15:29:16 EET 1998