next up previous
Next: Algorithm for the two-phase Up: Analysis of Gallup Questionnaires Previous: Analysis of Gallup Questionnaires

Introduction

Analysis of multivariate data, for example gallup questionnaires, can be truly difficult, if the number of data fields is big. For instance, direct groupings of human opinions are often meaningless if all available data fields are used, due to the large number of value combinations (see [6]). Also, the number of data records is usually rather small due to the work needed for collecting and processing the questionnaires. Unless the questionnaire or the requested answers are trivial, there is no one-stage method available that gives satisfactory grouping results. For example, the use of the Self-Organizing Map [2, 3] might only discriminate between well and badly filled questionnaires, since the large amount of groups each containing only a few data records prevents clear interpretations of the clustering objectives.

Better results can be obtained if the questions are divided into a few small categories, which are analyzed separately and their results then combined for an overall view. This kind of procedure also allows the addition of prior knowledge to the analysis process. The use of several SOMs is partially adopted from [1], where the main idea was to create linked graphical representations of multi-categorical data.



Anssi Lensu
Tue Nov 3 11:38:53 EET 1998