next up previous
Next: Making conclusions and fitness Up: Combining categorical analyses Previous: Combining categorical analyses

Group membership calculation

Ungrouped data records tex2html_wrap_inline287 can be given an interpretation in relation to known groups tex2html_wrap_inline289 by calculating memberships tex2html_wrap_inline291 , which are functions of distance tex2html_wrap_inline293 , tex2html_wrap_inline295 . Distance to the group could be calculated using the closest member of the group (single linkage) tex2html_wrap_inline297 or the center of the group (group-average linkage) tex2html_wrap_inline299 , in which tex2html_wrap_inline301 .

   figure38
Figure 2: Different linkages for distance calculation to a single group: a) single, b) group-average. c) A problematic distribution for group-average linkage.

The group-average linkage is usually computationally more efficient (O(n+m) compared to single linkage O(mn), where n is the number of data records and m the number of records in the group), but some problematic data sets should not be evaluated with group-average linkage (see figure 2 c).

The actual group memberships tex2html_wrap_inline291 can be calculated as a function of distance tex2html_wrap_inline313 , tex2html_wrap_inline295 . For examples, see figure 3.

   figure47
Figure 3: a) Group membership distributions of two groups in a two dimensional case. b) Examples of group membership functions. Dashed lines indicate identified group boundary. max identifies located maximum distance.



Anssi Lensu
Tue Nov 3 12:18:16 EET 1998