Ungrouped data records can be given an interpretation in relation to known groups by calculating memberships , which are functions of distance , . Distance to the group could be calculated using the closest member of the group (single linkage) or the center of the group (group-average linkage) , in which .
Figure 3: Different linkages for distance calculation to some group : a) single and b) group-average. c) A problematic distribution for group-average linkage.
The group-average linkage calculation is computationally more efficient (O(n+m) compared to single linkage O(mn), where n is the number of data records and m the number of records in the group), but it is clear that some problematic data sets should not be evaluated with group-average linkage (see figure 3 c).
The actual group memberships can be calculated as a function of distance , . For examples, see figure 4.
Figure 4: a) Group membership distributions of two groups in a two dimensional case. b) Examples of group membership functions. Dashed lines indicate identified group boundary. max identifies located maximum distance.