1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Homework Help: Does Average Linkage satisfy the properties of metric space?

  1. Mar 14, 2013 #1
    1. The problem statement, all variables and given/known data

    A dissimilarity measure d(x, y) for two data points x and y typically satisfy the following three properties:

    1. [itex]d(x, y) ≥ 0[/itex] and [itex]d(x, y) = 0[/itex] if and only if [itex]x = y[/itex]
    2. [itex]d(x, y) = d(y, x)[/itex]
    3. [itex]d(x,z) ≤ d(x, y) + d(y,z)[/itex]

    The following method has been proposed for measuring the dissimilarity[read: distance] between two sets of data points A = {xa1, xa2, . . . , xam} and B = {xb1, xb2, . . . , xbn}

    Average Linkage(UPGMA):

    [tex]d(A,B) = {\frac{Ʃ_{x∈A} Ʃ_{y∈B} d(x, y)}{|A||B|}} [/tex]

    Show that the linkage method satisfies each of the properties or provide a counter example (a visual/diagram representation of any counter example is sufficient if appropriate).

    2. Specific Problem
    I am having difficulty proving that the Average Linkage method does/does not satisfy the third property of metric space, the triangle inequality.

    The Euclidean distance formula, the Absolute distance formula(Manhattan) or any other distance formula may be used to show this.
  2. jcsd
  3. Mar 14, 2013 #2


    User Avatar
    Science Advisor
    Homework Helper

    I don't think it even satisfies the first property that d(A,A)=0 if A has more than one different data point, does it?
    Last edited: Mar 14, 2013
  4. Mar 14, 2013 #3
    Yes, I think you might be correct about property 1. I sort of glazed over it as I had already proven that property for Single Linkage and Complete Linkage. I didn't consider the case of d(A,A)=0 where A has more than one data point.
  5. Mar 15, 2013 #4
    I think property one holds for average linkage.

    e.g. A = { (0,1) (0,0) }

    average d(A,A) = 0.5

    Obviously this is greater than 0 in the case where x = y.

    However the property states that d(x,y) can be 0 if and only if x = y.

    This is still true because d(x,y) will only equal to 0 when x = y with the added constraint that x will only contain a single data point.

    e.g. B = {(0,0)}

    average d(B,B) = 0
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted