Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Generalization of mean, median

  1. Aug 28, 2007 #1
    I recently learned that if you minimize these functions with respect to "a", you get the mean and the median respectively:

    [tex]\sum (y_i - a)^2[/tex]

    [tex]\sum |y_i - a|[/tex]

    What would you get if you minimized an expression like [tex]\sum |y_i - a|^n [/tex] for various n's? Do the resulting expressions have any use, or are they just a slightly different, more complicated mean?
     
  2. jcsd
  3. Aug 28, 2007 #2

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    Higher powers of (y-a) may not have real solutions, they may all be imaginary.

    What is an example of min sum |y - a|^n that does not produce the identical result with min sum |y - a|^2? (Remember, the median has to be an element of the data set.)
     
    Last edited: Aug 28, 2007
  4. Aug 28, 2007 #3

    CRGreathouse

    User Avatar
    Science Advisor
    Homework Helper

    The multiset {0, 0, 3} would work. It's maximized at a = 1.327480002... for n = 4, but a = 1 for n = 2.
     
  5. Aug 28, 2007 #4

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    But 1.327480002... is not in the set. (Neither is 1.) By convention, the "4th-order median" is identical to the median (=0).
     
  6. Aug 28, 2007 #5

    CRGreathouse

    User Avatar
    Science Advisor
    Homework Helper

    Why would a need to be in the multiset? It's a fourth-order mean.
     
  7. Aug 28, 2007 #6
    I looked at this problem in MATLAB (code is below), and it does seem like there exists a "mean" for each n that lies within the range of values "a" is defined on.


    Code (Text):
    clear
    hold off

    % determine distribution over [0,1]
    x = rand(100,1);
    t = 0.0:(1.0/99.0):1.0;
    y = exp(-10*(x).^2);
    y2 = exp(-10*(t).^2);
    plot(t,y2)

    a = 0.0:0.001:1.0;

    for k = 1:size(a,2)
        for n = 1:10
            S(k,n) = sum( ( abs(y-a(k)) ).^n );
        end
    end

    figure
    hold all
    minPts = zeros(1,10);
    for n = 1:10
        % plot
        plot(a,log(S(:,n)))
        % determine minimums (this line keeps screwing up for some reason, so
        % I commented it out)
    %    minPts(n) = find( S(:,n) == min(S(:,n)) );
    end
    title('log plot of summary functions');

    figure
    hold all
    for n = 1:10
        plot(a,S(:,n))
    end
    title('summary functions')


    % a(minPts)
     
  8. Aug 28, 2007 #7

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    My post about imaginary root was in reference to the mean, not the median. ({1,2,3,4,5} - a)^n does not have a real root for every n.
     
    Last edited: Aug 28, 2007
  9. Aug 28, 2007 #8

    EnumaElish

    User Avatar
    Science Advisor
    Homework Helper

    My post in reference to your multiset was about the median, not the mean. The main question of the OP, I think, was also the median.
     
  10. Aug 31, 2007 #9

    Chris Hillman

    User Avatar
    Science Advisor

    Choquet theory, anyone?

    With some qualifications (in the case of median), you can easily generalize this to hold for finite-dimensional vector spaces. (If you like that, and have a graduate level background in functional analysis, check out Choquet theory, in which we average over infinite dimensional simplices and find beautiful geometric interpretations of important concepts in ergodic theory.)

    What memorable formal property of the median is most easily spotted in the vector space setting?

    Speaking of mean and median, in the one-dimensional case, what can you say about situations in which the mean exceeds the median and conversely? Can these situations arise plausibly when grading quizzes?

    (A phrase from "A Prairie Home Companion" always makes me smile: "Where all the children are above average". Outside Lake Wobbegon we can't do quite that well, but with a sufficiently unlikely distribution :wink: we can come close!)

    Well, first of all, what can you say about the "resulting expressions"?
     
    Last edited: Aug 31, 2007
  11. Aug 31, 2007 #10

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    I suppose you could call [tex]\sum |y_i - a|^n [/tex] the [tex]\mathcal L^n[/tex] mean. The question is, does such a mean yield any practical value? The median and mean are quite statistical measures. The [tex]\mathcal L^{\infty}[/tex] mean might be of some use (just a supposition; I can't think of any off the top of my head). For example, the [tex]\mathcal L^{\infty}[/tex] mean of the set {0, 0, 3} is 1.5.
     
    Last edited: Aug 31, 2007
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?