What Do You Get by Minimizing \sum |y_i - a|^n for Various n's?

  • Context: Undergrad 
  • Thread starter Thread starter JoAuSc
  • Start date Start date
  • Tags Tags
    Mean Median
Click For Summary

Discussion Overview

The discussion revolves around the minimization of the expression \(\sum |y_i - a|^n\) for various values of \(n\) and its implications in relation to statistical measures such as the mean and median. Participants explore whether these minimizations yield useful results or if they simply represent more complex forms of means.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • Some participants note that minimizing \(\sum (y_i - a)^2\) yields the mean, while minimizing \(\sum |y_i - a|\) yields the median.
  • There is a suggestion that higher powers of \(|y - a|\) may not have real solutions, potentially leading to imaginary results.
  • One participant provides an example using the multiset {0, 0, 3}, indicating that for \(n = 4\), the minimization yields a value not present in the set, while for \(n = 2\), it yields a different result.
  • Another participant questions the necessity of \(a\) being an element of the multiset, arguing that it is a fourth-order mean.
  • MATLAB code is shared by a participant to illustrate the existence of a "mean" for each \(n\) within the defined range of \(a\), although they note issues with finding minimums in their code.
  • There is a distinction made between the mean and median in the context of the original post, with some participants emphasizing the median aspect of the discussion.
  • One participant introduces Choquet theory, suggesting its relevance to the discussion of means and medians in vector spaces.
  • Another participant proposes the term \(\mathcal{L}^n\) mean for the expression \(\sum |y_i - a|^n\) and questions its practical value compared to traditional statistical measures.

Areas of Agreement / Disagreement

Participants express differing views on the implications of minimizing \(\sum |y_i - a|^n\), with no consensus reached on whether these expressions yield practical value or are merely variations of means. The discussion includes multiple competing perspectives on the necessity of \(a\) being part of the multiset and the nature of the resulting values for different \(n\).

Contextual Notes

Some limitations are noted regarding the existence of real solutions for higher powers and the conditions under which the median is defined. The discussion also highlights unresolved mathematical steps related to the minimization process.

JoAuSc
Messages
197
Reaction score
1
I recently learned that if you minimize these functions with respect to "a", you get the mean and the median respectively:

\sum (y_i - a)^2

\sum |y_i - a|

What would you get if you minimized an expression like \sum |y_i - a|^n for various n's? Do the resulting expressions have any use, or are they just a slightly different, more complicated mean?
 
Physics news on Phys.org
Higher powers of (y-a) may not have real solutions, they may all be imaginary.

What is an example of min sum |y - a|^n that does not produce the identical result with min sum |y - a|^2? (Remember, the median has to be an element of the data set.)
 
Last edited:
EnumaElish said:
What is an example of min sum |y - a|^n that does not produce the identical result with min sum |y - a|^2?

The multiset {0, 0, 3} would work. It's maximized at a = 1.327480002... for n = 4, but a = 1 for n = 2.
 
CRGreathouse said:
The multiset {0, 0, 3} would work. It's maximized at a = 1.327480002... for n = 4, but a = 1 for n = 2.
But 1.327480002... is not in the set. (Neither is 1.) By convention, the "4th-order median" is identical to the median (=0).
 
EnumaElish said:
But 1.327480002... is not in the set. (Neither is 1.) By convention, the "4th-order median" is identical to the median (=0).

Why would a need to be in the multiset? It's a fourth-order mean.
 
I looked at this problem in MATLAB (code is below), and it does seem like there exists a "mean" for each n that lies within the range of values "a" is defined on.


Code:
clear
hold off

% determine distribution over [0,1]
x = rand(100,1);
t = 0.0:(1.0/99.0):1.0;
y = exp(-10*(x).^2);
y2 = exp(-10*(t).^2);
plot(t,y2)

a = 0.0:0.001:1.0;

for k = 1:size(a,2)
    for n = 1:10
        S(k,n) = sum( ( abs(y-a(k)) ).^n );
    end
end

figure
hold all
minPts = zeros(1,10);
for n = 1:10
    % plot
    plot(a,log(S(:,n)))
    % determine minimums (this line keeps screwing up for some reason, so
    % I commented it out)
%    minPts(n) = find( S(:,n) == min(S(:,n)) );
end
title('log plot of summary functions');

figure
hold all
for n = 1:10
    plot(a,S(:,n))
end
title('summary functions')


% a(minPts)
 
JoAuSc said:
I looked at this problem in MATLAB (code is below), and it does seem like there exists a "mean" for each n that lies within the range of values "a" is defined on.
My post about imaginary root was in reference to the mean, not the median. ({1,2,3,4,5} - a)^n does not have a real root for every n.
 
Last edited:
CRGreathouse said:
Why would a need to be in the multiset? It's a fourth-order mean.
My post in reference to your multiset was about the median, not the mean. The main question of the OP, I think, was also the median.
 
Choquet theory, anyone?

JoAuSc said:
I recently learned that if you minimize these functions with respect to "a", you get the mean and the median respectively:
\sum (y_i - a)^2
\sum |y_i - a|

With some qualifications (in the case of median), you can easily generalize this to hold for finite-dimensional vector spaces. (If you like that, and have a graduate level background in functional analysis, check out Choquet theory, in which we average over infinite dimensional simplices and find beautiful geometric interpretations of important concepts in ergodic theory.)

What memorable formal property of the median is most easily spotted in the vector space setting?

Speaking of mean and median, in the one-dimensional case, what can you say about situations in which the mean exceeds the median and conversely? Can these situations arise plausibly when grading quizzes?

(A phrase from "A Prairie Home Companion" always makes me smile: "Where all the children are above average". Outside Lake Wobbegon we can't do quite that well, but with a sufficiently unlikely distribution :wink: we can come close!)

JoAuSc said:
What would you get if you minimized an expression like \sum |y_i - a|^n for various n's? Do the resulting expressions have any use, or are they just a slightly different, more complicated mean?

Well, first of all, what can you say about the "resulting expressions"?
 
Last edited:
  • #10
JoAuSc said:
What would you get if you minimized an expression like \sum |y_i - a|^n for various n's? Do the resulting expressions have any use, or are they just a slightly different, more complicated mean?

I suppose you could call \sum |y_i - a|^n the \mathcal L^n mean. The question is, does such a mean yield any practical value? The median and mean are quite statistical measures. The \mathcal L^{\infty} mean might be of some use (just a supposition; I can't think of any off the top of my head). For example, the \mathcal L^{\infty} mean of the set {0, 0, 3} is 1.5.
 
Last edited:

Similar threads

  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 3 ·
Replies
3
Views
7K
  • · Replies 19 ·
Replies
19
Views
4K
  • · Replies 11 ·
Replies
11
Views
5K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 5 ·
Replies
5
Views
4K