# Least Squares Method- What is the Measured mean value of y ?

1. Jan 27, 2013

### nerdy_hottie

1. The problem statement, all variables and given/known data

So I'm doing a Least Squares Analysis and I'm wondering about what the 'measured mean value of y for replicate measurements of the unknown' value is supposed to be. I have no idea in the world what it's asking for. The value it is speaking of is not the same as the average value in y. I will post the example so you can see what I'm talking about.

Least-Squares Spreadsheet

X Y
1 2
3 3
4 4
6 5

m 0.615384615 1.346153846 b
sm 0.054392829 0.214144783 sb
R2 0.984615385 0.196116135 sy

n= 4
Mean y= 3.5
Σ(xi-mean x)2 13

*********Measured y= 2.72
k= number of replicate measurements of y= 1
Derived x= 2.2325
sx= 0.373502805

2. Relevant equations
I'm looking for an equation, or an explanation as to how to obtain the value.

3. The attempt at a solution

I have asterisks (********) next to the measured y value in the spreadsheet. (The value is 2.72). The only reason I know what it is in this case is because this is an example from my text book. I have no idea where it comes from, but I need it for an equation to be able to do my lab and I don't know how to find the value.
As far as I can make sense of it, I have no means of calculating 'measured mean value of y for replicate measurements of the unknown', as there are no replicate measurements of the y values. Right?
Just in case it helps, this is for an analytical chemistry lab, but it's pertaining to statistics, so I asked it here.

Thanks.

2. Jan 27, 2013

### I like Serena

Welcome to PF, nerdy_hottie!

It looks like your measured y value of 2.72 is given and not calculated.
The purpose it to find the corresponding x.
The derived x is found by applying the inverse of the found linear relationship.

The 2.72 appears to be the result of a set of k=1 measurements.
This is relevant for the estimated sx, the standard deviation of the derived x.

3. Jan 27, 2013

### nerdy_hottie

So how would I find the given value for another set of values? Is there a table or something based on the number of k?

4. Jan 27, 2013

### I like Serena

The idea is that a new set of y measurements is done for a fixed unknown value of x.

The more measurements, the more accurate y will be, the more accurate will the linear relationship be, and the more accurate will the corresponding resulting x be.

I don't have the formulas at hand, but typically the standard deviations will decrease by a factor of about √k.
I guess what you would need is those formulas.

5. Jan 27, 2013

### nerdy_hottie

I have the formulas for finding sx (uncertainty in x), and all other corresponding formulas for all the values I have listed. I have another set of data which I want to find the value for 'measured y', but I don't know what it is to proceed. So is what you're saying that if I have another set of data with only one replicate measurement of y, the value will always be the given 2.72?

6. Jan 27, 2013

### I like Serena

I'm saying that the y measurement of 2.72 (and its number of measurements k=1) is not calculated from the data you have shown.
It is drawn from elsewhere.

7. Jan 27, 2013

### nerdy_hottie

Yes, but is it constant across all data sets with number of measurements, k=1 ? I mean, if I don't have to calculate it from the data given, and it's a given value for k=1, then isn't it a constant?

8. Jan 27, 2013

### I like Serena

I have seen only 1 dataset with only 4 measurements.
I guess it's a constant across this dataset...

For which purpose do you need it?

9. Jan 27, 2013

### nerdy_hottie

Okay sorry for any confusion but I didn't want to take the time to post all the data. I was just trying to find out the meaning of that measured y value, and apply it to the data I have now and all other data sets in the future. Right now the data set I'm working with is as follows:

Determination of Cu in Brass Using AA Spec.

Conc. (ppm) Abs.
0.000 0.000
2.044 0.268
4.088 0.509
6.132 0.723

m 0.118 0.014 b
sm 0.004 0.016 sb
R2 0.997 0.019 sy

n= 3
Mean y= 0.500
Σ(xi-mean x)2= 8.355872

Measured y= ????
sx ???? (need measured y)
Hope that's a little clearer.

10. Jan 27, 2013

### I like Serena

Seems to me you are supposed to measure the absorption yourself a couple of times.
And then fill in that value.
Didn't you say this is for a chem lab?

From that you can find the copper concentration and its associated uncertainty.

You would use the relation:
$$Absorption = (0.118\pm 0.004) \times Concentration + (0.014 \pm 0.016)$$

Last edited: Jan 27, 2013
11. Jan 27, 2013

### nerdy_hottie

I have measured the absorbance.. the values are above.
"Abs
0.000
0.268
0.509
0.723 "
for the corresponding values of concentration.
I have calculated average, "Mean y=0.500", and other such parameters, as seen above. I am performing a least squares analysis, and am as far as calculating sx using the formula
sx=(sy/|m|)√(1/k+1/n+((y-$\overline{y}$)2/(m2*$\Sigma$(xi-$\overline{x}$2)))

I just need that value for measured y.

12. Jan 27, 2013

### Ray Vickson

This might not be true *exactly* as written. In regression analysis there are expressions available that give "prediction intervals" for y(x) and "confidence intervals for Ey(x) in terms of x, so the width of an uncertainty bracket is different for different values of x. See, eg.,
http://www.weibull.com/DOEWeb/confidence_intervals_in_simple_linear_regression.htm .

Since the intervals for m and b are correlated, we cannot just use the two intervals separately--as your expression does--although that might give a pretty good approximation in some cases

13. Jan 27, 2013

### I like Serena

Yes, so you did 3 measurements to calibrate, using known concentrations.
Next you would pick substance X with an unknown concentration of copper.
Do k absorption measurements and fill that in in your formula to find the standard deviation of the concentration.

Btw, be careful to put the last square outside the parentheses. It should be $(x_i-\bar x)^2$.

14. Jan 27, 2013

### I like Serena

Yep, those were the formulas I was looking for in post #4.
It appears the OP is supposed to use a version that is even more advanced than the ones mentioned.

Actually, this is pretty advanced for a chem lab.

15. Jan 27, 2013

### nerdy_hottie

Oh my gosh I'm sorry I don't know what I'm talking about. Yes, I have other values for 'substance x', as you called it (samples of brass) which I have other absorbance values for. I mixed up three different samples (dilutions) of brass using three different masses of the same brass solid, and have absorbances corresponding to these three solutions. These absorbances (of which I have three corresponding to the three different samples) are actually a measurement of the average of three absorbance values, because the machine I used (an atomic absorption spectrometer-AA spec.) actually takes three separate readings of a sample and reports the average value of that sample (which I have listed below)
So I have three values for brass, each is an average the machine took.
Does mean that the k value is 3 (because the number of replicate measurements, or times the machine took an absorbance value, is 3) ? Am I understanding this at all right or totally wrong?
And if the value of k is 3, what then is the corresponding value of 'measured y' for that set of samples?
(I don't think you need it, but the absorbances for my three separate brass samples are:
0.521, 0.511, 0.524)

16. Jan 27, 2013

### I like Serena

Good!

Your 'measured y' would be 0.521 for the first brass sample.
And indeed you would have k=3 replicate measurements.
Fill that in your formula, and you'll get the standard deviation for the concentration of copper in this brass sample.

Repeat for the other 2 samples to find the sx in their copper concentrations as well.

Or am I misunderstanding and are all those measurements for the same sample of unknown brass?
If that is the case, you should average them and use k=9.

Last edited: Jan 27, 2013
17. Jan 27, 2013

### nerdy_hottie

No, you're understanding correctly.
But now I see the place of my confusion in the first place. I thought that sx would be only one value for the whole data set. Now I see that (for this data set), there will be three separate values of sx.
However, going back to the first sample example, where the measured value of y was 2.72. I know that it's a given number and not calculated in any way, but where does the value come from? It is not a value in the list of y values (only values are 2,3,4,5), so where does it come from, you know what I mean? And in this sample example in the book, there is only one value for sx, for the range of all the data. So I'm not saying you're wrong, but it's just that my book only has one value for the whole set.

18. Jan 27, 2013

### I like Serena

Well, I can only assume that the example in your book had 4 measurements with known concentrations, and 1 measurement for an unknown concentration.
The sx would be for that one unknown concentration.

But... there may be more than one sx mentioned.
When you do a linear regression, you can also determine another sx.
For instance $s_x=\sqrt{\sum (x_i - \bar x)^2 \over n-1}$.
This could be part of the calculation of m.
Either way, this sx would have no purpose for you.

19. Jan 27, 2013

### nerdy_hottie

Alright thanks for all the help, I know what I'm going to do now, whether it's right or not, I don't care much at this point. Most of our mark is based on how accurate our results are, so if this is wrong or not it shouldn't affect my mark a whole lot.

20. Jan 27, 2013

### I like Serena

Would you mind to let me know how it ends?

Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook