1. Jun 16, 2011

### hoodrych

Before you begin reading, I want to say thank you for helping, or attempting to help me. I really appreciate any help you can give me! Also, warning: Wall of Text approaching fast.

EDIT: I've been informed by Stephen Tashi on this forum that I am misusing the term Confidence Interval. What I am actually looking for appears to be the "Prediction Interval". The person who assigned me this task used the term CI but it looks like it is misused all the time, and that he meant PI.

Objectives:
• Quantify the variability of exchange rates between system #1 and system #2
• Determine bestfit distribution
• Find 95% confidence interval

Please note: I am doing this in excel, so I will be using excel functions to calculate stdev ect. But if you aren't familiar with excel you can still be of help to me! The concepts are what I need help with!

Here an explanation of the data I was given and what I have done so far:
System #1 and #2 BOTH have two lists of 48 numbers. One list being the closing market price in USD and the other list the closing market price in the native currency.
So I have "Sys#1 USD", "Sys#1 Native", "Sys#2 USD" and "Sys#2 Native" columns of 48 values each. System #1 is the actual closing mkt price, while System #2 is the one we are testing to see how much it differs from the correct values.

(1.)
I found the exchange rate to the dollar for each system by simply dividing Native/USD for the corresponding system.

(2.)
I then found the percent error of the foreign exchange rates of System 2 compared to System 1.

[PLAIN]http://www.pstcc.edu/departments/natural_behavioral_sciences/E2010D0101.gif [Broken]

(3.)
I proceeded to find the standard deviation and mean using the simple functions excel comes equipped with. Excel functions below.

(4.)
I was informed that using the =CONFIDENCE function in excel was actually NOT what I want because it calculated the CI with the true mean of all future data, and I do not know the true mean value of all future data, only of my sample of 48 days.

I was told to use the =NORMINV(probability,mean,standard_dev) function by my coworker. To my understanding, this method "fits a normal distribution to the data and then makes a prediction assuming that this fit is correct."
I'm not sure if my data is a normal distribution, so do not know if I can use =NORMINV?

So basically, how do I calculate a 95% confidence interval of this data and determine the best fit distribution? Should I be using =NORMINV?

Thank you so much for your help!

Last edited by a moderator: May 5, 2017
2. Jun 17, 2011

### hoodrych

I was informed that I want Prediction Interval by my coworker. I no longer need the Confidence Interval.

Yet I still went ahead and calculated the values I would get if I used the NORMINV function.

=NORMINV(.025, .317153, .23099973) = -.135597829
=NORMINV(0.975,0.317153322, .2309997) = .769904

So I REALLY still need to find out how to calculate and interpret the PI, but I'm curious how to use these two values.

What would my claim be? And under what conditions could i make that claim?
Here's my attempt and interpreting the NORMINV-
That under the condition that our sample's distribution correctly represents the true population's distribution (what is the true population though? there isn't one, the values are created every day, so n is infinite), 95% of future values will fall in between -.135597829 and .769904?
^probably not right.

3. Jun 20, 2011

### Stephen Tashi

I think you'll get more answers if you write more concisely. Don't put so many remarks and side questions in with the main issues. If you have a 100 questions, it's best to begin with the first one or two.

I think your claim is basically correct. You should not say "95% of future values will fall..." you should say "there is a 95% probability that a future value will fall..." The percentage that something actually happens does not have to equal its probability.

In practical statistics, most populations are infinite. So that's no problem.

I think a reader of your report would also be interested in a prediction interval based on the PERCENTILE function that bpet suggested in another thread. There is nothing wrong with presenting several different prediction intervals based on different assumptions if you make clear what those assumptions are.