How to Determine a 95% Confidence Interval and Best-Fit Distribution in Excel?

  • Context: Undergrad 
  • Thread starter Thread starter hoodrych
  • Start date Start date
Click For Summary
SUMMARY

This discussion focuses on calculating a 95% Prediction Interval (PI) and determining the best-fit distribution using Excel. The user initially confused the term Confidence Interval (CI) with Prediction Interval, as clarified by Stephen Tashi. The user has two datasets of 48 values each representing exchange rates and has utilized Excel functions such as =STDEV, =AVERAGE, and =NORMINV to analyze the data. The main goal is to ascertain the variability between two systems and accurately interpret the results of the prediction interval.

PREREQUISITES
  • Understanding of Excel functions: =STDEV, =AVERAGE, =NORMINV
  • Knowledge of statistical concepts: Prediction Interval vs. Confidence Interval
  • Familiarity with normal distribution and its properties
  • Basic grasp of exchange rate calculations and percent error
NEXT STEPS
  • Learn how to calculate Prediction Intervals using the PERCENTILE function in Excel
  • Explore the implications of normal distribution assumptions in statistical analysis
  • Research methods for determining best-fit distributions for datasets
  • Study the differences between Confidence Intervals and Prediction Intervals in depth
USEFUL FOR

Data analysts, financial analysts, and anyone involved in statistical analysis using Excel, particularly those working with exchange rate data and prediction modeling.

hoodrych
Messages
17
Reaction score
0
Before you begin reading, I want to say thank you for helping, or attempting to help me. I really appreciate any help you can give me! Also, warning: Wall of Text approaching fast.


EDIT: I've been informed by Stephen Tashi on this forum that I am misusing the term Confidence Interval. What I am actually looking for appears to be the "Prediction Interval". The person who assigned me this task used the term CI but it looks like it is misused all the time, and that he meant PI.




Objectives:
  • Quantify the variability of exchange rates between system #1 and system #2
  • Determine bestfit distribution
  • Find 95% confidence interval

Please note: I am doing this in excel, so I will be using excel functions to calculate stdev ect. But if you aren't familiar with excel you can still be of help to me! The concepts are what I need help with!

Here an explanation of the data I was given and what I have done so far:
System #1 and #2 BOTH have two lists of 48 numbers. One list being the closing market price in USD and the other list the closing market price in the native currency.
So I have "Sys#1 USD", "Sys#1 Native", "Sys#2 USD" and "Sys#2 Native" columns of 48 values each. System #1 is the actual closing mkt price, while System #2 is the one we are testing to see how much it differs from the correct values.


(1.)
I found the exchange rate to the dollar for each system by simply dividing Native/USD for the corresponding system.

(#1Native/#1USD) = (#1Native/#1 $1 USD) and (#2Native/#2USD) = (#2Native/#2 $1 USD)


(2.)
I then found the percent error of the foreign exchange rates of System 2 compared to System 1.

[PLAIN]http://www.pstcc.edu/departments/natural_behavioral_sciences/E2010D0101.gif


[ ( (abs[(#2Native/#2 $1USD) - (#1Native/#1 $1USD)]) / (#1Native/#1 $1USD) ] * 100


(3.)
I proceeded to find the standard deviation and mean using the simple functions excel comes equipped with. Excel functions below.

=STDEV(values)
=AVERAGE(values)

(4.)
I was informed that using the =CONFIDENCE function in excel was actually NOT what I want because it calculated the CI with the true mean of all future data, and I do not know the true mean value of all future data, only of my sample of 48 days.

I was told to use the =NORMINV(probability,mean,standard_dev) function by my coworker. To my understanding, this method "fits a normal distribution to the data and then makes a prediction assuming that this fit is correct."
I'm not sure if my data is a normal distribution, so do not know if I can use =NORMINV?


So basically, how do I calculate a 95% confidence interval of this data and determine the best fit distribution? Should I be using =NORMINV?

Thank you so much for your help!
 
Last edited by a moderator:
Physics news on Phys.org
I was informed that I want Prediction Interval by my coworker. I no longer need the Confidence Interval.Yet I still went ahead and calculated the values I would get if I used the NORMINV function.

=NORMINV(.025, .317153, .23099973) = -.135597829
=NORMINV(0.975,0.317153322, .2309997) = .769904

So I REALLY still need to find out how to calculate and interpret the PI, but I'm curious how to use these two values.

What would my claim be? And under what conditions could i make that claim?
Here's my attempt and interpreting the NORMINV-
That under the condition that our sample's distribution correctly represents the true population's distribution (what is the true population though? there isn't one, the values are created every day, so n is infinite), 95% of future values will fall in between -.135597829 and .769904?
^probably not right.
 
hoodrych said:
=NORMINV(.025, .317153, .23099973) = -.135597829
=NORMINV(0.975,0.317153322, .2309997) = .769904
...
What would my claim be?
...
That under the condition that our sample's distribution correctly represents the true population's distribution (what is the true population though? there isn't one, the values are created every day, so n is infinite), 95% of future values will fall in between -.135597829 and .769904?
^probably not right.

I think you'll get more answers if you write more concisely. Don't put so many remarks and side questions in with the main issues. If you have a 100 questions, it's best to begin with the first one or two.

I think your claim is basically correct. You should not say "95% of future values will fall..." you should say "there is a 95% probability that a future value will fall..." The percentage that something actually happens does not have to equal its probability.

In practical statistics, most populations are infinite. So that's no problem.

I think a reader of your report would also be interested in a prediction interval based on the PERCENTILE function that bpet suggested in another thread. There is nothing wrong with presenting several different prediction intervals based on different assumptions if you make clear what those assumptions are.
 

Similar threads

Replies
6
Views
3K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 22 ·
Replies
22
Views
7K
  • · Replies 2 ·
Replies
2
Views
3K