Statistical significance in experimentally obtained data sets

Raddy13 · Jun 17, 2013

I have a set of data that was recorded from an engine that we are testing. We've noticed lately that a particular pressure value will sometimes spike with no apparent explanation, as seen in the attached graph. The pressure in question is passively regulated by a pump, but it is also dependent on operating factors in the engine. I'm trying to narrow down whether this spike is indicative of a problem with the pump or if there's something else wrong with the engine to cause this spike. Plotting the pressure against other measurements hasn't really helped, but I wanted to see if there's a way I can statistically measure the significance of changes that occur, or whether it's just typical variation.

I recall from Stats that my professor used the Student's T-Distribution to measure whether a change between two data sets (in my case, before and after the pressure spike) was statistically significant, but when I tried it in Excel (using this tutorial, it gave me exactly a 100% probability that this pressure spike occurred by chance (we know from testing that's not the case). I've since read that the T-distribution is typically used when the standard deviation for a sample isn't directly measurable, but I do know that from the data, so is there another way to go about this?

EDIT: Just to clarify, to obtain my T-dist value, I set mu = average pressure before the spike, xbar = average pressure after the spike, s = standard deviation of pressure after the spike, and n is the number of data points after the spike (201 in my case). I plugged that into the T-value formula and got 213.4, and the T-dist probability for that is 1.

Office_Shredder · Jun 17, 2013

You interpreted your T-test backwards. That 100% probability number says that if the data sampled after the spike was drawn from the same distribution as the data before the spike, then there is a 100% probability you would have seen a smaller T-value than 213.4. So the probability you would see 213.4 or greater is essentially 0, which means that this is NOT just random noise.

statdad · Jun 17, 2013

"I plugged that into the T-value formula and got 213.4, and the T-dist probability for that is 1."

Excel is a horrible tool for statistical analysis of any kind, for many reasons. Here you've run into an old "feature" of the program that is, in essence, a foolish way it calculates p-values. If you sketch the t-distribution and locate your t-value you see it is far to the right of the distribution. Depending on your alternative hypothesis the p-value is either the area to the right of this t, or twice the area to the right of this t. The number Excel gives is the area to its left, which means the p-value is 0.

But I'm not sure this is the analysis you should do. The measurements are clearly not independent, since they come from the same mechanism, and t-tests require independent values.
What is graphed on the horizontal axis of the plot you attached?

Raddy13 · Jun 17, 2013

Office_Shredder said:

You interpreted your T-test backwards. That 100% probability number says that if the data sampled after the spike was drawn from the same distribution as the data before the spike, then there is a 100% probability you would have seen a smaller T-value than 213.4. So the probability you would see 213.4 or greater is essentially 0, which means that this is NOT just random noise.

My mistake, I will keep that in mind!

What is graphed on the horizontal axis of the plot you attached?

Time, measurements are recorded once per second.

Statistical significance in experimentally obtained data sets

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Attachments

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect