Statistical test for comparing two error signals

In summary: I think you can find them if you want. But ultimately it's up to you to decide if the extra work is worth it.
  • #1
aydos
19
2
Problem: I have a sensor monitoring a process which is controlled by a feedback controller. This sensor fails from time-to-time and I need to replace it with a new one. I have always used the same type of sensor, say type A. Some sensor manufacturers are offering me an alternative sensor technology, say sensor type B to measure the same process with the same theoretical signal characteristics. This needs to be tested though. I cannot afford to replace sensor type A with sensor type B and see how the controller performs. What I can do is to install sensor type B to monitor the same process, but remain "offline" (not used by the controller). This allows me to monitor both signals in parallel. By comparing the two signals, how can I determine if sensor type B will not produce negative impacts in my controller?

Current plan:
I am planning to trial sensor type B by installing three sensors monitoring the same process:
Sensor 1) Sensor type A, this is the reference sensor
Sensor 2) Sensor type A, this is candidate #1
Sensor 3) Sensor type B, this is candidate #2
Generate two error time series: one with the error between candidate #1 and reference and the second the error between candidate #2 and reference. The conclusion of a successful trial should be able to state that the differences of both errors are statistically insignificant.

Statistical test:
My first thought was to use a Student's t-test to compare the error signals. But I understand the t-test only tests for differences in mean values. But I suspect, I also need to know if the error variances are the same.

Questions:
- Will the F-test provide a test that is sensitive to both mean and variance differences?
- Would anyone suggest an alternative approach?
- I am collecting data for a 24 hour period. During this time, the process operates under 5 different regimes. Should I break-up the time series into 5 segments and run separate tests?
Will this split approach change the test criteria?

Other info:
- The error signals have a good approximation to a normal distribution
- The measurement noise is not time-correlated
 
Physics news on Phys.org
  • #2
Using statistics to do hypothesis testing is a subjective procedure. A person with experience in solving problems exactly like yours might be able to say, from experience, what methods work well. I can't. I can only offer a few comments.

aydos said:
By comparing the two signals, how can I determine if sensor type B will not produce negative impacts in my controller?

One consideration is whether your controller is an "integrating" controller that uses the average output of the sensor over a window of time to do its computation (or , if it's an analog controller, it might do the electronic equivalent of integration). If that is the case then the time average of sensor B vs sensor A will be the crucial signal.
Statistical test:
My first thought was to use a Student's t-test to compare the error signals. But I understand the t-test only tests for differences in mean values. But I suspect, I also need to know if the error variances are the same.

You can test for the equality of variances first and if the data passes, you can do the t-test next. (The "significance level" of the two step procedure is not the same as the "significance level" assigned to each step.)

Questios:
- Will the F-test provide a test that is sensitive to both mean and variance differences?

The word "sensitive" is, as far as I know, a subjective term. The quantification of the sensitivity of a test in frequentist statistics is done by computing "power" functions for the statistic. For example, suppose you decide the significance level is 0.05. We can imagine a simulation that takes paired samples (a,b) from two distributions. One distribution has mean 0 and standard deviation 1. The other distribution has mean x and standard deviation 1+y where x and y are held constant while the sampling is done. From the samples, we compute the probability that an F test would judge the two distributions different at the significance level of 0.05. By repeating this simulation for various values of (x,y) we create a power function for the test. Roughly speaking, this shows how well the test does in detecting deviations of various sizes from the null hypothesis. How you use such a curve to make a decision about what to do is subjective.

An interesting twist to your problem is that if sensor B has a smaller standard deviation than sensor A, this might indicate that its a better sensor. However, if you controller has an algorithm that attempts to compensate for sensor noise, it might be better tuned to working with sensor A than sensor B.

I Googled briefly for power curves for the F-test, but didn't find any simple results on that topic. Perhaps you can.

- Would anyone suggest an alternative approach?

I advocate using computer simulations, when possible. However, I don't know if the number of variables in your problem and the algorithm used by the controller are known and simple enough to simulate.

- I am collecting data for a 24 hour period. During this time, the process operates under 5 different regimes. Should I break-up the time series into 5 segments and run separate tests?

I'd say yes, split it up. I heard one lecturer in statistics say "Stratification never hurts. Always stratify when you can." (In statistics, "stratification" refers to analyzing different regimes separately and, if desired, combing the the results to represent a mixed population.)

Will this split approach change the test criteria?

The test critera are going to be up to you. There won't be a mathematical answer to this unless you can define precisely what you are trying to quantify, maximize or minimize etc.
 
  • #3
aydos said:
Other info:
- The error signals have a good approximation to a normal distribution
- The measurement noise is not time-correlated
That pretty much goes out the door (and so do your statistical tests) when you are looking for failures. Failures are, by definition, regimes where the sensor does not behave according to spec.

Assuming no failures (and if you are seeing failures in a 24 hour window you have some pretty lousy sensors), all that you will garner from testing is that your sensors behave better than spec. Spec behavior is what manufacturers guarantee. That means that a random sensor that you buy is almost certainly going to outdo spec. Manufacturers don't want to be faced with lawsuits because their spec is one sigma behavior.

Lacking spec values, what your testing can do is elicit non-failure behavior. You can use this (or spec behavior) to test for failures. Are the sensed values at all consistent with expectations? If they're not, you have a suspect sensor. What to do? That depends on the sensor. You need to have some kind of model of how the sensor fails. Does it go utterly schizo, generating random values? Does it freeze? Does it fail off-scale high / off-scale low? Does it just send a bad value every once in a while, but then return to nominal?
 

What is a statistical test for comparing two error signals?

A statistical test for comparing two error signals is a method used to determine whether there is a significant difference between the errors produced by two different processes or treatments. It helps scientists make informed decisions about which process or treatment is more effective or accurate.

How does a statistical test for comparing two error signals work?

A statistical test for comparing two error signals works by calculating a test statistic, which is a numerical value that summarizes the difference between the two error signals. This test statistic is then compared to a critical value based on the chosen level of significance and the degrees of freedom. If the test statistic is greater than the critical value, the two error signals are considered significantly different.

What types of statistical tests can be used to compare two error signals?

There are several types of statistical tests that can be used to compare two error signals, including t-tests, ANOVA (analysis of variance), and non-parametric tests such as the Wilcoxon rank-sum test. The choice of test depends on the type of data and the assumptions that can be made about the underlying distribution of the data.

What are the assumptions of a statistical test for comparing two error signals?

The assumptions of a statistical test for comparing two error signals depend on the specific test being used. Some common assumptions include normality of the data, independence of the data points, and equal variances between the two error signals. It is important for scientists to carefully consider these assumptions before choosing and interpreting the results of a statistical test.

When should a statistical test for comparing two error signals be used?

A statistical test for comparing two error signals should be used when there is a need to determine whether there is a significant difference between the errors produced by two different processes or treatments. This is often the case in scientific research, where scientists want to compare the effectiveness or accuracy of different methods or treatments. It is important to use a statistical test to make an objective and data-driven decision rather than relying on subjective judgments.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
Replies
16
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
10
Views
1K
Replies
14
Views
1K
  • Quantum Interpretations and Foundations
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
26
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
Replies
3
Views
691
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
3K
Back
Top