2 samples T test in case of non-normal distribution

uzi kiko · Jun 20, 2018

Hi All

I made an experiment where I measured the change of the impedance of a coil when I changed a environment parameter X times.
For each change I collected ~20 samples.
So I have a table with X lines that represent the number the change of the parameter and 20 columns that represent the repeat samples.
Now I would like to compare each line to the others and to find out if there is a significance difference between them.
Naturally I was thinking about T test. Unfortunately the measurement distribution is not normal, although the distribution is symmetric around the mean. ( I run Two-sample Kolmogorov-Smirnov test and found out that the test reject the hypothesis that the my sample arrived from normal distribution). You can find figure of the distribution here:
https://drive.google.com/open?id=1yp_Ufa4-N8kQD1twVCnHszL9RYLV-_3u

I know that if I had a higher number of samples I could average the sample groups till I will reach normal distribution and then use a T test, but I would like to avoid the idea to do the experiment again.

Now (Sorry about the long introduction... ) my questions are:
1) Do you know about a transformation that I can apply on my distribution so I will be able to use T test?
2) In case that there is no kind of transformation, which non parametric test would you suggest me to use?

Thanks a lot
Mosh

tnich · Jun 20, 2018

uzi kiko said:

I made an experiment where I measured the change of the impedance of a coil when I changed a environment parameter X times.
For each change I collected ~20 samples.
So I have a table with X lines that represent the number the change of the parameter and 20 columns that represent the repeat samples.
Now I would like to compare each line to the others and to find out if there is a significance difference between them.

This seems an odd way to analyze that sort of data. Presumably your intent in performing the experiment is to test some theory, and your theory includes a mathematical model that you can use to predict the relationship between your environment parameter and your measured impedance. If that is the case, then what parameters of the model do you want to estimate or verify using your data? If not, then why are you doing the experiment?

Dale · Jun 20, 2018

uzi kiko said:

I changed a environment parameter X times.
For each change I collected ~20 samples.
So I have a table with X lines that represent the number the change of the parameter and 20 columns that represent the repeat samples.

Is the environment parameter a numerical value or a categorical value?

uzi kiko · Jun 21, 2018

Thanks a lot for your quick responses.

Regarding tnich question:
I am now at the next stage of my research.
(You can find my paper here:
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0186381)

The first stage we indeed made a mathematical model. Now I am at the stage of verifying/identifying of my experiment setup and specifically I would like to understand what is the resolution of my experiment setup.

Regarding Dale question:
My environment parameter is numerical value. (The volume ratio between blood and brain tissue)

Dale · Jun 21, 2018

uzi kiko said:

My environment parameter is numerical value. (The volume ratio between blood and brain tissue)

Then you want to use a regression approach, not a t test.

uzi kiko · Jun 21, 2018

Thank you Dale,
I agree that if I want to understand the relation between the impedance and blood/brain ratio I should use regression test.
But the most important parameter that I have to identify is the minimum blood/brain ratio that my device can measure.

I would like to explain the way I was thinking to use the T test:
Let's say I have 6 measurements, each measurement contains 20 samples.
The first measurement is my baseline where there is only brain tissue without any blood.
Now I am taking the next measurement (Let's say with 2ml of blood). If there is a significant different between the 2 groups I can say the my device is sensitive enough for change of 2 ml of blood.
But if there is no significant different between the 2 groups - I will take the sample of 4 ml blood and compare, and so on.

Do you think that I can do this with regression test?

Dale · Jun 21, 2018

The problem is that this approach is strongly dependent on the number of samples. Suppose you use 20 samples and find a significant difference at 4 ml but not at 2 ml. Then, without changing your device you do the same experiment but with 200 samples. In that case you would be likely to find a significant difference at 2 ml also. Has your device become better? No. So this process doesn’t characterize the device.

Dale · Jun 21, 2018

Here are my thoughts. First, you want to do the regression so that you get an idea about the relationship between mL of blood and impedance.

Then, you want to characterize the 0 mL blood condition very well. You should acquire as many samples as feasible, and calculate a 95% confidence interval.

Using your regression you can convert that upper 95% limit to a mL blood measurement. That would likely be your best lower threshold. So use that volume (probably round up) as your candidate threshold and acquire a bunch of data at that volume also, and maybe one more slightly larger volume too.

Once you have that, you can do a ROC analysis to determine your best threshold for discriminating between blood and no blood as well as your sensitivity and specificity at that threshold.

uzi kiko · Jun 21, 2018

Thank you very much!

WWGD · Jun 21, 2018

How about a test of difference of proportions ( with the baseline being 0) needs simple sampling , which I think you have, independence the same and it is non-parametric. Wouldn't that work

2 samples T test in case of non-normal distribution

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad A variant of the Monty Hall problem

Undergrad My basic understanding of set theory

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

High School Onto set mapping is the surjective set mapping, and into injective?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers