Hypothetically you can think of it as measuring the height of the sun in the sky in winter (A) compared to the summer (B), in 4 nearby villages (independent observations) over 3 days (the assumption is that the height is stable over the consecutive days).

Since I want to know the true behavior of each replicate, I average the data of the three time periods (to take out any irrelevant fluctuations). This is the key, but it is also a problem (I think).

Now I calculate the average and the SD of condition A and condition B (based on the 3-day average of each replicate) and want to do a statistics test on the height at t=12:00.

My question: what is the best way to deal with the data? If I perform a t-test on the average then a condition would have 4 independent replicas, but in fact underneath there are 3 dependent replicas.

Am I over-inflating the significance by doing that and how can it be corrected? Any thoughts are welcome

Example of three days worth of data, averaged into one

A1 ~~~ = ~

A2 ~~~ = ~

A3 ~~~ = ~

A4 ~~~ = ~

B1 ~~~ = ~

B2 ~~~ = ~

B3 ~~~ = ~

B4 ~~~ = ~

# How to deal with averaging before calculating statistics

