Divide observational data into normal periods for study

Rosemary902348 · Feb 26, 2018

When analyzing hydrological and climatological timeseries/observations it is common practice to compare statistics made on normal periods. As WMO calls them "WMO Climatological Normals".

These periods consist normally of 30 years of data. If you want to compare two normal periods with each other, let's say for example the periods 1951-1980 and 1981-2010.

Now if we make averages for all the seasons in the period 1951-1980, the last month is december in this period. If we regard december, january, and february as a complete season/winter, it means we have an incomplete winter at the end of the period 1951-1980. Now taking december to represent a whole winter, would be inaccurate. The monthly variations of a climatic parameter can vary greatly, and multiplying the december values by 3 wouldn't yield a result based on actual observations. Also making december the whole winter of 1980, would underestimate the real winter average. This would in turn affect the overall 30-year mean, by lowering it and giving a wrong picture of the actual seasonal mean in that normal period.

Same goes for the beginning of the period 1981-2010, here we have january and february as a incomplete winter:

Now in order to get complete seasons in the intersections between the normal periods, wouldn't it be best to split the periods as follows: 1951-1980.11.30 and 1980.12.01-1990.

Also it would be best to cut out december completely at the end of the period: 1980.12.01-1990

So that i becomes: 1980.12.01-1990.11.30

The same goes for the last normal period: 1951-1980.11.30

Where we would cut out january and february of 1951 so that we get: 1951.03.01-1980.11.30

In this way we make sure we have complete seasons in both normal periods. So if we also do comparisons of monthly averages: jan-feb-mar..., we would still split up the periods in the same manner, in order to use the exact same periods for exact comparison with the seasonal averages.

However when we have split up the periods as follows: 1951.03.01-1980.11.30 and 1980.12.01-1990.11.30 Is it still normal in academic papers to say that we are comparing the normal periods 1951-1980 and 1981-2010, when there is actually a small overlap in the last?

If this is not the normal procedure, what is the way to get around this? What do researchers normally do in this case?

In addition cutting of seasons/months in this manner yields an unequal number of months/seasons. The analysis might have data from a higher number of the month february compared to january. So the statistics from february are more reliable. Also a disadvantage of cutting out actual observation data, yields a loss of valuable observation data, which lowers the reliability of the statistics.

scottdave · Feb 26, 2018

In a period of 30 years (360 months), is 2 months going to greatly alter a trend for the period?

jim mcnamara · Feb 26, 2018

Northern Hemisphere only:
By the meteorological calendar, winter traditionally starts on 1 December. The seasons are defined meteorologically as:
spring (1 March , April, 31 May ),
summer (1 June, July, 31 August),
autumn (1 September, October, 30 November)
winter (1 December, January, 28(29) February).

If you want seasons aggregated that is how you can do it. Note that winter gets "shorted" on days because of February.

There are new temperature normals calculated every 10 years, using 30 years of data. So your idea of aggregating on a 10 year basis is not what the normals are. You are reusing data, and need to accommodate that. I guess. Assuming I understood what you said... Why? Normals are often considered an interesting kind of longer period running average.
https://www.ncdc.noaa.gov/data-access/land-based-station-data/.../climate-normals

When we worked with kind of data we went straight to daily readings for each station.

Afraid I am not helping much because I really do not get what you are trying to analyze for - what exact hypothesis are you testing?

Rosemary902348 · Feb 26, 2018

jim mcnamara said:

Northern Hemisphere only:
By the meteorological calendar, winter traditionally starts on 1 December. The seasons are defined meteorologically as:
spring (1 March , April, 31 May ),
summer (1 June, July, 31 August),
autumn (1 September, October, 30 November)
winter (1 December, January, 28(29) February).

If you want seasons aggregated that is how you can do it. Note that winter gets "shorted" on days because of February.

There are new temperature normals calculated every 10 years, using 30 years of data. So your idea of aggregating on a 10 year basis is not what the normals are. You are reusing data, and need to accommodate that. I guess. Assuming I understood what you said... Why? Normals are often considered an interesting kind of longer period running average.
https://www.ncdc.noaa.gov/data-access/land-based-station-data/.../climate-normals

When we worked with kind of data we went straight to daily readings for each station.

Afraid I am not helping much because I really do not get what you are trying to analyze for - what exact hypothesis are you testing?

Sorry for not beeing more clear. I'll try to formulate in more simple terms.
I am comparing data from 2 periods. Each period is of 30 years. Let's just for simplicity call them period 1 and 2.
Looking just at seasons, we create average values for each season.

So the average for a winter in period 1 would be the average of all the winters in period 1.

Now in period 1, the last winter just consist of december, in other words not a complete winter.

So my question is how to handle data from the last december. How should we weigh the data from that last december, should it have equal weight as each of the other complete winters in period 1? Or should one simply just exclude the last december month completely? What is normally done?

Or is it as simple as:
the average value for for example winter 1960 would be made out of:
january 1960, february 1960 and december 1960. In this way we have a "complete winter" for 1960, even though the really don't belong to the "same" winter (since december is far away from january and february the same year, which seems wrong when considering that it is more important accurate/realitybased statistics to get the average for one continous/the same winter).

scottdave · Feb 26, 2018

Instead of averaging days in each winter, then averaging the winters, could you just take all days which are considered winter days and average those? Then each day carries equal weighting.

jim mcnamara · Feb 26, 2018

Normals by convention do not span meteorological winters as you see. Try creating month "normals" first. In fact some analyses used daily data because actual seasons and traditional months do not align well. Some approaches to analysis use 11 November - 11 February as winter. Our Gregorian calendar has strong religious and political roots, like most calendars, so the 11 November approach works approximately with sun angle for a given latitude. Don't assign any great scientific meaning to months.

To answer your question you can have "unbalanced" data sets - like an extra period (group of days which are labelled "month") or missing periods. It just makes things messier. Now how to do it. I'm going to send the bat signal to some folks who have done serious stats more recently than I have:
@StoneTemplePython @Dale
Can we get some help please. Links:
https://www.ncdc.noaa.gov/cag/statewide/time-series
https://www.ncdc.noaa.gov/news/defining-climate-normals-new-ways

Dale · Feb 26, 2018

I agree with @scottdave above. Don’t take the average of the temperatures in one winter and then average the averages. Take all of the winter days and average them all together.

If you feel that you absolutely must do an average of averages then just do a weighted average to account for the missing data. If 2/3 is missing from the last winter and 1/3 is missing from the first then just weight then as 1/3 and 2/3 respectively.

Rosemary902348 · Feb 26, 2018

Dale said:

I agree with @scottdave above. Don’t take the average of the temperatures in one winter and then average the averages. Take all of the winter days and average them all together.

If you feel that you absolutely must do an average of averages then just do a weighted average to account for the missing data. If 2/3 is missing from the last winter and 1/3 is missing from the first then just weight then as 1/3 and 2/3 respectively.

Thinking about it now I just realized that we are looking at the average season of all seasons during the whole period. So the average would be the same regardless of which months are treated togheter I believe, it all averages out in the end result by making one season out of all seasons.
We are dealing with daily precipitation which we sum up for every season/month (this because it is more common to have values in mm/month or mm/season when not looking at daily values of course), would it still be correct to weight the last winter which misses 2/3 of data by 1/3? When it comes to temperature I agree. But the last season of precipitation data, would have only one third of the total season's precipitation (to take an example, monthly variation is great in some places so this example is not neccesarily true). It should weigh 1/3 yes I guess.

olivermsun · Feb 27, 2018

What are you actually trying to do? Compute trends? Compare various averages between the periods?

Rosemary902348 · Feb 28, 2018

scottdave said:

In a period of 30 years (360 months), is 2 months going to greatly alter a trend for the period?

scottdave said:

Instead of averaging days in each winter, then averaging the winters, could you just take all days which are considered winter days and average those? Then each day carries equal weighting.

jim mcnamara said:

Normals by convention do not span meteorological winters as you see. Try creating month "normals" first. In fact some analyses used daily data because actual seasons and traditional months do not align well. Some approaches to analysis use 11 November - 11 February as winter. Our Gregorian calendar has strong religious and political roots, like most calendars, so the 11 November approach works approximately with sun angle for a given latitude. Don't assign any great scientific meaning to months.

To answer your question you can have "unbalanced" data sets - like an extra period (group of days which are labelled "month") or missing periods. It just makes things messier. Now how to do it. I'm going to send the bat signal to some folks who have done serious stats more recently than I have:
@StoneTemplePython @Dale
Can we get some help please. Links:
https://www.ncdc.noaa.gov/cag/statewide/time-series
https://www.ncdc.noaa.gov/news/defining-climate-normals-new-ways

Thanks for all the input, this definitely helps getting in the right direction.

olivermsun said:

What are you actually trying to do? Compute trends? Compare various averages between the periods?

In short words just trying to get the average for each season in period A and the average for each season in period B. So in period A we end up with 4 average values, same for B. The observation data available are daily sums. And the unit desired is [mm/season]. My only slight doubt now is how to weight one specific season if one lacks for example data for half a month in one of the seasons, but the previously suggested weighting makes kind of sense.

Divide observational data into normal periods for study

1. How do you define "normal periods" when dividing observational data for study?

2. What is the purpose of dividing observational data into normal periods for study?

3. How do you determine the appropriate length of normal periods when dividing data for study?

4. Are there any limitations to dividing observational data into normal periods for study?

5. Can observational data be divided into multiple normal periods for study?

Similar threads

Hot Threads

Recent Insights