Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Data analysis - Comparing temperature data from multiple sources

  1. Apr 28, 2014 #1
    Data analysis -- Comparing temperature data from multiple sources

    Dear Peter,

    I am Engineering student from Japan.I have installed 8 sensors,4 at rural and 4 at urban area.Those sensors measures the temperature and other property in time series format.Now, I am using one month data and comparing the temperature tendency between Urban and rural. The steps which i followed to compare the urban data and rural are as follows.
    1) I averaged the 4 sensors data in time series format.(for both Urban and rural for a day)\

    2) like wise i averaged the above values for 1 month in time series format for both urban and rural.
    3) now i am comparing the data between the urban and rural in time series format.

    is it the proper way to compare or there are some other techniques available?

    for this case do i need to check the significance of the data? if yes, what can be the possible significance test?
  2. jcsd
  3. Apr 28, 2014 #2


    User Avatar
    2017 Award

    Staff: Mentor

    Who is Peter?

    Where are your sensors located? Far away, or within the same region?
    Do you have uncertainties from your individual datapoints? Do you have calibration uncertainties? Did you test all 8 sensors in the same place at some point?

    Averaging over a whole month and then comparing results is problematic - there is no meaningful way to assign an uncertainty in between those steps. Clearly the temperature values will have large variations due to the day/night cycle and weather effects - but that is common to all sensors.

    For each timestep, you can subtract the average (of all 8 sensors) - that way, your values are just sensitive to temperature differences. Averaging those should give a way more relevant figure, and you can use standard uncertainty propagation to find an uncertainty on the average.
    Be careful with the interpretation, however - a difference between measurements in rural and urban areas does not have to come from rural and urban areas, there are many possible sources for differences.
  4. Apr 29, 2014 #3

    Thank you very much for your reply.I am also having the similar problem with remote sensing data for short wave radiation.

    My Points are located almost 1 km apart both in urban(4 points) and rural(4 Points) and the distance between urban and rural is almost 6 KM in same plane, I am taking data from remote sensing.I am analyzing Short wave radiation for day time only,Due to cloud formation difference the Short wave radiation also varies from point to point.The data is in the form of Time series series daily(6am to 6pm).

    to compare the difference between rural and urban Shortwave radiation, I averaged the time series values of each points from Urban and same for the rural.

    Urban average(i) 6:00= (U1+U2+U3+U4)/4
    Urban average(i) 7:00= (U1+U2+U3+U4)/4
    Urban average(i) 8:00= (U1+U2+U3+U4)/4
    Rural average(i) 6:00=(R1+R2+R3+R4)/4
    Rural average(i) 7:00=(R1+R2+R3+R4)/4
    Rural average(i) 8:00=(R1+R2+R3+R4)/4
    The values of Urban average and the Rural average are calculated for 1 month in the similar fashion in time series format.

    The ( U1,U2,U3,U4 ) and (R1,R2,R3,R4) has variations respectively, because of cloud thickness difference.

    Now,the Urban average for each time for whole month is averaged to get Uavg and so for the Ravg in time series format.

    I made a virtual data set by picking the maximum from all the days in each timeseries format which resembales the ideal format of the shortwave radiation SWmax.

    To compare the result I used ((SWmax-Uavg)/Uavg*100)% and ((SWmax-Ravg)/SWmax*100)%

    The difference of % are not so high in some point of time. so, I am little bit confused,are these difference% significant.
    because there are lot more variations in the data set. so please suggest me the way how I can compare it properly.and in a significant way.

  5. Apr 30, 2014 #4


    User Avatar
    2017 Award

    Staff: Mentor

    I don't see what those values are supposed to mean. The first one is "how large is the maximum (only urban or both?) compared to the urban average that day", okay (I don't see the interpretation of this value. What does "50%" tell you, for example?). But the second, where you divide by SWmax? "how large is the rural average compared to the maximum, and then take the negative value of this".
  6. Apr 30, 2014 #5
    I have taken the maximum values of each time (time series format)combining both Urban and Rural,and of whole 1 month.I simply used the command =max(..:..) in excel to get this. The final comparison is in the form of Percentage But the formula above i mentioned was little mistake Sorry.......

    To compare the result I used ((SWmax-Uavg)/Uavg*100)% and ((SWmax-Ravg)/SWmax*100)% (Wrong)
    To compare the result I used ((SWmax-Uavg)/SWmax*100)% and ((SWmax-Ravg)/SWmax*100)%(Correct)

    actually the percentage gives the "decreased ratio of short wave flux.50% means the there is the presence of cloud which is thicker enough to block the 50% of the Short wave radiation at that time.

    In my case i got less then 1% difference between Urban and the rural%.In this case can i conclude my results with such less values difference? and do I need to check the significance of the raw data.As i tested T test for each time the % came out to be more then 40% in some times even 90%.
  7. May 1, 2014 #6


    User Avatar
    2017 Award

    Staff: Mentor

    This is hard to tell without knowing the actual setup and seeing the data, but I guess 1% is below multiple systematic effects, maybe even below the statistical effects.
    What is the significance of data? Significance of what?
  8. May 1, 2014 #7
    I am told that,for the validation of data and to compare the two different things, lets say Urban and Rural in my case I have to perform Ttest between the urban data set and the rural data set.But I am not sure why to check the significance between Urban and rural.

    Can you explain when and at what circumstances do we need to compare the significance.and do i need to check the t-test between Urban and rural?
  9. May 1, 2014 #8


    User Avatar
    2017 Award

    Staff: Mentor

    I don't know what you mean with "compare the significance".
    I don't think so, but it depends on what exactly you want to find out.

    Data analysis methods always have some purpose. You cannot just "analyze data" and then everything is done. You have to define what you want to know first.
  10. May 2, 2014 #9
    My purpose for data analysis is to compare the Short wave radiation between Urban and Rural.like I mentioned in #3.so, My question is for this comparison between urban Shortwave and rural shortwave do we need to test the data like T-test of something else?

    Thank you very much for your kind suggestion.
  11. May 2, 2014 #10


    User Avatar
    2017 Award

    Staff: Mentor

    That is a very broad topic. And I doubt you can get this at all with your data, as there are so many other factors that can influence your values.

    I guess you can use a t-test, but then you have to prepare a data sample that is suitable for it. Especially the required normal distribution of your values looks problematic - none of your measurement series will have a normal distribution. Maybe the difference between the averages, for each time step, can satisfy that.
  12. May 3, 2014 #11


    User Avatar
    Science Advisor
    Gold Member
    2017 Award

    If your hypothesis is that one set is higher than the other, you can rank order the 8 data for each day (or make one giant rank ordering for all data) and apply some non-parametric statistics tests. Then you would not need to make a model for either set of data.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook