Program that reads data from a file and calculates the mean?

Click For Summary

Discussion Overview

The discussion revolves around writing a program that reads data from two files to calculate the mean, standard deviation, and standard error of the values in each file. The focus is on understanding file I/O in programming and addressing issues related to the calculation of standard deviation.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant shares an initial code snippet for reading data from a file and calculating mean values, expressing difficulty with file I/O.
  • Another participant suggests checking for compilation and execution errors before seeking help, emphasizing the importance of error messages.
  • A later reply indicates that the participant has made progress but is still struggling with incorrect standard deviation values, providing specific data from the files for reference.
  • One participant points out that the variable used for standard deviation calculations retains the last value read, suggesting alternatives such as saving values in an array or using a different formula for variance.
  • Another participant agrees with the suggestion and acknowledges its validity.
  • A different participant introduces the concept of an online algorithm for calculating mean and variance in a single pass through the data, referencing external resources for further understanding.

Areas of Agreement / Disagreement

Participants express differing views on the approach to calculating standard deviation, with some suggesting changes to the existing code while others propose alternative algorithms. The discussion remains unresolved regarding the best method to implement these calculations.

Contextual Notes

Participants have not reached a consensus on the correct approach to calculating standard deviation, and there are unresolved issues related to the handling of variable values after loops.

youngfreedman
Hi everyone.

I'm trying to write a program that reads data from 2 files and then calculates the mean, standard deviation and standard error of both files (separate values for each). I'm struggling to get my head around simple I/O, so excuse the poor attempt, but this is what I have so far: (I'm only attempting to just print out each value for now.)

Code:
    program data
    implicit none

    integer             :: j
    double precision    :: test

    open(unit = 100, file = 'tmax_1910.txt', status = 'old', action = 'read')
     do j = 1,  12
     read(100,*) test
     print *, 'N1=', test
    end do

    end program data

If it helps, the file is a list of monthly rainfalls for a year.

Thanks for any help!
 
Last edited by a moderator:
Technology news on Phys.org
I can't immediately see anything wrong. Does it compile, link, execute? If you get an error message in one of those steps, that is the time to ask questions. And tell us what the error message is complete with line number.
 
FactChecker said:
I can't immediately see anything wrong. Does it compile, link, execute? If you get an error message in one of those steps, that is the time to ask questions. And tell us what the error message is complete with line number.

I've actually made a lot of progress since posting this, I was going to delete the post but don't know how. However, I do still need help. Here's my code at this point:

Code:
    program data
    implicit none

    integer             :: R, F
    double precision    :: x, sum = 0, mean, y, mean2, sum2 = 0, var =0, sdv,
    var2 = 0, sdv2    open(unit = 100, file = 'tmax_1910.txt', status = 'old', action = 'read')
     do R = 1,  12
     read(100,*) x
     sum = sum + x

    end do

    mean = (sum)/12

     do R = 1, 12
      var = var + (((x - mean)**2.0)/12)
      sdv = var**0.5

    end do
    open(unit = 200, file = 'tmax_2010.txt', status = 'old', action = 'read')
     do F = 1, 12
     read(200,*) y
     sum2 = sum2 + y

    end do

    mean2 = (sum2)/12

     do F = 1, 12
      var2 = var2 + (((y - mean2)**2.0)/12)
      sdv2 = var2**0.5
    end do

    print *, 'mean=', mean, 'mean2=', mean2, 'sdv=', sdv, 'sdv2=', sdv2

    end program data

This prints the correct mean values, but the values for standard deviation (for both files) is incorrect. For reference, here are the numbers in each file:

File 1: 5.0, 6.6, 9.3, 10.4, 14.0, 18.0, 16.9, 18.6, 15.4, 13.1, 5.4, 7.6 (actual standard dev = 4.9..., my value 4.09..)
File 2:3.2, 4.3, 9.5, 13.0, 14.5, 19.2, 20.8, 19.0, 17.2, 12.9, 7.2, 2.0. (actual standard dev = 6.6.., my value 9.9...)

Thanks.

EDIT: Just to add, the values I get for mean are correct.
 
Last edited by a moderator:
Your values of x are changing correctly in the first loop but once you get out of that loop, you are left with x = last value read. So the second loop has a constant x value. Your alternatives are to loop through the x values twice (either saving an array of them or reading them twice) or using a different formula for the variance.

You can use a formula for variance where you accumulate the sum of x2 in the first loop and the sample mean. That saves you from looping through the x values twice.
 
  • Like
Likes   Reactions: jim mcnamara and youngfreedman
FactChecker said:
Your values of x are changing correctly in the first loop but once you get out of that loop, you are left with x = last value read. So the second loop has a constant x value. Your alternatives are to loop through the x values twice (either saving an array of them or reading them twice) or using a different formula for the variance.

You can use a formula for variance where you accumulate the sum of x2 in the first loop and the sample mean. That saves you from looping through the x values twice.

I hadn't considered that. That makes sense, thanks !
 
The algorithm @FactChecker mentions is called an online algorithm. The idea is that you can type a single datastream at the keyboard and do things with the data like calculate mean, standard deviation, and variance. Works for a file, too: you just read through the file one time.

Wikpedia has examples if you search for 'variance', the original Knuth version is easy to understand.
 
  • Like
Likes   Reactions: FactChecker

Similar threads

  • · Replies 5 ·
Replies
5
Views
5K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 2 ·
Replies
2
Views
3K