1. Not finding help here? Sign up for a free 30min tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Java - Importing .csv file for calculating selected averages

  1. Aug 5, 2014 #1
    1. Problem: The task is to import an excel (.csv) file containing monthly weather data and produce a programme that can execute different averages on a yearly basis. i.e.;, produce average rainfall, maximum or minimum temperatures for desired years.

    I have thus far managed to calculate average values for entire rows, but I want the programme to only calculate averages of desired values (yearly basis), so that this year's average data can be requested.

    2. Relevant Information:

    The format of the excel file is (representing a column each): {Year, Month, Max Temp, Min Temp, Rainfall, ...}

    In the attempt below, avMonth is showing all the months listed then the average at the bottom as I expected. I have no idea where to go now to complete the task.


    3. Attempt:
    Mod note: Added [ code ] tags to restore indentation.
    Code (Text):

    package weatherPackage;

    import java.io.File;
    import java.io.FileNotFoundException;
    import java.util.Scanner;

    public class Main {
       
        public static void main(String[] args) {
            String fileName = "sheffielddata.csv";
            File file = new File(fileName);
            try {
                Scanner inputStream = new Scanner(file);
                double sum = 0;
                double numberofMonths = 0;
                while (inputStream.hasNext()){
                    String data = inputStream.next();
                    String[] values = data.split(",");
                    double avMonth = Double.parseDouble(values [1]);
                    sum += avMonth;
                    numberofMonths++;
                    System.out.println(avMonth);
                }
                inputStream.close();
                System.out.println("Average - " + (sum / numberofMonths));
               
            } catch (FileNotFoundException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
           
        }
    }
     
    Last edited by a moderator: Aug 8, 2014
  2. jcsd
  3. Aug 5, 2014 #2
    Hint: You're allowed to read the file twice - and after the first read, you'll know how many data values there are.
     
  4. Aug 5, 2014 #3
    Thanks for the response Scott, although I can't say I am any the wiser. I am struggling a lot due to next-to-no java experience. Any kind of elaboration would be much appreciated.

    My numberOfMonths gives me total number of rows, which calculates me my overall averages. But I want to just calculate averages on a yearly basis as chosen by indexing so...

    1) Set up variables for calculating averages
    a) Since the divisor is over a year on a monthly chart it is just 12, so I can just put this in.
    b) To be thorough, I could divide my numberOfMonths by the actual number of years, but this I am not sure how to do. It would need to only register every 12 rows (since each row is a month). Could this be done by use of indexOf? If so, how do I go about this on the multiple basis I need?

    2) Are all the rows already indexed? When I println(numberOfMonths) I get 1 through to 804 (which is the number of rows). Can I implement where by inserting these values the corresponding row of data can be displayed?

    So many questions and I apologise that I aren't as clear as could be.
    Many thanks again, and to all in advance.
     
  5. Aug 6, 2014 #4
    The first thing you need to do is decide how you going to access the data.
    1) One way would be to keep the data in the file and only read what you need when you need it. That would be a bit complicated and perhaps not that efficient.
    2) Another would be to try to collect and compute everything you needed in a single pass through the data. If possible, that would be the most efficient use of system resources.
    3) Less complicated would be to read the entire file into a "List" and then copy the list into an array.
    4) The least complicated is to read through your data twice (two passes). On the first pass, you just count the months. On the second pass, you read everything into a convenient array. Since that's the simplest for coding, that's what I show in the examples below.

    Also in my examples below, I am assuming that the first two value are the year and month.

    Here's one way of getting the data into an array:
    1) Define a structure for my weather data, for example:
    Code (Text):

      public class Weather {
        public int nYear;
        public int nMonth;
        public double dAvTemp;
        public double dMaxTemp;
        public double dMinTemp;
        public double dAvHumidity;
        ... whatever ...
      }
    2) As you did in your code, read the data set and determine numberOfMonths.
    3) Declare your array, as with:
    Code (Text):

      Weather[] WeatherArray = new Weather [numberOfMonths];
    4) Read through the file again, this time storing everything away in the array:
    Code (Text):

      for(int nMonth=0; nMonth<numberOfMonths; nMonth++) {
        String data = inputStream.next();
        String[] values = data.split(",");
        WeatherArray[nMonth].nYear      = Integer.parseInt(values[0]);
        WeatherArray[nMonth].nMonth    = Integer.parseInt(values[1]);
        WeatherArray[nMonth].dAvTemp = Double.parseDouble(values [2]);
        ... whatever ...
      }
    With all the data in an array, I guessing you would know how to compute any statistics. For example, average March temperatures or maximum January temperature.

    Without knowing the details of the csv file, it's hard to suggest anything further.

    If you don't actually have year information in the row, then you can do "nFullYears = numberOfMonths/12" as you suggested. If you have the option, put the year and month with the data.

    If you want to find a break from one year to the next, you can do this:
    Code (Text):

      int nThisMonth = 0;
      int nTotalMonths = 0;
      double dAvTemp = 0.0;
      double dMaxTemp = -999.9;
      for(int nRow=0; nRow<nomberOfMonths; nRow++) {
        if(WeatherArray[nRow].nMonth != nThisMonth) {
          if(nTotalMonths>0) {
            dAvTemp = dAvTemp/nTotalMonths;
            print everything for this year ...
          }
          nThisMonth = nMonth;
          nTotalMonths = 0;
          dAvTemp = 0.0;
          dMaxTemp = -999.9;
        }
        nTotalMonths++;
        dAvTemp += WeatherArray[nRow].dTemp;
        double dMax = WeatherArray[nRow].dMaxTemp;
        if(dMaxTemp<dMax) dMaxTemp = dMax;
      }
      if(nTotalMonths>0) {
        dAvTemp = dAvTemp/nTotalMonths;
        print everything for this year ...
      }
     
    I don't think I've answered all your questions. But I think you have enough to experiment.
    As you experiment, you should discover different ways of doing different things.
    I hope you have a build environment handy.
     
  6. Aug 6, 2014 #5
    Thanks again Scott great help and given me a much better perspective of things, I will follow through as you suggested.


    Unfortunately, I am again at a loss with the computations. I have been researching using indexOf, but have not had any success when using within [code/] println [code/] to get index values of first occurrence of, for example, 1994.

    How would I, for example, calculate the average temperature of all months in 1994, given that the data is given monthly? This would also help me in figuring out how to select by year for computations.

    Again, thanks so much for the help.
     
  7. Aug 6, 2014 #6
    Let's assume that the year and month are included each line of the *.csv file - as I described above. And also assume that we have read the data into the array as described above.
    Then:
    Code (Text):

      double av1994 = 0.0;
      int nM1994 = 0;
      for(int nMonth=0; nMonth<numberOfMonths; nMonth++) {
        if(WeatherArray[nMonth].nYear==1994) {
          nM1994++;
          av1994 += WeatherArray[nMonth].dTemp;
        }
      }
      if(nM1994>0) {
        av1994 = av1994/nM1994;
        ... output the result ...
      } else {
        ... report that there is no data for 1994 ...
      }
     
  8. Aug 6, 2014 #7
    Thanks again Scott for all your help, this is an edit after working back through again.

    I have compiled as you instructed, and up to where you left of after constructing WeatherArray.
    At this point, I can get the earlier print out of average over a column (that helps in getting numberOfMonths) but I can't get a print out of anything after WeatherArray, not even numberofMonths, it is returning
    "Exception in thread "main" java.lang.IllegalStateException: Scanner closed
    at java.util.Scanner.ensureOpen(Scanner.java:1070)
    at java.util.Scanner.next(Scanner.java:1358)
    at sheffieldWeatherPackage.Weather.main(Weather.java:37)


    Here is my code, much replicating yours, for reference;


    package sheffieldWeatherPackage;

    import java.io.File;
    import java.io.FileNotFoundException;
    import java.util.Scanner;

    public class Weather {
    public int nYear;
    public int nMonth;
    public double maxTemp;
    public double minTemp;
    public double afDays;
    public double rainFall;
    public double sunHrs;


    public static void main(String[] args) {
    String fileName = "sheffielddata.csv";
    File file = new File(fileName);
    try {
    Scanner inputStream = new Scanner(file);
    double sum = 0;
    int numberofMonths = 0;
    while (inputStream.hasNext()){
    String data = inputStream.next();
    String[] values = data.split(",");
    double avRain = Double.parseDouble(values[5]);
    sum += avRain;
    numberofMonths++;
    System.out.println(numberofMonths);
    }
    inputStream.close();
    System.out.println("Average - " + (sum / numberofMonths));

    Weather[] WeatherArray = new Weather [numberofMonths];
    for (int nMonth = 0; nMonth<numberofMonths; nMonth++) {
    String data = inputStream.next();
    String[] values = data.split(",");
    WeatherArray[nMonth].nYear = Integer.parseInt(values[0]);
    WeatherArray[nMonth].nMonth = Integer.parseInt(values[1]);
    WeatherArray[nMonth].maxTemp = Integer.parseInt(values[2]);
    WeatherArray[nMonth].minTemp = Integer.parseInt(values[3]);
    WeatherArray[nMonth].afDays = Integer.parseInt(values[4]);
    WeatherArray[nMonth].rainFall = Integer.parseInt(values[5]);
    WeatherArray[nMonth].sunHrs = Integer.parseInt(values[6]);



    System.out.println(numberofMonths);


    }

    } catch (FileNotFoundException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }

    }
    }
     
    Last edited: Aug 6, 2014
  9. Aug 8, 2014 #8

    Mark44

    Staff: Mentor

    The code tags should look like this:
    [code] your code [/code]
     
  10. May 16, 2015 #9
    I would like to provide the following code in calculating average rainfall, maximum or minimum temperatures.

    Code (Java):
    public static void readCSV() throws FileNotFoundException {
            // 1st, config the CSV reader, such as line separator, column separator and so on
            CsvParserSettings settings = new CsvParserSettings();
            settings.getFormat().setLineSeparator("\n");

            // 2nd, creates a CSV parser with the configs
            CsvParser parser = new CsvParser(settings);

            // 3rd, parses all rows from the CSV file into a 2-dimensional array
            List<String[]> resolvedData = parser.parseAll(new FileReader("/examples/example.csv"));

            // 4th, fill the data list with 1st column in all rows in the csv file.
            List<Integer> data = new ArrayList<Integer>();
            for (String[] row : resolvedData) {
                data.add(Integer.parseInt(row[0]));
            }

            // 5th, find out the smallest, largest, average and sum
            int smallest = Integer.MAX_VALUE, largest = 0, average = 0, sum = 0;
            for (Integer val : data) {
                sum += val;
                smallest = val <= smallest ? val : smallest;
                largest = val > largest ? val : largest;
            }

            average = data.isEmpty() ? 0 : sum / data.size();
        }
    In this code example, I used the open source library uniVocity-parsers to parse CSV files into list of arrays just in few lines of code. You can try to extend from this code to fullfill your requirements better.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Java - Importing .csv file for calculating selected averages
Loading...