Output of max consecutive values/days

  • Context: Fortran 
  • Thread starter Thread starter Mizarge
  • Start date Start date
  • Tags Tags
    Max Output
Click For Summary

Discussion Overview

The discussion revolves around programming in Fortran to analyze climate data, specifically focusing on identifying consecutive days without precipitation and determining the longest dry periods within each year from a dataset spanning 1947 to 2016. Participants are seeking assistance with coding logic and implementation details.

Discussion Character

  • Technical explanation
  • Exploratory
  • Homework-related

Main Points Raised

  • One participant describes their project and shares their initial Fortran code for extracting days without precipitation.
  • Another participant suggests adding logic to detect consecutive dry days and proposes using integer division to extract the year from the date.
  • A participant expresses gratitude for the year extraction method but struggles with the logic to count consecutive dry days, sharing their current approach and seeking further guidance.
  • Another participant critiques the logic of incrementing the year variable and recommends a more organized approach using state variables to track the current and previous day's conditions.
  • One participant hints at considering how humans might approach the problem, implying a need for intuitive logic in the coding process.

Areas of Agreement / Disagreement

Participants generally agree on the need to refine the logic for counting consecutive dry days, but there are multiple competing views on the best approach to implement this logic. The discussion remains unresolved regarding the optimal coding strategy.

Contextual Notes

Participants have not reached consensus on the best method to implement the logic for counting consecutive dry days, and there are unresolved aspects regarding how to handle transitions between years in the data.

Mizarge
Messages
2
Reaction score
0
Hello everyone,

I am working on a project with climate data and like to write a program that is able to find consecutive days with no precipitation, and furthermore only the period of each year that shows the longest lack of rainfall. As quite a Newbie regarding Fortran, I spent the last few days reading in forums, but by now I could not manage find the right code term. It would be great to find some help or maybe just hints here :)

The input is a txt.file that contains daily measurement of different values. Of note, there are the date (messdat; YYMMDD) and the precipitation value (rsk). Looking at the period 1947-2016 (19470101 and 20161231), I wrote a code that extracts all days without precipitation, exactly those that have rsk values of less than 0.1 : [just ignore all other variables, they may eventually be used later]

Fortran:
program dry_period
   
    implicit none
   
   
   
    integer n
    parameter(n=100000)
    integer anzja,stat_id(n),messdat(n),qn3(n),qn4(n),rskf(n)
    integer anfp,endp,i, anzahl
    character path*2, file1*20, datafile*40, dummy*80
    real fx(n),fm(n),rsk(n),sdk(n),shkt(n),nm(n),vpm(n),pm(n)
    real tmk(n),upm(n),txk(n),tnk(n),tgk(n)

    ! anzja        = period of integration
    ! stat_id    = station ID
    ! messdat     = Date [yyymmdd]
    ! qn3         = Qualitylevel of column
    ! qn4         = Qualitylevel of column
    ! rskf         = type of precipitation [0-9]
    ! fx         = daily maximum Wind [m/s]
    ! fm         = daily mean wind [m/s]
    ! rsk         = daily precipitation [mm]
    ! sdk         = daily sunshine duration [h]
    ! shkt         = daily snow depth [cm]
    ! nm         = mean cloud amount [1/8]
    ! vpm         = mean vapour pressure [hPa]
    ! pm         = mean air pressure [hPa]
    ! tmk         = mean temperature [°C]
    ! upm         = mean relative humidity [%]
    ! txk         = mean air temperature [°C] in 2m Height
    ! tnk         = daily minimum of air temperature [°C] in 2m Height
    ! tgk         = Minimum of air temperature [°C] in 5cm Height
    ! eor         = End of Record
    ! default    = -999   
       
    parameter    (anzja= 70,             ! 70 years: Integrationperiod 1947-2016)
     1             path='./',             ! Output path        
     1              file1='Erg_Trockenph.v1')    ! Output file for dry periods       
   
    data datafile /'1.txt'/

   
c -------- Columns of data ---------

    ! 01. column: stat_id
    ! 02. column: messdat
    ! 03. column: qn3
    ! 04. column: fx
    ! 05. column: fm
    ! 06. column: qn4
    ! 07. column: rsk
    ! 08. column: rskf
    ! 09. column: sdk
    ! 10. column: shkt
    ! 11. column: nm
    ! 12. column: vpm
    ! 13. column: pm
    ! 14. column: tmk
    ! 15. column: upm
    ! 16. column: txk
    ! 17. column: tnk
    ! 18. column: tgk

c -------- Defining Period for Program ---------

    write(*,*) 'insert start date:'
    write(*,*) 'Format YYYYMMDD, e.g. 17811231'
    read(*,*) anfp
    write(*,*) 'insert end date'   
    write(*,*) 'Format YYYYMMDD, e.g. 19470101'
    read(*,*) endp
       
    open(10, file=datafile)
   
   
   
c ------ skip header ----------

    read(10,*) dummyc ----- skip data until beginning date -------

    do
    read(10,*,end=300) stat_id(1),messdat(1),qn3(1),fx(1),fm(1),
     1  qn4(1),rsk(1),rskf(1),sdk(1),shkt(1),nm(1),vpm(1),pm(1),tmk(1),
     1  upm(1),txk(1),tnk(1),tgk(1)
    if (messdat(1) .ge. anfp) exit
    end do
   
c ----- Read data until end date -----------------

    do i=2,n
    read(10,*,end=300) stat_id(i),messdat(i),qn3(i),fx(i),fm(i),
     1  qn4(i),rsk(i),rskf(i),sdk(i),shkt(i),nm(i),vpm(i),pm(i),tmk(i),
     1  upm(i),txk(i),tnk(i),tgk(i)
         if (messdat(i) .gt. endp) exit
         anzahl = i
     end do
            
300    continue
    close(10)        
        

c ---- find days without rainfall
   
    open(21,file=path//file1)
     do i=2,anzahl   
      if (rsk(i).lt.0.1) then 
      write(21,*) stat_id(i), messdat(i), rsk(i)
      end if
     enddo
     close(21)
    
     write(*,*) 'days without rain written'
        
         end

So by now, I have a new file with the station ID (column 1), the date (column 2) and every day without precipitation (column 3). What I try to find is only the period with most dry days in each year, the best case would be (written in a new file) something like this:

stat_id...[messdat = beginning of period]...[count of consecutive days without rainfall]

or maybe even just
stat_id - year - countUnfortunately, my implementation fails as I have no idea how to properly connect the values to only the years and ignoring MMDD, and especially how to filter the longest period out of the data. I read something about using
Fortran:
diff(rsk(i)) ==1
however, I'm not sure how to use that.

Maybe someone knows a preferably easy way to solve this or an idea what I could try. I think it is not that difficult, but I am missing the necessary knowledge :)
 
Technology news on Phys.org
I assume that your original data file had consecutive records, one for each day. If so, you should add logic to your second loop to detect consecutive days without rain. You will have to decide what to do about a dry period that is split between years.

An easy way to detect the year is to divide the date by 10000 and store it into an integer variable, year. Integer division will chop off (truncate) the lower 4 digits, leaving only the year digits.

I also don't see how the diff( rsk(i) ) will help. (I'm not even sure what diff is.)
 
Last edited:
Thank you for your help! With the division, receiving only the year worked perfectly, that actually did not came to my mind.
However, I still have problems with the logic part. I added a count that is quasi just (i) as I thought it might be possible to pick out the values with the count that shows the most consecutive numbers, but different approaches were unsuccessful. My idea now is something like this:
Fortran:
open(21,file=path//file1)
     do i=1,anzahl
      year = messdat(i)/10000

     year = anf/10000                     ! Starting with the first year
           do while (year .eq. anf/10000)   !Do while [the first year]
             if (rsk(i) .lt. 0.1 ) then               ! First day without rain -> count 1
             count=1
             end if

             if (rsk(i) .AND. rsk(i+1) .lt. 0.1) then     !Second and next day without rain -> count 2
             count=2
.
.                                                          ! The program should only keep the highest count
.
             
             end do

write(21,*) stat_id(i), messdat2(i),messdat(i),rsk(i),count
        endif
           year=year+1            !continue with the next year
     enddo
     close(21)
  
     write(*,*) 'dry periods written in data'

Is there an easier way to get the max dry period?
Maybe I can also use the count-argument, I am still figuring out and learning everything :rolleyes:
Therefore, further hints or help would be much appreciated.
 
I don't think that your year=year+1 really does anything since year keeps being set at the top of the loop. It looks like it will repeat the logic for a year for every record of that year.

Assuming that the records are in chronological order, with one record for each day, I recommend that you:
1) only loop through the records once,
2) make some "state variables" (like todays_year, yesterdays_year, todays_rain, yesterdays_rain), that you update for each record and
3) create a section of code within the loop for each possibility (like: if (todays_year .eq. yesterdays_year) .and. (todays_rain .eq. 0) .and. (yesterdays_rain .eq. 1) then ... )

That will help you to keep the logic organized.

Don't forget to set the yesterday variables at the beginning so that the first record will be done correctly and to print/record results for the last year when the last record has been read.
 
Last edited:
Just a point. Think about how humans might solve it.
You'd write down the date, write maybe "R" for rain "N" for no rain, then look back to the previous line and decide:
If it changed i.e,. from N to R or the other way around, then
write 1 (we'll call that the sequence number)
else
take the previous line's number and add one, write new sequence number

Do that for every line in the data file.

At the end notice: that for every sequence number == 1 (except for the very first line) denotes the end of a "streak".
You can abstract that notation of the data with sequences into an array of FORTRAN RECORDS or gfortran types:
https://gcc.gnu.org/onlinedocs/gfortran/STRUCTURE-and-RECORD.html

Fortran:
STRUCTURE /item/
  INTEGER sequence
!  rain is R or N
  CHARACTER(LEN=2) rain
  CHARACTER(LEN=12) date
  REAL precip
END STRUCTURE

! Define two variables, an single record of type ``item''
! "work"  and "rainlog" which can hold up to 70 years worth of daily events
RECORD /item/ work, rainlog(25570)

Now all you have to do is loop and check for a 1 in sequence, back up one array element and sequence is the number of days of rain/no rain for the previous event set. IF you need precip totals, you can sum them up the as you summed the sequences.

This is called an online algorithm - it takes a data stream and orgranizes it. It is one of the standard approaches for this kind of problem.
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm

Why did I do this? Your code has many variables, "catalog" keeps track of everything for ~70 years worth of data. You do not have to rewrite your existing code, simply recognize that this might be a better choice many times.
 

Similar threads

Replies
1
Views
2K