Output of max consecutive values/days

  • Fortran
  • Thread starter Mizarge
  • Start date
  • Tags
    Max Output
In summary: Station ID for integration period: 1947-2016 ! 70 years: Integrationperiod 1947-2016 ! Output path: ./ In summary, the code extracts all days without precipitation, those
  • #1
Mizarge
2
0
Hello everyone,

I am working on a project with climate data and like to write a program that is able to find consecutive days with no precipitation, and furthermore only the period of each year that shows the longest lack of rainfall. As quite a Newbie regarding Fortran, I spent the last few days reading in forums, but by now I could not manage find the right code term. It would be great to find some help or maybe just hints here :)

The input is a txt.file that contains daily measurement of different values. Of note, there are the date (messdat; YYMMDD) and the precipitation value (rsk). Looking at the period 1947-2016 (19470101 and 20161231), I wrote a code that extracts all days without precipitation, exactly those that have rsk values of less than 0.1 : [just ignore all other variables, they may eventually be used later]

Fortran:
program dry_period
   
    implicit none
   
   
   
    integer n
    parameter(n=100000)
    integer anzja,stat_id(n),messdat(n),qn3(n),qn4(n),rskf(n)
    integer anfp,endp,i, anzahl
    character path*2, file1*20, datafile*40, dummy*80
    real fx(n),fm(n),rsk(n),sdk(n),shkt(n),nm(n),vpm(n),pm(n)
    real tmk(n),upm(n),txk(n),tnk(n),tgk(n)

    ! anzja        = period of integration
    ! stat_id    = station ID
    ! messdat     = Date [yyymmdd]
    ! qn3         = Qualitylevel of column
    ! qn4         = Qualitylevel of column
    ! rskf         = type of precipitation [0-9]
    ! fx         = daily maximum Wind [m/s]
    ! fm         = daily mean wind [m/s]
    ! rsk         = daily precipitation [mm]
    ! sdk         = daily sunshine duration [h]
    ! shkt         = daily snow depth [cm]
    ! nm         = mean cloud amount [1/8]
    ! vpm         = mean vapour pressure [hPa]
    ! pm         = mean air pressure [hPa]
    ! tmk         = mean temperature [°C]
    ! upm         = mean relative humidity [%]
    ! txk         = mean air temperature [°C] in 2m Height
    ! tnk         = daily minimum of air temperature [°C] in 2m Height
    ! tgk         = Minimum of air temperature [°C] in 5cm Height
    ! eor         = End of Record
    ! default    = -999   
       
    parameter    (anzja= 70,             ! 70 years: Integrationperiod 1947-2016)
     1             path='./',             ! Output path        
     1              file1='Erg_Trockenph.v1')    ! Output file for dry periods       
   
    data datafile /'1.txt'/

   
c -------- Columns of data ---------

    ! 01. column: stat_id
    ! 02. column: messdat
    ! 03. column: qn3
    ! 04. column: fx
    ! 05. column: fm
    ! 06. column: qn4
    ! 07. column: rsk
    ! 08. column: rskf
    ! 09. column: sdk
    ! 10. column: shkt
    ! 11. column: nm
    ! 12. column: vpm
    ! 13. column: pm
    ! 14. column: tmk
    ! 15. column: upm
    ! 16. column: txk
    ! 17. column: tnk
    ! 18. column: tgk

c -------- Defining Period for Program ---------

    write(*,*) 'insert start date:'
    write(*,*) 'Format YYYYMMDD, e.g. 17811231'
    read(*,*) anfp
    write(*,*) 'insert end date'   
    write(*,*) 'Format YYYYMMDD, e.g. 19470101'
    read(*,*) endp
       
    open(10, file=datafile)
   
   
   
c ------ skip header ----------

    read(10,*) dummyc ----- skip data until beginning date -------

    do
    read(10,*,end=300) stat_id(1),messdat(1),qn3(1),fx(1),fm(1),
     1  qn4(1),rsk(1),rskf(1),sdk(1),shkt(1),nm(1),vpm(1),pm(1),tmk(1),
     1  upm(1),txk(1),tnk(1),tgk(1)
    if (messdat(1) .ge. anfp) exit
    end do
   
c ----- Read data until end date -----------------

    do i=2,n
    read(10,*,end=300) stat_id(i),messdat(i),qn3(i),fx(i),fm(i),
     1  qn4(i),rsk(i),rskf(i),sdk(i),shkt(i),nm(i),vpm(i),pm(i),tmk(i),
     1  upm(i),txk(i),tnk(i),tgk(i)
         if (messdat(i) .gt. endp) exit
         anzahl = i
     end do
            
300    continue
    close(10)        
        

c ---- find days without rainfall
   
    open(21,file=path//file1)
     do i=2,anzahl   
      if (rsk(i).lt.0.1) then 
      write(21,*) stat_id(i), messdat(i), rsk(i)
      end if
     enddo
     close(21)
    
     write(*,*) 'days without rain written'
        
         end

So by now, I have a new file with the station ID (column 1), the date (column 2) and every day without precipitation (column 3). What I try to find is only the period with most dry days in each year, the best case would be (written in a new file) something like this:

stat_id...[messdat = beginning of period]...[count of consecutive days without rainfall]

or maybe even just
stat_id - year - countUnfortunately, my implementation fails as I have no idea how to properly connect the values to only the years and ignoring MMDD, and especially how to filter the longest period out of the data. I read something about using
Fortran:
diff(rsk(i)) ==1
however, I'm not sure how to use that.

Maybe someone knows a preferably easy way to solve this or an idea what I could try. I think it is not that difficult, but I am missing the necessary knowledge :)
 
Technology news on Phys.org
  • #2
I assume that your original data file had consecutive records, one for each day. If so, you should add logic to your second loop to detect consecutive days without rain. You will have to decide what to do about a dry period that is split between years.

An easy way to detect the year is to divide the date by 10000 and store it into an integer variable, year. Integer division will chop off (truncate) the lower 4 digits, leaving only the year digits.

I also don't see how the diff( rsk(i) ) will help. (I'm not even sure what diff is.)
 
Last edited:
  • #3
Thank you for your help! With the division, receiving only the year worked perfectly, that actually did not came to my mind.
However, I still have problems with the logic part. I added a count that is quasi just (i) as I thought it might be possible to pick out the values with the count that shows the most consecutive numbers, but different approaches were unsuccessful. My idea now is something like this:
Fortran:
open(21,file=path//file1)
     do i=1,anzahl
      year = messdat(i)/10000

     year = anf/10000                     ! Starting with the first year
           do while (year .eq. anf/10000)   !Do while [the first year]
             if (rsk(i) .lt. 0.1 ) then               ! First day without rain -> count 1
             count=1
             end if

             if (rsk(i) .AND. rsk(i+1) .lt. 0.1) then     !Second and next day without rain -> count 2
             count=2
.
.                                                          ! The program should only keep the highest count
.
             
             end do

write(21,*) stat_id(i), messdat2(i),messdat(i),rsk(i),count
        endif
           year=year+1            !continue with the next year
     enddo
     close(21)
  
     write(*,*) 'dry periods written in data'

Is there an easier way to get the max dry period?
Maybe I can also use the count-argument, I am still figuring out and learning everything :rolleyes:
Therefore, further hints or help would be much appreciated.
 
  • #4
I don't think that your year=year+1 really does anything since year keeps being set at the top of the loop. It looks like it will repeat the logic for a year for every record of that year.

Assuming that the records are in chronological order, with one record for each day, I recommend that you:
1) only loop through the records once,
2) make some "state variables" (like todays_year, yesterdays_year, todays_rain, yesterdays_rain), that you update for each record and
3) create a section of code within the loop for each possibility (like: if (todays_year .eq. yesterdays_year) .and. (todays_rain .eq. 0) .and. (yesterdays_rain .eq. 1) then ... )

That will help you to keep the logic organized.

Don't forget to set the yesterday variables at the beginning so that the first record will be done correctly and to print/record results for the last year when the last record has been read.
 
Last edited:
  • #5
Just a point. Think about how humans might solve it.
You'd write down the date, write maybe "R" for rain "N" for no rain, then look back to the previous line and decide:
If it changed i.e,. from N to R or the other way around, then
write 1 (we'll call that the sequence number)
else
take the previous line's number and add one, write new sequence number

Do that for every line in the data file.

At the end notice: that for every sequence number == 1 (except for the very first line) denotes the end of a "streak".
You can abstract that notation of the data with sequences into an array of FORTRAN RECORDS or gfortran types:
https://gcc.gnu.org/onlinedocs/gfortran/STRUCTURE-and-RECORD.html

Fortran:
STRUCTURE /item/
  INTEGER sequence
!  rain is R or N
  CHARACTER(LEN=2) rain
  CHARACTER(LEN=12) date
  REAL precip
END STRUCTURE

! Define two variables, an single record of type ``item''
! "work"  and "rainlog" which can hold up to 70 years worth of daily events
RECORD /item/ work, rainlog(25570)

Now all you have to do is loop and check for a 1 in sequence, back up one array element and sequence is the number of days of rain/no rain for the previous event set. IF you need precip totals, you can sum them up the as you summed the sequences.

This is called an online algorithm - it takes a data stream and orgranizes it. It is one of the standard approaches for this kind of problem.
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm

Why did I do this? Your code has many variables, "catalog" keeps track of everything for ~70 years worth of data. You do not have to rewrite your existing code, simply recognize that this might be a better choice many times.
 

What is the output of max consecutive values/days?

The output of max consecutive values/days refers to the maximum number of consecutive values or days in a given data set. This can also be interpreted as the longest streak of consecutive values or days.

How is the output of max consecutive values/days calculated?

The output is calculated by identifying the longest streak of consecutive values or days in a given data set. This can be done manually by looking at the data or through computer algorithms.

Why is the output of max consecutive values/days important?

The output is important because it can provide insights into patterns and trends in the data. It can also help in identifying outliers or abnormalities in the data.

Can the output of max consecutive values/days vary?

Yes, the output can vary depending on the data set and the method used for calculation. Different algorithms or approaches may result in slightly different outputs.

How can the output of max consecutive values/days be used in research or analysis?

The output can be used to track and analyze trends, identify potential outliers, and make predictions based on past patterns. It can also be used to compare data sets and identify differences or similarities in consecutive streaks.

Similar threads

  • Engineering and Comp Sci Homework Help
Replies
1
Views
1K
Back
Top