Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Fortran, reading a particular column of a matrix, allocatable

  1. May 31, 2013 #1

    fluidistic

    User Avatar
    Gold Member

    I have a program that makes a histogram from data taken in a file but it only works if the data is an array (nx1 matrix).
    I've downloaded a data file which is a 159663x12 matrix but only the 8th column contains numbers I'm interested it. The other columns contain either letters and/or numbers.
    So I cannot read the data file and assignate, say, a(i) to values in the first column for example because these values might not even be numbers.
    What I want is something like
    Code (Text):
    read(11,*) , , , , , , ,x_i, , , ,
    where I don't want to read the blancked values.
    I'm sure there's a way to do this, but despite reading what I could find in google it's not clear at all to me how to do.
    I'll ask other questions related to allocatable probably further.
    Thanks for any help.


    Edit: To complicate things, some if not most columns have no values for a particular row. For example here is how the file starts:
    Code (Text):
    10207538       A E M, Doshtagir                                             BAN M                            1864  0   30 0000  i  
    10206612       A K M, Sourab                                                BAN M                            1714  0   30 0000  i  
    5045886        A K, Kalshyan                                                IND M                            1943  9   15 1964      
    8605360        A La, Teng Hua                                               CHN F                            1915  0   30 1993  wi  
    5031605        A, Akshaya                                                   IND F                            2014  0   15 1994  w  
    5080444        A, Sohita                                                    IND F                            1447  0   30 1995  wi  
    5706068        A. Nashir, Mohd Khairul Nazrin                               MAS M                            1878  0   30 0000  i  
    10201971       A.f.m., Mahfuzul Haque                                       BAN M                            1758  0   15 0000      
    10202650       A.k. Azad, Akand                                             BAN M                            1692  0   30 0000  i  
    10210997       A.K.M. Mehfuz                                                BAN M                            2015  0   30 0000  i  
    24663832       Aab, Manfred                                                 GER M                            1758  0   15 1963      
    1701991        Aaberg, Anton                                                SWE M                            2372  0   10 1972      
    1513966        Aabid, Ryaad                                                 NOR M                            1698  0   15 1958      
    1407589        Aabling-Thomsen, Jakob                                       DEN M   FM                       2336  7   15 1985      
    12524670       Aadeli, Arvin                                                IRI M                            2037  5   15 0000      
    5072662        Aadhityaa, M                                                 IND M                            1893  0   15 1999      
    25034677       Aadish S                                                     IND M                            1528  0   15 1999      
    25088955       Aadit Bhatia                                                 IND M                            1544  0   30 2002      
    5086183        Aaditt, M K                                                  IND M                            1610  0   30 1996  i  
    25083821       Aaditya jain                                                 IND M                            1297  0   30 2000      
    5027942        Aaditya, Jagadeesh                                           IND M                            1811  0   15 1998      
    25011952       Aadityan G                                                   IND M                            1627  0   15 2001      
    5063485        Aadityan, N.                                                 IND M                            1758  0   15 1996      
    1431692        Aae, Jesper                                                  DEN M                            1881  0   30 1954      
    1427024        Aagaard, Gert                                                DEN M                            2049  7   15 1966      
    1401815        Aagaard, Jacob                                               DEN M   GM        FST            2519  0   10 1973      

    Here is my code:
    Code (Text):
    program histogram
    implicit none

    real :: mean
    integer :: x_i ,i  ,k  
    integer, dimension(0:3000) :: icount
    character (5):: a,b,c,d,e,f,g,h,j,l,m
    i=1

    open(12,file='histchess.txt',status='old')
    open(11,file='crat.txt', status='old')

    icount=0
    do while (i.le.159663)
    read(11,*)a,b,c,d,e,f,g,x_i,h,j,l,m
    icount(x_i)=icount(x_i)+1



    i=i+1
    end do

        do k=0,3000,1
        write(12,*)k,icount(k)
        end do

    close(11)
    close(12)



    end program
    It compiles but when I execute it, I get the error
    Code (Text):
    At line 15 of file chesshistogram.f90 (unit = 11, file = 'crat.txt')
    Fortran runtime error: Bad integer for item 1 in list input
     
     
    Last edited: May 31, 2013
  2. jcsd
  3. May 31, 2013 #2
    real :: mean
    integer i,k
    integer,allocatable, dimension(:) :: x_i
    character (5):: a,b,c,d,e,f,g,h,j,l,m


    open(12,file='histchess.txt',status='old')
    open(11,file='crat.txt', status='old')

    i=0
    do
    read(11,*,end = 200)
    i=i+1
    end do

    200 allocate(x_i(i))

    rewind(11)

    do k=1,i
    read(11,*)a,b,c,d,e,f,g,x_i,h,j,l,m
    write(12,*)k,x_i(k)
    end do

    close(11)
    close(12)

    end program


    Could you try this code and let me know if it works?
     
    Last edited: May 31, 2013
  4. May 31, 2013 #3

    Mark44

    Staff: Mentor

    Your variables a through g are 5-character arrays. The very first item you attempt to read is 10207538, which contains 8 characters. That's a problem.

    I think that you will need to do list-directed input to make sure that specific items in your table go into the appropriate variables. For more information, see http://www.oc.nps.edu/~bird/oc3030_online/fortran/io/io.html or any other Fortran docs on the read statement. The section on Format Descriptors in the link I show might be helpful.
     
  5. May 31, 2013 #4

    fluidistic

    User Avatar
    Gold Member

    I tried it, returned a similar error that I got:
    Code (Text):
    Fortran runtime error: Bad integer for item 5 in list input
    .
    Like Mark44 said, it probably has to do with the format.

    Ok I will read this and try the Format statement. I don't really understand why my variables "a" through "g" are 5 character arrays.
    Edit: nevermind, that's because I set it to 5 indeed. I tried several other numbers, I've had the same error (even with it's characters(9) )
     
  6. May 31, 2013 #5

    fluidistic

    User Avatar
    Gold Member

    I've read about format but I'm still confused.
    I tried to avoid this problem by trying to remove columns 1 to 7 and 9 to 12 using awk
    Code (Text):
    awk '{for(i=1;i<=NF;++i) if (i != 1 && i!=2 && i!=3 && i!=4 && i!=5 && i!=6 && i!=7 && i!=9 && i!=10 && i!=11 && i!=12) printf("%s ", $i);  printf("\n"); }' crat.txt >crat4.txt
     
    . The new .txt file had indeed only 1 column remaining but not as I wanted. It even contained letters, i.e. elements of other columns. Here is a small part of it:
    Code (Text):
    30
    0
    0
    WF w
     
    0
    15
    15
    30
    30
    15
    0
    6
    0
    1827
    0
    30
    6
    0
    30
    30
    CM
     
    15
    30
    15
    0
    8
    0
    18
    0
    1513
     
    30
    15
    10
    15
    0
    15
    2347
     
    2131
    WF w
     
    0
    0
    15
    30
    30
    This leads me to think that Fortran won't be able to understand the columns of the .txt file either, even if I order it to read only column 8.
    I guess I chose the wrong data file to download. Too complicated.
     
  7. May 31, 2013 #6

    Mark44

    Staff: Mentor

    I would have the program read everything in the text file, but discard most of it, leaving only what you want.

    Here are the first couple of lines of your text file:
    Code (Text):

    10207538     A E M, Doshtagir     BAN M  1864  0   30 0000 i  
    10206612     A K M, Sourab        BAN M  1714  0   30 0000 i
    I removed a bunch of spaces so the lines weren't so long.

    You need to look at the text file to see how it is arranged. Each line seems to consist of a number of fields, some of which can be null, and most of which you don't care about. One possible solution would be to read in everything up to the number you want (1864 in the first line and 1714 in the second line) as one variable, read in the number as your second variable, and then read in everything else as your third variable.

    The statements to do this would look something like this:
    Code (Text):
        read (12, 100) Str1, Number, Str2
    100 format A50, I5, A30
    Str1 and Str2 would be declared as character array variables, and Number would be declared as an integer variable. I am assuming that the number you want starts in column 51 and is no more than 5 digits.
     
  8. May 31, 2013 #7

    fluidistic

    User Avatar
    Gold Member

    Thanks a lot Mark44, I'm going to check this out really seriously as soon as I can. Meanwhile I could delete manually all columns but the 8th using kate. I've plotted the column I'm interested in gnuplot. I took a screenshot so you can see.
     

    Attached Files:

  9. May 31, 2013 #8

    Mark44

    Staff: Mentor

    As they say, there's more than one way to skin a cat.

    What are you plotting? Is it the number of chess games won by the individuals listed in the file? I'm guessing that the three-letter fields are the country, with DEN = Denmark and IND = India. If so, what do BAN and MAS mean? (If you know.)
     
  10. May 31, 2013 #9

    fluidistic

    User Avatar
    Gold Member

    I'm plotting the Elo ratings on the x-axis versus the number of people with such a Elo rating on the y-axis. The data is for May of this year and apparently contains all the names of active players who are "members" of the FIDE. Anyone can download the file on FIDE's website, it's around 24 Mb I think.
    As for the countries you are right. By looking at the names, I'm guessing BAN stands for Bangladesh but I don't know for MAS. Probably arabic by looking at the names.
     
  11. Jun 1, 2013 #10
    That's fixed tab pos data of known size; easy job:

    Code (Text):
    !----------------------------------------------------------------------------80
    !> @file elo_read.f90
    !! sample for reading data of known size given in known fixed tab format
    !------------------------------------------------------------------------------

    program elo_read
        implicit none
        integer,            parameter :: FD     = 11
        character(len=*),   parameter :: FN     = 'elo_data.txt' ! whatever
        !integer,            parameter :: M      = 159663  ! number of rows
        integer,            parameter :: M      = 26      ! number of rows
        integer,            parameter :: HEAD   = 109     ! chars before int datum
        integer,            parameter :: TAIL   = 16      ! dto after int datum

        character(len=13)             :: sfmt
        character(len=HEAD)           :: head_buffer
        character(len=TAIL)           :: tail_buffer ! just for completeness sake
        integer                       :: data_(M)
        integer                       :: stat, i

        ! building the format string from parameters
        write(sfmt, fmt="(A2,I3,A5,I2,A1)") "(A", HEAD, ",I4,A" , TAIL, ")"
        open(unit=FD, file=FN, status='OLD', err=110, iostat=stat)

        data_ = 0

        i = 1
        do while (stat == 0 .and. i <= M)
            read(FD, fmt=sfmt, err=120, iostat=stat) &
                                head_buffer, data_(i), tail_buffer
            i = i + 1
        end do

        write(*,*) data_ ! or whatever processing is required

        goto 999
    110 stop "OPEN" !> @TODO implement more specific error handling
    120 stop "READ" !> @TODO implement more specific error handling
    999 stop
    end program elo_read
     
    Last edited: Jun 1, 2013
  12. Jun 1, 2013 #11

    fluidistic

    User Avatar
    Gold Member

    I see. What does A30 stand for though?
    I've the following code to try this way:
    Code (Text):
    program histogram
    implicit none

    !real :: mean
    integer :: x_i ,j  ,k  
    integer, dimension(0:3000) :: icount
    character(5), dimension(7):: Str1
     character(5), dimension(4):: Str2
    j=1

    open(12,file='histchess.txt',status='old')
    open(11,file='crat.txt', status='old')

    100 format (A109, I5, A30)
    icount=0
    do while (j.le.159663)

    read (11, 100) Str1, x_i, Str2

    icount(x_i)=icount(x_i)+1



    j=j+1
    end do

        do k=0,3000,1
        write(12,*)k,icount(k)
        end do

    close(11)
    close(12)

    end program
    But I'm getting the error
    Code (Text):
    At line 18 of file chesshistogram2.f90 (unit = 11, file = 'crat.txt')
    Fortran runtime error: Expected INTEGER for item 2 in formatted transfer, got CHARACTER
    (A109, I5, A30)
           ^
     

    Thanks a lot for your time. The code is a bit over my head. I mean I can mainly understand it but I couldn't write it on my own.
     
  13. Jun 1, 2013 #12
    You're welcome.

    The code simply eats the leading 109 chars, disregards them, eats the the 4 char repr. of the ELO int, and makes that an int inside an array of ints ("data_") for further processing.

    The only - slightly - tricky thing is building the fmt string for the read() sort-of dynamically; I do this to get the maximum out the known constants declared in the section with the "parameter" clauses to avoid burying those consts deep inside a format statement only or - worse - duplicating their given values somewhere in the code.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Fortran, reading a particular column of a matrix, allocatable
  1. Fortran read (Replies: 9)

  2. Read file in fortran (Replies: 2)

Loading...