How do I find an exact line from a large file in fortran90

Click For Summary
SUMMARY

The discussion centers on reading a specific matrix from a file in Fortran90, specifically after locating the line "OVERLAP MATRIX - CELL N. 1( 0 0 0)". Users suggest using a loop to read lines until the target line is found, followed by reading the matrix data. A sample code snippet is provided, but issues arise with string comparison and line reading. The consensus is to ensure exact matches for the target string and to streamline the code for clarity and functionality.

PREREQUISITES
  • Fortran90 programming language
  • File I/O operations in Fortran
  • String manipulation and comparison in Fortran
  • Matrix representation in Fortran
NEXT STEPS
  • Review Fortran90 file handling techniques
  • Learn about string comparison methods in Fortran
  • Explore matrix operations in Fortran, including diagonalization
  • Investigate debugging techniques for Fortran code
USEFUL FOR

Physics students, Fortran programmers, and anyone involved in data processing and matrix manipulation in scientific computing.

kranthi4689
Messages
16
Reaction score
0
I'm new to prorgramming . I have been working with Fortran90 for my physics project and I have to read data from a file. I need to find a specific matrix and then print the said matrix onto a different file and diagonalize it. How do I read the matrix that follows a specific line from a file . My file looks like this:

OVERLAP MATRIX - CELL N. 1( 0 0 0)

1 2 3 4 5 6 ...
1 1.0000E+00
2 6.5891E-01 1.0000E+00
3 0.0000E+00 0.0000E+00 1.0000E+00
4 0.0000E+00 0.0000E+00 0.0000E+00 1.0000E+00
5 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.0000E+00
6 0.0000E+00 0.0000E+00 6.7373E-01 0.0000E+00 0.0000E+00 1.0000E+00
7 0.0000E+00 0.0000E+00 0.0000E+00 6.7373E-01 0.0000E+00 0.0000E+00 ...
......

Now, If the line overlap matrix -cell n is found then I need to read the matrix below it. How can I achieve this using fortran. thank you.
 
Technology news on Phys.org
From the looks of your data, you can use the line number at the beginning of each line (is that something you can count on?) to loop through a "do-nothing" read loop to the line you want and then start the real reading of the matrix.
 
FactChecker said:
From the looks of your data, you can use the line number at the beginning of each line (is that something you can count on?) to loop through a "do-nothing" read loop to the line you want and then start the real reading of the matrix.
No. Line number can vary from file to file. so I cannot use it.
 
You need to read in a line as a string, then compare with the line you are looking for. You loop over that until a match is read, then you can start reading in the matrix.
 
Can you assume that the line you want is unique in the file? Or at least that it is the first occurrence of that text line? You might be surprised at how often things go wrong in a file of data.
 
DrClaude said:
You need to read in a line as a string, then compare with the line you are looking for. You loop over that until a match is read, then you can start reading in the matrix.
I did that using a code from stackoverflow, but it is not working,
Fortran:
program open_file
implicit none
integer ::ios
character (len =39) :: str_name
character , allocatable :: command(:)
character (len=200) :: line
integer :: n,i
str_name = 'OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='InAs_bulk_lanl2dz.outp',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"
    n = 0
    do
        read(10, '(A)', iostat=ios) line
        if (ios/= 0 ) exit
    n=n+1
    end do
print*, "File contains ", n, "commands"
print*, str_name , len_trim(str_name)
    allocate(command(n))
    rewind(10)
    do i = 1, n
        read(10,'(A)') command(i)
    end do
    close(10)
    do i=1,n
    if (trim(command(i)) /= trim(str_name)) then
    print *, "target"
        else
            !print*, command(i), str_name
            print*, "not found"
    endif
enddo
ENDPROGRAM

any help.
 
FactChecker said:
Can you assume that the line you want is unique in the file? Or at least that it is the first occurrence of that text line? You might be surprised at how often things go wrong in a file of data.
yeah . the line is unique to the file .
 
kranthi4689 said:
yeah . the line is unique to the file .
Good. Then you are safe in using @DrClaude 's recommendation.
 
FactChecker said:
Good. Then you are safe in using @DrClaude 's recommendation.
I did try that approach but for some reason, I cannot check for that string. I posted the code above. if possible could you look into it.
 
  • #10
There are many problems with the code. This first part seems to only count the number of lines which is completely unnecessary.
You need something like

Fortran:
implicit none
integer ::ios
character (len =39) :: str_name
character (len=200) :: line
str_name = 'OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='InAs_bulk_lanl2dz.outp',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"
    do
        read(10,'(A)') line
        if (line(1:39) == str_name) exit
    end do
! continue with reading the matrix

You have to make sure that str_name matches exactly the line in the file.
 
  • Like
Likes   Reactions: FactChecker
  • #11
DrClaude said:
There are many problems with the code. This first part seems to only count the number of lines which is completely unnecessary.
You need something like

Fortran:
implicit none
integer ::ios
character (len =39) :: str_name
character (len=200) :: line
str_name = 'OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='InAs_bulk_lanl2dz.outp',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"
    do
        read(10,'(A)') line
        if (line(1:39) == str_name) exit
    end do
! continue with reading the matrix

You have to make sure that str_name matches exactly the line in the file.
I tried your solution u
DrClaude said:
There are many problems with the code. This first part seems to only count the number of lines which is completely unnecessary.
You need something like

Fortran:
implicit none
integer ::ios
character (len =39) :: str_name
character (len=200) :: line
str_name = 'OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='InAs_bulk_lanl2dz.outp',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"
    do
        read(10,'(A)') line
        if (line(1:39) == str_name) exit
    end do
! continue with reading the matrix

You have to make sure that str_name matches exactly the line in the file.
I tried to read the line. but it is not reading it. I did exactly like you said.
Fortran:
program read_mat
implicit none
integer ::ios,i
character (len =39) :: str_name
character (len=1000) :: line
str_name='OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='data.dat',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"
    do
        read(10,'(A)',end=100) line
        if (line(1:39) == str_name) then
        write(*,*) "found line"
        endif
    end do
100 close(10)
! continue with reading the matrix
end program read_mat
my file is :
OVERLAP MATRIX - CELL N. 1( 0 0 0)

1 2 3 4 5 6 7 8 9 10

1 1.0000E+00
2 6.5891E-01 1.0000E+00
3 0.0000E+00 0.0000E+00 1.0000E+00
4 0.0000E+00 0.0000E+00 0.0000E+00 1.0000E+00
5 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.0000E+00
6 0.0000E+00 0.0000E+00 6.7373E-01 0.0000E+00 0.0000E+00 1.0000E+00
7 0.0000E+00 0.0000E+00 0.0000E+00 6.7373E-01 0.0000E+00 0.0000E+00 1.0000E+00
8 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 6.7373E-01 0.0000E+00 0.0000E+00 1.0000E+00
9 3.5521E-02 1.2488E-01 -9.4002E-02 9.4002E-02 9.4002E-02 -1.7531E-01 1.7531E-01 1.7531E-01 1.0000E+00
10 1.1830E-01 2.7534E-01 -1.8038E-01 1.8038E-01 1.8038E-01 -3.1590E-01 3.1590E-01 3.1590E-01 6.8256E-01 1.0000E+00
11 6.2932E-02 9.8027E-02 -4.8573E-02 1.1828E-01 1.1828E-01 4.4339E-02 9.1732E-02 9.1732E-02 0.0000E+00 0.0000E+00
12 -6.2932E-02 -9.8027E-02 1.1828E-01 -4.8573E-02 -1.1828E-01 9.1732E-02 4.4339E-02 -9.1732E-02 0.0000E+00 0.0000E+00
13 -6.2932E-02 -9.8027E-02 1.1828E-01 -1.1828E-01 -4.8573E-02 9.1732E-02 -9.1732E-02 4.4339E-02 0.0000E+00 0.0000E+00
14 1.8034E-01 2.6410E-01 5.8440E-04 2.1454E-01 2.1454E-01 1.7823E-01 1.9997E-01 1.9997E-01 0.0000E+00 0.0000E+00
15 -1.8034E-01 -2.6410E-01 2.1454E-01 5.8440E-04 -2.1454E-01 1.9997E-01 1.7823E-01 -1.9997E-01 0.0000E+00 0.0000E+00
16 -1.8034E-01 -2.6410E-01 2.1454E-01 -2.1454E-01 5.8440E-04 1.9997E-01 -1.9997E-01 1.7823E-01 0.0000E+00 0.0000E+00

11 12 13 14 15 16
11 1.0000E+00
12 0.0000E+00 1.0000E+00
13 0.0000E+00 0.0000E+00 1.0000E+00
14 6.8765E-01 0.0000E+00 0.0000E+00 1.0000E+00
15 0.0000E+00 6.8765E-01 0.0000E+00 0.0000E+00 1.0000E+00
16 0.0000E+00 0.0000E+00 6.8765E-01 0.0000E+00 0.0000E+00 1.0000E+00

FOCK MATRIX - CELL N. 1( 0 0 0)
 
  • #12
I see no line in your file that says
"'OVERLAP MATRIX - CELL N. 1( 0 0 0)"
but I do see
"OVERLAP MATRIX - CELL N. 1( 0 0 0)"
 
  • #13
anorlunda said:
I see no line in your file that says
"'OVERLAP MATRIX - CELL N. 1( 0 0 0)"
but I do see
"OVERLAP MATRIX - CELL N. 1( 0 0 0)"
Sorry, I don't understand. what is it that's missing
 
  • #14
Your code is hard to follow due to your inconsistent indentation. I have added comment lines with numbers in your code that are tied to my further comments below your code.
Fortran:
program open_file
implicit none
integer ::ios
character (len =39) :: str_name
character , allocatable :: command(:)
character (len=200) :: line
integer :: n,i
str_name = 'OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='InAs_bulk_lanl2dz.outp',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"                        
    n = 0                                                                              ! 1 Why are this line and the following lines indented?
    do                                                                                   ! 2 The do loop is not part of the if statement, but the indentation suggests that it is.
        read(10, '(A)', iostat=ios) line
        if (ios/= 0 ) exit
    n=n+1                                                                            ! 3 The line should be indented the same as the line above it.
    end do
print*, "File contains ", n, "commands"
print*, str_name , len_trim(str_name)
    allocate(command(n))                                                  ! 4 Why is this line indented?
    rewind(10)
    do i = 1, n
        read(10,'(A)') command(i)
    end do
    close(10)
    do i=1,n
    if (trim(command(i)) /= trim(str_name)) then             ! 5 Should be indented
    print *, "target"                                                         
        else                                                                        ! 6 Else should be aligned with the if statement it matches
            !print*, command(i), str_name
            print*, "not found"
    endif
enddo
ENDPROGRAM                                                              ! 7 I'm not sure this will compile without a space between end and program

Here is how I would lay out your program to make it more readable:
Fortran:
program open_file
implicit none
integer ::ios
character (len =39) :: str_name
character , allocatable :: command(:)
character (len=200) :: line
integer :: n,i
str_name = 'OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='InAs_bulk_lanl2dz.outp',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"
n = 0
do
     read(10, '(A)', iostat=ios) line
     if (ios/= 0 ) exit
     n=n+1
end do
print*, "File contains ", n, "commands"
print*, str_name , len_trim(str_name)
allocate(command(n))
rewind(10)
do i = 1, n
    read(10,'(A)') command(i)
end do
close(10)
do i=1,n
    if (trim(command(i)) /= trim(str_name)) then
        print *, "target"
    else
       !print*, command(i), str_name
       print*, "not found"
   endif
enddo
END PROGRAM open_file
 
  • #15
Mark44 said:
Your code is hard to follow due to your inconsistent indentation. I have added comment lines with numbers in your code that are tied to my further comments below your code.
Fortran:
program open_file
implicit none
integer ::ios
character (len =39) :: str_name
character , allocatable :: command(:)
character (len=200) :: line
integer :: n,i
str_name = 'OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='InAs_bulk_lanl2dz.outp',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"                       
    n = 0                                                                              ! 1 Why are this line and the following lines indented?
    do                                                                                   ! 2 The do loop is not part of the if statement, but the indentation suggests that it is.
        read(10, '(A)', iostat=ios) line
        if (ios/= 0 ) exit
    n=n+1                                                                            ! 3 The line should be indented the same as the line above it.
    end do
print*, "File contains ", n, "commands"
print*, str_name , len_trim(str_name)
    allocate(command(n))                                                  ! 4 Why is this line indented?
    rewind(10)
    do i = 1, n
        read(10,'(A)') command(i)
    end do
    close(10)
    do i=1,n
    if (trim(command(i)) /= trim(str_name)) then             ! 5 Should be indented
    print *, "target"                                                        
        else                                                                        ! 6 Else should be aligned with the if statement it matches
            !print*, command(i), str_name
            print*, "not found"
    endif
enddo
ENDPROGRAM                                                              ! 7 I'm not sure this will compile without a space between end and program

Here is how I would lay out your program to make it more readable:
Fortran:
program open_file
implicit none
integer ::ios
character (len =39) :: str_name
character , allocatable :: command(:)
character (len=200) :: line
integer :: n,i
str_name = 'OVERLAP MATRIX - CELL N.   1(  0  0  0)'
open(unit=10,FILE='InAs_bulk_lanl2dz.outp',iostat=ios)
if ( ios /= 0 ) stop "Error opening inputfile"
n = 0
do
     read(10, '(A)', iostat=ios) line
     if (ios/= 0 ) exit
     n=n+1
end do
print*, "File contains ", n, "commands"
print*, str_name , len_trim(str_name)
allocate(command(n))
rewind(10)
do i = 1, n
    read(10,'(A)') command(i)
end do
close(10)
do i=1,n
    if (trim(command(i)) /= trim(str_name)) then
        print *, "target"
    else
       !print*, command(i), str_name
       print*, "not found"
   endif
enddo
END PROGRAM open_file
thank you for the quick edit, but even then the code doesn't read the said line. Can you tell me where is it I am going wrong
 
  • #16
The formatting in these forum posts have hidden that the text being searched for has a different number of blank spaces than the line in the file. That causes the match to fail. Make sure that your copy and paste is exact.

For many things like this, I like to use small Perl programs (Python would also work) to pre-process input files so that the FORTRAN code does not have to do string searches or manipulation. Those languages (especially Perl) are very good at those tasks. The pre-processing programs can be called from the FORTRAN program in a system call or vice versa. I always use the scripting language (Perl/Python/Bash etc.) as the top level program.
 
  • #17

Similar threads

  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
4K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 58 ·
2
Replies
58
Views
5K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 5 ·
Replies
5
Views
5K
  • · Replies 1 ·
Replies
1
Views
2K