Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

The exact reading of a text (fortran 90)

  1. May 25, 2014 #1
    Hello everyone,
    I am making an RSA algorithm (cryptography) with my little knowledge and I am having serious problems with the reading of the text.
    I am making it with a simple text reading (no raw format) and the main problem is the length of each line.
    For example in this text:
    ___________________________________
    Hello world, i'm doing a text to show you

    an easy example
    ___________________________________

    the first line has 42 characters, the second 0 and the third 16.
    What I normally do it's making all the lines of lenght 42 for example (m=42) and read it with a string per line or with a matrix (character type).
    Like this
    Code (Text):

    do i=1,n !n-->lines,(i know how to read all the lines but no the lenght)
      read(unit="whatever",fmt="(42a1)"),(x(i,j),j=1,m)
    enddo
     
    But the problems is that the matrix has a lot of "spaces" characters that I dont want to read! (the right spaces, no all the spaces)
    I am trying to do like a matrix with a "flexible" length or something (dont know if exists) to read each line exactly

    If you cant help me this way, can you help me with a good link to learn how the raw read/write works? (i suppose that this would help me, maybe I'm wrong too)
     
  2. jcsd
  3. May 25, 2014 #2
    The first thing to remember is that, when reading in Fortran, commans and blankspaces act as value separators. So, in order to read and entire string (including blankspaces) as a single value and into a single variable, you need to specify the format and tell it to read all 40, 80 characters or whatever widest row you may have...and your string variable needs to be just as wide.

    The best thing is to read entire lines at a time and let Fortran figure that for you (i.e., where the lines end); then, you come back and inspect the line and figure out how wide it really is.

    Here is a program that reads lines of variable lengths and populates your matrix.

    Code (Text):

    program rrr
        integer, parameter :: nrows = 3, ncols = 45

        character(len=20) :: fmt    
        character(len=ncols) :: line
        character, dimension(nrows,ncols) :: x    
       
        write(fmt,*) "(a", ncols, ")"
       
        x = ''
        do i = 1,nrows
            read(*,fmt) line
            do j = 1, len_trim(line)
                x(i,j) = line(j:j)
            end do
        end do
       
        write(*,*) "x = "
        do i = 1,nrows
            write(*,*) (x(i,j),j=1,ncols)
        end do

    end program rrr
     
    but I am not 100% sure of what you want to achieve...whether you would be happy with an arrays of strings or a matrix of characters or what.

    In any case, I hope the program above gives you a couple of hints (and tricks) on how to about it.

    gsal
     
  4. May 25, 2014 #3

    AlephZero

    User Avatar
    Science Advisor
    Homework Helper

    If you want to read exactly what is on the disk file, the only option in Fortran 90 is to open the file for unformatted random access with a record length of 1 character, and then you can read one character at a time.

    There is a slightly nicer option in Fortran 2008 using "stream I/O", which just treats the file as a string of characters and ignores the ends of the lines. Some compilers (e.g. GNU gforrtran) have that option even though they are don't support all of Fortran 2008.

    These solutions are operating system dependent, because different systems represent "end of line" in different ways on the disk. Linux uses one character, but Windows uses two.

    The best OS-indepedent way is to use a C routine to read the data instead of Fortran. Many Fortan compilers have a non-standard built in function called GETC or FGETC which calls the corresponding C library routine.

    gsal's code "nearly works", except it can't tell you when the last character(s) in a line really were blank characters that were stored on the disk. For many purposes that doesn't matter, but it's not telling you exactly what was on the disk file. It also depends on knowing the maximum length of any line in the file. If the character string is too short to hold the whole line, you won't get any error messages.
     
    Last edited: May 25, 2014
  5. May 25, 2014 #4
    Thank you, both of you. Your answers are very helpful !

    The problem with the solution of gsal is that when writing the matrix, all the matrix columns have the same lines, and also the backspace character. This makes my program very slow (makes the cryptography operation with each character), but thanks to your answer, I realized that I do not need to read all the text in a row, but per lines and with the "len_trim" that you write.

    On the other hand, the answer of AlephZero clear my mind about all the info I search in the internet about this matter. I did not knew the difference of "unformatted" and "I/O stream". I think that this solution would be the best, talking about cryptography, but it is a little out of my hands, so I would try what I say earlier.
     
  6. May 25, 2014 #5
    Sorry that my solution cannot be perfect; after all, I do not know exactly what you are trying to do.

    In any case, I am not sure if you figured it out or not, but the way I typically work is with standard input and standard output...my program, as-is, works just fine if run with input re-direction:
    Code (Text):

    rrr.exe < input.file
     
    The writing of the matrix is correct, too...did you run my entire program as-is, with input re-direction? or did you try to incorporate stuff into your version? If your program is not working correct, maybe your should post the entire source and let us take at look at it.

    Or whatever...I think by now you probably know better how to go about it.
     
  7. May 26, 2014 #6
    No, no, your program works really fine!
    But I mean that in my program I use an old text and make a difficult operation with each character to form another text. With the program you write the output is in the screen, making it imposible for me. And I try to modify to make the output in a text, but the matrix "X" maintains the backspaces (in the new text).

    But your program really is fine. I know that I did not ask the question with all my conditions from the beginning, but I was a mess before I write in this post. I was asking something (not exactly) to hear the answers and see if it could help me and they really help me. Thanks.
     
    Last edited: May 26, 2014
  8. May 26, 2014 #7
    I don't know Fortran beyond 90.

    And I don't know cryptography...are you trying to encrypt a file? I presume that for encrypting a file, you really need to process the entire file in one shot? Correct? 'cause if encryption happens one line at a time, it could easily be reverse engineered?

    In any case, I presume that if we were using tcl, python, perl or stuff like that, the first thing that one would do is to read the entire file into single string...so, to that end, AlephZero recommendation sounds good, if you can go beyond Fortran 90.

    In ANY case, I don't quite understand your statement when you say "all the matrix columns have the same lines, and also the backspace character"...do you mean to say, all the rows have the same number of columns? Even if they are empty? ("backspace"-> blank space?).

    If that is the case....well, yeah, a matrix is a matrix and it is typically of rectangular shape...if not being stored via some sparse technique.

    If you would like to use some kind of matrix, where the rows (lines of text) are NOT the same length and possibly zero length when empty, then, you need what is typically referred to as "ragged" matrix.

    Immediately, two choices come to mind:
    1) an array of variable length character strings.
    2) an array of variable length character arrays.

    Because your original intent of using a matrix, I implement choice 2), above, in the program below; it uses a custom type definition and a pointer.

    Code (Text):

    program rrr
        integer, parameter :: nmax = 2000, ncols = 150

        type array
            character, dimension(:), pointer :: x
        end type array
        type(array), dimension(nmax) :: ragged    

        integer :: i, n, ios
        character(len=20) :: fmt    
        character(len=ncols) :: line
       
        write(fmt,*) "(a", ncols, ")"    

        nrows = 0
        do
            read(*,fmt,iostat=ios) line
            if (ios /= 0) exit
            if (len_trim(line) == 0) then        
    !            nrows = nrows+1
    !            allocate( ragged(nrows)%x(1) )
    !            ragged(nrows)%x(1) = ""
            else
                nrows = nrows+1
                allocate( ragged(nrows)%x(len_trim(line)) )
                do j = 1, len_trim(line)
                    ragged(nrows)%x(j) = line(j:j)
                end do
            end if
        end do

        do i = 1,nrows
            write(*,'("Row #",i2," is",i3," characters long")') i, size(ragged(i)%x,dim=1)
        end do
       
        write(*,*)
        write(*,*) "============================================"
        write(*,*) "Replicating, x = "
        do i = 1,nrows
            write(*,*) (ragged(i)%x(j), j=1, size(ragged(i)%x,dim=1) )
        end do
        write(*,*) "============================================"
       
    end program rrr
     
    I increased the parameters to allow for the widest line in the file to be 150 characters and the total number of lines in the file of 2000...but this is only because I am using re-direction; if you do not use re-direction and instead actually open the file inside the code, you can open the file, count the number of line, close it, allocate memory, re-open file, read it...you get the point.

    The code as shown ignores blank lines; un-comment those 3 lines and it puts them back as length 1 with the empty character in them.
     
  9. May 26, 2014 #8
    Hello, sorry for asking again, but I have a problem with the tip gsal gave me.

    I make a program that takes a text and makes an output of numbers (exactly with the lenght of the characters) like that

    Code (Text):

    !to read the number of lines in the text
    open(unit=111,file="text.txt",action="read",status="old")
    nrows=0
       do
         read(unit=111,fmt="(1a)",iostat=s) line
         if (s<0) then
           exit
         endif
         nrows=nrows+1
       enddo
    close(unit=111)

    open(unit=111,file="text.txt",action="read",status="old")
    open(unit=112,file="cyber.txt",action="write",status="replace")

    ncols=1000

    allocate (x(ncols),x1(cols),y(ncols))

       do i=1,nrows
         read(unit=111,fmt="(a1000)"),line
         do j=1,len_trim(line)
           x(j)=line(j:j)
           x1(j)=ichar(x(j))
           call encrypt(x1(j),E,N1,y(j))   !-->make a operation to transform into another number (y)
         enddo
         write(unit=112,fmt="(1000i10)"),(y(j),j=1,len_trim(line))
       enddo

    deallocate (x,x1,y)
    close(unit=111)
    close(unit=112)
     
    This works perfectly fine and makes a text with the format "(1000i10)" (if you want to see, I put it like attachment)
    But the problem is later reading the text of numbers, I read it like this:

    Code (Text):

    !some definitions
    integer,dimension(:),allocatable::x1,y1
    character(len=1),dimension(:),allocatable::x

    !to read the number of lines again
    open(unit=111,file="text.txt",action="read",status="old")
    nrows=0
       do
         read(unit=111,fmt="(1a)",iostat=s) line
         if (s<0) then
           exit
         endif
         nrows=nrows+1
       enddo
    close(unit=111)

    open(unit=111,file="cyber.txt",action="read",status="old")
    open(unit=112,file="new_text.txt",action="write",status="replace")

    allocate(x(cols),x1(cols),y(cols))

     do i=1,nrows
         read(unit=111,fmt="(a10000)"),line  !--> make it with 10000, because the for each character,
                                                             !     makes 10 numbers (or spaces)
        do j=1,len_trim(line)/10                    !-->for the same reason
            read(unit=111,fmt="(1000i10)"),y(j)
            call decrypt(y(j),P,Q,D,x1(j))         !--> makes an operation with number y to became another number
            x(j)=achar(x1(j))
         enddo
         write(unit=112,fmt="(a1000)"),(x(j),j=1,len_trim(line))
       enddo

    close(unit=111)
    close(unit=112)
    deallocate(x,x1,y)
     
    It gives me the error of
    "Traceback: not available, compile with -ftrace=frame or -ftrace=full
    Fortran runtime error: End of file"
    but i dont know why its the end of file before it makes all. I try to put iostat to see where its the problem, and it seems that it is in "i=1", the first try?!?

    I know it is a lot to see, if you can't help me, I understand, it is my duty after all XD.
     

    Attached Files:

  10. May 26, 2014 #9
    In your first piece of source code, above, I noticed:
    • I don't see where you declare x, x1, y
    • the first do-loop, determines nrows. Yet, you do not seem to make any kind of allocation based on nrows; that means, there is no need to open the file, read, close...may as well just read it once until the end of file.
    • the allocate line seems to have a typo...allocate (x(ncols),x1(cols),y(ncols))...is cols (without the n) defined before? is it non-zero?
    • the second loop ignores blank lines...hope you are doing that on purpose.

    In your second piece of source code:
    • at the top, you declare x1 and y1, but I don't see where you declare y...which you later allocate

    Other than that, I don't have your entire program (and workflow) to carry out any kind of testing and debugging of my own.
     
  11. May 26, 2014 #10
    Sorry about that, I was trying to explain a little bit (because my program isnt in english) and I make a few mistakes. I attach the real program (I added some explanations with the ! simbol). But maybe its too long to read.
    Thanks for the help
    In the meantime, I will see if I can do something regarding your suggestions

    I compile it with g95 like this

    g95 -o rsa5.x big_integer_module.f95 rsa5.f95
     

    Attached Files:

  12. May 26, 2014 #11
    Regarding your answer, the second loop ignores the blank lines? its not the same as the first? I dont see the difference...
     
  13. May 26, 2014 #12
    The difference is that the second look has an inner loop "j=1,len_trim(line)" which will not be executed if the line is empty or made up of blank spaces. Thankfully, the write statement that follows does not execute, either, as the implied loop had the same limits...otherwise, you would have been writing the previous y.
     
  14. May 26, 2014 #13
    compiled the program...running it with 1 to create keys...it is not working...it looks like the key_generator call just does not return as I am never asked for the number to encrypt the file...

    ...or attache a couple of private and public keys for me.
     
  15. May 26, 2014 #14
    hhhmmm...it looks like I was able to compile the program and able to run it and encode a message.

    But, I had to change a couple of things.

    First, because it was getting stuck while generating keys, I moved the random_seed() call out of the "prime_number_aurkite()" subroutine and placed it at the top of the main program so that it only gets called once per run....otherwise, because it was being called without arguments, I think it might have been generating the same number every time and generating the same Pand Q and never getting out of the loop.

    Also, I did not like that in your sub "prime_number_aurkite()", you are allocating Y(N), which defines Y as an array with items from 1:N...but next you do: Y(0)=3! In other words, you assign to index 0 (zero).

    After I was able to create keys and encrypt a file, I was not able to unencrypt...

    first thing was that I got an end-of-file error, which I think it is because you try to read twice as many times from the file...
    shoudln't your code
    Code (Text):

       do i=1,n
         read(unit=111,fmt="(a10000)"),line
         do j=1,len_trim(line)/10
           read([COLOR="Red"][b]unit=111[/b][/COLOR],fmt="(1000i10)"),y(j)
           call decrypt(y(j),P,Q,D,x1(j))
           x(j)=achar(x1(j))
         enddo
         write(unit=112,fmt="(a1000)"),(x(j),j=1,len_trim(line))
       enddo
     
    say
    Code (Text):

       do i=1,n
         read(unit=111,fmt="(a10000)"),line
         do j=1,len_trim(line)/10
           read([COLOR="Red"][b]line[/b][/color],fmt="(1000i10)"),y(j)
           call decrypt(y(j),P,Q,D,x1(j))
           x(j)=achar(x1(j))
         enddo
         write(unit=112,fmt="(a1000)"),(x(j),j=1,len_trim(line))
       enddo
     
    Anyway, that's where I am...and can't unencrypt the encrypted code...the program terminates, but the output file is very large and does not have much.
     
  16. May 26, 2014 #15
    Anyways, thanks for all your help.
    I was compiling the program in a docfis server, and in this server the program runs always good, but sometimes not in my private computer. So maybe that is way you couldnt run it in the first attempt.
    I will try to see if I can do more, and I am also searching like you said before about the I/O stream.
     
  17. May 26, 2014 #16
    dont know why, but it reads the first number x m (m times) where m=columns
     
  18. May 26, 2014 #17
    I think its done, if thats the problem of all the program I will let you know
     
  19. May 26, 2014 #18
    Finally!! The only change I made is in the part that make the error, I made some changes to make the paragraphs characters (like you said) and the problem I mention about the bad reading. Like this:
    Code (Text):

       do i=1,n
         read(unit=111,fmt="(a10000)"),line1
        if (len_trim(line1)/=0) then
         read(line1,fmt="(1000i10)"),(y(j),j=1,len_trim(line1)/10)
         do j=1,len_trim(line1)/10
           call decrypt(y(j),P,Q,D,x1(j))
           x(j)=achar(x1(j))
         enddo
         write(unit=112,fmt="(1000a)"),(x(j),j=1,len_trim(line1)/10)
        else if (len_trim(line1)==0) then
         write(unit=112,fmt="(1a)"),''
        endif
       enddo
     
    The only important change I made was that the
    Code (Text):

    do j=1,len_trim(line)
      read(line,fmt="(1000i10)"),y(j)
    enddo
     
    that i write it like this
    Code (Text):

    read(line,fmt="(1000i10)"),(y(j),j=1,len_trim(line))
     
    About the reverse engineering you mention, why is that? what is the difference between reading all in one and like me with each line?
     
  20. May 26, 2014 #19
    I see the problem...in the read statements.

    Here is a shortcut...forget about reading the line as text first and THEN trying to read integers out of it...in this case, it is simpler to read all integers in one shot into the integer array and be done with it.

    Inside the clause "else if (zenbakia==3) then", this is what the do-loop that works for me looks like
    Code (Text):

       do i=1,n
         read(111,'(1000i10)' ) y
         k = 1
         do while (y(k) > 0)
           call decrypt(y(k),P,Q,D,x1(k))
           x(k)=achar(x1(k))
           write(112,'(a1,$)') x(k)
           k = k + 1
         end do
         write(112,*)
       enddo
     
    I can now unencrypt the encrypted file...how do you say it anyway? Unencrypt? Uncrypt? Decrypt?

    Anyway...that's my solution.

    gsal
     
  21. May 26, 2014 #20
    missed you for 2 minutes...saw your solution...I like mine better ;-) ...so much briefer, no divide by 10 or anything like that...
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: The exact reading of a text (fortran 90)
Loading...