Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

[FORTRAN] Writing to same file from multiple processors?

  1. Oct 5, 2011 #1
    I am working in FORTRAN
    I need to avoid writing to the same file at the same time while working in a parallel environment.

    I have a program (that I have not written and cannot edit) that calls a subroutine (that I have written and can edit). The subroutine is called over and over again by several processors working at the same time. I am having issues with more than one processor writing to the same file at the same time.

    I need a way to avoid writing to the file at the same time.

    The best way that I can think of is to break the file into several files; one per processor and name them by processor. However, I need some way to determine which processor is doing the writing.

    Does anyone know how to do that?
    Or does anyone have any other suggestions?

    Thank you,
  2. jcsd
  3. Oct 5, 2011 #2
    I think there are many methods to solve this. From the top of my mind - allocate a specified memory space of flags, that each processor triggers while writing.
  4. Oct 5, 2011 #3
    I am still very new to FORTRAN and from many people's view programming in general. When you say a "memory space" I am not sure what you mean.

    Can you explain this suggestion in more detail?

    If my notion of memory space is correct then I should remind you that I have no control over the program that is calling my subroutine.
  5. Oct 5, 2011 #4
    I think I got it. The program that calls my subroutine keeps the same process ID for each processor running the program. I think I can use the getpid() function from FORTRAN and write to a file named based on the integer that getpid() returns.

    Thank you for your time,
    Timothy Van Rhein
  6. Oct 5, 2011 #5


    User Avatar
    Homework Helper

    I was thinking you could use some type of messaging system to send all writes to a single thread that the subroutine creates (requires a global variable to know when the first call is made to know when to create the thread), but without having the main program calling a second subroutine when the main program has completed, there would be no way to flush out all the pending writes and close the file.

    Even with the multiple file scheme based on getpid(), how will your program know when to close all those files, or will your program open a file, append data, and close a file on each call?

    Another advantage of writing multiple files is if the program is generating data faster than a single hard drive can write. In this case, if you have multiple hard drives, you can keep the separate files on separate hard drives.
    Last edited: Oct 5, 2011
  7. Oct 5, 2011 #6
    The subroutine will open a file, append data, and close a file on each call. I did not think about using multiple hard drives. I will consider that.

    Thanks for the reply,
    Timothy Van Rhein
  8. Oct 5, 2011 #7


    User Avatar
    Homework Helper

    An issue with this is how to map each process id into a hard drive and file name when you don't know the process ids in advance. I assume process ids are 32 bit (or larger) values, so you can't use a huge array to do the mapping. You can use a global array with a size equal to the maximum number of processes the program could be running at one time, then search for and/or add (if no id found) to the array, each time the subroutine is called. Then use the index to the entry in the array with the current process id as part of the hard drive / file name. Unless there are a large number of processes, the overhead of this seach and/or add scheme would be small compared to the actual write time to the hard drives. You'll also need a global index for the next available (empty) entry in the array. This global index would be initially be zero, indicating an empty process id array. The search loop would only search for indexes less than the global index (so the first time it's called, there is no search because the global index is initially zero, indicating an empty array). Each time you add an process id to the array, you'd increment the global index.
    Last edited: Oct 6, 2011
  9. Oct 6, 2011 #8
    Hmmmm... With these issues in mind I may not write to multiple drives. I only have a maximum of three drive available. One hard drive in the tower and two externals. I could also use network drives but I question whether that would be any faster at all. I really do not have any experience doing these things.

    Thanks for the thought,
  10. Oct 6, 2011 #9


    User Avatar
    Homework Helper

    Depends on the speed of the external interface or network, and the overhead involved. I don't know how fast data is being generated by your program, but assuming it isn't capturing data in real time from some instrumented device or other fixed rate device at a high data rate it shouldn't be a problem, other than it may throttle the rate at which the program runs if the program generates data faster than the writes can occur. If speed was an issue, you could consider a raid setup in your tower to utilize multiple hard drives.

    I assume your library functions for open, write, and close file support a multi-processing pre-emptive environment, if not, you would need to disable multi-threading and/or use some type of semaphore lock during file operations, or use the messaging to a single process method I mentioned before.
  11. Oct 6, 2011 #10
    I am pretty sure the calling program takes care of all of that for me. I am running it and I don't seem to be running into any problems except for generating too much data. I need to make some revisions.

    Thanks for the help,
    Timothy Van Rhein
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook