Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

MPI in Fortran

  1. Jun 12, 2009 #1
    I'm just starting to use MPI in Fortran90 and have some questions on improving performance.

    At the start of a calculation, I run a stage to distribute all required data to the slave nodes. This is a one-off step, but seems to be taking a long time.

    The code running on the master node looks something like:

    DO i_node = 1, nnodes
    CALL MPI_SEND(.....)
    CALL MPI_SEND(.....)
    END DO

    The number of data items, and hence calls to MPI_SEND, is of order 50.

    Would it better to use MPI_BCAST instead of including the MPI_SENDs in a loop over i_node? If so, do the slaves need to use MPI_RECV or MPI_BCAST to receive the data?

    Would using un-blocked sends be better? I am assuming I'd then need to include several MPI_WAIT statements after the loop. If I were to use un-blocked sends, would each call in the loop need a different process id number, to be later referred to in a unique call to MPI_WAIT?

    Are there any benefits of using MPI_PACK to combine all the data items, then using a single send to transmit that (with it being unpacked when reaching the slave nodes)?

    Apologies if the above seems like a lot of questions! Any help very much appreciated!!
  2. jcsd
  3. Jun 15, 2009 #2


    User Avatar

    Are you sure this is what is taking a long time? I use MPI in C, not Fortran, but what is slow in a MPI program is starting all the processes. This can take up to a minute sometimes when there's a lot of them.

    Try adding a Barrier before the Send loop, and after the barrier print something on the screen.

    Also, if the number of tasks you are distributing isn't equal to the number of slave processes, then this isn't a very good way to do it, it's better to send each process one task, then use Iprobe in a loop to check when a process has finished, Recv the result and send it another task, and do this until you get all the results (I'm assuming the tasks are independent of each other).

    I just hope those 50 sends aren't different pieces of the same task that you are sending 1 by 1 instead of putting them in a single package...
    Last edited: Jun 15, 2009
  4. Jun 16, 2009 #3


    User Avatar
    Science Advisor

    I believe our in-house CFD code uses a BCAST at the start of the run. I'm not sure how exactly your sending your messages, but make sure they're in 1D packed arrays. MUCH faster
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook