- #1
SFA10
- 2
- 0
I'm just starting to use MPI in Fortran90 and have some questions on improving performance.
At the start of a calculation, I run a stage to distribute all required data to the slave nodes. This is a one-off step, but seems to be taking a long time.
The code running on the master node looks something like:
DO i_node = 1, nnodes
CALL MPI_SEND(...)
...
CALL MPI_SEND(...)
END DO
The number of data items, and hence calls to MPI_SEND, is of order 50.
Would it better to use MPI_BCAST instead of including the MPI_SENDs in a loop over i_node? If so, do the slaves need to use MPI_RECV or MPI_BCAST to receive the data?
Would using un-blocked sends be better? I am assuming I'd then need to include several MPI_WAIT statements after the loop. If I were to use un-blocked sends, would each call in the loop need a different process id number, to be later referred to in a unique call to MPI_WAIT?
Are there any benefits of using MPI_PACK to combine all the data items, then using a single send to transmit that (with it being unpacked when reaching the slave nodes)?
Apologies if the above seems like a lot of questions! Any help very much appreciated!
At the start of a calculation, I run a stage to distribute all required data to the slave nodes. This is a one-off step, but seems to be taking a long time.
The code running on the master node looks something like:
DO i_node = 1, nnodes
CALL MPI_SEND(...)
...
CALL MPI_SEND(...)
END DO
The number of data items, and hence calls to MPI_SEND, is of order 50.
Would it better to use MPI_BCAST instead of including the MPI_SENDs in a loop over i_node? If so, do the slaves need to use MPI_RECV or MPI_BCAST to receive the data?
Would using un-blocked sends be better? I am assuming I'd then need to include several MPI_WAIT statements after the loop. If I were to use un-blocked sends, would each call in the loop need a different process id number, to be later referred to in a unique call to MPI_WAIT?
Are there any benefits of using MPI_PACK to combine all the data items, then using a single send to transmit that (with it being unpacked when reaching the slave nodes)?
Apologies if the above seems like a lot of questions! Any help very much appreciated!