- #1
Telemachus
- 835
- 30
Hi. I'm trying to parallelize my code. I am new at MPI, I'm learning the basics.
I want to use a domain decomposition strategy in my code. I've been reading a bit, and wanted to use this subroutine to exchange points between neighbors. I've started by trying to modify a code presented by Gropp in his book. I'll give the entire code, which is publicly available at his webpage, and then I explain the modification I want to do, and the problems I'm facing.
This code implements a domain decomposition strategy to solve the Poisson equation in 2D. It uses a 1D domain decomposition, which means that the domain is decomposed into "slices", the original array a(maxn,maxn) is decomposed in different processes which handles the subarrays a(maxn,s:e).
This is the code:
So now I would like to modify the subroutine that makes the exchange between neighbours. Each sweep is carried on by independent processes, and the exchange of "ghost points" is used to give the information in one process about the neighbor processes.
Now there is only one ghost point that is being exchanged between neighbors at each boundary of the subdomains. The corresponding subroutine is:
I would like to change this subroutine to exchange more than a single point, let's say four points, for example. However, when I try to do this, I obtain error messages. This is my modified subroutine:
However, when the program calls this subroutine, the error appears:Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
I know I'm doing something silly, probably I'm not handling correctly the size of the arrays. Maybe it has to do with something more subtle, like how I am tagging the processes, I'm not sure. I thought that perhaps somebody here could help me.
I don't know if it is clear what the idea is. I would like to extend each of the subdomains in such a way that they overlap by four points at each boundary, and then exchange those points between neighbors.
There are many things that I'm not understanding. When the send and receive is performed, I don't understand how each process knows in which part of the array it has to put the received information. This are just overwritten over the overlapping part of the arrays? I mean, the shared elements have the same indices, so when received, it overlaps those points in the part of the array corresponding to that process?
Thanks in advance.
I want to use a domain decomposition strategy in my code. I've been reading a bit, and wanted to use this subroutine to exchange points between neighbors. I've started by trying to modify a code presented by Gropp in his book. I'll give the entire code, which is publicly available at his webpage, and then I explain the modification I want to do, and the problems I'm facing.
This code implements a domain decomposition strategy to solve the Poisson equation in 2D. It uses a 1D domain decomposition, which means that the domain is decomposed into "slices", the original array a(maxn,maxn) is decomposed in different processes which handles the subarrays a(maxn,s:e).
This is the code:
Fortran:
!*******************************************************************
! oned.f90 - a solution to the Poisson problem using Jacobi
! interation on a 1-d decomposition
!
! The size of the domain is read by processor 0 and broadcast to
! all other processors. The Jacobi iteration is run until the
! change in successive elements is small or a maximum number of
! iterations is reached. The difference is printed out at each
! step.
!*******************************************************************
program main
!
include "mpif.h"
integer maxn
parameter (maxn = 1024+2)
double precision a(maxn,maxn), b(maxn,maxn), f(maxn,maxn)
integer nx, ny
integer myid, numprocs, ierr
integer comm1d, nbrbottom, nbrtop, s, e, it
double precision diff, diffnorm, dwork
double precision t1, t2
external diff
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
!
if (myid .eq. 0) then
!
! Get the size of the problem
!
! print *, 'Enter nx'
! read *, nx
nx = 1024
endif
call MPI_BCAST(nx,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
ny = nx
!
! Get a new communicator for a decomposition of the domain
!
call MPI_CART_CREATE(MPI_COMM_WORLD, 1, numprocs, .false., &
.true., comm1d, ierr )
!
! Get my position in this communicator, and my neighbors
!
call MPI_COMM_RANK( comm1d, myid, ierr )
call MPI_Cart_shift( comm1d, 0, 1, nbrbottom, nbrtop, ierr )
!
! Compute the actual decomposition
!
call MPE_DECOMP1D( ny, numprocs, myid, s, e )
!
! Initialize the right-hand-side (f) and the initial solution guess (a)
!
call onedinit( a, b, f, nx, s, e )
!
! Actually do the computation. Note the use of a collective operation to
! check for convergence, and a do-loop to bound the number of iterations.
!
call MPI_BARRIER( MPI_COMM_WORLD, ierr )
t1 = MPI_WTIME()
do it=1, 10000
! do it=1, 2
call exchng1( a, nx, s, e, comm1d, nbrbottom, nbrtop )
call sweep1d( a, f, nx, s, e, b )
call exchng1( b, nx, s, e, comm1d, nbrbottom, nbrtop )
call sweep1d( b, f, nx, s, e, a )
dwork = diff( a, b, nx, s, e )
call MPI_Allreduce( dwork, diffnorm, 1, MPI_DOUBLE_PRECISION, &
MPI_SUM, comm1d, ierr )
if (diffnorm .lt. 1.0e-5) exit
! if (myid .eq. 0) print *, 2*it, ' Difference is ', diffnorm
enddo
if (myid .eq. 0 .and. it .gt. 100) print *, 'Failed to converge'
t2 = MPI_WTIME()
if (myid .eq. 0) then
print *, 'Converged after ', 2*it, ' Iterations in ', t2 - t1, &
' secs '
print *,' Difference is ', diffnorm
endif
!
call MPI_FINALIZE(ierr)
end
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
subroutine exchng1(a, nx, s, e, comm1d, nbrbottom, nbrtop)
use mpi
integer nx, s, e, comm1d, nbrbottom, nbrtop
double precision a(0:nx+1,s-1:e+1)
integer ierr, req(4)
!
call MPI_IRECV(&
a(1,s-1), nx, MPI_DOUBLE_PRECISION, nbrbottom, 0, &
comm1d, req(1), ierr)
call MPI_IRECV(&
a(1,e+1), nx, MPI_DOUBLE_PRECISION, nbrtop, 1, &
comm1d, req(2), ierr)
call MPI_ISEND(&
a(1,e), nx, MPI_DOUBLE_PRECISION, nbrtop, 0, &
comm1d, req(3), ierr)
call MPI_ISEND(&
a(1,s), nx, MPI_DOUBLE_PRECISION, nbrbottom, 1, &
comm1d, req(4), ierr)
!
call MPI_WAITALL(4, req, MPI_STATUSES_IGNORE, ierr)
return
end
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!
! This routine show how to determine the neighbors in a 2-d decomposition of
! the domain. This assumes that MPI_Cart_create has already been called
!
subroutine fnd2dnbrs( comm2d, &
nbrleft, nbrright, nbrtop, nbrbottom )
integer comm2d, nbrleft, nbrright, nbrtop, nbrbottom
!
integer ierr
!
call MPI_Cart_shift( comm2d, 0, 1, nbrleft, nbrright, ierr )
call MPI_Cart_shift( comm2d, 1, 1, nbrbottom, nbrtop, ierr )
!
return
end
!
subroutine fnd2ddecomp( comm2d, n, sx, ex, sy, ey )
integer comm2d
integer n, sx, ex, sy, ey
integer dims(2), coords(2), ierr
logical periods(2)
!
call MPI_Cart_get( comm2d, 2, dims, periods, coords, ierr )
call MPE_DECOMP1D( n, dims(1), coords(1), sx, ex )
call MPE_DECOMP1D( n, dims(2), coords(2), sy, ey )
!
return
end
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
subroutine onedinit( a, b, f, nx, s, e )
integer nx, s, e
double precision a(0:nx+1, s-1:e+1), b(0:nx+1, s-1:e+1), &
f(0:nx+1, s-1:e+1)
!
integer i, j
!
forall(j=s-1:e+1)
forall( i=0:nx+1)
a(i,j) = 0.0d0
b(i,j) = 0.0d0
f(i,j) = 0.0d0
end forall
end forall
!
! Handle boundary conditions
!
forall(j=s:e)
a(0,j) = 1.0d0
b(0,j) = 1.0d0
a(nx+1,j) = 0.0d0
b(nx+1,j) = 0.0d0
end forall
if (s .eq. 1) then
forall( i=1:nx)
a(i,0) = 1.0d0
b(i,0) = 1.0d0
end forall
endif
!
return
end
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
! This file contains a routine for producing a decomposition of a 1-d array
! when given a number of processors. It may be used in "direct" product
! decomposition. The values returned assume a "global" domain in [1:n]
!
subroutine MPE_DECOMP1D( n, numprocs, myid, s, e )
integer n, numprocs, myid, s, e
integer nlocal
integer deficit
!
nlocal = n / numprocs
s = myid * nlocal + 1
deficit = mod(n,numprocs)
s = s + min(myid,deficit)
if (myid .lt. deficit) then
nlocal = nlocal + 1
endif
e = s + nlocal - 1
if (e .gt. n .or. myid .eq. numprocs-1) e = n
return
end
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!
! Perform a Jacobi sweep for a 1-d decomposition.
! Sweep from a into b
!
subroutine sweep1d( a, f, nx, s, e, b )
integer nx, s, e
double precision a(0:nx+1,s-1:e+1), f(0:nx+1,s-1:e+1), &
b(0:nx+1,s-1:e+1)
!
integer i, j
double precision h
!
h = 1.0d0 / dble(nx+1)
do j=s, e
do i=1, nx
b(i,j) = 0.25 * (a(i-1,j)+a(i,j+1)+a(i,j-1)+a(i+1,j)) - &
h * h * f(i,j)
enddo
enddo
return
end
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!
! The rest of the 1-d program
!
double precision function diff( a, b, nx, s, e )
integer nx, s, e
double precision a(0:nx+1, s-1:e+1), b(0:nx+1, s-1:e+1)
!
double precision sum
integer i, j
!
sum = 0.0d0
do j=s,e
do i=1,nx
sum = sum + (a(i,j) - b(i,j)) ** 2
enddo
enddo
!
diff = sum
return
end
So now I would like to modify the subroutine that makes the exchange between neighbours. Each sweep is carried on by independent processes, and the exchange of "ghost points" is used to give the information in one process about the neighbor processes.
Now there is only one ghost point that is being exchanged between neighbors at each boundary of the subdomains. The corresponding subroutine is:
Fortran:
subroutine exchng1(a, nx, s, e, comm1d, nbrbottom, nbrtop)
use mpi
integer nx, s, e, comm1d, nbrbottom, nbrtop
double precision a(0:nx+1,s-1:e+1)
integer ierr, req(4)
!
call MPI_IRECV(&
a(1,s-1), nx, MPI_DOUBLE_PRECISION, nbrbottom, 0, &
comm1d, req(1), ierr)
call MPI_IRECV(&
a(1,e+1), nx, MPI_DOUBLE_PRECISION, nbrtop, 1, &
comm1d, req(2), ierr)
call MPI_ISEND(&
a(1,e), nx, MPI_DOUBLE_PRECISION, nbrtop, 0, &
comm1d, req(3), ierr)
call MPI_ISEND(&
a(1,s), nx, MPI_DOUBLE_PRECISION, nbrbottom, 1, &
comm1d, req(4), ierr)
!
call MPI_WAITALL(4, req, MPI_STATUSES_IGNORE, ierr)
return
end
I would like to change this subroutine to exchange more than a single point, let's say four points, for example. However, when I try to do this, I obtain error messages. This is my modified subroutine:
Fortran:
subroutine exchng1(a, nx, s, e, comm1d, nbrbottom, nbrtop)
use mpi
integer nx, s, e, comm1d, nbrbottom, nbrtop
double precision a(0:nx+1,s-3:e+3)
integer ierr, req(4)
!
call MPI_IRECV(&
a(1,s-3:s), 4*nx, MPI_DOUBLE_PRECISION, nbrbottom, 0, &
comm1d, req(1), ierr)
call MPI_IRECV(&
a(1,e:e+3), 4*nx, MPI_DOUBLE_PRECISION, nbrtop, 1, &
comm1d, req(2), ierr)
call MPI_ISEND(&
a(1,e:e+3), 4*nx, MPI_DOUBLE_PRECISION, nbrtop, 0, &
comm1d, req(3), ierr)
call MPI_ISEND(&
a(1,s-3:s), 4*nx, MPI_DOUBLE_PRECISION, nbrbottom, 1, &
comm1d, req(4), ierr)
!
call MPI_WAITALL(4, req, MPI_STATUSES_IGNORE, ierr)
return
end
However, when the program calls this subroutine, the error appears:Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
I know I'm doing something silly, probably I'm not handling correctly the size of the arrays. Maybe it has to do with something more subtle, like how I am tagging the processes, I'm not sure. I thought that perhaps somebody here could help me.
I don't know if it is clear what the idea is. I would like to extend each of the subdomains in such a way that they overlap by four points at each boundary, and then exchange those points between neighbors.
There are many things that I'm not understanding. When the send and receive is performed, I don't understand how each process knows in which part of the array it has to put the received information. This are just overwritten over the overlapping part of the arrays? I mean, the shared elements have the same indices, so when received, it overlaps those points in the part of the array corresponding to that process?
Thanks in advance.
Last edited: