Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Developing a simple program to solve the diffusion equation

  1. Oct 16, 2007 #1
    Hi there,
    I'm looking for an algorithm which describes the numeric solution to solve the diffusion equation (1D or 2D).
    I've taken a look on some text book such as Hamilton and Henry ones but I didn't find a simple solution.
    anybody knows about it? Where can I find what I'm looking for?
    Thanks.
     
    Last edited: Oct 17, 2007
  2. jcsd
  3. Oct 17, 2007 #2

    Morbius

    User Avatar
    Science Advisor

    libertad,

    It's trivial to write an algorithm to solve the diffusion equation. In 1-D, the diffusion equation will
    couple together 3 adjacent zones - the diffusion term giving the leakage out the left face of the
    center zone involves a difference between the flux in the zone to the left and the flux in the center
    zone.

    Likewise, the diffusion term giving leakage out the right side involves the flux to the right, and the
    flux in the center zone. Therefore, the total balance equation couples 3 adjacent zones, left, center,
    right for each center zone. If you write the equations for all the zones in matrix form; you get a
    tri-diagonal matrix. You solve that with a simple Gaussian reduction - two sweeps - a forward
    elimination of the lower diagonal, and a backward substitution that solves the equation.

    If you have 2-D; then you couple together 5 zones - left, right, front, back, and center. Again in
    matrix form; this yields a penta-diagonal matrix. Two forward elimination sweeps - one for each of
    the now 2 lower diagonals; followed by a backward substitution.

    Duck soup!!

    Dr. Gregory Greenman
    Physicist
     
  4. Oct 17, 2007 #3
    It's slightly more complicated than Morbius says. I recently had to do a 1-D program and it took me about a week.

    The way to attack the problem of making a program is to divide it into parts. First determine the composition and geometery of the system you're looking at, this will allow you to set up the matrix for the calculations you need to do. Duderstalt and Hamiltion should have a section on numerical solutions, I know Stacey's does since I'm using his; those will show you how to set the matrix up. After you have this, you can go on and do the calculations Morbius mentioned above. It is pretty straight foward, but if you're like me and not great at programming, it will take a while.

    Incidentally, I'm now working on a 2-D code. I'm teaching myself FORTRAN from the basics.
     
  5. Oct 18, 2007 #4

    Morbius

    User Avatar
    Science Advisor

    Candyman,

    Actually it is EXACTLY as easy as I stated. If you can program the matrix setup, and the Gaussian
    reduction - that's all there really is to it. You don't need to go looking in books for the solution strategy;
    it's really easy to figure it out.

    Dr. Gregory Greenman
    Physicist
     
  6. Oct 19, 2007 #5
    Hi Dr. Gregory,
    I actually wanna test some of my parallel computing techniques in coding a program to solve the diffusion equations.
    I'm a comparatively professional developer. If you have any problems in developing software applications don't hesitate to contact me.
     
  7. Oct 19, 2007 #6
    the real power of parallelization comes with higher dimensional diffusion. in which case, a more efficient way of calculating diffusion is through Monte Carlo integration (since the error scales as 1/sqrt(N) rather than with the dimensionality), which indeed parallelizes trivially. for a 1D case i can't see any reason why you would parallelize.

    if you still want to solve it via matrix methods, then look at the ESSL/BLAS/LAPACK libraries and MPI.
     
  8. Oct 19, 2007 #7

    Morbius

    User Avatar
    Science Advisor

    libertad,

    If you want something that runs in parallel [ the Gaussian elimination is not a parallel algorithm ], then
    I would suggest some form of Krylov-subspace solver; e.g. like a conjugate gradient method.

    Mehtods such as Krylov-subspace solvers can be made to run in parallel. Much of the work is
    doing the matrix multiplies.

    However, there will be one or two "dot products" of the solution and/or correction vectors.

    These dot products require a "sync point" - basically an operation called an "all reduce".

    Each processor computes its own portion of the dot product, a "partial sum" for the part of the
    vector that the processor "owns". Then all the processors have to sum all their partial sums to
    the "Boss" processor. The Boss processor then has to broadcast the dot product to all the workers.

    The "step length" in the Krylov solve is typically the quotient of two dot products.

    That's my profession also. I'm a "computational physicist". I write physics simulation programs
    for Lawrence Livermore National Laboratory. All of our work is done in parallel since we have
    computers with as many as 131,000 processors:

    https://asc.llnl.gov/computing_resources/bluegenel/

    LLNL's Center for Advanced Scientific Computing just won one of R&D Magazine's
    "Top 100" Awards in 2007 for the "Hypre" solver library:

    http://www.rdmag.com/ShowPR.aspx?PUBCODE=014&ACCT=1400000100&ISSUE=0707&RELTYPE=PR&ORIGRELTYPE=FE&PRODCODE=00000000&PRODLETT=G&CommonCount=0 [Broken]

    http://www.llnl.gov/str/Oct07/pdfs/10_07.4.pdf

    http://www.llnl.gov/CASC/linear_solvers/

    I thought you might be interested in reading about Hypre.

    Dr. Gregory Greenman
    Physicist
     
    Last edited by a moderator: May 3, 2017
  9. Oct 20, 2007 #8
    Dr. Gregory Greenman

    Thanks for your recommendations. As I got from your note you and your team are doing great paralellization based on distributed processors.
    What I intend to do is paralellization in a local computer with a single processor. I wanna divide my computation tasks to 2 or more seperated threads which are running simultaneously. After doing so I will doing some paralellization on a dual core processor.
    Because of the fact that I'm a nuclear engineer, I intended to test my task on the diffusion equation.
    Nevertheless, It's my honor to get to know more about you and your projects in paralellization.

    Thanks.
     
  10. Oct 20, 2007 #9
    so what does one typically use as a scattering function? (im assuming that we're talking about solving the Boltzmann-Poisson eqn for neutron diffusion?)
     
  11. Oct 20, 2007 #10

    Morbius

    User Avatar
    Science Advisor

    quetzalcoatal,

    First - the VARIANCE scales as 1 / sqrt(N) - NOT the error; and that's for a 0-dimensional quantity
    like the k-effective.

    The question as to whether it is more efficient to use Monte Carlo or deterministic methods depends
    on what you want to calculate - and for a typical reactor problem, deterministic methods beat
    Monte Carlo HANDS DOWN!!!

    This was covered in a session at the American Nuclear Society Topical Meeting in Computational
    Methods held last April in Monterey, California. The session was hosted by Prof. William Martin,
    Chairman of the Nuclear Engineering Program at the University of Michigan.

    The question under discussion was a conjecture at a previous meeting by Dr. Kord Smith of
    Studsvik Scanpower, Idaho Falls, ID. The most used codes for reactor analysis is the suite from
    Studsvik; SIMULATE, CASMO,... authored by Dr. Smith and his colleagues. [ Kord and I both
    did our graduate studies together with Professor Alan F. Henry at MIT in the 1970s ]

    Kord's conjecture is that it will be about 15-20 years before one can use Monte Carlo as the basis
    for reactor reload calculations. The consensus of the session hosted by Professor Martin is that
    Kord's conjecture is about right; maybe 10-15 years at best.

    The reason that deterministic methods win over Monte Carlo has to do with what I said above about
    it depending on what you want to calculate. For something simple like a k-eff calculation; or some
    other 0-dimensional quantity - the scaling is 1 / sqrt(N) for the VARIANCE!!

    However, if one is going to do something like a reactor reload calculation, then the quantity of interest
    is the peak heat flux or some other limit that the calculation has to show to be within the tech specs
    for the reactor.

    The peak heat flux occurs at point in space; which is NOT known a priori. Therefore, although the
    variance scales with the inverse root of the number of particles; there is a very low probability that
    the neutron histories that include the peak heat flux point in any number large enough to give one
    anything approaching an acceptable error.

    That's why the 1 / sqrt(N) is NOT the error. That only gives the scaling; NOT the error.
    There is another coefficient out front of that inverse root N scaling. The scaling with N is
    only half the story.

    For something like a point power, or a point heat flux; that coefficient is BIG. So even though
    you can decrease the error by sqrt(N) by making N - the number of neutron histories large - it will take
    you an absolutely HUGE number of histories before you make a dent in that large coefficient.

    The problem is that the quantity of interest - a peak power or peak heat flux - is a very small part
    of the phase space. Zero-dimensional quantities like k-effective which are integrals over the entire
    problem space converge quite fast with Monte Carlo.

    However, point quantities DO NOT. If one knew the point in question a priori - that is - if the
    question was "How many neutrons will be absorbed by a detector at location (x,y,z)?" - then such a
    question can be solved readily by Adjoint Monte Carlo.

    One uses the Monte Carlo technique to solve the Adjoint Boltzmann Transport Equation. When
    you do that - the roles of sources and sinks of neutrons reverse. That is the number of neutrons that
    will be absorbed by a detector in the forward problem; is equal to an integral over the whole problem
    from "adjoint neutrons" sourced at the detector location. Thus in adjoint space, one again is solving
    for an integral over a large volume of the problem.

    However, one has to know the location of the detector a priori, because one has to source
    "adjoint neutrons" in at that location. In the reactor problem, one doesn't know that a priori.

    Therefore, even with the fastest computers; Monte Carlo loses very badly to deterministic methods
    even if the deterministic methods don't take advantage of parallelism. If the deterministic methods
    are coded to take advantage of what parallelism they can; Monte Carlo loses even more.

    I am currently working on a Monte Carlo code; so if anything, my "bias" should be in favor of the
    Monte Carlo. However, facts are facts; for a reactor calculation, deterministic methods beat
    Monte Carlo hands down, and will continue to do so for many years to come.

    Dr. Gregory Greenman
    Physicist
     
  12. Oct 21, 2007 #11
    Dr. Greenman,

    ok, so you are saying that for this particular problem there is a large scaling prefactor infront of the 1/sqrt(N), and so this currently makes non-stochastic methods desirable. (However, since you say that you are developing a stochastic code I take this to mean that the high levels of parallelism available cheaply now allow you to "make a dent" in that prefactor otherwise you wouldn't bother).

    Unfortunately, we are not always so fortunate...for example, in Quantum Monte Carlo we have no hope of diagonalizing the Hamiltonian, and so we must learn to live with the sampling issues inherent in fermionic systems (due to the antisymmetry of the wavefunction, we run into a "sign problem" whereby calculating averages from the density matrix requires a delicate cancellation of two very large quantities - a situation that is numerically unstable). Ceperely and others have developed methods of nodal sampling (essentially umbrella sampling) that makes this somewhat more tractable, but not much.

    Take path integral MC for example, it truly is taunting since we have a small scaling prefactor, the problem factorizes linearly, and the MC algorithm is trivially parallel. Unfortunately, as i said above, the sampling space is so large (and method of summation is inherently unstable) that unless we can focus around the nodes we are doomed - but if you know where the nodes are (where all of the interesting chemistry happens) then why bother!

    Q

    P.S. - i have used the blue gene at SDSC extensively, but given the SUs for their slower hardware i found it somewhat unattractive for what i am doing. i understand the purpose for which the BG was supposedly designed (FLOPS/kW and FLOPS/m^2) but (a) neither of these things really help me and (b) power efficiency doesn't seem to be stopping anyone else from building highly parallel systems either, i.e. look at TACC's Ranger (which at 504 TFlops will displace LLNL's BG/L from the #1 spot on the top500). i think this spells the end of future bluegene systems. the opteron-based Cray XT3 is also an excellent system. i have nothing against the BG per se, but PPC is largely a failure (i.e. no 64-bit) and IBM is trying to spin it a different way.

    http://www.tacc.utexas.edu/resources/hpcsystems/ [Broken]
     
    Last edited by a moderator: May 3, 2017
  13. Oct 21, 2007 #12

    Morbius

    User Avatar
    Science Advisor

    quetzalcoatal,

    Your assumption that our reason for developing the Monte Carlo code is to take advantage of
    high levels of parallelism. That's incorrect.

    The reason for using the Monte Carlo code is that for the applications that we are interested in;
    there are certain effects that are more easily implemented in a Monte Carlo code. It's not the
    parallelism that gives us an advantage over deterministic methods - it's that certain effects are
    more difficult to implement in the deterministic code.

    Additionally, for our application; unlike the reactor application; the important constraint is not a
    local one; unlike the reactor problem

    Dr. Gregory Greenman
    Physicist
     
  14. Oct 21, 2007 #13

    Morbius

    User Avatar
    Science Advisor

     
    Last edited: Oct 21, 2007
  15. Oct 21, 2007 #14
    Morbius,

    you and i have something in common then: I also have implemented a Monte Carlo code in order to avoid doing many-body molecular dynamics with a potential energy function whose derivative is numerically ill-behaved! (Although, I do not get away completely unscathed since the systems that I am studying via this method exhibit long correlation times, which means lots of "wasted" CPU time - but at least the many-body problem is now tractable!)

    Q
     
  16. Oct 21, 2007 #15
    I kindly disagree. If you read the BG white papers those are EXACTLY the reasons that they have used to achieve the high parallelism.

    At 440 Mhz with 512 MB memory (BG/L), the motive was certainly NOT performance: it was to achieve a low power consumption footprint thus allowing higher levels of parallelism.

    From IBM's architectural white-paper:

    Please see the following for an overview of the blue gene architecture:

    http://www.research.ibm.com/journal/rd/492/gara.html [Broken]

    Q
    P.S. - note that the BG node consume only 20 W, that is note by accident.
     
    Last edited by a moderator: May 3, 2017
  17. Oct 21, 2007 #16
    the first BG/P is not expected to be operational until Spring 2008 (and only 111 TFlops at that, petascale systems will follow later) - Ranger will be up and in use sometime in December, making it #1 at least for a little while :)

    Q

    P.S. - i do not work for TACC, Ionly like their systems :)
     
  18. Oct 21, 2007 #17

    Morbius

    User Avatar
    Science Advisor

    quetzalcoatal,

    That Monte Carlo parallelizes trivially is only trivial when one doesn't think about it too much.

    It would be trivial to parallelize Monte Carlo if one can devote a processor to tracking the entire
    lifecycle of a simulation particle - from birth to death. Unfortunately, that is seldom the case.

    Since a simulation particle can theoretically go anywhere in the problem domain, in order for a
    processor to track a simulation particle from birth to death; it must have a complete description of
    the entire problem geometry. In a machine like Blue Gene/L which has 131,000 processors; full
    utilization of the machine would require 131,000 duplicate representations to reside in memory;
    since each processor must have the complete geometry.

    This would be wasteful of computer memory. No - large Monte Carlo simulations are run on large
    parallel machines using a technique called "domain decomposition". In essence, each processor
    "owns" only a piece of the entire problem. As simulation particles reach the boundary of the domain
    on a given processor, it transits to another domain on another processor, and the simulation particle
    must be "handed off" to the processor that owns the neighboring domain.

    Once processors have to communicate particles back and forth - then one has the same problems
    that one has with a deterministic system. Once the "cradle to grave on a single processor" paradigm
    is broken, one loses the easy parallelism.

    In fact the parallelism problems are WORSE for Monte Carlo. In a deterministic code, if each
    processor has equivalent sized pieces of the problem; then the processors have roughly equivalent
    work. The amount of work a given processor has to do depends on how much of the problem it
    "owns"; and the amount of work does not depend on the solution.

    Therefore, if the domains are parcelled out roughly equally to all processors; then the all the processors
    should arrive at the "sync points" where they need to communicate at roughly equivalent times. This
    makes a "sychronous communication" model viable. That is all processors arrive at the communication
    point at roughly the same time. Therefore, they wait until all processors have arrived, and can then
    exchange the data that is needed across processor boundaries. Additionally, the amount of data
    being shared is symmetric. If processor "i" borders processor "j" at "n" faces; then "i" needs to send
    "j" "n" values for each variable being shared; and vise-versa - "j" needs to send "i" the equivalent
    amount. Each processor needs to know 1 value per face per variable in a 1st order approximation.
    Each processor would have to go 2 values into the adjacent domain for a 2nd order approximation...
    Regardless of order, there's always a symmetric exchange of data.

    That isn't the case in Monte Carlo. In Monte Carlo, there may be a very active region bordering a
    very inactive region. The amount of data that needs to flow is not symmetric. Because processors
    don't have an equal amount of work to do - given varying number of simulation particles; then one
    can't assume that they will all arrrive at a "sync point" nearly simultaneously.

    Therefore, one can't use a "synchronous" communication paradigm unless one is willing to have
    processors waiting a large portion of the time. In this case, an "asynchronous" communication
    model is used. Processors exchange data, not in a determinate way; but in an indeterminate way.
    Processors "buffer" particles as needed and neighoring processors flush these buffers when more
    particles are needed to process.

    The asynchronous communication model also opens the door to potential deadlock conditions if one
    is not careful.

    Additionally, unlike the deterministic case, where processors have roughly equal amounts of work
    to do since the are solving the same equations on roughly equal numbers of zones; Monte Carlo
    simulations inherently are not "load balanced". The workload among the processors may not be
    shared equally.

    One could even the workload and "load balance" by redistributing the regions of space for which a
    given processor "owns". As regions of the problem are transferred between processors in order to
    balance the load; so too the simulation praticles that currently occupy those regions must also be
    transferred. This extensive communication may offset much of the gains to be had from balancing
    the load. Therefore, one should only load balance when it will yield a net improvement in computer
    efficiency. Some of my collegues have figured out how to do that; and have written presented papers
    on the subject. For example, one that was presented in Avignon, France last year is available from
    the Dept. of Energy at:

    http://www.osti.gov/bridge/purl.cover.jsp?purl=/878214-gj95NC/

    One property that one wishes a parallel simulation to have, is that the answers not depend on the
    number of processors on which the simulation was run. If processors "owned" the random number
    seeds, then the history of a given particle would depend on the set of processors that it visited and
    when it did so. Clearly, this would impart a depedence of the simulation results to the number of
    processors involved. In order to sidestep this issue, it is useful for each simulation particle to carry
    its own random number seed. This gives each simulation particle its own pseudo-random number
    sequence, and it won't matter which processor calculates the next number in the sequence.

    However, at the beginning of the problem, the initialization of the random number seeds has to be
    coordinated among the processors in order to avoid duplication.

    All in all, I have coded both deterministic methods, as well as Monte Carlo methods. The complexity
    of the parallelism of the Monte Carlo methods is at least one or two orders of magnitude more complex
    than the deterministic methods.

    Dr. Gregory Greenman
    Physicist
     
    Last edited: Oct 21, 2007
  19. Oct 21, 2007 #18

    Morbius

    User Avatar
    Science Advisor

    quetzalcoatl,

    That's for the UNCLASSIFED systems. You don't know what the CLASSIFED ones are doing.

    Dr. Gregory Greenman
    Physicist
     
  20. Oct 21, 2007 #19
    Morbius,

    Yes, I am familiar with domain decomposition methods - infact I utilize decompositions in solving for the many-bodied polarization states (iterative gauss-seidel solver with preconditioning) of my MC code.

    It sounds like you are focusing on a very particular type of problem - neutron diffusion in some fissile material. However, keep in mind that if one was, for example, solving for a statistical mechanical observable, then doing the trivial parallelization where each processor handles a seperate (indepedent) sample is appropriate - infact, desireable since the goal is to sample as many possible states from the ensemble and this method scales linearly; one could hardly do better for this type of problem! The same is true in path integral MC since one wishes to calculate the density matrix accurately - far from being wasteful, starting independent configurations on many processors separately is the best way to do this!

    Molecular dynamics is more of the case that you describe, since we are propagating the entire system along a trajectory in phase space - in which case the forces/potential are decomposed by domain..there are very effective libraries for doing so (CHARM++ being one) that can take advantage of the various interconnect topologies.

    Keep in mind that "Monte Carlo" does not refer to neutron diffusion - it is nothing more than an efficient way of numerically doing integrals!

    Regards,

    Q
     
  21. Oct 21, 2007 #20
    Morbius,

    I do not know anything about classified systems - I would never have guessed that such large systems would even be classified, since the current top #1 BG at LLNL is certainly not classified (actually, #1, #3 and #6, all NNSA are online at top500.org).

    Q
     
  22. Oct 21, 2007 #21

    Morbius

    User Avatar
    Science Advisor

    quetzalcoatl,

    You sure are batting ZERO today. The issue DEFINITELY WAS performance.

    It was the US Congress that commissioned BlueGene/L; and the stated intent was to recapture
    the #1 slot of the Top500 list for the USA; because at the time the #1 slot had been held by Japan's
    Earth Simulator.

    Evidently you don't understand that it didn't matter HOW IBM achieved the #1 supercomputer
    designation. The charge from the US Congress what that they recapture the #1 title for the USA.

    For someone to say that performance wasn't an issue shows a profound lack of understanding.
    You think I don't already know this!!! I know even more about this than I
    am allowed to tell. Please don't patronize me by implying that I don't know
    about the Blue Gene archietecture.
     
  23. Oct 21, 2007 #22

    Morbius

    User Avatar
    Science Advisor

    Quetzalcoatl,

    WRONG again. The #1 BlueGene/L at LLNL is a CLASSIFIED system.

    It's existence has been disclosed; but it is NOT online on any open network.

    All the machines you listed were CLASSIFIED during the procurement and build process.

    Part of the workload for these machines will be calculations in support of the Dept. of Energy/
    National Nuclear Security Administration Stockpile Stewardship missions.

    Therefore, because of the nature of some of the work that will be run on these machines; the
    details of what is being built and how is not revealed.

    When there is a machine sitting inside of a secure computer room, inside a secure building, inside
    a secure laboratory; it does no harm to announce the existence of the machine. Nobody can exploit
    that information for nefarious purposes anymore; the machine is secure.

    Dr. Gregory Greenman
    Physicist
     
  24. Oct 21, 2007 #23
    That is up to you: you obviously did not know that the goal was increased ratio of performance per watt and performance per m^2, as I pointed out (is IBM lying in their white paper which I posted the link to?)...that is how/why the #1 slot was attained since you can stuff 1024 nodes in a single, standard rack...big deal. the goal was NOT to increase the performance per node, as you had originally implied.

    It was not my intent to make this nasty, but that is your tone: how could you not know about the performance/power ratio?!?! Did you think that 440 Mhz PPC were chosen for their individual performance?

    YOU called my allegations of performance/power ratio wrong, but in fact I have cited that this is the correct design motive! I have posted an entire document about this!

    Congress' motives are irrelevant to this discussion.

    Q
     
  25. Oct 21, 2007 #24

    Morbius

    User Avatar
    Science Advisor

    quetzalcoatl,

    NOPE - we're not focusing on a very particular type of problem, as far as the code is concerned.
    The code is a multi-physics, multi-use code.

    I focused on the very particular type of problem - the neutron transport in a reactor - as an example
    of where deterministic methods best Monte Carlo methods hands down, and will continue to do
    so for a number of years. That was merely for rebutting the contention that Monte Carlo would
    be more efficient than deterministic methods for reactor calculations.
    You sure didn't understand my previous post as to why you CAN'T do that!!!

    In order to realize this trivial parallelization, you need a single processor to follow a simulation
    particle from birth to death. That means that the processor has to know the specifications of the
    entire problem - since the simulation particle can go anywhere. However, we are dealing with very
    LARGE simulations. Even the US Gov't can only afford enough memory for ONE copy
    of the problem description to be in computer memory.

    That single copy of the problem is divided among all the CPUs assigned to the task. Since no one
    CPU has the entire description of the problem; then one CPU can not follow the particle from birth to
    death. The particle has to jump from processor to processor to processor in the course of its
    calculational lifetime.

    It is the synchronizing of all these jumps from processor to processor, as well as things like
    population control, tallies.... ALL of which need to have multiple processors coordinating.

    Therefore, one never has the trivial parallelism, except for brief instants of time. Yes - there are
    brief periods when a single processor tracks a single particle. But that doesn't last very long before
    one of these events that requires coordination among the processors happens.
    The genesis of "Monte Carlo" WAS for doing neutron transport [ not diffusion; which is a low-order approximation to transport ].
    The "Monte Carlo" method was invented for doing neutron transport. Later, it was realized that one could also do integrals.

    Consider the history of the Monte Carlo method:
    http://en.wikipedia.org/wiki/Monte_Carlo_method

    "Monte Carlo methods were originally practiced under more generic names such as "statistical sampling". The
    "Monte Carlo" designation, popularized by early pioneers in the field (including Stanislaw Marcin Ulam,
    Enrico Fermi, John von Neumann and Nicholas Metropolis), is a reference (Nicholas Metropolis noted that
    Stanislaw Marcin Ulam had an uncle who would borrow money from relatives because he "just had to go to
    Monte Carlo.") to the famous casino in Monaco. Its use of randomness and the repetitive nature of the process
    are analogous to the activities conducted at a casino. Stanislaw Marcin Ulam tells in his autobiography
    Adventures of a Mathematician that the method was named in honor of his uncle, who was a gambler, at the
    suggestion of Metropolis."


    In case you don't recognize the cast of characters; Ulam, Fermi, von Neumann, Metropolis; they were all
    pioneers in the field of neutron and radiation transport at Los Alamos National Laboratory where the technique
    was invented. Monte Carlo started out as a way of doing nuclear particle simulations. It's use as a numerical
    integration technique was realized later.

    Dr. Gregory Greenman
    Physicist
     
    Last edited: Oct 21, 2007
  26. Oct 21, 2007 #25
    i think that if you stop and listen to what i am saying you will realize that what i am saying is correct.

    consider, for example, a periodic system of N gas molecules contained in a specific volume at a specific temperature. Let's say we have 1000 processors available on some system. If we have 1000 uncorrelated starting configurations, we could start up 1000 seperate Monte Carlo simulations (with individual RNG seeds) in order to calculate some obsevable (let's say, for example, the pressure of the gas). Each processor calculates it's own average pressure. At the end of a certain number of steps, those 1000 averages (and configurations also if we wish) are sent back to the head node to get the final averaging.

    this is what we call trivial parallelization and is VERY useful for studying physical systems.
     
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook