Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

GPU Programming

  1. Mar 19, 2010 #1


    User Avatar
    Homework Helper

  2. jcsd
  3. Mar 19, 2010 #2
    yes, GPUs outperform CPUs in computation.

    I have Nvidia GeForce 9800 which is one the lowest end graphic cards for gamers these days.

    I ran a benchmark software that comes from CUDA driver set from Nvidia. The benchmark pulled about 230 gigaflops (single precision) from GPU, while the CPU pulls around 2 gigaflops or less.
  4. Mar 19, 2010 #3


    User Avatar
    Homework Helper

    Thanks. I also found the following link that explains a lot:


    Here's what I don't understand. In order to use the GPU for an intensive activity like computational fluid dynamics (CFD), do I need to write or use specialized code for parallel processing, or will existing codes with no modification run ~100x faster? Is it that simple?

    Does this mean I could buy a new Nvidia graphics card for my ~5 year old desktop computer and turn it into a computing monster?
  5. Mar 19, 2010 #4
    You have to modify the code, unless the software your are trying to run is CUDA compatible.

    The CUDA driver set gives you all the necessary tools, and libraries to program in C right away. From what I understand it's very easy to learn if you know C. In fact, the code that talks to GPU follows same syntax as C. I haven't actually learned to program the GPU yet, but it's on my to do.

    Browse around Nvidia site. They have lots of tutorials, video lectures, and stuff on parallel algorithms on matrix multiplication, and n-body problems.
  6. Mar 19, 2010 #5


    User Avatar
    Homework Helper

    Ah, so limited to C. I guess you need a compiler that is specific to the GPU, thus the need for the SDK. I'm beginning to get the picture. Rats, I was hoping I could have some fun with the vast library of existing Fortran programs that are out there.

    Anyway, I'll browse. Thanks.
  7. Mar 20, 2010 #6


    User Avatar
    Gold Member

  8. Mar 24, 2010 #7
    It's not limited to C anymore. CUDA 3 officially supports C++ and Fortran. There are also unofficial implementations for C#, Python and others, but that could get unpleasant. And yes, CPUs are pretty much done for, both in scientific and desktop computing.

    GF100 processors have programmable L2 cache, this makes them more suitable for native programming and code translation. Essentially, they work much more like a CPU from a programmer's perspective. They're also a boatload faster than the last generation, especially with 64-bit floats.
    Last edited: Mar 24, 2010
  9. Mar 24, 2010 #8


    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Parallel programming is hard even in the best of cases. GPUs are not the best of cases -- they put rather significant constraints on how you can store, access, and manipulate data. They are very good at doing things similar to their designed purpose, but TMK it is difficult to use them efficiently for dissimilar tasks.
  10. Mar 25, 2010 #9
    CUDA is easy to learn, but it's hard to achieve the "theoretical" peak performance. There are all sorts of quirks and bottlenecks that limit what you can do and how fast you can do it on a GPU. For example, there's no such thing as random memory access. Or rather, all access to global memory is serialized and carries significant penalties. Individual threads can only concurrently access what's called "shared memory", which is limited in pre-GF100 processors to 16 kilobytes per multiprocessor (GeForce 9800 has 7 multiprocessors, GTX 285 has 15), or effectively <1 kb/thread. And even then there are limitations to what you can access without penalties.

    On top of that, "theoretical" performance is always quoted in terms of 32-bit floats. Mainly because that's what GPUs are optimized for. If you want to work with 8-bit integers (maybe you're doing video processing?), modern CPUs will provide you with SIMD instructions that operate on 16 of those at once. On a GPU, you have to do everything byte-by-byte.

    But there's hope. GF100 family should be nicer than its predecessors for programming purposes (but, as everyone knows, NVIDIA is having severe problems with yields and no one knows if they are going to ship even 10,000 of those worldwide before the end of 2010). Eventually GPUs will learn SIMD, will adapt to all data types, things will get better.

    Modern multi-core CPUs can pull on the order of 50 gigaflops.
    Last edited: Mar 25, 2010
  11. Apr 18, 2010 #10
  12. Aug 10, 2010 #11
    Leveraging GPUs does not always require C, C++ and Fortran, for some applications MATLAB and Jacket from AccelerEyes can get a pretty good performance return - http://www.accelereyes.com
  13. Aug 10, 2010 #12


    User Avatar
    Homework Helper

    There is also a math library for ATI GPUs:

    http://developer.amd.com/gpu/acmlgpu/pages/default.aspx [Broken]
    Last edited by a moderator: May 4, 2017
  14. Aug 10, 2010 #13


    User Avatar
    Science Advisor
    Homework Helper

    openCL is a little simpler than CUDA, it's more C like. While alot of CUDA is more the openGL shader language and more assembler-ish.
    openCL is cross platform, it will run on NVidia and ATI cards and will also transparently run on multi-core CPUs if there is no GPU available.

    On NVidia it's basically translated into the same CUDA instructions by the cl compiler at run time.
  15. Aug 16, 2010 #14
    Does anyone know if there are any code/software/hardware platforms out there that will do DFT (density functional theory) calculations using OpenCL?
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook