• #1
Dr Transport
Science Advisor
Insights Author
Gold Member
2,538
680
INTRODUCTION
As a long-time computer programmer and almost as long a High-Performance Computer (HPC) user, I really didn’t know anything about how these machines actually worked under the hood. I still really don’t, so a few years ago when I was working at one of the US National Labs, I decided that a fun project to learn this would be to build my own raspberry pi cluster. Well, life got in the way and I started getting parts for it and never got around to doing it.
Last year, my employer spent millions of dollars on a whole new set of HPC’s and up to now, we’ve not gotten the bang for the buck we had anticipated because to get them up in the hands of the working engineers, we broke some of them apart into their constituent pieces and mimicked the machines we already had on hand. This is all well and good, but we have machines with 2000 nodes that we would like to set up and use on CFD...

Continue reading...
 
  • Like
  • Informative
Likes neilparker62, Wrichik Basu, Twigg and 2 others

Answers and Replies

  • #2
35,631
7,501
Very nice! Have fun with your Pi setup!
I've done some experimenting with multithreading on my Dell workstation at home with its 10-core Intel Xeon Scalable processor. I've found unless each thread is doing lots of work, the overhead of setting up threads means that it's probably going to take less time if you use only a single thread.
 
  • #3
Dr Transport
Science Advisor
Insights Author
Gold Member
2,538
680
Very true, but in my case, I have machines with ~100 nodes I am using at work and most of the tools developed haven't been upgraded to handle machines of that type. I'm doing some python development and having some of my codes using mpi4py would help in throughput some.

Mostly what I'm doing is to play around because I've never had any formal education in multi-node machines.
 
  • #4
35,631
7,501
Very true, but in my case, I have machines with ~100 nodes I am using at work and most of the tools developed haven't been upgraded to handle machines of that type.
Converting a program that runs in a single process and parallelizing it to take advantage of multiple/many cores has turned out to be a fairly difficult problem. A natural split for many programs is to have user interface stuff in one thread, and calculations in one or more other threads. A problem that arises is trying to keep multiple threads occupied, rather than having some of them idle while others are chugging away. For your situation with CFD calculations that take on the order of months, it really makes sense to split the computations among a bunch of nodes.

What I was doing with my 10-core machine was writing Intel AVX-512 assembly code (Advanced Vector eXtensions) that used SIMD (single instruction multiple data) instructions that could do calculations with 16 floats (512 bits) in a single operation. The calculations happened so quickly, that it didn't make sense to split them into multple threads -- setting up the threads took orders of magnitude longer than the calculations required.
 
  • #5
Dr Transport
Science Advisor
Insights Author
Gold Member
2,538
680
One of the problems I've been working on lately is mapping temperatures to a mesh for inclusion in another analysis. So I'm taking CFD generated temperatures, mapping them to another mesh and running that in a second analysis code. The times are linear, so if 100K facets takes an hour, 1M facets takes ~10 hours. I really don't want to wait all day or over night to get the following calculation running because management belives that anything that runs more than over night or more than a day is wasted time. (This whole thing stems from the fact that management is very impatient, I keep telling them that good measurements or calculations are like fine wine, they take time to make.)
 
Last edited:
  • #6
pbuk
Science Advisor
Gold Member
2,756
1,464
Interesting article, and a Pi cluster is an interesting thing in its own right, but can't you run your CFD computations on a GPU?
 
  • #7
Dr Transport
Science Advisor
Insights Author
Gold Member
2,538
680
Interesting article, and a Pi cluster is an interesting thing in its own right, but can't you run your CFD computations on a GPU?
Sure, but you forget that the codes have to be specifically written or adapted for that. I use a code at work that doesn't have gpu support for some of it's functionality. My computers at home do not have any additional gpu's either, just the graphics capability on the mother board becasue I didn't specifically purchase one.
 
  • #8
759
294
Dear Santa. Please can I have an HPC for Christmas.

You sure can - actually I just happen to have the 'starter kit' for you right now :wink:
 
  • #9
759
294
I was thinking of writing an article on net-booting raspberry pies. Just wondering if that might be a useful idea for your cluster instead of running the individual stations from sd cards ? I would imagine it might be easier when it comes to upgrading.

Also found following product which may or may not be of interest:

https://shop.pimoroni.com/products/cluster-hat
 
  • Like
Likes Wrichik Basu
  • #10
Dr Transport
Science Advisor
Insights Author
Gold Member
2,538
680
Interesting, never thought about net-booting, I might have to investigate that. The SD cards have the operating system o it, so I don't know how to get around that.

The cluster hat is also another interesting find. To be honest, all the sites out there show a system like mine with the hardware and associated fans. I'm not sure which would be better. From a heat standpoint, I can't see the cluster hat handling the heat dissipation as well as individual fans.
 
  • #11
759
294
Interesting, never thought about net-booting, I might have to investigate that. The SD cards have the operating system o it, so I don't know how to get around that.

The cluster hat is also another interesting find. To be honest, all the sites out there show a system like mine with the hardware and associated fans. I'm not sure which would be better. From a heat standpoint, I can't see the cluster hat handling the heat dissipation as well as individual fans.
Pi without sd card has enough intelligence to reach a tftp server and be directed to a boot directory from which it can load boot files. Ultimately it reaches a point where it can connect to network storage via NFS. Expect a fairly steep learning curve though - there are quite a few 'ingredients' needed to make it all happen!

https://linuxhit.com/raspberry-pi-pxe-boot-netbooting-a-pi-4-without-an-sd-card/

I guess the only way to test the pi zero cluster product would be to buy one and try it out.
 

Related Threads on How to Setup a Raspberry Pi Cluster

Replies
3
Views
5K
  • Last Post
Replies
3
Views
4K
Replies
1
Views
558
Replies
1
Views
552
Replies
4
Views
529
  • Last Post
Replies
2
Views
37K
  • Last Post
Replies
3
Views
4K
Replies
18
Views
2K
  • Last Post
Replies
1
Views
3K
  • Last Post
Replies
4
Views
2K
Top