Mastering Parallel Computing on Linux: From Cluster Setup to 16 Processors

AI Thread Summary
To start with parallel processing and building a cluster, it's recommended to explore the Message Passing Interface (MPI), which is essential for linking programs on clusters. While MPI is useful for coding in a parallel environment, those looking to create their own cluster should focus on Linux resources specific to cluster building. For a 16-node setup, selecting fast CPUs, a motherboard with Gigabit Ethernet, and ample RAM is crucial. It’s noted that dual-CPU configurations can outperform single-CPU setups in certain scenarios. For larger clusters, investing in reliable rack-mounted hardware is essential for stability, as well as understanding network tuning and using quality switches to avoid performance issues. Additionally, for software development, utilizing cloud computing services like Amazon EC2 can be beneficial. Functional programming languages such as Haskell and Erlang are highlighted for their suitability in this domain due to their side-effect-free nature, suggesting a vibrant hobbyist community for further exploration.
welatiger
Messages
85
Reaction score
0
i asked a question that " i need to make a parallel processing " but still wants to know from where i start



I need to learn

Parallel computing processes i.e. I hope to build cluster

Linux Parallel Processing Using Clusters we have 16 processors
 
Technology news on Phys.org
MPI is a library for writing code that runs on a parallel implementation. However, if you're trying to MAKE a parallel implementation (i.e. make your own cluster) that's not what you need. I'd start looking through linux websites on cluster building (it also depends a lot on the cluster you want to build)
 
Good point !

What we found was that at 16nodes pretty much anything will work - just buy whatever CPU is fastest/$ at the moment, get a MB with Gigabit ethernet and as much ram as you can afford. Look at duals when Dell are having a sale, 8*2cpu is often faster than 16*1cpu because half of your interconnects are super fast.

There is an O'Reilly book "Building Beowulf clusters" but it is out of date and wasn't very good when it was new.

For larger clusters (>64nodes) it's worth buying decent rack mount hardware from a proper vendor, otherwise you never have a system that is stable enough to complete a job before some fan fails and a machine hangs.
Racks, network and cooling start to cost you as much as the HW at this point.

Learn about network tuning and TCP packets, buy decent switches don't daisy chain home grade ones. If you need lower latency than ethernet it's probably time to pay the experts.
 
If you are interested in the software development aspect of it I would recommend just buying CPU time on one of the many "cloud computing" networks. Look into Amazon EC2 or Sun's Grid.

Also, functional programming using Haskell, Erlang, Standard ML is ideal because of its "no side effects" nature.
 
I feel like there's probably a sizable hobby community for this kind of stuff. If you can find the right website there's probably a wealth of information.
 
thank you so much
 
Back
Top