Why not more cores in today's CPUs?

Vanadium 50 · Nov 21, 2009

Perhaps someone can explain this to me.

An i7 has a transistor count of 731M. A Pentium has a transistor count of 3M. So in principle, Intel could build a 200 core processor today. Each core would be no "smarter" than a Pentium, but that's pretty much all you need.

Now, I recognize that this is extreme, and such a processor would be bottlenecked by memory bandwidth, but the basic idea seems sound - gain throughput by putting more, less capable, cores on the chip: 20 or 25 Pentium 4's seems not to be outside the realm of possibility.

Why don't we see this on the market?

Hepth · Nov 21, 2009

I thought we did? I know our lab had a 35 processor "supercomputer" that were all parallel mac G4's back in the day.
I think the construction of such a thing, while useful, is not useful on a consumer level. And as such, while they DO exist (thats what supercomputers are)

Prior to May 2007 the CAS supercomputer was a linux beowulf cluster. In 2002 this became only the second machine in Australia to exceed 1 Tflops in performance and by 2004 further expansion provided a theoretical peak speed of 2 Tflops. The cluster comprised the following hardware:
200 Pentium 4 3.2 GHz nodes
32 Pentium 4 3.0 GHz nodes
90 Dual Pentium 4 2.2 GHz server class nodes.

so they do it, but its expensive to build and cool. 32 processors is still a LOT of heat.

Vanadium 50 · Nov 21, 2009

There are many parallel clusters. I'm talking about a single chip - allocating the feature count differently.

You do bring up a good point about heat, though.

slider142 · Nov 21, 2009

Too many bottlenecks on consumer level motherboards and devices, and the heat problem. There are already ridiculously sized cooling devies on existing motherboards. As an aside, click here to see Mark Russinovich running Windows 7 on a 256-processor computer (not a processor with 256 cores).

Borek · Nov 21, 2009

slider142 said:

running Windows 7 on a 256-processor computer

256-processors... I don't think my room is large enough for Windows 7 machine, I will stick to my XP.

slider142 · Nov 21, 2009

Borek said:

256-processors... I don't think my room is large enough for Windows 7 machine, I will stick to my XP.

LoL. Seriously, Windows 7 is more efficient than XP at memory management and core management, so if you get a chance, try it out. Unlike Vista, it does not assume you have a PC with a fast graphics processor, and loses or replaces most of the background services that make Windows Vista respond like molasses to simple commands. If you're running it on a laptop, Windows 7 also has better power management (it is designed to run efficiently on netbooks, as opposed to XP which has no optimizations for such low resource devices).
I'm running 7 on a 8-year old desktop PC and a 5-year old laptop and multitask much faster than the snail that was Vista, and I have the additional user interface features and hardware management that is not present in XP.

waht · Nov 21, 2009

intel built an 80 core CPU, but it comes with a new set of problem:

http://news.cnet.com/Intel-shows-off-80-core-processor/2100-1006_3-6158181.html

Intel used 100 million transistors on the chip, which measures 275 millimeters squared. By comparison, its Core 2 Duo chip uses 291 million transistors and measures 143 millimeters squared. The chip was built using Intel's 65-nanometer manufacturing technology, but any likely product based on the design would probably use a future process based on smaller transistors. A chip the size of the current research chip is likely too large for cost-effective manufacturing.

Intel demonstrated the chip running an application created for solving differential equations. At 3.16GHz and with 0.95 volts applied to the processor, it can hit 1 teraflop of performance while consuming 62 watts of power. Intel constructed a special motherboard and cooling system for the demonstration in a San Francisco hotel.

waht · Nov 21, 2009

But GPU graphics cards can surpass that in computation:

The Next Generation CUDA Architecture, Code Named Fermi
The Soul of a Supercomputer in the Body of a GPU
The next generation CUDA architecture, code named “Fermi”, is the most advanced GPU computing architecture ever built. With over three billion transistors and featuring up to 512 CUDA cores, Fermi delivers supercomputing features and performance at 1/10th the cost and 1/20th the power of traditional CPU-only servers.

http://www.nvidia.com/object/fermi_architecture.html

Mech_Engineer · Nov 21, 2009

Graphics cards jumped on the "massively parallel" bandwagon several years ago now. Take for example ATI's new top of the line card, the Radeon 5870-

MaximumPC said:

This new chip is no shrinking violet in the numbers department. Every number associated with the new Radeon 5800 series is staggering: 2.15 billion transistors, 2.7 trillion floating-point operations a second [TFlops], more than 20 gigapixels per second throughput, 1,600 shader units

The card has 1,600 parallel processing units, albeit simple in form when compared to a primary CPU. When used on software optimized for full parallel operation, 2-order of magnitude increases in speed are possible. ATI's definition of a "processing unit" differs somewhat from nVidia, but suffice to say with a mere 512 processors the nVidia card will be pretty ridiculous.

Keep in mind the FLOPS quoted for graphics cards are usually single-precision calculations, while CPU's may be preforming double-precision. This difference can make comparisons difficult.

Blenton · Nov 23, 2009

I'm no expert, but if you built a cpu out of 200 pentium cores its total clock would not be faster than the individual pentium chip itself. In regards to this, I think it is more important to have a chip with higher clock speeds rather than parallel cores. Since todays cpus seem to have hit a limit at around 3.2 ghz, the introduction of more cores was there to advance the system past the clock limitations.

Hardware wise it is better to have a 3.1 ghz duel core cpu than a 2.4 ghz quad core considering many programs still don't benefit from a duel core cpu or quad core let alone any N-core cpu.

The_Absolute · Nov 23, 2009

When will we see a CPU for a personal computer with TFLOP performance? I don't understand why graphics cards have so many more FLOPS than a CPU has.

mgb_phys · Nov 23, 2009

The_Absolute said:

I don't understand why graphics cards have so many more FLOPS than a CPU has.

Because they are highly parrallelized, they can o a lot of FLOPS if you want the same FLOP applied to 6 or 256 memory locations at the same time. If you want a different FLOP applied in each case they are slow.

It's like the difference between a printer, that is slow but you can have each new character printed differently, and a printing press that can print an entire page in the time it takes to pritn one character - but you have to have all the characters preset.

The difficulty of putting more cores onto a single chip is the I/O (apart form how to program the thing) - if they all share the same memory bus then each core's access to RAM is 200x slower while it waits for it's turn - making memory about as fast as disk access. to get round this you have to put a lot of cache onto the chip so that each core's data is already on board, this is what uses most of the i&s Transistor count - it has a huge amount of L1 cache.
The other alternative is to put separate buses, but that means a chip a lot more pins - which is tricky given that an i7 socket already has 1360.

Why not more cores in today's CPUs?

Similar threads

What Free Privacy-Focused AI Chatbots Don’t Use My Data for Training?

How far will we let AI control us?

If you think having a backup is too expensive, try not having one

Impersonation News

How to get Hackathon sponsors?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers