Why not more cores in today's CPUs?


by Vanadium 50
Tags: cores, cpus
Vanadium 50
Vanadium 50 is offline
#1
Nov21-09, 12:48 PM
Mentor
Vanadium 50's Avatar
P: 15,569
Perhaps someone can explain this to me.

An i7 has a transistor count of 731M. A Pentium has a transistor count of 3M. So in principle, Intel could build a 200 core processor today. Each core would be no "smarter" than a Pentium, but that's pretty much all you need.

Now, I recognize that this is extreme, and such a processor would be bottlenecked by memory bandwidth, but the basic idea seems sound - gain throughput by putting more, less capable, cores on the chip: 20 or 25 Pentium 4's seems not to be outside the realm of possibility.

Why don't we see this on the market?
Phys.Org News Partner Science news on Phys.org
Lemurs match scent of a friend to sound of her voice
Repeated self-healing now possible in composite materials
'Heartbleed' fix may slow Web performance
Hepth
Hepth is offline
#2
Nov21-09, 12:57 PM
PF Gold
Hepth's Avatar
P: 444
I thought we did? I know our lab had a 35 processor "supercomputer" that were all parallel mac G4's back in the day.
I think the construction of such a thing, while useful, is not useful on a consumer level. And as such, while they DO exist (thats what supercomputers are)

Prior to May 2007 the CAS supercomputer was a linux beowulf cluster. In 2002 this became only the second machine in Australia to exceed 1 Tflops in performance and by 2004 further expansion provided a theoretical peak speed of 2 Tflops. The cluster comprised the following hardware:
200 Pentium 4 3.2 GHz nodes
32 Pentium 4 3.0 GHz nodes
90 Dual Pentium 4 2.2 GHz server class nodes.
so they do it, but its expensive to build and cool. 32 processors is still a LOT of heat.
Vanadium 50
Vanadium 50 is offline
#3
Nov21-09, 01:09 PM
Mentor
Vanadium 50's Avatar
P: 15,569
There are many parallel clusters. I'm talking about a single chip - allocating the feature count differently.

You do bring up a good point about heat, though.

slider142
slider142 is offline
#4
Nov21-09, 01:12 PM
P: 876

Why not more cores in today's CPUs?


Too many bottlenecks on consumer level motherboards and devices, and the heat problem. There are already ridiculously sized cooling devies on existing motherboards. As an aside, click here to see Mark Russinovich running Windows 7 on a 256-processor computer (not a processor with 256 cores).
Borek
Borek is offline
#5
Nov21-09, 01:18 PM
Admin
Borek's Avatar
P: 22,655
Quote Quote by slider142 View Post
running Windows 7 on a 256-processor computer
256-processors... I don't think my room is large enough for Windows 7 machine, I will stick to my XP.
slider142
slider142 is offline
#6
Nov21-09, 01:27 PM
P: 876
Quote Quote by Borek View Post
256-processors... I don't think my room is large enough for Windows 7 machine, I will stick to my XP.
LoL. Seriously, Windows 7 is more efficient than XP at memory management and core management, so if you get a chance, try it out. Unlike Vista, it does not assume you have a PC with a fast graphics processor, and loses or replaces most of the background services that make Windows Vista respond like molasses to simple commands. If you're running it on a laptop, Windows 7 also has better power management (it is designed to run efficiently on netbooks, as opposed to XP which has no optimizations for such low resource devices).
I'm running 7 on a 8-year old desktop PC and a 5-year old laptop and multitask much faster than the snail that was Vista, and I have the additional user interface features and hardware management that is not present in XP.
waht
waht is offline
#7
Nov21-09, 01:36 PM
P: 1,636
intel built an 80 core CPU, but it comes with a new set of problem:

http://news.cnet.com/Intel-shows-off...3-6158181.html

Intel used 100 million transistors on the chip, which measures 275 millimeters squared. By comparison, its Core 2 Duo chip uses 291 million transistors and measures 143 millimeters squared. The chip was built using Intel's 65-nanometer manufacturing technology, but any likely product based on the design would probably use a future process based on smaller transistors. A chip the size of the current research chip is likely too large for cost-effective manufacturing.

Intel demonstrated the chip running an application created for solving differential equations. At 3.16GHz and with 0.95 volts applied to the processor, it can hit 1 teraflop of performance while consuming 62 watts of power. Intel constructed a special motherboard and cooling system for the demonstration in a San Francisco hotel.
waht
waht is offline
#8
Nov21-09, 01:41 PM
P: 1,636
But GPU graphics cards can surpass that in computation:


The Next Generation CUDA Architecture, Code Named Fermi
The Soul of a Supercomputer in the Body of a GPU
The next generation CUDA architecture, code named “Fermi”, is the most advanced GPU computing architecture ever built. With over three billion transistors and featuring up to 512 CUDA cores, Fermi delivers supercomputing features and performance at 1/10th the cost and 1/20th the power of traditional CPU-only servers.
http://www.nvidia.com/object/fermi_architecture.html
Mech_Engineer
Mech_Engineer is offline
#9
Nov21-09, 03:22 PM
Sci Advisor
PF Gold
Mech_Engineer's Avatar
P: 2,234
Graphics cards jumped on the "massively parallel" bandwagon several years ago now. Take for example ATI's new top of the line card, the Radeon 5870-

Quote Quote by MaximumPC
This new chip is no shrinking violet in the numbers department. Every number associated with the new Radeon 5800 series is staggering: 2.15 billion transistors, 2.7 trillion floating-point operations a second [TFlops], more than 20 gigapixels per second throughput, 1,600 shader units
The card has 1,600 parallel processing units, albeit simple in form when compared to a primary CPU. When used on software optimized for full parallel operation, 2-order of magnitude increases in speed are possible. ATI's definition of a "processing unit" differs somewhat from nVidia, but suffice to say with a mere 512 processors the nVidia card will be pretty ridiculous.

Keep in mind the FLOPS quoted for graphics cards are usually single-precision calculations, while CPU's may be preforming double-precision. This difference can make comparisons difficult.
Blenton
Blenton is offline
#10
Nov23-09, 04:02 AM
P: 193
I'm no expert, but if you built a cpu out of 200 pentium cores its total clock would not be faster than the individual pentium chip itself. In regards to this, I think it is more important to have a chip with higher clock speeds rather than parallel cores. Since todays cpus seem to have hit a limit at around 3.2 ghz, the introduction of more cores was there to advance the system past the clock limitations.

Hardware wise it is better to have a 3.1 ghz duel core cpu than a 2.4 ghz quad core considering many programs still don't benefit from a duel core cpu or quad core let alone any N-core cpu.
The_Absolute
The_Absolute is offline
#11
Nov23-09, 04:36 AM
P: 182
When will we see a CPU for a personal computer with TFLOP performance? I don't understand why graphics cards have so many more FLOPS than a CPU has.
mgb_phys
mgb_phys is offline
#12
Nov23-09, 08:48 AM
Sci Advisor
HW Helper
P: 8,961
Quote Quote by The_Absolute View Post
I don't understand why graphics cards have so many more FLOPS than a CPU has.
Because they are highly parrallelized, they can o a lot of FLOPS if you want the same FLOP applied to 6 or 256 memory locations at the same time. If you want a different FLOP applied in each case they are slow.

It's like the difference between a printer, that is slow but you can have each new character printed differently, and a printing press that can print an entire page in the time it takes to pritn one character - but you have to have all the characters preset.

The difficulty of putting more cores onto a single chip is the I/O (apart form how to program the thing) - if they all share the same memory bus then each core's access to RAM is 200x slower while it waits for it's turn - making memory about as fast as disk access. to get round this you have to put a lot of cache onto the chip so that each core's data is already on board, this is what uses most of the i&s Transistor count - it has a huge amount of L1 cache.
The other alternative is to put separate buses, but that means a chip a lot more pins - which is tricky given that an i7 socket already has 1360.


Register to reply

Related Discussions
MULTI CORE CPUs to run many programs ? Computers 6
Need Help Understanding a Problem About CPUs Engineering, Comp Sci, & Technology Homework 2
Do you like Intel CPUs more or AMD's? Computing & Technology 11
replace an Intel processor with an Alpha processor Computing & Technology 3
2 CPUs on a motherboard ? Computing & Technology 10