Can Engineers Overcome the Impending Limitations of Moore's Law?

  • Thread starter Thread starter ElliotSmith
  • Start date Start date
  • Tags Tags
    Law Moore's law
Click For Summary

Discussion Overview

The discussion centers on the limitations of Moore's Law and the potential challenges faced by engineers and computer scientists in advancing microprocessor technology. Participants explore various aspects including theoretical limits, architectural improvements, and the future of quantum computing.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Exploratory

Main Points Raised

  • Some participants express concern that the physical limits of transistor size are approaching, potentially within the next decade, and question how engineers will address this issue.
  • Others suggest that current methods, such as increasing the number of layers on chips, may not be sustainable due to heat dissipation challenges.
  • A participant raises the possibility of quantum computers becoming commonplace, questioning the feasibility of room-temperature quantum computing and mentioning topological insulators as a theoretical solution.
  • Some argue that while the individual components of CPUs can only shrink so much, improvements in architecture and software efficiency can still enhance performance.
  • There is mention of advancements in compiler technology that may help optimize CPU performance without necessarily increasing clock speeds.
  • One participant claims that the "wall" of performance has already been reached, noting that CPU clock speeds have stagnated around 3-3.5 GHz.
  • Another viewpoint suggests that the current demand for computing power is not as pressing for the average user, implying that innovation may be driven more by specific industries rather than general consumer needs.
  • Concerns are raised about quantum effects, such as electron tunneling, becoming significant at smaller feature sizes, potentially impacting future chip designs.
  • Some participants reference upcoming CPU architectures and their specifications, indicating ongoing developments in the industry.

Areas of Agreement / Disagreement

Participants express a range of views on the current state and future of microprocessor technology, with no clear consensus on whether the limitations of Moore's Law have been reached or what the best path forward may be. Multiple competing perspectives on the role of quantum computing and architectural improvements are present.

Contextual Notes

Limitations include uncertainties about the timeline for reaching physical limits, the impact of heat dissipation on chip design, and the speculative nature of quantum computing advancements.

  • #31
ElliotSmith said:
The scientific limit on to how small you can make a functionally viable transistor is very fast approaching and should hit a stone wall within the next 10 years or less. How will electronic engineers and computer scientists compensate for this problem?...Are there any workarounds on the table being discussed and researched for this issue?...

There are four main limits re CPU performance:

(1) Clock speed scaling, related to Dennard Scaling, which plateaued in the mid-2000s. This prevents much faster clock speeds: https://en.wikipedia.org/wiki/Dennard_scaling

(2) Hardware ILP (Instruction Level Parallelism) limits: A superscalar out-of-order CPU cannot execute more than approx. eight instructions in parallel. The latest CPUs (Haswell, IBM Power8) are already at this limit. You cannot go beyond about an 8-wide CPU because of several issues: dependency checking, register renaming, etc. These tasks escalate (at least) quadratically, and there's no way around them for a conventional out-of-order superscalar machine. There will likely never be a 16-wide superscalar out-of-order CPU.

(3) Software ILP limits on existing code: Even given infinite superscalar resources, existing code will typically not have over 8 independent instructions in any group. If the intrinsic parallelism isn't present in a single-threaded code path, nothing can be done. Newly-written software and compilers can theoretically generate higher ILP code but if the hardware is limited to 8, there's no compelling reason to undertake this.

(4) Multicore CPUs limited by (a) Heat: The highest-end Intel Xeon E5-2699 v3 has 18 cores but the clock speed of each core is limited by TDP: https://en.wikipedia.org/wiki/Thermal_design_power
(b) Amdahl's Law. As core counts increase to 18 and beyond, even a tiny fraction of serialized code will "poison" the speedup and cap improvement: https://en.wikipedia.org/wiki/Amdahl's_law
(c) Coding practices: It's harder to write effective multi-threaded code, however newer software frameworks help some.

While transistor scaling will continue for a while, increasingly heat will limit how much of that functional capacity can be simultaneously used. This is called the "dark silicon" problem. You can have lots of on-chip functionality but it cannot all be simultaneously be used. See paper "Dark Silicon and the end of Multicore Scaling": https://www.google.com/url?sa=t&rct...=k_D1De2gUp79VwMVcTIdwQ&bvm=bv.84349003,d.eXY

What can be done? There are several possibilities along different lines:

(1) Increasingly harness high transistor counts for specialized functional units. E.g, Intel core CPUs since Sandy Bridge have had a Quick Sync dedicated video transcoder: https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video This is about 4-5x faster than other methods. Intel's Skylake CPU will have a greatly improved Quick Sync which handles many additional codecs. Given sufficient transistor budgets you can envision similar specialized units for diverse tasks. These could simply sit idle until called on, then render great performance in that narrow area. This general direction is integrated heterogeneous processing.

(2) Enhance existing instruction set with specialized instructions for justifiable cases. E.g, Intel Haswell CPUs have 256-bit vector instructions and Skylake will have AVX-512 instructions. In part due to these instructions a Xeon E5-2699 v3 can do about 800 linpack gigaflops, which is about 10,000 faster than the original Cray-1. Obviously that requires vectorization of code, but that's a well-established practice.

(3) Use more aggressive architectural methods to squeeze out additional single-thread performance. Although most items have already been exploited, a few are left, such as data speculation. Data speculation differs from control speculation, which is currently used to predict a branch. In theory data speculation could provide an additional 2x performance on single-threaded code, but it would require significant added complexity. See "Limits of Instruction Level Parallelism with Data Speculation": http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.47.9196&rep=rep1&type=pdf

(4) Use VLIW (Very Long Instruction Word) methods. This side steps the hardware limits on dependency checking, etc by doing it at compile time. In theory a progressively wider CPU could be designed as technology improves which could run single-threaded code 32 or more instructions wide. This approach was unsuccesfully attempted by Intel with Itanium and CPU architects still debate whether a fresh approach would work. A group at Stanford is actively pursuing bringing a VLIW-like CPU to the commercial market. It is called the Mill CPU: http://millcomputing.com/ VLIW approaches require software be re-written, but using conventional techniques and languages, not different paradigms like vectorization, multiple threads, etc.
 
Computer science news on Phys.org
  • #32
phinds said:
Right. And Bill Gates was SURE that 64K would be all the memory anyone would ever need. It was just inconceivable that more could be required for a single person.

He actually never said that but it's a popular urban legend that he did.

Back on topic, Moore's Law seems to be reaching the end of its life now. We're moving to distributed systems and multicore machines and Amdahl's Law is the new one to watch.

http://en.wikiquote.org/wiki/Bill_Gates

http://en.wikipedia.org/wiki/Amdahl's_law
 
  • #33
Carno Raar said:
He actually never said that but it's a popular urban legend that he did.
Either way my point remains exactly the same
 
  • #34
Amdahl's law is algorithm dependent. So it's not the same kind of thing as Moore's Law.
 
  • #35
SixNein said:
Amdahl's law is algorithm dependent. So it's not the same kind of thing as Moore's Law.

It's an appropriate answer for the OP's question.

"The scientific limit on to how small you can make a functionally viable transistor is very fast approaching and should hit a stone wall within the next 10 years or less. How will electronic engineers and computer scientists compensate for this problem?"

A valid answer is we spin up more cloud instances and learn to write concurrent code. Right now Amdahl and Moore are limiting factors in the growth of large computer systems. Moore will doubtless become less important in the near future, while we're only just starting to get our heads around concurrency issues. I say concurrency not parallelism as I don't yet have access to properly parallel hardware ... :-)
 
  • #36
Carno Raar said:
I say concurrency not parallelism as I don't yet have access to properly parallel hardware ... :)
If you don't have parallel hardware, concurrency is just sequential but with extra overhead. That is, if you have a single-thread process in a single CPU and you make it multi-threaded but still on the single CPU, all you have done is add thread overhead.
 
  • #37
phinds said:
If you don't have parallel hardware, concurrency is just sequential but with extra overhead. That is, if you have a single-thread process in a single CPU and you make it multi-threaded but still on the single CPU, all you have done is add thread overhead.
Managing multiple downloads + many other use-cases.

"Take this list of URLs and download them all". You don't want to sit there doing nothing while your 1st and only download times out.

Edit: Yes you can implement this single threaded with async and a non-blocking downloader but that's a bit weird - and most libraries implement nonblocking download with threads anyway.
 
Last edited:
  • #38
Carno Raar said:
Managing multiple downloads + many other use-cases.
Good point. Thanks. I had not thought about I/O bound processes.
 

Similar threads

Replies
11
Views
6K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K