Data in the CPU cache

  • #26
phinds
Science Advisor
Insights Author
Gold Member
2019 Award
16,177
6,186
I'm afraid that my spelling error has hijacked this thread.
Oh ... what were we talking about? :smile:[/QUOTE]
 
  • #27
DrClaude
Mentor
7,339
3,520
To bring things back on track, here is a figure from C. W. Ueberhuber, Numerical Computation: Methods, Software, and Analysis (Springer, Berlin, 1997):

fft004.jpg

There is significant increase in computing time when the data no longer fit in the cache.
 
  • Like
Likes EngWiPy
  • #28
1,516
617
You should never worry about the cache as a programmer. Do that once you start profiling your software and update your algorithm if there are problems.

You can not access the cache in any language other than assembly, and even then, you have no assurance of what actually happens. CPUs may not even have certain caches, so usually even the operating system is agnostic to the on-chip cache. The only way I know to even hint to the computer that you want something stored in cache is to use the PREFETCH assembly command in IA-32. But again, it's only a hint, not an instruction.

Storing a small text file would likely not be very useful, unless you plan on accessing the parts of it trillions of times, in which case your hardware will have likely already placed it in cache for you.
 
  • Like
Likes FactChecker
  • #29
nsaspook
Science Advisor
940
1,358
You should never worry about the cache as a programmer. Do that once you start profiling your software and update your algorithm if there are problems.
It depends on what type of programmer you are.:biggrin: Embedded/systems programming in C at the kernel/driver level even on devices like the 4 core RPi3 ARM8 with cache requires very close attention to cache issues that affect processor pipeline, branch execution times, memory barriers for cache coherency and such.

http://www.geeksforgeeks.org/branch-prediction-macros-in-gcc/
https://www.kernel.org/doc/Documentation/memory-barriers.txt
 
  • Like
Likes FactChecker
  • #30
1,616
969
You should never worry about the cache as a programmer.
Except when it becomes problematic to tell your customers to buy bigger/stronger HW...
 
  • #31
1,367
61
I don't remember studying anything about cache in the OS course. It was taught in the computer architecture course.
 
  • #32
nsaspook
Science Advisor
940
1,358
Cache in the news, in a bad way. While the effects of caching are usually transparent at the user level and higher OS levels it's possible to exploit cache operation in computer attacks even from JavaScript.
https://www.vusec.net/projects/anc/
 
  • #33
FactChecker
Science Advisor
Gold Member
5,581
2,059
It depends on what type of programmer you are.:biggrin: Embedded/systems programming in C at the kernel/driver level even on devices like the 4 core RPi3 ARM8 with cache requires very close attention to cache issues that affect processor pipeline, branch execution times, memory barriers for cache coherency and such.
I agree. But that being said, I think that it is a subject for an advanced, specialized programmer. A typical programmer would have a long time to learn about cache before anyone expects him to deal with these issues. And they would probably do it with frequent consultation with the manufacturer.
 
Last edited:
  • #34
33,632
5,289
A typical programmer would have a long time to learn about cash before anyone expects him to deal with these issues.
Here we go again...
 
  • Like
Likes nsaspook and jbriggs444
  • #36
1,616
969
And they would probably do it with frequent consultation with the manufacturer.
Common folk has to rely on the various 'performance optimization guide' documents, like this. Almost all CPU (and: GPU) manufactuer provides similar documents.
You have to have some serious background to get (real) personal attention.

Apart from professionals there are many amateurs who tries to give it a try on this field. It's just the matter of performance bottlenecks, and without access to professional programming materials it's surprisingly frequent.
 
  • #37
FactChecker
Science Advisor
Gold Member
5,581
2,059
Apart from professionals there are many amateurs who tries to give it a try on this field. It's just the matter of performance bottlenecks, and without access to professional programming materials it's surprisingly frequent.
Ok. I'll buy that. My experience was in an unusual environment.
 
  • #38
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
2019 Award
24,559
7,449
But it is NOT just data. Not just things like loop counters, it is the program code as well as data.
True, but this takes care of itself. When an instruction is executed, the odds are very high that the next instruction executed is the next instruction in memory, and that was read into the cache at the same time the instruction in question was loaded. The data cache is what the programmer needs to think about.

You can not access the cache in any language other than assembly
Not true. In the Intel Knights Landing, there is high speed memory (called MCDRAM) that is used as a cache between the main memory and the chip. The programmer can let it cache automatically, or she can use a variation on malloc to allocate memory directly on the cache, thus pinning those objects into the fast memory.

In general, one can do cache management indirectly by careful placement of objects - one can calculate c = a + b in such a way that when one of a or b is read into the cache, the other is as well.
 
  • #39
FactChecker
Science Advisor
Gold Member
5,581
2,059
True, but this takes care of itself. When an instruction is executed, the odds are very high that the next instruction executed is the next instruction in memory, and that was read into the cache at the same time the instruction in question was loaded.
But there are important clever exceptions. For instance, they usually assume that the last instruction of a loop is followed by the first instruction of the loop, because there are usually several loops before the loop is done. It is very hard to do better than the automatic optimization. It's usually best to respect it and work with it. Unfortunately, in a lot of safety critical work, the optimization must be kept at very low levels or turned off completely.
 
  • #40
.Scott
Homework Helper
2,536
914
I don't think cache management is part of the O.S. it is part of the on-board processing, the "computer within the computer" as it were. I'm not 100% sure that that's always the case.
Yes, for most processors, cache is primarily a processor feature that operates with little or no direct encouragement from the software.

Here are some situations where the knowledge of the cache performance is important:
1) Compiler-writing. This is perhaps the most important.
2) Debugging when using hardware debugging tools. The host is the processor where the debugging software is running. The target is the processor being debugged. When the host and target are the same processor, the caching can be invisible. But when they are not, the target may have asynchronous debugging features. Without awareness of the caching, the debugging environment can often produce perplexing situations.
3) Multicore environments. When you have several processor on a single chip that share memory, you will be provided machine-level instructions such as "instruction sync" and "data sync" that force cache to become synced with memory. You may also have mechanisms (such as supplemental address spaces) for accessing memory without caching.
4) If instruction timing becomes critical, you will need to consider caching - and that can be impractical. What you really need to do, is make the instruction timing to be non-critical.

So getting back to the part of the original question:
Is it possible to store a small .txt file into the cache?"
Kind of, but not really.
If you read a text file into RAM and begin searching through it, it will be pulled into cache memory. If its less than half the size of cache, it is likely to be pulled in in its entirety.

But it gets pulled in implicitly, not because of explicit instructions. And if you continuously interrupt the process with other memory-intensive procedures, it may never be wholly included in cache.
 
  • #41
rbelli1
Gold Member
950
356
Last edited by a moderator:
  • Like
Likes FactChecker
  • #42
3,379
943
The primitive video game 'space invaders' could be done in one kilobyte.
At the time that was very impressive
 
  • #43
rbelli1
Gold Member
950
356
The primitive video game 'space invaders' could be done in one kilobyte.
1/8 kilobyte on the Atari 2600. Not as nice as the arcade version but still impressive what you can do with 128 bytes.

BoB
 
  • #44
FactChecker
Science Advisor
Gold Member
5,581
2,059
https://software.intel.com/en-us/bl...dram-high-bandwidth-memory-on-knights-landing
In what world is 16GB a small amount of RAM? It's not an impressively large amount but still quite a lot.

Swing that at me in a few years and it will probably be a whole different story.

BoB
Interesting. It looks like the MCDRAM is Level-3 memory that can be used entirely as cache, entirely as addressable memory, or split between the two, depending on the BIOS settings. https://colfaxresearch.com/knl-mcdram/ has examples of how to use it in each case. So it can be directly controlled by the programmer as addressable memory and be faster than Level-3 cache.
 
  • #45
3,379
943
1/8 kilobyte on the Atari 2600. Not as nice as the arcade version but still impressive what you can do with 128 bytes.

BoB
Atari yeah, things like CPU directly addressing video RAM.
What, er?, video RAM?
 
  • #46
rbelli1
Gold Member
950
356
CPU directly addressing video RAM
No video RAM on the 2600 and contemporary machines. They directly addressed the beam.

Direct VRAM access was standard on all CGA XGA EGA and VGA systems. Also most of that era systems of all brands. It was mapped into the normal address space. Some systems used that ability to access more colors than were possible with datasheet operation.

programmer as addressable memory and be faster than Level-3 cache
16GB of close DRAM is certainly a performance opportunity. Bump that to SRAM and you can fly.

BoB
 
  • #47
jim mcnamara
Mentor
3,909
2,299
I think this link would be really appropriate as an answer to the first question, as the thread seems to have 'wandered'. Look for a discussion of data locality.

https://www.akkadia.org/drepper/cpumemory.pdf

This is a bit old, but still very relevant.
 
  • Like
Likes FactChecker
  • #48
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
2019 Award
24,559
7,449
In what world is 16GB a small amount of RAM?
In a world where it is shared by 256 processes.
 
  • #49
rbelli1
Gold Member
950
356
In a world where it is shared by 256 processes.
I just looked at the Intel Xeon Phi series. I had no idea anything like that existed.

BoB
 

Related Threads on Data in the CPU cache

Replies
1
Views
2K
  • Last Post
Replies
5
Views
6K
  • Last Post
Replies
4
Views
6K
  • Last Post
Replies
4
Views
918
  • Last Post
Replies
6
Views
2K
  • Last Post
Replies
3
Views
2K
  • Last Post
Replies
5
Views
3K
  • Last Post
Replies
8
Views
6K
  • Last Post
Replies
1
Views
6K
  • Last Post
Replies
1
Views
3K
Top