Searching for Answers on Processor Performance & Real-Time Processing

In summary: Your answer is very helpful. I imagine that at the very least the computer chosen can likely do the work, but it may remain something of a mystery to me until I dive into what the program is actually doing that becomes so CPU intensive. It very well could be that the new architecture does not offer any advantages to the specific process we are running. Besides my previous comments, processor designers add new features to new architectures. One feature that I'm particularly interested in is AVX-512 (Advanced Vector eXtensions, with 512-bit registers), which is currently available only on Xeon Phi and the Xeon Scalable processors (Xeon Bronze, Xeon Silver, Xeon Gold, Xeon Platinum) from Intel. AFAIK, AMD doesn
  • #1
Pattonias
197
0
I have a question for which I can't seem to find a satisfactory answer.

We do a lot of intense real time processing of audio signals in my line of work. (Generally programmed in C)
I am a mechanical engineer with only a rudimentary understanding of the process that goes on in the software for the project but I can't seem to find an answer that really fits our situation.

We will often spec laptops/PCs that we will use for field work, and I will often have computers that I select for approval with newest architecture and high performance overruled with sometimes multi-year old processors that seemingly only have a higher clock rate. 3.1 GHz (2-3 years old) as opposed to 2.5 - 2.8 Ghz (current generation, new architecture).

When performing intense processing is it still purely the clock rate that matters? Why do bench marks show improved performance in the lower clock rate processors with newer architecture, but this does not seem to be important when selecting a PC for custom coded applications?

I guess my question is whether or not there is actually a need, or if the other engineers are following a rule of thumb that discounts the advantages of new architecture.

When trying to research this I found that most of the comments centered around perceived performance advantage of current generation processor on specific metrics such as file transfer or improved game performance that could seemingly be improved with the architecture, but not necessarily be attributed to actual processing being done faster.
 
Technology news on Phys.org
  • #2
Pattonias said:
We will often spec laptops/PCs that we will use for field work, and I will often have computers that I select for approval with newest architecture and high performance overruled with sometimes multi-year old processors that seemingly only have a higher clock rate. 3.1 GHz (2-3 years old) as opposed to 2.5 - 2.8 Ghz (current generation, new architecture).
Your thread title is "Processor Question for Intense Single Core Calculations," which confused me somewhat. If you're asking about Intel or AMD processors produced in the last two or three years, I believe all of them are multicore processors, with at least two physical cores.

Pattonias said:
When performing intense processing is it still purely the clock rate that matters?
Not necessarily. The CPU designers can improve performance by reducing the number of micro-instructions for frequently used instructions, so that even at the same clock rate, a new design can outperfom a CPU with the older design. Other possibilities are the addition of better branch prediction algorithms, so that there is less chance of incorrectly predicting what the next instruction will be, so the instruction pipeline doesn't need to be flushed.
Pattonias said:
Why do bench marks show improved performance in the lower clock rate processors with newer architecture, but this does not seem to be important when selecting a PC for custom coded applications?
The benchmarks are based on commonly performed tasks, and probably don't include the kinds of tasks for signal processing that you're doing. That would be my guess.
 
  • Like
Likes QuantumQuest, Pattonias and berkeman
  • #3
Mark44 said:
Your thread title is "Processor Question for Intense Single Core Calculations," which confused me somewhat. If you're asking about Intel or AMD processors produced in the last two or three years, I believe all of them are multicore processors, with at least two physical cores.
Sorry, I should have clarified. The code we are running is not generally designed to run on multiple cores, so the single core processing power seems to be the driving metric. In this instance a dual core laptop with higher clock rate was chosen over a newer laptop with quad core, but lower clock rate.

Your answer is very helpful. I imagine that at the very least the computer chosen can likely do the work, but it may remain something of a mystery to me until I dive into what the program is actually doing that becomes so CPU intensive. It very well could be that the new architecture does not offer any advantages to the specific process we are running.
 
  • #4
Besides my previous comments, processor designers add new features to new architectures. One feature that I'm particularly interested in is AVX-512 (Advanced Vector eXtensions, with 512-bit registers), which is currently available only on Xeon Phi and the Xeon Scalable processors (Xeon Bronze, Xeon Silver, Xeon Gold, Xeon Platinum) from Intel. AFAIK, AMD doesn't have this capability in their Zen and Ryzen processors.

Having 512-bit registers means you can load 16 floats (@ 4 bytes each) into a single register, and another 16 floats into another register, to do whatever computation you need, all in a single operation. This could be advantageous in audio signal processing.

My previous computer runs an Intel i-7 (from about 5 years ago). It supports some 256-bit registers, but only a limited number of operations can be performed on them. I just bought a new computer with a Xeon Silver processor, with support for many AVX-512 capabilities. As soon as I get it up and running, I will start writing programs to take advantage of these new capabilities.(writing x-64 assembly code).
 
  • #5
Pattonias said:
Sorry, I should have clarified. The code we are running is not generally designed to run on multiple cores, so the single core processing power seems to be the driving metric.
I thought that might be the case, based on your thread title.
Pattonias said:
In this instance a dual core laptop with higher clock rate was chosen over a newer laptop with quad core, but lower clock rate.
Have you or your team thought about modifying the code to take advantage of the multiple cores? Instead of having a single thread doing all the work, it might be advantageous to divide the work so that two or more threads are working on different, but independent, parts of the problem. I've done some exploration of this, but you need to give each thread a big chunk of stuff to work on, otherwise the overhead of starting and stopping threads can eat up the time saved by having multiple threads involved.
 
  • #6
Mark44 said:
Have you or your team thought about modifying the code to take advantage of the multiple cores? Instead of having a single thread doing all the work, it might be advantageous to divide the work so that two or more threads are working on different, but independent, parts of the problem. I've done some exploration of this, but you need to give each thread a big chunk of stuff to work on, otherwise the overhead of starting and stopping threads can eat up the time saved by having multiple threads involved.

We were actually talking about this yesterday. We have the capability of writing the code to multiple threads, but the project is in the prototype phase, so currently it is only written to a single thread. I would guess that the code could eventually be optimized should we develop a next generation of the system, but currently they are going with quick and "easy" on the software side. (I say easy in jest. These guys are very good at what they do.)

With regards to the Xeon Phi's bit register, I'll have to do more research but you have given me some really interesting starting points.
 
  • #7
It may be a silly question, but won't those extra cores off-load the PC's house-keeping from the core running your algorithm ??

This 8-core CAD-tower PC is clocked significantly slower than the single-core zoomer I built some years ago, but processes my 'work' much faster as 'house-keeping' now only uses 3~5 % of CPU cycles...
YMMV.
 
  • #8
Pattonias said:
With regards to the Xeon Phi's bit register, I'll have to do more research but you have given me some really interesting starting points.
The Xeon Phi processors are very pricey. The top of the line is model Xeon Phi 7290, with 72 cores. The clock speed is relatively low, 1.5 GHz normal or 1.7 GHz turbo. One website offers this processor for about $6000 -- that's just the processor and nothing else.

For what I needed, I chose a Xeon Silver model 4114, with 10 cores. The site listed above offers this processor for about $1200, but I've seen it offered at $740 or thereabouts at other sites..
 
  • #9
Pattonias said:
When performing intense processing is it still purely the clock rate that matters?
Definitely not, but without further details it is hard to say anything for sure. There are just too many components (with many parameters) around a CPU.

Pattonias said:
When trying to research this I found that most of the comments centered around perceived performance advantage of current generation processor on specific metrics...
Actually this is, what might bring you closer to practical answers. You should just start naggig the guys at the software department to add a benchmark-mode to the software... o0)
 
  • #10
Hi Pattonias,

We have a similar use case where I work in that we need to optimize our single-core performance since several of the codes we use aren't written for multi-threading and will not be upgraded to multi-threading.

What we have found is that newer architectures are optimized for things other than raw single-core performance. For instance, they can have better multi-core performance and are typically much more power efficient (watts/MIP).

Our experience is this: you have to benchmark candidate machines using your own code. The published benchmarks have not been as helpful for us as I expected. We have found our fastest machine for doing single-core processing is a 6-year machine using the Intel Xeon with the Westmere architecture. It still works best for us. We do a lot of multi-core stuff and for that we have a lot of newer machines (best overall is based on Intel Xeon with Broadwell architecture).

Still, the surprising thing is we still keep the Westmere machine around to run a specific simulator because it is the fastest thing we have for single-core operations. (Moore's Law? haha). I have no idea if this is relevant but our simulation machines run Linux.

So my advice to you is to look into a used server that is Westmere-based. It may be just the ticket (and cheap too!).
 
  • Like
Likes Pattonias
  • #11
On-board memory access might be another speed factor to consider. More memory on each level of memory is an advantage. And you should not forget to bind your program to a core at the highest priority. If you are running Windows, you need at least one other processor core to take care of the Windows OS tasks.
 
Last edited:
  • Like
Likes SchroedingersLion
  • #12
FactChecker said:
On-board memory access might be another speed factor to consider. More memory on each level of memory is an advantage. And you should not forget to bind your program to a core at the highest priority. If you are running Windows, you need at least one other processor to take care of the Windows OS tasks.
How does one tell a single core that self-written program has the highest priority? How does one tell another core to take care of Windows?
Do you have a guide or something to this?
 
  • #13
For windows, I believe it is called something like 'CPU affinity'.
 
  • Like
Likes FactChecker
  • #15
Take a look for OpenCL to run with your C on your system. It may make it easy to use more of your CPU processors, or justify a cheap graphics card for the GPU. That should give you 20 times the throughput without trying.
 
  • Like
Likes Pattonias and SchroedingersLion

What is processor performance?

Processor performance refers to the speed and efficiency at which a processor can execute instructions and handle tasks. It is measured in clock speed (measured in gigahertz) and the number of cores (physical or virtual) a processor has.

Why is processor performance important?

Processor performance is important because it directly impacts the speed and responsiveness of a computer or device. A faster processor allows for quicker loading times, smoother multitasking, and better overall performance.

How does real-time processing differ from regular processing?

Real-time processing is a type of processing that requires immediate response and action, often in milliseconds or microseconds. It is used in tasks such as gaming, video editing, and high-frequency trading. Regular processing, on the other hand, does not have the same time constraints and can take longer to execute tasks.

What factors affect processor performance?

Several factors can affect processor performance, including clock speed, number of cores, cache memory, and architecture. Other factors such as software optimization, thermal design, and power management also play a role in overall performance.

How can I improve processor performance?

Processor performance can be improved by upgrading to a faster processor with more cores, increasing the amount of cache memory, optimizing software, and ensuring proper thermal design and power management. Overclocking, or increasing the clock speed beyond the manufacturer's specifications, can also improve performance, but it can also lead to instability and damage to the processor if not done correctly.

Similar threads

  • Programming and Computer Science
Replies
29
Views
3K
  • Quantum Interpretations and Foundations
Replies
25
Views
1K
Replies
16
Views
2K
  • Programming and Computer Science
Replies
1
Views
1K
  • STEM Career Guidance
Replies
4
Views
1K
  • Mechanical Engineering
Replies
2
Views
13K
Replies
4
Views
30K
  • Thermodynamics
Replies
2
Views
2K
Replies
4
Views
8K
Back
Top