Do soft errors still occur in today's machines?

In summary, soft errors are mostly occurring in DRAM, and they are not the leading cause of issues that are fixed with the reboot.
  • #1
Muhammad Usman
52
3
TL;DR Summary
Do soft errors still occur in today's machines specially laptops, routers and is it the most probable cause behind the reboot-fix?
Hi,

I was reading about the soft errors. I read that the soft errors are the one that basically caused by alpha particles and even some time thermal issues (Too much heat in the machines). Although I search lot of mitigation techniques but I am curious that are these errors still exist and are they the leading cause of issues that are fixed with the reboot. Thanks
 
Engineering news on Phys.org
  • #2
Most things that are fixed with reboots are software bugs. That is why I banned reset buttons from motherboards that were to be used in critical applications (find the error - do not act as if nothing happened!).
 
  • Like
Likes anorlunda
  • #3
The majority of crashes are due to software bugs. Very few crashes are due to soft errors.
If it is important to be soft error free, then you must run the program twice and check the outputs are the same.
Soft errors due to contaminated packaging have been greatly reduced over the last 40 years.
https://en.wikipedia.org/wiki/Soft_error#Designing_around_soft_errors
 
  • #4
Baluncore said:
If it is important to be soft error free, then you must run the program twice and check the outputs are the same.
In critical applications (where delay is not acceptable) it's not twice but thrice in paralell, and two similar result is the requirement.
TMR
Guess this might count as existing mitigation technique for soft errors.

In everyday computer usage the end result is mostly the same for software- and soft errors - for a singular event you will never know what happened.
 
Last edited:
  • Like
Likes Asymptotic and anorlunda
  • #5
Muhammad Usman said:
Summary:: Do soft errors still occur in today's machines specially laptops, routers and is it the most probable cause behind the reboot-fix?

Hi,

I was reading about the soft errors. I read that the soft errors are the one that basically caused by alpha particles and even some time thermal issues (Too much heat in the machines). Although I search lot of mitigation techniques but I am curious that are these errors still exist and are they the leading cause of issues that are fixed with the reboot. Thanks
Soft errors are mostly occurring in DRAM; other parts of computer have a build-in rejection of low-energy events and should not have soft errors if designed properly.

In terrestrial applications, the soft error rates are negligible for most DRAM chips manufactured after 2010. Before 2010 it was higher because some DRAM makers (i cannot say who due to my work contract restrictions) did used a fault-prone variant of deep trench technology. The occurrence of faulty technology was common enough to make popular the ECC and parity-checked DRAM modules.

In space application radiation doses are much higher (0.3-10 Sieverts/year) compared to terrestrial (~0.002 Sieverts/year), therefore soft errors in form of bit-flip and SEU are still common in satellites, despite of less sensitive modern DRAM.
 
  • Like
Likes eq1

1. Do soft errors still occur in today's machines?

Yes, soft errors still occur in today's machines. Soft errors are random, temporary, and non-destructive errors that can occur in electronic systems due to external factors, such as cosmic rays or electrical noise. These errors can affect the data being processed or stored in a system, but they do not cause permanent damage.

2. What causes soft errors in machines?

Soft errors are caused by external factors, such as cosmic rays or electrical noise, that can disrupt the normal operation of electronic systems. These external factors can create an electrical charge that interferes with the data being processed or stored in a system, leading to a soft error.

3. Can soft errors be prevented?

While it is impossible to completely prevent soft errors, they can be mitigated through the use of error-correcting codes (ECC) and other techniques. ECC adds extra bits to data being stored or transmitted, allowing the system to detect and correct errors. Other techniques, such as shielding and redundancy, can also help reduce the occurrence of soft errors.

4. Do soft errors affect all types of machines?

Yes, soft errors can occur in all types of electronic systems, including computers, servers, and mobile devices. They can also affect different components within these systems, such as memory, processors, and storage devices.

5. How often do soft errors occur in machines?

The frequency of soft errors varies depending on the type of machine and its environment. In general, soft errors occur very rarely, with an average rate of 1 error per 1 billion device hours. However, in high-altitude or high-energy environments, such as in space or near nuclear reactors, the frequency of soft errors can be higher.

Similar threads

Replies
1
Views
965
  • Electrical Engineering
Replies
18
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Programming and Computer Science
Replies
29
Views
3K
Replies
2
Views
1K
  • Sci-Fi Writing and World Building
Replies
31
Views
2K
  • Computing and Technology
Replies
25
Views
3K
  • Differential Equations
Replies
5
Views
2K
  • Mechanical Engineering
Replies
14
Views
1K
  • Introductory Physics Homework Help
Replies
10
Views
2K
Back
Top