In a recent exchange about code memory error correction (ECC memory), Linus Torvalds, openly criticized Intel for not carrying the ECC RAM to major platforms and praised AMD for support on Ryzen platforms.
ECC memory is a type of random access memory that contains a correction code that allows you to detect and correct the most common types of data corruption. This type of memory is used in computers where data corruption cannot be tolerated under any circumstances, such as for scientific or financial calculations.
For many industries, if major warehousing errors occur, there is not only the risk of financial loss and in the worst case, a company's position can be seriously weakened in the market.
In this context, the tendency to always have more memory is criticized; in fact, the more you increase the storage capacity, the more the risk of failure increases. That's why server and work environments that require high data integrity emphasize comprehensive data protection. For example, ECC memory is used instead of normal RAM to better protect ourselves and avoid simple bit errors.
Given this, Several approaches have been developed to deal with memory errors: programming with immunity acknowledgment, parity bits, and error-correcting code memory. Using ECC is like calling a data code that has the ability to detect and correct single-bit errors.
In addition, the ECC can also determine rare double-bit errors. To take advantage of this correction method, ordinary random access memory (RAM) modules are expanded with an ECC memory module. That is why we talk about ECC RAM.
At the end of the day, there is a trade-off between protection against data loss and higher cost of the memory. Therefore, this is done with certain drawbacks:
- Error-correcting code memory is more expensive than conventional memory due to the additional hardware required to produce it and the lower production volumes of this memory and associated components.
- Motherboards, chipsets, and processors that support error-correcting code memory are also more expensive for the same reasons.
- Error correction code memory can be 2 to 3 percent slower than conventional memory due to the additional time required for error checking and correction.
- However, modern systems integrate error handling into the processor, eliminating the time required to verify and correct memory accesses.
Linus Torvalds Perspective
When told, “So yeah, I totally agree that AMD offers a better deal. However, the ECC doesn't really matter here ”, replied Linus Torvalds,
“The ECC is absolutely important.
“ECC availability is very important, precisely because Intel has been instrumental in destroying the entire ECC industry with its horribly poor market segmentation.
“Go out there and try to find ECC DIMMs, it's really hard. Of course, probably all thanks to AMD, it could have improved a bit lately, but that's exactly what I'm going to do.
“Intel has been hurting the entire industry and users because of its bad and misguided policies towards ECC. Seriously.
"And if you don't believe me, look at the multiple generations of memory hammered, where every time Intel and the memory makers complain about how it will fix next time."
In your post, Torvalds points the finger at Intel for lack of widespread ECC adoption in the main space.
Torvalds believe this is due to the complete crash Intel's view of ECC support in its consumer processors and chipsets, saying that alone has removed any incentive for memory makers to create desktop ECC memory for the general public.
Torvalds also praised AMD for its unofficial support for ECC. Despite this being unofficial support, Linus is still very happy that AMD is even expanding the option on Ryzen platforms.
“I don't really care if honey system is equipped with ECC or not. This is not the problem. If I have memory errors, I'm actually pretty good at solving them. Also, I end up using fairly "safe" machines. I make sure I have over-specified power, I live mostly at sea level, I don't overclock, and I buy reputable products, ”said Torvalds.