Segfaults due to defective RAM

cover
This blog post was published 10 years ago and may or may not have aged well. While reading please keep in mind that it may no longer be accurate or even relevant.

If you get various segfaults on your Linux machine, like these:

spamd child\[2656\]: segfault at 200251c208 ip 00007fa039223684 sp 00007fff77953680 error 4 in libperl.so.5.14.2\[7fa03916a000+177000\]

or:

clamd\[3311\]: segfault at 1000000008 ip 00007f00200b3751 sp 00007fff3e2cef60 error 4 in libclamav.so.6.1.17\[7f001fff1000+988000\]

or

php5\[14914\]: segfault at 7fff7d2939c8 ip 00000000006bf04d sp 00007fff6d293860 error 6 in php5\[400000+6f3000\]

or

PassengerHelper\[11644\]: segfault at ffffffffca4ef420 ip 0000000000492fea sp 00007f5b81e991d0 error 7 in PassengerHelperAgent\[400000+203000\]

Then no, your system is not suddenly crazy, nor are you. It simply could be that a RAM module is defective! To diagnose, we should run the RAM test from the boot manager.

If we are operating a server that we can’t reboot, there is an excellent tool called memtester. It is a memory test for a running system. It is part of the Debian distribution, install it with apt-get install memtester.

Check top to see how much free RAM there is available. Say, you have 10GB of free RAM, then tell memterst to test 8GB of it (so that 2GB remain free for the running system to operate).

In my case, memtester indeed detected faulty RAM:

memtester 8000 3
Loop 1/3:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : testing  30FAILURE: 0xffffffffffffffff != 0xfffffffbffffffff at offset
0x36e77910.
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
8-bit Writes : ok
16-bit Writes : ok

So, when I replaced the RAM, the segfaults stopped.

You could run memtester regularly to make sure the RAM is okay. Needless to say, healty RAM is a very crucial part of every hosting operation!

In my case however, the segfaults corrupted MySQL tables, which I had to clean up manually.

If you found a mistake in this blog post, or would like to suggest an improvement to this blog post, please me an e-mail to michael@franzl.name; as subject please use the prefix "Comment to blog post" and append the post title.
 
Copyright © 2023 Michael Franzl