Removing millions of files in a directory

This blog post was published 11 years ago and may or may not have aged well. While reading please keep in mind that it may no longer be accurate or even relevant.

At one time, a SMTP submission user account on my Exim email instance got compromised and wildly started sending spam e-mails. As a consequence, my Exim instance became blacklisted and started to store millions of spam emails in the mail queue (which are stored as regular files in the directories /var/spool/exim4/input and /var/spool/exim4/msglog).

There were so many files that I was not able to remove them with the standard rm ./* wildcard because even the shell wildcard expansion took too much CPU and RAM.

I was also not able to use find . -type f -delete, as some tutorials on the internet suggested. The reason is that it too was collecting filenames upfront which exhausted the RAM.

My solution was to try it with Perl which allows streaming the entries of a directory:

cd directory/to/be/emptied
perl -e 'opendir D, "."; while ($n = readdir D) { print $n; unlink $n }'

This started deleting files only after 1 minute of initialization. The deletion speed on an Ext4 filesystem on a SSD was about 200 files per second.

If you found a mistake in this blog post, or would like to suggest an improvement to this blog post, please me an e-mail to michael@franzl.name; as subject please use the prefix "Comment to blog post" and append the post title.
 
Copyright © 2023 Michael Franzl