Cleaning Disks with the Schneier Method

TL;DR - Look at this gist

Last year I was introduced to plex software.  I was really excited about the idea of converting my now seemingly ancient DVD collection into digital files so I can watch movies, and television shows on any device in the house and while on the road.  The challenge of I was facing was resources, I was lacking a system with enough compute and storage to run a stable plex box.  So I decided to build one!

Fast forward a few weeks and I had managed to get some fantastic deals on eBay for old hardware: Dual Intel L5640s, a Supermicro motherboard to act as a chariot for the aforementioned CPUs, and 24GB of ECC RAM that I had laying around for other projects.

Next the elephant in the room was storage, I wasn't quite sure how much storage I would actually need.  But I knew that I was going to be limited to 2 terabyte hard drive drives due to the age of the hardware I purchased.  So I went on a hunt to find reasonably priced drives that were NOT refurbished and I happened to come across some Western Digital 2TB Enterprise Drives on Amazon.  Digging in a little deeper the seller noted that the drives being sold are new old stock, I was able to pick up twelve disks for around 30 dollars a piece I thought I scored big time!

Excitedly I ripped almost two dozen cardboard boxes and assembled my new media storage server in a whirlwind that left a trail of boxes and screws throughout my workspace.  Thankfully my lovely bride tolerated the mess until I was able to ensure that the new machine powered up, and was working as expected.

Up to this point I had never purchased old new stock so I wanted to make sure that the labels on the drives actually matched what the drive reported.  To do so I installed smartmontools so that I could get actual data on the condition of the drives, from the drives.  I used the trusty lsblk (short for list block devices) to get the drive identifiers.  Then I began to check the disks themselves:

jgavinray@plex:~$ sudo smartctl -A /dev/sde
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-48-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   172   168   021    Pre-fail  Always       -       6391
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       29
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   042   000    Old_age   Always       -       371
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       29
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       28
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   114   094   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

As you can see from above, the disks most certainly were being reported as old but what caught my eye is the Power_On_Hours, these disks had been used but not for very long.  In the example above you can see that this particular disk had been running for 371 hours, or about 15 days.  The disks I received has a range of power on hours time, from 3 hours to over 500 hours.  While certainly not new, its within my comfort zone of using the disks anyway without returning them.  At thirty dollars a piece I doubt I could have found a better deal anyway.

So now that I have disks, I want to make sure I am getting a clean start.  I have read several of Bruce Schneier's books and many of his blog posts.  He is an advocate of privacy so I wanted to make sure that I nuked these disks the right way.  In his book Applied Cryptography: Protocols, Algorithms, and Source Code in C Bruce talks about his method of performing data sanitation.  As I was looking over the details, several things became clear to me.  One, this would certainly make sure that the disks I received are mine and that there would be no residual data left behind.  Two, my installation of Ubuntu linux had all of the required software preinstalled to perform this wipe.  So I decided to wipe up a little bash script to perform the work!

#!/bin/bash
# The purpose of this script is to wipe a file system or
# physical disk using the Schneier data sanitation method.

if [ -z "$1" ]
  then
    echo "No arguments supplied, please pass the path of the disk or "
    echo "mount point you want sanitized."
    exit 1
fi

read -p "Are you absolutely sure you want to completly wipe $1? [y/N]: " -n 1 -r
echo    # (optional) move to a new line
if [[ ! $REPLY =~ ^[Yy]$ ]]
then
    [[ "$0" = "$BASH_SOURCE" ]] && exit 1 || return 1 # handle exits from shell or function but don't exit interactive shell
fi

echo "Starting data sanitation"
tr '\0' '\377' < /dev/zero > $1
echo "First pass done"
/bin/dd if=/dev/zero of=$1 bs=128M status=progress
echo "Second pass done"
/bin/dd if=/dev/urandom of=$1 bs=128M status=progress
echo "Third pass done"
/bin/dd if=/dev/urandom of=$1 bs=128M status=progress
echo "Fourth pass done"
/bin/dd if=/dev/urandom of=$1 bs=128M status=progress
echo "Fifth pass done"
/bin/dd if=/dev/urandom of=$1 bs=128M status=progress
echo "Sixth pass done"
/bin/dd if=/dev/urandom of=$1 bs=128M status=progress

echo "$1 has been destroyed."
exit 0 

Note: The latest version of this can be seen as a public gist on Github.

I was shocked at how long the above process took, about 3 days per disk.  That said, I was given some confidence that the disks were clean, and not on the verge of failure after I ran the above script on the drives.

After the above was done, it only took a few minutes to install OpenZFS on my Ubuntu box and setup a raidz-2 to act as storage for these aged magnetic disks.  I can now stream classics from my DVD collection to any device in the house and have a measure of confidence that there is some redundancy in the data stored on the newly acquired disks.

All in all, this was a really neat learning experience and gave me a tool that I use at work to ensure that disks are clean when they are decommissioned or we hand hardware off to other folks outside of my team.