Ever since I got my server started, it’s gone through various changes. It started on OpenSolaris, eventually got stable on FreeNAS, and finally matured into a more permanent FreeBSD.

With that, there has been some hardware changes, but moreso in the core guts of the machine. What hasn’t changed is the hard drives it has been running on. Unfortunately, they’re also running a tad long in the tooth. Dumping out the Power_On_Hours line from smartctl gives me a range of 30998–43674 hours (3.54–4.99 years). Yup. They’ve been powered on upwards of 5 years now.

Most of the drives are doing fine (Reallocated_Sector_Ct line is giving 0 for half the drives), but some of them are slowly accruing bad sectors (most are in the single digits, but I have two at 15 & 37, respectively). Unfortunately, these usually pop up overnight during the daily script FreeBSD runs from the smartmontools port, so by the morning, I’ve already gotten the email (example shown when you click through to the rest of the post) that the drive has been taken offline. Since these are all in a RAIDZ setup, a single drive loss is no big deal, but I do have to resolve the issue so the array does not remain degraded. After doing this over numerous incremental errors (out of a dozen read errors, I get maybe 1 or 2 reallocated sectors), I’ve semi-automated the process (although I need to write a better bash script to do this without intervention).

More »