Impact of power failure during Spinrite 6.1 write?

  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

Journaling file system
SpinRite does not use or care of the existence of the file system what-so-ever. It runs sector by sector (LBA by LBA). If you have a power failure during a write, you're going to experience data loss, and potentially low level data corruption of the magnetics (on spinning rust.) Your best option is to not have a power failure... so use a UPS (or a laptop on a quality battery.)
 
Thanks for explaining that. I'll investigate UPS's!

An after-thought: my SR usage is infrequent - perhaps once a year. So, for me, a a cheaper but effective mitigation would be to image the SSD just before running SR, and if the worst happens restore that image.

Perhaps this risk (and suggested mitigation) could be high-lighted in the SR user manual or FAQs?
 
Last edited:
It's worse than simply the loss of a sector or two. Whatever is in the disk's cache is lost. Additionally, since writes are performed out of order, i.e. the disk firmware may use the elevator algorithm, data written to a database may be corrupt because writes are not atomic.

However, this can be somewhat mitigated if the disk a) supports tagged queuing and b) the O/S uses it. SCSI devices do support this. Some ATA devices may (poorly). Tagged queuing requires the use of DMA. (And the SCSI protocol does support an atomic write command.)

And, yes a UPS is the best solution. I do have a UPS for my machines in my basement. When the UPS reaches a certain battery level the machines will shut themselves down. I use Network UPS Tools to manage this.
 
UPS is the best solution
Aside from preventing data loss when an actual power outage occurs, it's also beneficial to condition the power, and prevent momentary power glitches that can be the source of unexplained system instability and data corruption. There are higher quality UPSes (at higher prices, of course) that run all input power through their output power sine wave reconstruction, and that has the benefit of providing the most stable power possible for your equipment. I think there are different terms used by different manufacturers, but you will probably encounter the terms online (the most capable/expensive), line-interactive and offline.
 
True. I keep my UPS plugged into a power conditioning surge protector. The reason, we had an open neutral between the meter and the panel. The connection broke. They should have run the neutral from the pole directly into the panel but instead they had spliced it just behind the meter. When it broke the lights flickered wildly, during a terrible wind storm. I unplugged most gear but any surge protectors still plugged in were fried and the smell of burnt electronics (in the surge protectors was apparent. The surges were that bad that light bulbs popped.

The surges also fried the ($400) circuit board in the furnace.

Even though the UPS does protect from surges and conditions power, I protect my UPS too as it's much more expensive than a sacrificial surge protector.

We're thinking of installing a whole house surge protector/power conditioner in the panel.
 
True. I keep my UPS plugged into a power conditioning surge protector. The reason, we had an open neutral between the meter and the panel. The connection broke. They should have run the neutral from the pole directly into the panel but instead they had spliced it just behind the meter. When it broke the lights flickered wildly, during a terrible wind storm. I unplugged most gear but any surge protectors still plugged in were fried and the smell of burnt electronics (in the surge protectors was apparent. The surges were that bad that light bulbs popped.

The surges also fried the ($400) circuit board in the furnace.

Even though the UPS does protect from surges and conditions power, I protect my UPS too as it's much more expensive than a sacrificial surge protector.

We're thinking of installing a whole house surge protector/power conditioner in the panel.
Yes whole panel will help, and also putting them after the breakers, though that is hard on US panels, as most SPD are designed for DIN rail use, and US residential boards are not really sold in DIN rail format. Outside the USA it is common to have SPD in panel, protected by a breaker, so you add in a few and put them after the circuit protection, so the SPD will protect against overvoltage and trip the breaker.
 
It's worse than simply the loss of a sector or two. Whatever is in the disk's cache is lost. Additionally, since writes are performed out of order, i.e. the disk firmware may use the elevator algorithm, data written to a database may be corrupt because writes are not atomic.

However, this can be somewhat mitigated if the disk a) supports tagged queuing and b) the O/S uses it. SCSI devices do support this. Some ATA devices may (poorly). Tagged queuing requires the use of DMA. (And the SCSI protocol does support an atomic write command.)
Those are valid comments when running a drive with a modern OS which does things like command queuing and drive write caching enabled. However SpinRite does not. It only issues one command at a time without command queuing so you could only lose a single command's worth in the buffer. I don't think that SpinRite runs with write caching enabled. However, since SpinRite will issue sequential commands as large as 16MB ignoring the filesystem, you can still wind up corrupting a lot of data across several files if the power goes out in the middle of a write.

Levels 1 and 2, which only do reads, are pretty safe from power failure. However, if SpinRite has recovered the data from a bad sector during these levels and is in the process of writing it out when the power failure happens you could lose the data. If it's happening in the middle of the reassign, it could even leave the drive confused as to where the valid sector is.
 
Those are valid comments when running a drive with a modern OS which does things like command queuing and drive write caching enabled. However SpinRite does not. It only issues one command at a time without command queuing so you could only lose a single command's worth in the buffer. I don't think that SpinRite runs with write caching enabled. However, since SpinRite will issue sequential commands as large as 16MB ignoring the filesystem, you can still wind up corrupting a lot of data across several files if the power goes out in the middle of a write.

With cache enabled TCQ reduces the possibility of data loss during a power failure.

At most one sector could be lost, though some drives will complete a write operation before parking heads while power is lost.

The impact of what is lost depends on whether a single data sector or a sector containing critical filesystem metadata is lost. Of course any app, spinrite, dd, or any other block level app, doesn't know the structure of the filesystem written to the media, hence it's a crap shoot.

Personally, I wouldn't worry about the possibility, at least not on any UFS, ZFS, or EXT* filesystem. NTFS and Windows files may be another story. Though NTFS should be able to handle it, the loss or corruption of a Windows O/S file would make me nervous, as Windows IMO is fragile compared to other O/Ss.
 
Levels 1 and 2, which only do reads
Ah . . .no. Not quite correct. Level 1 is indeed read only. Safe.

But Level 2 will be read only unless/until data recovery is required. Then writing of recovered data will occur.

For a healthy readable drive, however, Levels 1 and 2 would both be read only. And both safe.
 
I suggested (see #3 above) that, for me, a cheaper but effective mitigation against power cuts whilst SR was running would be to image the SSD before running SR, and if the worst happens restore that image (I use Image for Linux). Would anyone care to comment? Is that likely to be effective? Are there any gotchas?

ps. I'm not interested in further UPS discussion!
 
I suggested (see #3 above) that, for me, a cheaper but effective mitigation against power cuts whilst SR was running would be to image the SSD before running SR, and if the worst happens restore that image (I use Image for Linux). Would anyone care to comment? Is that likely to be effective? Are there any gotchas?
A viable approach methinks. The possible "Gotcha" is that if a power failure occurs during the image process, the image is corrupt and useless. But when power is restored, the drive could then simply be re-imaged before SpinRite-ing.

Comments re UPS's not mentioned above (for anyone reading this):

A UPS should not be expected to complete a SpinRite scan in progress when power goes out.

While a UPS does provide protection against momentary power failures / fluctuations (its purpose) a UPS will not have enough battery time to complete any lengthy SpinRite scan. However, the UPS will provide the user the opportunity (also its purpose) to safely terminate SpinRite and safely shut down the system until stable power is resumed. And then either resume SpinRite from the point of interruption or simply restart SpinRite from the beginning.
 
  • Like
Reactions: SeanBZA
I suggested (see #3 above) that, for me, a cheaper but effective mitigation against power cuts whilst SR was running would be to image the SSD before running SR, and if the worst happens restore that image (I use Image for Linux). Would anyone care to comment? Is that likely to be effective? Are there any gotchas?
That depends if you are running SR as a preventative measure, or to recover data. In the latter case, either your image might contain the corrupt data, or the act of taking the image might tip the drive over a failure point.