Is there a point at which slow SpinRite performance means the drive is bad?

  • DNS Benchmark v2 is Finished and Available!
    Guest:
    That's right. It took an entire year, but the result far more accurate and feature laden than we originally planned. The world now has a universal, multi-protocol, super-accurate, DNS resolver performance-measuring tool. This major second version is not free. But the deal is, purchase it once for $9.95 and you own it — and it's entire future — without ever being asked to pay anything more. For an overview list of features and more, please see The DNS Benchmark page at GRC. If you decide to make it your own, thanks in advance. It's a piece of work I'm proud to offer for sale. And if you should have any questions, many of the people who have been using and testing it throughout the past year often hang out here.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

I'm not going to post every drive's results, as I have a stack of them I'm working through. But I thought those who contributed guidance on this latest drive we were discussing might be interested in "the rest of the story". It also gives a good basis for asking a few final questions.

While I was tempted to kill SpinRite processing of this drive, I let it run. It took 4+ days, but it completed. I've attached the relevant screens showing status after completion. Points of interest:
  • Graphics Status Display: one Unrecovered area of storage.
  • Real-Time Activities: 14 command timeouts, 14 minor troubles
  • SMART System Monitor: ECC corrected 112/114, relocated sect 76/77, uncorrectable 84/100, pending sectors 84/100.
I've read the doc on all of these, and the SMART doc too. Based on the above, what would your verdict on this drive be? Would running SpinRite again on it potentially clean it up further to a reliable state?

One last question about SpinRite usage: is it possible to rerun SpinRite on only the specific problem areas only as indicated by these results, and ignore the rest of the drive which tested fine?

Thanks so much for your help!
 

Attachments

  • IMG_0804.jpeg
    IMG_0804.jpeg
    84.8 KB · Views: 82
  • IMG_0805.jpeg
    IMG_0805.jpeg
    48.2 KB · Views: 78
  • IMG_0808.jpeg
    IMG_0808.jpeg
    58.3 KB · Views: 100
is it possible to rerun SpinRite on only the specific problem areas only as indicated by these results
Sure, in a manner of speaking. You should have the logs, which should give you LBA #'s (or more likely a suitable range) and you can specify the range on the command line or in the UI. (On the screen just before final confirmation, there should be a note on what to press to edit the range.. it used to be Shift-Enter, but it changed in 6.1 toward the end of the development, and I forget the new sequence... might be TAB.) See the FAQ on the site for the command line options if you want to go that route. https://www.grc.com/sr/faq.htm Search for "What are SpinRite’s command line options?"
 
  • Like
Reactions: brado2049
One last question about SpinRite usage: is it possible to rerun SpinRite on only the specific problem areas only as indicated by these results, and ignore the rest of the drive which tested fine?
Yes! One way is via the command line as pholder noted.

Another way: When you get to the "Before Beginning" screen do not press Enter. Press TAB instead. This takes you to a screen where you may specify starting and stopping points for a more surgical SpinRite run.

Of course, either way you need to know where to start and where to stop. This can be either by a percentage or by sector number. This information is typically found in the SRLOG file. It is also specified on screen by SpinRite when a run is aborted in progress.
 
  • Like
Reactions: brado2049
Update -- I have made my way through all 13 of my drives, and here are the first-pass totals:
  • 7 good (mostly, minor stuff)
  • 3 bad, either outright unreadable, or estimating months to process and letting it run 30-60 min kept that high estimate (I exited SpinRite).
  • 3 iffy multi-day processing times, moderate errors.
Thank you everyone for your guidance on running SpinRite and reading the results. 7 of those drives are going back into circulation in a new QNAP DAS, I'll hold a funeral for the 3 bad ones, and the 3 iffy ones are being experimented with @DanR 's XFER guidance (4096). Thanks to the helpful pointers, I'm starting to get the feel of all this. These last XFER tests will probably take a couple weeks, but hopefully in the near future I can get a blog post out. If XFER and using these drives in the QNAP array produce any interesting nuggets, I'll post here.

This has been a great exercise on a great tool -- thanks again to all who have contributed.
 
Re" ... [ SpinRite ] estimating months to process ..."​

2 things:

For Data Recovery, try DYNASTAT 1 to reduce SpinRite spending
5 minutes on every unreadable sector:

SPINRITE NORAMTEST DYNASTAT 1

For Drive Maintenance, try DYNASTAT 0 to eliminate data
recovery altogether.

SPINRITE NORAMTEST DYNASTAT 0

Level 2, 3, 4, or 5, you decide how thorough a test you want to throw
at anything.

Those might speed through drives for reuse.

That being said, I have let SpinRite have at it for 2 weeks, and I was
pleased to discover that it recovered everything, even the drive itself.

Let us know if that helps make the worst drives usable again.
 
  • Like
Reactions: brado2049
@peterblaise — thanks for the reply. I have no need for data recovery at all. For the 13 drives I mentioned that I ran SpinRite on, I used this command:

SPINRITE NORAMTEST LEVEL 3 DYNASTAT 0 NOREWRITE

For the 3 iffy drives out of those 13 after one run (moderate issues), I’m now running the following on those:

SPINRITE NORAMTEST LEVEL 3 DYNASTAT 0 NOREWRITE XFER 4096

Thought this does bring up another question I have. Why when specifying the “LEVEL 3” on the command line, does SpinRite still ask you to select the Level you want to run? It appears specifying the Level on the command line has no effect….
 
I hadn’t thought to make this thread longer, but as a question from the new circumstances aligns with the original thread topic, I’ll put it here. So I have been running this on one of those “iffy” drives (moderate issues) for a few days — been running about 50 hours, but with an estimated remaining time of 350 hours. Been running with this command:

SPINRITE NORAMTEST LEVEL 3 DYNASTAT 0 NOREWRITE XFER 4096

I got to thinking about it — what does it mean when you have a drive that’s processing very slowly (relatively speaking), but you aren’t getting any (or many) errors? Is it possible that such a drive will be cleaned up and returned to normal use? Or is it garbage regardless of lack of errors?
 
I got to thinking about it — what does it mean when you have a drive that’s processing very slowly (relatively speaking), but you aren’t getting any (or many) errors? Is it possible that such a drive will be cleaned up and returned to normal use? Or is it garbage regardless of lack of errors?
Personally I would take note of the slow area then re-run SR over the same area a few times on level 3-5 to see if it's still slow or if working the platters helped in cleaning it up. Then decide.
 
  • Like
Reactions: brado2049
Thought this does bring up another question I have. Why when specifying the “LEVEL 3” on the command line, does SpinRite still ask you to select the Level you want to run? It appears specifying the Level on the command line has no effect….
In this case, Level 3 has been selected (via the command line) and will show at the very top center of the Level Selection screen.
SpinRite is merely giving you the option of changing your mind if a different level is desired.

BTW: You could also try XFER 2048 or XFER 1024, for example. However, progress could be slower as SpinRite's "bites" now involve fewer bytes. :)
 
  • Like
Reactions: brado2049
Personally I would take note of the slow area then re-run SR over the same area a few times on level 3-5 to see if it's still slow or if working the platters helped in cleaning it up. Then decide.
Best I can tell, there’s no area that’s been slower or faster than the others. All processing of the drive has been slow. The original estimate for drive processing time was around 400 hours, and the rate of processing hasn’t changed substantially after 50 hours. That’s really the question, what does it mean if processing of the entire drive is slow, but there isn’t much in the way of errors?
 
Best I can tell, there’s no area that’s been slower or faster than the others. All processing of the drive has been slow. The original estimate for drive processing time was around 400 hours, and the rate of processing hasn’t changed substantially after 50 hours. That’s really the question, what does it mean if processing of the entire drive is slow, but there isn’t much in the way of errors?
If this is the drive in the pictures above, which shows a number of cabling errors, and it is consistently slow in SR, perhaps there are REAL cabling errors ( bad contacts or damaged traces). Have the SMART stats changed significantly during the run?
 
If this is the drive in the pictures above, which shows a number of cabling errors, and it is consistently slow in SR, perhaps there are REAL cabling errors ( bad contacts or damaged traces). Have the SMART stats changed significantly during the run?
This is not that same drive. This is one of the other “iffy” drives that either had moderate errors or were processing extremely slowly in my first pass through the 13 drives I had, so I exited SpinRite processing and set it aside for a second pass with different parameters. This is where my question is arising from — basically it is a drive that is processing extremely slowly — original estimate ~400 hours, and several days in, that rate has proven consistently true. However, there are almost no errors that have arisen. I am attaching the screenshots of the current status, and you can see:

  • Graphic Status Display: no problems
  • Real-Time Activities: no errors
  • SMART System Monitor: ECC corrected - one red square (but notice that has decreased from three as SpinRite processing has proceeded)
Hence, the question — what should be concluded when the drive processing is agonizingly slow, but comes up with no (or few) errors? What does that mean? Does it mean there are no errors but actual read/write performance will make the drive pragmatically unusable? Or does it mean the drive is fine for normal use, but the SpinRite exercises were very slow for some reason that won’t be material to normal usage?

Thanks for your guidance!
 

Attachments

  • IMG_0832.jpeg
    IMG_0832.jpeg
    155 KB · Views: 78
  • IMG_0833.jpeg
    IMG_0833.jpeg
    169.4 KB · Views: 74
  • IMG_0834.jpeg
    IMG_0834.jpeg
    154.4 KB · Views: 65
  • IMG_0835.jpeg
    IMG_0835.jpeg
    151.5 KB · Views: 69
  • IMG_0836.jpeg
    IMG_0836.jpeg
    90.8 KB · Views: 72
This is not that same drive.

  • Graphic Status Display: no problems
  • Real-Time Activities: no errors
  • SMART System Monitor: ECC corrected - one red square (but notice that has decreased from three as SpinRite processing has proceeded)
Hence, the question — what should be concluded when the drive processing is agonizingly slow, but comes up with no (or few) errors?
As you say, no errors are showing, although I do notice that you are only using 4096 bytes blocks. That will slow SR down, the AHCI driver should be able to process 32k blocks which should be 8 times faster.
 
  • Like
Reactions: brado2049
As you say, no errors are showing, although I do notice that you are only using 4096 bytes blocks. That will slow SR down, the AHCI driver should be able to process 32k blocks which should be 8 times faster.
My original run on all of my hard drives used 32k blocks. After that first pass, I had 7 good drives, 3 bad drives, and 3 “iffy” drives which either had moderate errors or were processing so slowly they were either bad or needed different parameters. So taking the guidance of another (@DanR) earlier in this thread, I switched to 4k blocks for a second run only on the 3 “iffy” drives.

This makes for an opportunity for a worthwhile clarification on what that 32k -> 4k change actually does. My understanding is this determines the size of the blocks SpinRite performs read-write I/O with, so larger blocks, faster drive processing, but will also report issues against that amount of drive space. However, if there are errors or slow I/O (I don’t know all the myriad reasons which can contribute) within the drive location a larger block addresses, switching to a smaller block size can help to isolate the problem to a smaller area of the drive, perhaps revealing that it is only a smaller area which is actually having problems.

So for example, let’s say I use 32k block size, and am returned errors. If I reprocess that 32k area of the disk using a 4k block size, that addresses that 32k block disk area in 8 different 4k blocks, and SpinRite may now be able to determine that, for example, the error exists only in 4k block 5, not the entire 32k block area. I would assume this would favorably change the final statistics on the drive as a whole, and ideally show the drive to have issues on a lesser area of the drive. What it also does is further isolate issues to smaller areas and more specific locations on the drive, making it possible to focus just on those areas for additional SpinRite runs if desired.

Have I got that right? (If not, please someone set me straight! :) ) Anyway, returning to the drive in question, on this second run using 4k blocks, there doesn’t seem to be any significant difference in processing speed (though I wasn’t directly comparing 32k vs 4k estimated times), they both were just extremely slow, so I’m just letting this current 4k run complete, if it can. But the lack of much in the way of errors has got me thinking about the proper conclusion — what happens if this thing finally completes, and there are no (or very few) errors, but the drive processing was agnonizingly slow (which if stopped right now, would be the case)? Is that drive good to put back into use? Or is it bad, and should be destroyed with the other dead drives? Part of the idea of what I’m asking is if there’s a scenario where the drive is otherwise fine and will perform fine during normal use, but there’s something about the nature of SpinRite operations which on some drives just manifests as extremely slow processing, but doesn’t necessarily mean drive dysfunction. So in other words, dog-slow SpinRite processing, but the drive is still good. Can anyone speak to this?
 
Last edited:
what that 32k -> 4k change actually does.
Your understanding may be a little off. The request size is always a multiple of 512 bytes, as that is the size of a LBA. So really what is changing is the number of LBAs that are being requested from the drive at once. Assuming all of them succeed, this just means SpinRite can go faster because the overhead per LBA is lower. (The drive's internal code presumably handles optimizations like knowing it needs more sequential blocks, so it arranges to optimize the reading and communicating with the PC.)

However, once a problem occurs, SpinRite has to abandon all of the big block work and get down to work on that specific LBA. It's going to keep retrying it. At this point, it's all up to the drive and its internal processes, and speed is no longer the main concern. So, while it may affect drive behaviour, because many things can, and we don't know what the firmware is coded like, changing the size of reads shouldn't affect drive reliability... but during my beta testing of SpinRite, I found a drive that did vary its behaviour based on block size, and I think that was the genesis of Steve providing the command line option to override SpinRite's automatic logic.
 
  • Like
Reactions: brado2049
dog-slow SpinRite processing, but the drive is still good
Some questions:

Is the current 4K run the first pass Level 3 run? That is, you have not re-started SpinRite from the beginning for a second pass? And the Graphic Status Display (GSD) screen shows no blocks with R's, U's, or B's?

If the drive has not been used for a long time, the bit patterns on the platter surfaces will weaken over time and become progressively harder to read. SpinRite can be very patient and persistent in trying to read the sectors, taking lots of time if necessary.

A clean GSD screen suggests that the blocks were read successfully (no U's), DynaStat likely not needed (no R's), and the data was rewritten successfully (no B's), refreshing the bit patterns.

A second pass Level 2 run should then proceed at normal speed on a now refreshed., now good drive. If the second pass is still dog-slow, however, then the drive is bad.
 
  • Like
Reactions: brado2049
Your understanding may be a little off. The request size is always a multiple of 512 bytes, as that is the size of a LBA. So really what is changing is the number of LBAs that are being requested from the drive at once. Assuming all of them succeed, this just means SpinRite can go faster because the overhead per LBA is lower. (The drive's internal code presumably handles optimizations like knowing it needs more sequential blocks, so it arranges to optimize the reading and communicating with the PC.)

However, once a problem occurs, SpinRite has to abandon all of the big block work and get down to work on that specific LBA. It's going to keep retrying it. At this point, it's all up to the drive and its internal processes, and speed is no longer the main concern. So, while it may affect drive behaviour, because many things can, and we don't know what the firmware is coded like, changing the size of reads shouldn't affect drive reliability... but during my beta testing of SpinRite, I found a drive that did vary its behaviour based on block size, and I think that was the genesis of Steve providing the command line option to override SpinRite's automatic logic.
Thanks for the great reply! That is really interesting. If the XFER block size parameter was added to address the behavior which varied based on block size, then I suppose the actual net effect of that parameter value (beyond # of LBAs requested) lies in what the actual varied behavior you observed was. Do you mind expounding on that a little more, I’m really curious as to what it was. From a career doing software development, my blind guess would be something to do with optimizing I/O buffering and optimizations and possibly managing caching. I don’t know the internals, but those things absolutely can cause behavioral variances in systems. Fascinating stuff, if you can share more, would really welcome it.
 
Some questions:

Is the current 4K run the first pass Level 3 run? That is, you have not re-started SpinRite from the beginning for a second pass? And the Graphic Status Display (GSD) screen shows no blocks with R's, U's, or B's?

If the drive has not been used for a long time, the bit patterns on the platter surfaces will weaken over time and become progressively harder to read. SpinRite can be very patient and persistent in trying to read the sectors, taking lots of time if necessary.

A clean GSD screen suggests that the blocks were read successfully (no U's), DynaStat likely not needed (no R's), and the data was rewritten successfully (no B's).

A second pass Level 2 run should then proceed at normal speed on a now refreshed., now good drive. If the second pass is still dog-slow, however, then the drive is bad.
Thanks for the reply! Yeah, the most recent screenshots I posted apply. Every run I have done has been Level 3. The first run on all the drives restarted SpinRite fresh for each drive using this command:

SPINRITE NORAMTEST LEVEL 3 DYNASTAT 0 NOREWRITE

This second run on just the 3 “iffy” drives is using this command:

SPINRITE NORAMTEST LEVEL 3 DYNASTAT 0 NOREWRITE XFER 4096

On this current drive in question, the one currently running (same one associated with the most recent screenshots I posted), the GSD screen is totally clean. Also, the drive has not been used for a long time (years). The run is not even half-way done yet, it will take several more days to complete. But I gather from your post above, that once it does complete, that the implication is that if the GSD screen completes clean, that if SpinRite is run again on the drive (using the first command above), that it should be much faster (hours for the entire drive to be processed) and should be good?
 
Every run I have done has been Level 3. The first run on all the drives restarted SpinRite fresh for each drive using this command:

SPINRITE NORAMTEST LEVEL 3 DYNASTAT 0 NOREWRITE

This second run on just the 3 “iffy” drives is using this command:

SPINRITE NORAMTEST LEVEL 3 DYNASTAT 0 NOREWRITE XFER 4096

On this current drive in question, the one currently running (same one associated with the most recent screenshots I posted), the GSD screen is totally clean. Also, the drive has not been used for a long time (years). The run is not even half-way done yet, it will take several more days to complete. But I gather from your post above, that once it does complete, that the implication is that if the GSD screen completes clean, that if SpinRite is run again on the drive (using the first command above), that it should be much faster (hours for the entire drive to be processed) and should be good?
NOREWRITE is intended for the situation where data recovery is paramount. Only a 100% successfully read sector will be rewritten. Partially read sectors, with read errors, will NOT be rewritten, thus preserving the data . If data is of no concern here then NOREWRITE is pointless.

DYNASTAT 0 does no data recovery (DynaStat is disabled). Just one normal read is done. If the read is successful the sector is rewritten. Partially read sectors would be rewritten with zeros for the unreadable data, thus losing data.

Level 3 rewrites every sector.

I do not understand why either of the above command lines would be so slow, unless the drive is inherently slow. If Level 3 is not speeding up the drive then the drive is suspect.

I would suggest starting a normal scan at level 2 on this drive. No need for level 3 to to rewrite everything yet again. No need for DynaStat 0 since theh entire drive has been successfully rewritten. What speed does an unfettered level 2 run at?
 
  • Like
Reactions: brado2049
NOREWRITE is intended for the situation where data recovery is paramount. Only a 100% successfully read sector will be rewritten. Partially read sectors, with read errors, will NOT be rewritten, thus preserving the data . If data is of no concern here then NOREWRITE is pointless.

DYNASTAT 0 does no data recovery (DynaStat is disabled). Just one normal read is done. If the read is successful the sector is rewritten. Partially read sectors would be rewritten with zeros for the unreadable data, thus losing data.

Level 3 rewrites every sector.

I do not understand why either of the above command lines would be so slow, unless the drive is inherently slow. If Level 3 is not speeding up the drive then the drive is suspect.

I would suggest starting a normal scan at level 2 on this drive. No need for level 3 to to rewrite everything yet again. No need for DynaStat 0 since theh entire drive has been successfully rewritten. What speed does an unfettered level 2 run at?
@DanR — thanks for a great reply — good to know about NOREWRITE. Somewhere along the line I gathered that option was part of eliminating data recovery if not necessary. I also had gathered from some comments, and it appears I misunderstood, that the way to do a complete refresh of drives which required no data recovery (I have zero need for any data recovery on any drive) was to do a Level 3 with Dynastat 0. It appears I was mistaken.

Considering what I am trying to accomplish, looking at the level explanations in the SpinRite FAQ, and the comments above, is a Level 2 what I need, or should I do a Level 1? Given the FAQ’s description of Level 1 essentially being a Level 2 but without data recovery, that sounds like what I need. @DanR can you confirm? If that is indeed the case, then that is worthy of killing this current SpinRite run that’s been going on for days.

I look forward to your response!