WD Red 3TB hdd hangs on a bad section of the disk

  • SpinRite v6.1 is Released!
    Guest:
    That's right. SpinRite v6.1 is finished and released. For the full story, please see this page in the "Pre-Release Announcements & Feedback" forum.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in:

    This forum does not automatically send notices of new content. So if, for example, you would like to be notified by mail when Steve posts an update to his blog (or of any other specific activity anywhere else), you need to tell the system what to “Watch” for you. Please checkout the “Tips & Tricks” page for details about that... and other tips!

    /Steve.
  • Announcing “BootAble” – GRC's New Boot-Testing Freeware
    Please see the BootAble page at GRC for the whole story.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)


tahoebill

New member
Feb 27, 2023
3
0
I have a 3TB WD Red hdd recently marked as "degraded" by my Synology NAS. I placed the drive in an eSATA dock and formatted it in Windows and found that it passed "CHKDSK /x /f /r". When I tried to test the drive in Spinrite 6.1 using levels 1 through 3, it hung up in a loop trying to read a bad area on the disk, constantly switching between 16 and 256 sector reads.
 
Last edited:
I wonder if Spinrite depends on the drive generating a timeout on unsuccessful reads. Normal desktop drives generate read timeouts, whereas NAS drives do not.
I have no idea what this means.

A drive either responds within certain time or it does not. If it takes a drive longer to respond then we (software, an OS, etc.) is willing to wait we say the read or write request "timed out". It does not mean the drive would not eventually return a response.

So if a drive "timed-out" it means it did not respond within the time we're willing to wait. The drive might eventually respond if we give it more time (with the requested data or an error), or it may not respond at all.

RAID controllers may be less patient than 'normal' PCs and this is why drives intended to be used in a NAS or RAID arrays may respond quicker (spend less time on error recovery) with either the data or an error to avoid it timing out since the NAS declares a time-out sooner.

Anyway back to SpinRite: SpinRite does nothing special when it's simply reading sectors from a drive. IOW it sends a read command, then it gives the drive time to respond (I forgot how long exactly) and if drive does not respond within that time window it considers the drive times-out. But again this time-out / response window is arbitrary. So say it waits for 5 seconds, while for example chkdsk /r which uses Windows IO waits 20 seconds, and the drive takes 10 seconds to respond, chkdsk may not notice issues while SpinRite 'times-out'.

Question then becomes, what's an acceptable time frame for a drive to respond with either requested data or an error. If you'd rely on SpinRite to make this call, then the drive undergoing testing appears to have some kind of problem. It may be wise to look at the SMART data.
 
I have a 3TB WD Red hdd recently marked as "degraded" by my Synology NAS. I placed the drive in an eSATA dock and formatted it in Windows and found that it passed "CHKDSK /x /f /r". When I tried to test the drive in Spinrite 6.1 using levels 1 through 3, it hung up in a loop trying to read a bad area on the disk, constantly switching between 16 and 256 sector reads. I wonder if Spinrite depends on the drive generating a timeout on unsuccessful reads. Normal desktop drives generate read timeouts, whereas NAS drives do not.
I think if you watch SpinRite's “Real Time Activities” screen you'll find that SpinRite is slowly moving forward rather than looping. If the drive has a "region" of trouble, SpinRite will gradually work its way forward through that region. During SpinRite 6.1's extensive development work we found that asking damaged drives to read larger blocks at once was far less useful than asking for only a few. So, when SpinRite encounters a problem is drops down to much smaller read requests while it works past the trouble. After encountering a problem with a sector it will drop to 16 sector transfers (as you saw) in the hope of getting that to succeed. If it can it'll cautiously move up to 256 sector requests for a while, then finally return to its highest-speed 32,768 sector transfers.

So, @tahoebill, from your description it appears that everything is working correctly. (y)
 
OK, that seems reasonable. I just got frustrated watching it through over 12,400 iterations with nothing changing on the screen except switching the read size between 16 and 256 sectors. Maybe once the sector has irrecoverable bytes, the whole sector can just be marked as bad, rather than trying to read the rest of the bytes?
 
The trouble is, many users probably really really want to get every last bit of data from the drive. We DO allow SpinRite's data recovery "strength" to be tuned up or down — even down to zero! — but we've kept that as a command-line option since it needs to be used with care!
 
So there is one thing you need to be aware of. RED drives are specially designed for use in a NAS, and run a different firmware that is supposed to allow the NAS to have more control. If you look on your drive label, it will probably say something like NASWare 2.0 or something. Because NASes are focused on data security, they are usually much more inclined to report a problem as soon as possible, to give you enough time to replace the drive before it fails completely. When it comes to NASes it's always better to replace a failing drive and rebuild your RAID before the drive dies, the RAID degrades or fails, and then your data is much more at risk. (You ARE doing backups though, right? RAID is NOT a backup.)