NVME ssd "fault" and resolution

  • SpinRite v6.1 Release #3
    Guest:
    The 3rd release of SpinRite v6.1 is published and may be obtained by all SpinRite v6.0 owners at the SpinRite v6.1 Pre-Release page. (SpinRite will shortly be officially updated to v6.1 so this page will be renamed.) The primary new feature, and the reason for this release, was the discovery of memory problems in some systems that were affecting SpinRite's operation. So SpinRite now incorporates a built-in test of the system's memory. For the full story, please see this page in the "Pre-Release Announcements & Feedback" forum.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

orbital-hamburger

New member
Mar 14, 2024
2
0
This nvme ssd froze halfway on both L1 and L2 scans. I identified the affected partition, backed up the files and lost one file due to corruption. Then I wrote zeros over the entire partition using dd. Afterwards Spinrite scans worked successfully.





spinrite_stuck_nvme_1.jpg
spinrite_stuck_nvme_2.jpg
spinrite_stuck_nvme_3.jpg
 
This nvme ssd froze halfway on both L1 and L2 scans. I identified the affected partition, backed up the files and lost one file due to corruption. Then I wrote zeros over the entire partition using dd. Afterwards Spinrite scans worked successfully.

It would seem that this drive was overdue for a refresh. Thus there was a sector that was simply no longer readable. Writing all zeros there "refreshed" the "data" restoring readability. :)

My I ask what version of SpinRite you were using? SpinRite 6.0?
 
@orbital-hamburger: The only thing I would add to what these guys already correctly noted is that since SpinRite doesn't yet have native NVMe drivers (v7.0 will). So, if an NVMe drive is available at all to SpinRite, it will be through the BIOS. But since the BIOS only needs to be sufficient to allow a system to boot an OS, the NVMe support may be lacking. It's very "dumb" that the BIOS was unable to recover from a simple read error on that drive... and with each passing day I'm becoming more anxious to get to work on v7.0... But I need to get v6.1 solidly launched before I can let myself start having THAT sort of fun again! (y)
 
Hello all.

I'm seeing a similar issue here, also for an nvme drive

1710803371268.png


But here's my issue: I'd like help to get the entire device scanned and it stops here citing an inability to continue to work with the device. However, simply by exiting the application and restarting it, Spinrite is able to see the device and continue working with it.

So the problem as I see it, is that the application is able to continue but needs to be restarted. It seems to me that whatever the conditiion causing the state can be cleared, but isn't (attempted). Perhaps this is due to inconsistent systemt-to-system behavior, where the lowest common denomiator is to ask the user to reset the system, but it's a shame since stopping and restarting the scan is a bit cumbersome to say the least.

For completeness the system and device under test are:
motherboard: ASRock X470D4U
cpu: Ryzen 7 5700G
nvme: Samsung 870 EVO nvme 2TB
 
So the problem as I see it, is that the application is able to continue but needs to be restarted. It seems to me that whatever the condition causing the state can be cleared, but isn't (attempted).
I agree with you 100%. There should be nothing that exiting and restarting SpinRite does that SpinRite could not do without exiting.
 
@bluesmoke: How is that NVMe drive attached to the machine? It is BIOS drive 82. Is it internal and attached to the machine or plugged into an external USB-to-NVMe adapter? I'm very interested since I've spent an outrageously large amount of time on this specific problem and I believed that it was resolved. There's no question that the BIOS is the culprit, so SRv7 will not have any of this mess... but you're also correct, as I noted before, that there's no reason why SpinRite should not be able to "clear" the trouble.

Also... before this occurs, are you seeing a two-minute countdown, in yellow, in the upper-left corner of the screen?

Thanks!!
 
The problem area on the drive was a region which had not been read from for a long time. The version of spinrite used was the 2nd final release of 6.1 - the one which fixed the ISO bootable media creation. Going forward, a regular refresh L1 or L2 scan hopefully will maintain SSD freshness. Thanks Steve (and testers) for all your hard work.
 
Going forward, a regular refresh L1 or L2 scan hopefully will maintain SSD freshness.
Um, . . . sorry, no, it will not.

Level 1 is read only - no refreshing occurs.
Level 2 will only refresh those areas that are hard enough to read to trigger data recovery and a subsequent rewrite of the recovered data.
Judicious use of Level 3/4 would be required to refresh all of the media and maintain "freshness".
 
Level 1 is read only - no refreshing occurs.
This depends on your goals and the drive's firmware. Since some flash drives actually theoretically suffer from "read disturb" it seems pretty clear that forcing a flash device to read the data can cause the controller to realize the media needs work and do it all on it's own. If it was my hardware, I would start with a level 1, to make sure there are no read problems that show up as errors. I would also watch the speed test results to see if anything measurable was improving.

Most drives don't slow down enough to really cause you to notice any need for them to be faster. If the drive does slow down in your estimation (rather than theoretically from watching the speed tests) THEN (and only then) would I consider whether a full rewrite (a level 4) was of any value. Even then, I would be more selective of the sections of the drive I would exercise, since a flash drive works better when not close to full, one presumes a large section of the drive would be unoccupied anyway.

Hopefully SpinRite 7 will automate more of this for you in the future.
 
@bluesmoke: How is that NVMe drive attached to the machine? It is BIOS drive 82. Is it internal and attached to the machine or plugged into an external USB-to-NVMe adapter? I'm very interested since I've spent an outrageously large amount of time on this specific problem and I believed that it was resolved. There's no question that the BIOS is the culprit, so SRv7 will not have any of this mess... but you're also correct, as I noted before, that there's no reason why SpinRite should not be able to "clear" the trouble.

Also... before this occurs, are you seeing a two-minute countdown, in yellow, in the upper-left corner of the screen?

Thanks!!
Argh. I apologize for missing this earlier.

To answer your question it is connected directly to an NVME slot on the motherboard, not USB.

I am also seeing the countdown before it throws the error screen.

Please let me know if I can provide additional details.
 
Last edited:
@bluesmoke : This is as frustrating for me as it is for you, since I would love nothing more than to be able to clear that trouble and proceed. And I agree that since exiting and restarting SpinRite clears the trouble, SpinRite ought to be able to do the same without exiting. But it is already issuing drive resets and waiting for the drive to come back up (that's the countdown timer that's shown). So, without having the machine in front of me I'm unable to figure out what more I/SpinRite can do.

It won't help you today, but one of the feature of SR7 will be remote Internet debugging so I'll be able to remotely see and fix what's going on with such mystery machines.
 
Ok I understand. Also the frustration is not because of SR, but my poor response time since you clearly responded weeks ago and I failed to reciprocate.

SR7 makes sense since you've been attempting to bring SR6 to a close. Also, I know you have a lot of hardware already but if you want I'm local and can arrange this system to get to you if you are interested.
 
While that IS tempting, I think I need to keep moving forward. It's with a huge sigh of relief that the third release is out now and at some point I really need to decide that it's good enough. SR7 will be adding both native USB and NVMe support and SpinRite 6.1 is not really "supposed" to support NVMe yet. It's difficult for me to not keep doing everything I can, but I need to decide that for this free upgrade to v6.1, it's enough for the time being.