SSD has taken itself offline

  • SpinRite v6.1 Release #3
    Guest:
    The 3rd release of SpinRite v6.1 is published and may be obtained by all SpinRite v6.0 owners at the SpinRite v6.1 Pre-Release page. (SpinRite will shortly be officially updated to v6.1 so this page will be renamed.) The primary new feature, and the reason for this release, was the discovery of memory problems in some systems that were affecting SpinRite's operation. So SpinRite now incorporates a built-in test of the system's memory. For the full story, please see this page in the "Pre-Release Announcements & Feedback" forum.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

Salient Henry

New member
Nov 27, 2023
1
0
I have a 120GB ssd that I wanted to equalize the read speeds across the drive. I ran a level 3 and it failed with the "drive has taken itself offline" error. Is the drive bad? Should I replace it? The reading speed has also decreased after running the level 3 scan. After I power off the machine and unplug the drive I can restart spinrite but receive the error again at the same location on the drive.
 

Attachments

  • after level 3.jpg
    after level 3.jpg
    179.1 KB · Views: 83
  • before level 3 scan.jpg
    before level 3 scan.jpg
    290.1 KB · Views: 90
  • s drive offline error.jpg
    s drive offline error.jpg
    181 KB · Views: 85
Hello @Salient Henry. At this point I believe that this is a remaining loose end for SpinRite's RC5. Something did go bump in that drive. But I believe that SpinRite should be better at recovering from that event. Resolving this is at the top of my task list once I get to work on SpinRite's RC6. You should just note the percentage or sector number then restart SpinRite yourself from a bit earlier (to show that the issue was resolved). And it sure does appear that your dive will benefit! Please feel free to post your after-Level3 benchmark results if you choose! (y)
 
These older SSDs quite frequently degraded in speed and would quickly come back to life with a 10 second secure erase. (at the expense of any files on the drive that aren't backed up in advance).

As they degrade pretty quickly and are of little to no value, I'd replace it with something more reliable.
 
What you're saying, @lcoughey, fits the model of read fatigue: A 10-second "secure" erase would not have time to perform a full physical zeroing of the SSD NAND media, right? It would essentially be performing a mass TRIM to release all of the LBA-to-physical media mapping. In that case, any subsequent "reads" would be virtual, returning (probably) all 0's for never written and mapped LBA space, and any writes would succeed by re-writing and thus refreshing the NAND. As that point, such freshly re-written NAND would, indeed, read quickly and reliably since they would have been recently written. :)
 
GOOD!!!
Actually, my understanding is that a secure erase issues a command that results in the physical reset of all the blocks in parallel, as per the following.
I was going to add that simply wiping the FTL data would not be truly secure so I'm glad to learn that "Secure Erase" is really that. How certain are you of the timing? That's what caused me to wonder about how "secure" it could be given only 10 seconds. Perhaps Colby has some real-world feedback from his explorations?
 
I just had a chat with Roman, the SSD and flash guru, among other things, at ACE Labs and he has corrected me.

@Steve wins this point.

1701267052690.png
 
I appreciate the closure on this, @lcoughey. That makes the most sense given what we know of NAND writing. And it's going to be very useful in the future to recognize that the meaning of "Secure Erase" needs to be tempered with the caveat "secure from the outside", meaning secure from someone who is only able to access the memory through the FTL (what your NAND guru friend refers to as the Translator)... but not secure from someone who's willing to go to a lot more work. As he notes, that "internal" level of security will require much more time to achieve. Thanks, again.
 
Yes. Without the translator, the data on the NAND is essentially useless anyway. Bypassing with a direct read would only result in scrambled encrypted sectors. So, pretty much impossible to recover, but easily done in Hollywood.
 
Ah, yes... that's certainly a good point, too. Depending upon the mapping block size, the physical media would be a jumble of those fragmented blocks without any clear relationship to their original mapping.
 
I agree, Joep. I suspect that only way to really assure users who want 100% assurance that an SSD is truly wiped will be to deliberately overwrite the entire accessible LBA data space -- acknowledging that this might not reach regions that have been mapped out due to trouble or wear leveling. Then follow that up with whatever best secure erasure API the device might have. Short of using undocumented manufacturer-specific commands, that would appear to provide the best possible assurance. And, of course, if someone did not want to invest that much time then they could settled for only using the fast secure erase option.
 
@Salient Henry : I was just writing elsewhere (answering a rhetorical question about the feasibility of running SpinRite backwards from the back to the front of a drive) and I recalled that another of the differences between the lower and the higher SpinRite levels is that Levels 1 and 2 deliberately use much shorter block transfers (1024 sectors) since they are "forward only" modes that do not always return to the front of the block for a write or re-read. But Levels 3 and above, which do continuously return to the beginning of each block, deliberately use SpinRite's much longer 32,768-sector (16MB) transfers because that allows SpinRite to proceed MUCH faster on "good" drives.

The reason Levels 1 & 2 use shorter blocks, is that during SpinRite's development we encountered damaged drives whose firmware appeared to not deal well with such large transfer requests when in the presence of any drive trouble. I believe that this was always with “spinners” and I don't recall this happening with SSDs, though that may have been because we didn't encounter any SSDs that had trouble.

What that bit of background... I'd LOVE to have you try starting SpinRite and adding xfer 128 to the command line. Run SpinRite at Level 3 and over the end of that drive where it's been dying... and let's see whether reducing the block transfer length allows SpinRite and the drive to past past that “sore spot.” (y)
 
I have a 120GB ssd that I wanted to equalize the read speeds across the drive. I ran a level 3 and it failed with the "drive has taken itself offline" error. Is the drive bad? Should I replace it? The reading speed has also decreased after running the level 3 scan. After I power off the machine and unplug the drive I can restart spinrite but receive the error again at the same location on the drive.
@Salient Henry : Back to the reason for your original posting...
I've just finished the work to make SpinRite much more "patient" with drives that appear to be taking more than 10 seconds to come back online. With the latest release (pre-release 5.01) SpinRite will now wait up to 60 seconds following a drive reset after an error before it gives up on a drive. And during that waiting it will display a countdown so that the user knows what's going on. I will be very interested in learning whether this works with that SSD you have. Thanks!

(More information is here: https://forums.grc.com/threads/pre-release-5-01.1415/