SSD without a L3 improvement

  • SpinRite v6.1 is Released!
    Guest:
    That's right. SpinRite v6.1 is finished and released. For the full story, please see this page in the "Pre-Release Announcements & Feedback" forum.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in:

    This forum does not automatically send notices of new content. So if, for example, you would like to be notified by mail when Steve posts an update to his blog (or of any other specific activity anywhere else), you need to tell the system what to “Watch” for you. Please checkout the “Tips & Tricks” page for details about that... and other tips!

    /Steve.
  • Announcing “BootAble” – GRC's New Boot-Testing Freeware
    Please see the BootAble page at GRC for the whole story.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)


DarkwinX

Well-known member
Sep 17, 2020
87
20
Melbourne, Australia
Thought I'd open up the discussion on an SSD I have which doesn't improve its read performance after an L3 or even L5 scan.

Attached are some before and after shots. The after is actually after an L3, L5 and an L3 with a forced xfer of 32MB Incase the cache was getting in the way.

At this point could it be just degraded cells or bad firmware?
 

Attachments

  • after (Medium).jpg
    after (Medium).jpg
    321.8 KB · Views: 50
  • before (Medium).jpg
    before (Medium).jpg
    265.7 KB · Views: 46
Have you tried doing a "secure erase"? On an SSD this only takes a few seconds, and supposedly it can reset performance back to default in some situations.
 
Hmm I wonder if it will truly give an increase. AFAIK what will happen is that the underlying MLC will get purged of data but the mapping table will also get reset.

So until new data is written it will probably seem to operate at max speed due to being unmapped, and then will revert back.

Will give it a shot anyway and see.
 
Hmm I wonder if it will truly give an increase. AFAIK what will happen is that the underlying MLC will get purged of data but the mapping table will also get reset.
We were discussing this recently. It's certainly for an "insecure erase" to simply wipe the mapping table — what's known as the FTL, the Flash Translation Layer. This would be no different than TRIMming the entire drive surface. And since the underlying data would not have been immediately wiped it's not secure. But we're told that a "secure erase" will not only do that, but that it will also proactively wipe the entire drive's underlying data.

So, it would be quite interesting to learn whether a secure erase is able to do for an SSD what a L3 or L5 cannot.

Remember, though that the drive will APPEAR to be screaming immediately following a secure erase because real reading will not be occurring. So you'll need to run an L3 pass afterward to remap the logical addresses to physical media. (y)
 
So I completed a secure erase on the drive, ran the benchmark and as expected the maximum read results were seen due to the erased FTL.

I then completed a level 3 and performed the benchmark again and the front of drive rate appears to be closer to maximum speed again.
If we assume that Spinrite was working correctly in the original test then that means that the benchmark results were indicative of some other issue with the NVM.

Would erasing the FTL (and the underlying data) result in the suspected bad NVM to be randomly spread across the new FTL mapping? Therefore what was the "front of the drive" could be spread across the entire drive now and the read performance impact averaged across the drive as well?
 

Attachments

  • 1. after secure erase.jpg
    1. after secure erase.jpg
    212 KB · Views: 37
  • 2. after level 3.jpg
    2. after level 3.jpg
    201.8 KB · Views: 40
  • Like
Reactions: drwtsn32
Of course there will be difference between secure erase / TRIM versus some refresh operation. Difference is after secure erase / TRIM NO DATA is read from NAND, after 'refresh' however data needs still to be read from NAND. They are physically different processes.

If there's no improvement after refresh it simply means there's nothing to improve. Reading actual data will always take time. Refresh ensures data was written to physically different NAND and so read conditions are optimal. If OTOH read's crawl after a refresh THEN there may be reason for concern. Simple fact that reading from a mapped LBA is slower than reading from non-mapped LBA in itself is to be expected.
 
ould erasing the FTL (and the underlying data) result in the suspected bad NVM to be randomly spread across the new FTL mapping? Therefore what was the "front of the drive" could be spread across the entire drive now and the read performance impact averaged across the drive as well?

How did you determine this SSD is 'bad' in the first place?

There's no reason for a SSD to randomly distribute data all over the place. Could it have been written to different physical addresses than before, yes sure.
 
OTOH read's crawl after a refresh THEN there may be reason for concern.
That's what my first post is saying. Using Spinrite to refresh (L3) gave no improvement to the front of the drive, it hovered around 300 Mbps.

It took a secure erase and then a refresh (L3) to show a full improvement (from 300ish Mbps to 500ish Mbps.

Since we've seen time and again that an L3 always brings an SSD much closer to the link speed, I'm inclined to think it was an issue with the underlying NAND.
 
1703949031026.png


This drive is over a decade old, I don't know how much use it saw, but NAND wears and no amount of refreshing will counter that. Oh, and by "crawl" I mean that quite literally.
 
Last edited:
Would erasing the FTL (and the underlying data) result in the suspected bad NVM to be randomly spread across the new FTL mapping? Therefore what was the "front of the drive" could be spread across the entire drive now and the read performance impact averaged across the drive as well?

I don't have an answer, but I have seen similar results in the past with some SSDs.

Was the SSD used on an OS and/or controller that didn't support TRIM?
 
So I completed a secure erase on the drive, ran the benchmark and as expected the maximum read results were seen due to the erased FTL.

I then completed a level 3 and performed the benchmark again and the front of drive rate appears to be closer to maximum speed again.
If we assume that Spinrite was working correctly in the original test then that means that the benchmark results were indicative of some other issue with the NVM.

Would erasing the FTL (and the underlying data) result in the suspected bad NVM to be randomly spread across the new FTL mapping? Therefore what was the "front of the drive" could be spread across the entire drive now and the read performance impact averaged across the drive as well?

Complicating factor is the intelligence of the SSD drive's firmware and controller. If I am not mistaken this SandForce controller incorporates "DuraClass" and "DuraWrite" technology which means as much as the controller does not simple take data, finds a spot to write it and then actually writes it, instead it does several things before that. Simplified, if we'd ask to write 512 zero bytes to LBA 100, it will compare this to what's currently in LBA 100 and if it finds 512 zeros there, it writes nothing and so nothing gets refreshed either. IOW if SR comes along and reads 512 zeros and writes that back nothing changes. It will also evaluate "entropy" of the data and compress low entropy data. So even if some inversion takes place, inversion of low entropy data wil give low entropy data (00 00 00 00 inverted becomes ff ff ff ff = low (zero actually) entropy data) so it can compress the sh*t out of that and as result only small portion of those 512 bytes will be written somewhere .

Where am I going with this? .. Well, we can not compare your first tests, assuming the drive truly contained data with the post secure erase tests even if the level 3 you ran after that wrote to the drive. To truly compare post secure erase vs. refresh test wed need to fill the drive with high entropy data after secure erase. The above might very well explain the discrepancy between the pre-secure erase tests and post-secure erase tests.

IOW, there's so much going on that you can not conclude "bad NAND" based on some numbers that in all honesty are the result of sampling a tiny amount of total drive space. You also can't just compare two different drives using different controllers, different NAND etc..

What you could try evaluate to determine the state of a SSD drive is observing actual errors, truly slow reads/writes and SMART statistics such as amount of data written, amount of spare blocks utilized etc..
 
Last edited:
So I completed a secure erase on the drive, ran the benchmark and as expected the maximum read results were seen due to the erased FTL.

I then completed a level 3 and performed the benchmark again and the front of drive rate appears to be closer to maximum speed again.
If we assume that Spinrite was working correctly in the original test then that means that the benchmark results were indicative of some other issue with the NVM.
SUPER Interesting testing, @DarkwinX. We know from many others' tests that SpinRite's passes can improve performance. What you've just demonstrated is that, as it is, it doesn't necessarily "must" improve performance. But SpinRite 7 is going to be all about doing that... so we have some very interesting explorations ahead! (y)
 
  • Like
Reactions: DarkwinX