SMR drives being coy, and what happens when Spinrite falls for it?

  • SpinRite v6.1 Release #3
    Guest:
    The 3rd release of SpinRite v6.1 is published and may be obtained by all SpinRite v6.0 owners at the SpinRite v6.1 Pre-Release page. (SpinRite will shortly be officially updated to v6.1 so this page will be renamed.) The primary new feature, and the reason for this release, was the discovery of memory problems in some systems that were affecting SpinRite's operation. So SpinRite now incorporates a built-in test of the system's memory. For the full story, please see this page in the "Pre-Release Announcements & Feedback" forum.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

Peter P

Member
Nov 1, 2023
6
1
Canada
This is a deepish dive so feel free to give the convo a miss (^-^)
But there is one VERY important aspect SMR users need to know IMO.

I understood the basic principles behind SMR (Shingled Magnetic Recording) drives, but hadn't given much thought to the actual implementation. I recently watched this VERY informative presentation that IMO is well suited to audiences like us.


Despite recommendations to not run Level 3+ on SMR drives, many users might not even know they have them. Even the 34 page manual for my Seagate ST8000AS002 simply states "TGMR recording technology provides the drives with increased areal density" ...indeed

The inner workings of SMR drives may be beyond the scope of SR 6.1, but I think it's still helpful to know (or speculate) what's happening.
Unless otherwise informed, I'm assuming SR 6.1 is 'SMR unaware'.

SMR_ Drives explained and use cases00_06_59.267.png


For level 2, how does the recovery differ between "band data" and "cache data"? What happens after the recovered data is written to the persistent
cache? Does Spinrite verify what it just wrote from the cache or the band? What are the implications of bands having to be rewritten in their entirety when you change even 1 bit? How are bad sectors mapped in the cache? Are entire bands mapped out if there is a defect anywhere within?

**The really important part**
I have experimented with levels 3-5, and observed behaviour consistent with the persistent cache explained in the above video. In short, ALL data is written to the persistent cache first. Only then is it transferred to the appropriate band. The drive will wait for a few seconds of idle time before transferring data from cache to band. Or if the cache is full, the drive will interrupt further input until it can clear some space.

The bottom line is that while Spinrite's initial read is from a "band track", I think ALL subsequent write/read/verify is done ONLY on the persistent cache. The only way to verify the 'permanent data' is to somehow flush the cache before reading back what was just written. Or work in timed intervals to give the cache time to write everything out before verifying.

I did not record any figures or make calculations. But I did observe that after pausing Spinrite, I could feel the heads continue sweeping back&forth for several minutes. A clear indication the drive was busy transferring data from persistent cache to data bands. When the heads went still, I'd unpause spinrite and let it get on with its work. The time remaining would drop, then start climbing again when the cache filled up.

Given the persistent cache is 10's of gigabytes, I imagine even SR level 5 does all of its flipping, writing and verifying there entirely, completely unaware it's not touching the data's permanent location on the bands.

Level 3 can be useful for refreshing the bits on the media, but as I understand it levels 4 & 5 should absolutely be avoided. They work the drive so hard there's a real possibility of some data corruption. And level 5's final verification would give a false sense of security as it's more likely reading from the media cache instead of the band.

Whew, this went longer than I expected :) . Well anyway just happy to share my findings with anyone that might be interested. And if anything here can inform Spinrite 7 development so much the better.
 
Unless otherwise informed, I'm assuming SR 6.1 is 'SMR unaware'.
Thanks for your posting. You should consider yourself “otherwise informed” since SpinRite v6.1 IS SMR Aware! <g> SpinRite uses every available indicator (several are available) to detect SMR drives and to caution its user against running any "wholesale rewriting" level on such drives. It =IS= possible for a drive to be SMR and to be deliberately hiding that fact. (Some manufacturers have gotten themselves in hot water by doing this in the past.) So SpinRite does everything IT can to catch and inform users. And it also makes this very clear throughout its documentation. (y)
 
  • Like
Reactions: Peter P and SeanBZA
Yes, I think this is correct. It has been discussed in development newsgroup. It's pretty much, much like running SpinRite on SSD drives in the sense that LBA to PBA mapping is dynamic and volatile. Reading LBA n and write back some inverted pattern to LBA n isn't at all useful in the sense that it tells you nothing about the condition of the sector at LBA n because there is no LBA sector n. LBA sector n is a virtual address that gets mapped to some physical location and the latter you don't know.

The first place I looked was in spinrite.dev, but couldn't find any significant discussion. It's quite possible I missed it. I can't seem to retrieve old headers in gravity, and the search function on the forum web page doesn't work for me. But it's good to know this has all been discussed!

The problem you address goes pretty much for both SMR drives and SSDs since they decoupled LBA from PBA addresses and use some form of 'translator'.
Is that in fact how it works for SMR drives? I hadn't been able to find that out definitively.
Granted I have limited knowledge, I'm not sure how far comparisons of SMR with SSD can be made. SMR drives have no need for wear levelling, and PBA fragmentation will impose a performance hit unlike SSD. Just an educated guess but it seemed to me the LBA vs PBA presented to the OS was reversed. IE the OS is aware of the physical location, but entirely blind to the persistent cache and its mapping.

BTW SpinRite has some tricks (6.1 at least) to detect if it's dealing with a SMR drive. I think one is the drive supporting TRIM while rotational speed being > 0 and I think there was another one that I forgot and would have to look up again.
I only have that one model SMR drive to test with. Spinrite RC2 did not take any particular notice that I could see, hence my long and slightly misguided posting 😅
 
Thanks for your posting. You should consider yourself “otherwise informed” since SpinRite v6.1 IS SMR Aware! <g> SpinRite uses every available indicator (several are available) to detect SMR drives and to caution its user against running any "wholesale rewriting" level on such drives. It =IS= possible for a drive to be SMR and to be deliberately hiding that fact. (Some manufacturers have gotten themselves in hot water by doing this in the past.) So SpinRite does everything IT can to catch and inform users. And it also makes this very clear throughout its documentation. (y)
Done and thank you! I should have made some inquiries before jumping to conclusions. ☺️
I'm having trouble accessing older spinrite.dev headers and couldn't find any relevant discussion. The model SMR drive I have seems to be one of those that slips past the test. RC2 did not give any indication it was aware this was an SMR drive.

I only read the option menu descriptions and hadn't looked at the new documentation. Indeed in FAQ section B, it says "...any level above 2 should be used sparingly on any solid-state or SMR drives". How often have I told users to read the manual? 😅
 
What drive do you have, Peter? I'd be glad to grab one to see whether SpinRite might be able to detect it. (y)
Sorry for originally stating the model number so obtusely 😅 . It's an 8 TB Seagate Archive model ST8000AS0002. I inherited a few survivors for helping out with a fire investigation. I've been using them, but now with SR 6.1 I can test them properly. I have attached the manual and a log file from testing for your convenience.
 

Attachments

  • ST8000AS0002 test.txt
    10.1 KB · Views: 145
  • Seagate Archive HDD 6 & 8 TB product manual.zip
    495.7 KB · Views: 147
Thanks Peter. I wasn't paying attention since Colby knew which drive you were using. I've just ordered one from Amazon for delivery tomorrow. I'll see whether there might be any way for SpinRite to detect that this drive is SMR. It's clearly spelled out that it's for "Archive Use" only.
 
  • Like
Reactions: Peter P
Thanks Peter. I wasn't paying attention since Colby knew which drive you were using. I've just ordered one from Amazon for delivery tomorrow. I'll see whether there might be any way for SpinRite to detect that this drive is SMR. It's clearly spelled out that it's for "Archive Use" only.
Despite the name "Archive", I've been beating them up with JBOD hot storage. Other than needing to occasionally catch its breath, it's proving to be quite reliable. For its intended purpose, I think it'd be an excellent performer.

I managed to download all the spinrite.dev headers, and now see all the SMR conversations I missed out on. I'll definitely be looking at those and will contribute there if I have anything of value to add.
 
I've been perusing old newsgroup postings and found a reference to the ST8000AS0002. For convenience in case anyone wants to review it, here is the beginning of the relevant thread. I did not recognize anything in the documentation indicating this drive is host aware, but given my lack of expertise I might have missed it. I will make any further comments in that thread to provide some degree of continuity.

Message-ID: <tkpce6$168e$1@GRC>

Subject: Re: Detection of CMR vs SMR drives
From: Scott F <scott200g@notreally.gmail.com>
Date: Sun, 13 Nov 2022 00:06:30 -0000 (UTC)
Newsgroups: grc.spinrite.dev

Steve Gibson <news007_@_grc.com> wrote:
> Following up on what Scott F wrote...
>
>> I would suggest that SpinRite should issue a Report Zones
>> command; if SR gets a response back, indicating the drive is
>> either Host Managed or Host Aware SMR, SR should refuse to
>> operate on those drives.
>
> Yeah. I agree, Scott. SpinRite should make sure that it won't
> run on any of those. That's annoying.
>
I couldn’t find any of those HGST HM-SMR drives for sale, but the Seagate
ST8000AS0002 seems readily available on eBay for about $80. That drive,
based on the documentation, is Host Aware SMR, so it could work as either
Host Managed or Drive Managed, but it should respond to the Zone ATA
commands so you can test that logic.