Bit Rot, Windows 7, 2 TB SSD, Long Ago

  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in:

    This forum does not automatically send notices of new content. So if, for example, you would like to be notified by mail when Steve posts an update to his blog (or of any other specific activity anywhere else), you need to tell the system what to “Watch” for you. Please checkout the “Tips & Tricks” page for details about that... and other tips!

    /Steve.

rfrazier

Well-known member
Sep 30, 2020
320
100
Hi all. I have a question for those of you that have been around the block in the computer world for a while. I'm trying to even figure out how to pose the question. I posted another thread back in June 2021 or so about multiple admin programs BSOD in Windows 7. I'm starting a new thread because I want to focus on a new topic very specifically. Many of you suggested things in the other thread and I appreciate it. Basically, I've been struggling with BSOD's in Windows 7 for a year or more. The PC will just crash to the blue, out of the blue. Then it may run for days. I've tried literally hundreds of troubleshooting steps. And, I don't want to rehash that here. I've tried spinner hard drives, SSD's, shutting down this and that, even a whole separate pc of the same model. So, let's see if I can make this make sense.

For a while, I had given up and just reconciled myself to having to endure a crash once or twice a week. I had cloned my system from a 1 TB spinner to a 1 TB SSD and then later to a 2 TB Samsung 860 Evo Pro SSD. So, even on the 2 TB SSD, I was still getting crashes 1 - 2 times per week. Then, very recently, within the last week, the crashes got much more frequent. No I cannot think of anything definitive that I did. The drive got to where it would crash every few minutes. But, if I could get it to run long enough, it would pass chkdisk just fine. Or, I could boot something else and run chkdisk with it in an external enclosure. As I said, I even put the drive in a whole other computer, with the same results. I was able to start using a 2 TB spinner backup that I had from a few months ago. It worked much better other than being dog slow. I've had the spinner BSOD a time or two, so it didn't totally fix the problem, but at least it was usable. Over and over I tried to get the SSD to work. I tried copying files from the spinner to the SSD. I am very convinced at this point that the crashes are caused by the CONTENTS of the drive, rather THAN the drive. Everything failed. Eventually, I cloned the spinner to the SSD with a copy box and now I'm actually running on the SSD. So, that corroborates the theory that the contents of the drive were causing the problem. I haven't tried this mode long enough to know whether it won't crash at all.

Now I get to the heart of the matter. Sorry for the long lead up. Something is bugging me in the back of my mind about a memory from long, like 15 years, ago. I think it had something to do with bit rot, IE data corruption, in Windows 7, with large hard drives, and maybe related to the IDE vs AHCI. I'm running IDE mode. It may have had to do with the registry, or with backup. In fact, it's so vague, I can't even latch on to enough of it to describe it to you. So, here's the question. Do any of you remember any situation involving Windows 7, large hard drives, and bit rot that could cause my 2 TB SSD to suddenly become unreliable and start crashing. Keep in mind that once I cloned my spinner back to my SSD, the SSD is happy again. AND, all the while, the SSD would pass chkdisk procedures when in an external enclosure. All this has me quite baffled. I've even been wondering if we had a rash of sun spots or something. @Steve has said that in these new drives, a few electrons in the memory storage cell can change the data. Frankly, that scares me, but I was really hoping to make the 2TB drive work. The PC is usable again and is running the SSD. But, I don't want to be in this same situation again in 2 months if bit rot is happening. For what it's worth, I ran SpinRite level 4 on the drive when I first got it, so I had every reason to trust it.

As always, all help is appreciated. Even geeks need tech support sometimes.

May your bits be stable and your interfaces be fast. :cool: Ron
 
Last edited:

PHolder

Well-known member
Sep 16, 2020
769
2
359
Ontario, Canada
I had given up and just reconciled myself to having to endure a crash once or twice a week.
My only suggestion is to get the necessary tools and skill to capture the info about the blue screen so that you have a clue what specifically is crashing. Microsoft makes free tools for "minidump analysis" which is what I think you need for this. In the past I had to do it myself, and I recall I ended up downloading some driver developer toolkit. In my case, it was my GPU that was overheating. The fan eventually seized.
 

DiskTuna

Well-known member
Jan 3, 2021
93
10
Netherlands
Seems to me that you have decided already it has to do with the hard drives.

Now, this is just anecdotal but all experience I had with systems that "crashed to the blue, out of the blue" (nice one BTW) it was almost always related to something else than the hard drive.

Bit rot is a discussion I have gotten myself into too often already. I recover data from USB flash drives and memory cards, and yes I do see what you could call bit rot: That is bits that are not what they're supposed to be according to ECC data.

What I struggle with is explaining how bit rot would manifest itself without triggering read errors. A 'flipped' bit or a stuck bit is caught by ECC error correction. Too many flipped/stuck bits for ECC to cope with should result in an error that's passed on all the way down to OS level. So in my mind, theoretically you should never have corrupt data without being notified at some level about a read error.

Reality: people send me corrupt files (digital photos in my case) every day, that just sat on a drive with no file system errors, that can be read without error and are yet corrupt.
 

rfrazier

Well-known member
Sep 30, 2020
320
100
Hi all. I love the thoughts y'all share so keep them coming. Yay, the pc made it overnight with no crash. I haven't run a full AV scan, which seems to be a causative factor in the past. Let me clarify the timeline a bit. Between the time I had posted the June / July thread and now, I had gotten to a point of about 1-2 crashes per week with no real solution. I had gone through several storage drives then ended up with the 2 TB Samsung 860 Evo Pro SSD. BSOD error reports were always different, and I couldn't find a pattern. I had actually run a memory test a few times. That was fine. I was living with it. Almost always, after a crash, I could run a chkdisk and it would be OK since NFTS is pretty resilient.

Just within the last week, the BSOD's became much more prevalent, to the point where I couldn't even get the computer to boot and run a chkdisk. I was able to run the SSD in an external enclosure and do a chkdisk and be successful. I was also able to retrieve data from the drive with it in an external enclosure and I think I got everything important. At least I hope so. Anything else should be in my online Jungledisk backup. But, I was never able to get the unit to boot and remain stable. I even put the SSD in a totally separate laptop of the same exact model and it crashed the same way. That led me to believe it wasn't the pc's hardware, unless they share a common design flaw. It also led me to believe it wasn't the drive's hardware. I concluded it was the drive's contents causing the crash within just a few minutes.

Since I gave up and cloned my 2 month old backup spinner drive to the SSD, the pc is running fine (thus far) on the SSD. I've had the backup spinner crash before at least once, so I may be back at the 1-2 crashes per week concept. There also may be multiple factors in play. What was really worrying me, and what caused me to create this thread, is the very rapid degradation of the SSD and the decline from 1-2 crashes per week to 1 every 2 minutes. The fact that the pc is (relatively) happy again confirms that the contents of the drive were at least a substantial part of the problem.

minidump analysis
GPU that was overheating. The fan eventually seized.
@PHolder Good suggestion about the minidump analysis. If the machine continues to crash often enough to make it worth the trouble, I may have to look into that. Interesting that you've mentioned overheating and fans. I've also thought that it might be some thermal thing. In fact, I usually set the pc on the floor on its edge at night sitting vertically. It occurred to me that the heat pipe may not work that way. No proof yet, but I may try to stop doing that. Also, regarding fans. Long ago in another life, I did some Litecoin mining. I had several GPU's running 24 hours per day. I had to make warranty claims on several after just a few months because of fan failure. That's a huge weak point in many systems. Dirt cheap fans usually don't even have real bearings. Just a so called sleeve with an axle going through it and a bit of grease. Those have very low life expectancy. One of my pet peeves. If you need a fan, get a Noctua. Pricey but great quality.

BlueScreenView
@AlanD That's also a great suggestion if I have to get to that level of troubleshooting again.

sat on a drive with no file system errors, that can be read without error and are yet corrupt.
@DiskTuna That's what worries me. 1-2 crashes / week I can live with, although I don't want to. But, I'm hoping that my SSD won't degrade again in a couple of months and I'll be back in the crash every 2 minutes mode. Like I said in the original post, I was wondering if a bunch of sunspots or something scrambled it.

May your bits be stable and your interfaces be fast. :cool: Ron
 

MichaelRSorg

Well-known member
Nov 1, 2020
88
13
RouterSecurity.org
I completely agree with the suggestion to boot into a memtest CD/DVD/flash drive. If it is a heat issue, running memtest for a few hours might let you see that too.

I also agree that the time has come to look into the BSOD error codes and Nirsoft is a great place to start.

Samsung Magician software is your friend. For one thing, it has something like checkdisk. Also, it lets you allocate part of the drive with extra spare sectors.

Your guess about using a large drive with Windows 7 is a good one. To address that, why not partition the drive? I have not done this in a long time, but there is free partitioning software in most every copy of Linux that you can run from a Linux Live CD/DVD. I think its called parted.
 

rfrazier

Well-known member
Sep 30, 2020
320
100
If it is a heat issue, running memtest for a few hours might let you see that too.
@MichaelRSorg You raise some good points. The pc has been running for another day without fail, and I'm quite OK with that. But, I may have to go back into troubleshooting mode if it acts up. I'm specifically trying not to set the computer on its edge, which could affect the heat pipe. I've thoroughly cleaned the fan and radiator and re pasted the heat sink pads. Earlier in the year, I've run prime95 to peg the cpu and heavy disk activity and the cpu and ssd temps never got above their maximum limits. I've run both the Microsoft memory test and memtest86 through at least one pass. They never show errors. But, if I start getting crashes again, I might have to do the multi hour thing.
Samsung Magician
Also excellent suggestions if the need arises. I've spent so much cumulative time on the once or twice a week BSOD's that I'm just trying to use the machine now rather than fixing it. Its current iteration has been running a couple of days but I have yet to determine if the intermittent BSOD's are back. It's obvious, though, that the once every minute or two BSOD's are gone based on the act of cloning my 2 month old spinner backup over to the SSD.
allocate part of the drive with extra spare sectors.
Good point. I try to leave about 10% of the drive unprovisioned.
Your guess about using a large drive with Windows 7
This gets back to the reason I started this thread. And, unfortunately, I haven't been able to resurrect enough of my old memory on the issue to bring clarity. So, here's what my current partition structure looks like. I start with 2 TB. Take out some for overprovisioning. Take out some for formatting. And, this installation has a small recovery partition. The partition structure is MBR. I'm running Windows 7 64 bit SP 1. So, when it's all said and done, disk management says the disk is 1907 GB, which probably means GiB. It says the windows partition is 1696 GB, which probably means GiB. So, my biggest concern in this thread was the fact that the SSD went from 1-2 BSOD's / week from whatever cause to 1 every minute or two, until I restored the backup. So, do we actually know that having a large partition like that will cause bit rot or data corruption in Windows 7? I'm pretty sure that the MBR has a limit of 2 TB for a partition. Not sure if that's 2 TB or 2 TiB. But my actual partition size is substantially below that. I'm hoping it's OK now, but I'd prefer not to be cloning the backup again in another two months. Hope that makes any sense. Apparently spinning rust still has advantages.

May your bits be stable and your interfaces be fast. :cool: Ron
 

rfrazier

Well-known member
Sep 30, 2020
320
100
Hi all. I wanted to provide you with some more information, or lack of it. I just started Samsung Magician on this Samsung 860 Pro 2 TB drive. To my surprise, none of the following features work on this drive: drive health / TB written, diagnostic scan, overprovisioning, performance optimization. It says the drive doesn't support the features. Not quite sure what to make of it and whether it's good or bad, but it's something you might consider if buying a Samsung Pro drive. In terms of overprovisioning, I'm doing that myself by using a reduced partition size. I'm also doing my own makeshift diagnostic scan by running chkdisk with the check box turned on to scan for and attempt recovery of bad sectors. This is kind of like a SpinRite Level 2 scan. The scan is still running, but so far, it's happy.

May your bits be stable and your interfaces be fast. :cool: Ron
 

Dave New

Active member
Nov 23, 2020
34
9
I ran up against running a 1 TB drive in an older Windows 7 Sony Viao laptop (it came new with Vista, to give you an idea of the age of the BIOS, etc). Apparently, the BIOS in the laptop doesn't support LBA properly to run a 1 TB drive, but I was never told that by anything (BIOS or Win 7 OS), but it started failing mysteriously a few days after I upgraded, especially telling me that I was not running a 'genuine' version of Windows, and Windows Update would refuse to run.

I re-imaged the drive a couple more times, before I finally came to the realization that there was some LBA mapping issues cause by a BIOS that had been written when there were no 1 TB laptop drives around (SSD or HD) and never updated. I tried a workaround, loading an Intel driver that was supposed to deal with this kind of hooey, but it was also unstable.

I finally retired the laptop (no, the manufacturer had no BIOS update to fix it, either), and picked up a re-furbished Win 10 HP Probook laptop at the Dayton Hamvention. Really nice deals in used laptops abounded then (maybe less so, now?) and I got a quad-core i7 with 16 GBs of RAM and a 256GB SSD for US$250. I swapped out the drive for a 1 TB SSD for less than US$100, and swapped the narrow-view angle LCD display for an IPS one for another pittance.

Way overkill, but this is now my ham shack computer, and has been running happily there ever since.

Now the oldest laptop I still have in use is my 10-year-old Photoshop laptop, an ASUS that originally came with Windows 8 (ach!), 8 GBs of RAM, and a 512 GB HD. I upgraded the RAM to 16 GBs (max for this laptop) and the HD to 1 TB SSD - what a screamer! Boots Windows 10 in about 10 seconds. Downside? It doesn't have a TPM of any sort, much less a v1.2 one, so in about four years, it will have to be replaced, when Microsoft drops support for Windows 10.

Oh, yeah, I use BackBlaze to keep all my laptops backed up. It's saved my bacon a couple of times now, since even (or especially?) HDs and SSDs WILL FAIL. It's not a question of if, but when. I had an external 4 TB USB HD full of 10's of thousands of my photographs fail. Send off for the encrypted 8 TB drive from Backblaze, and in a couple of days, I was back up and running. The cost was shipping the 8 TB drive back. $5 a month for unlimited backup of all locally-connected drives of each machine.
 
  • Like
Reactions: rfrazier