M.2 drive overheats almost immediately, SR pauses, but drive keeps on heating up

  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

Dexterous

Member
May 8, 2024
10
1
I've been trying to run Spinrite on an old Dell Inspiron 5570 laptop. The machine came with a Kingston SSD and he later added an M.2 2280 drive. (model # CT1000MX500SSD4). This second drive rapidly overheats when Spinrite accesses it in level 2. The SMART temperature check triggers at around 55C and automatically pauses Spinrite until it cools down. However, the drive never cools down! Indeed the temperature keeps creeping up well above 70C. At that point I stop Spinrite and retune to DOS for a while, or shut off the machine, then try again.

This doesn't seem right. Could Spinrite's monitoring of the SMART parameter be causing the drive to heat up like this?

I've ordered a heatsink for the drive - it clearly needs one, regardless of the above, but I am wondering how much drive activity Spinrite might be initialting during the SMART 'cooldown' waiting period.

Thanks!
-Ray
 
It seems possible the device itself is running hot and that something it runs in Windows helps it stay cooler, but that's not running when SpinRite runs DOS. Try the experiment to boot to DOS but do not start SpinRite at all, and see what happens while it idles running DOS.
 
I've ordered a heatsink for the drive - it clearly needs one
Maybe a dumb question, but considering it's in a laptop, is there room for a heatsink? Laptop cover will be able to go back on with it in place?

I have had laptops that overheated with SR 6.0, and sometimes I'd just take the cover off, set them on end, and point a desk fan at them while SR ran. I agree with @PHolder that Windows power management settings help keep things cool and when not in Windows, things can get hotter.
 
  • Like
Reactions: Darcon
I think 55C is a fixed number based on Steve's experience with hard drives so not necessarily a warning by the SSD itself. I can't find the specs for that particular SSD but maybe 70-75 is where the SSD normally sits and otherwise is normal. Either way a heatsink is a good call; the cooler the better IMO.
 
It seems possible the device itself is running hot and that something it runs in Windows helps it stay cooler, but that's not running when SpinRite runs DOS. Try the experiment to boot to DOS but do not start SpinRite at all, and see what happens while it idles running DOS.
I was running DOS. I don't know how it behaves under Windows. I don't have a good way to look at the SMART values when in DOS so I rely on Spinrite. If I let it sit for a while, either at the DOS command prompt or completely powered off, and then start Spinrite, the drive temp seems fine. (I'll have to check the actual value - but I think it was under 50C at rest.)

If I run level 2 on that machine's SSD drive I believe the M.2 drive's temperate remains stable, as it should. It's only when Spinrite start to pay attention to the M.2 drive does its temperature start to shoot up. For example the drive benchmark tool will cause the temp to go up as well. That's expected. I'm hoping that the heatsink will mitigate that.

My concern, and the reason I'm posting this, is that Spinrite may be inadvertently causing the temperature rise while it's monitoring the SMART temperature and waiting for it to go back down.


The M.2 drive may be defective somehow, or it may truly need a heatsink, or both. I've seen mention on Reddit that this particular model may in fact tend to run hot and can benefit from a heatsink.

This is not a huge issue for the laptop itself - I am going to wipe it and install Linux. I can't even log in to Windows since my friend forgot that password. If Linux works well on that machine I can replace the M.2 drive with a new one. The entire machine will be wiped either way.

Regards,
Ray
 
Last edited:
Maybe a dumb question, but considering it's in a laptop, is there room for a heatsink? Laptop cover will be able to go back on with it in place?

I have had laptops that overheated with SR 6.0, and sometimes I'd just take the cover off, set them on end, and point a desk fan at them while SR ran. I agree with @PHolder that Windows power management settings help keep things cool and when not in Windows, things can get hotter.

Yeah I cracked the back of the laptop and checked: there appears to be room for a thin heatsink, so that's what I ordered from Amazon. The M.2 drive wasn't screwed down, by the way, which was a shock to me. It was just flapping loosely. I'm amazed it ran at all, TBH.

I'm not particularly invested in this drive nor this laptop - it is destined to be wiped and have Linux installed, if it otherwise seems t function well. I just wanted to report the temperature rise issue and being odd.

Maybe Spinrite should abort without user intervention if the temperature continues to rise like this? It seems like it might be something that would be good for safety, if nothing else.
 
  • Like
Reactions: himemsys
Maybe Spinrite should abort without user intervention if the temperature continues to rise like this?
Normally, it does. When it has paused on me in the past, the drive temperature DOES drop. If it's not for you (I'd try cycling through the screens, perhaps it is dropping, but not updating the display, and cycling through screens will force it to update temp status), then it could be an issue with M.2. I think SR61 has other potential issues with M.2 drives, from what I've read on here.
 
Normally, it does. When it has paused on me in the past, the drive temperature DOES drop. If it's not for you (I'd try cycling through the screens, perhaps it is dropping, but not updating the display, and cycling through screens will force it to update temp status), then it could be an issue with M.2. I think SR61 has other potential issues with M.2 drives, from what I've read on here.

I think you're missing my point. This issue is not that the drive is overheating during normal Spinrite operation. It makes total sense that a drive with poor cooling could overheat in that context. That's why there is an over-temp screen which halts the normal Spinrite activity until the temperature drops. Totally sensible - we surely agree on that point.

What I'm saying is that while Spinrite is displaying that over-temp warning screen it is also constantly polling the SMART temperature. I assume that Spinrite is not doing anything else during that period. That, presumably, is enough to continue to overheat this drive. If this happens while the computer is unattended Spinrite could conceivably push the drive's temperature to dangerous levels.
 
No, I'm not missing your point. I'm just telling you that normally for me, when SR pauses, the drive temp DOES go down. But because you're doing this on an M.2 (which I've never tested under SR61), it may be a compatibility issue with M.2. I have seen other reports on here with unexpected behavior on M.2 drives. So it's good that you are reporting it. My comment about cycling through the screens was just a way to double-check that the drive temp really is rising while SR is paused due to overheating.
 
No, I'm not missing your point. I'm just telling you that normally for me, when SR pauses, the drive temp DOES go down. But because you're doing this on an M.2 (which I've never tested under SR61), it may be a compatibility issue with M.2. I have seen other reports on here with unexpected behavior on M.2 drives. So it's good that you are reporting it. My comment about cycling through the screens was just a way to double-check that the drive temp really is rising while SR is paused due to overheating.

My apologies then. I didn't know it was possible to cycle through screens when the temperature warning is being displayed. I remember three options:

1) cancel further work
2) wait until temperature drops down and auto-resume (the default selection)
3) ignore temperature warnings and keep working.

The current drive temperature is displayed at the bottom, so there is obviously some SMART polling occurring. If I get a chance this evening I'll try to see if I can cycle through screens when it's in this state. I also hope to get the heatsink later today so I'll have another data point.

Thanks,
Ray
 
Also, my apologies. I am going by memory and it's been several months since I've run SR on a system where the drive was overheating. I may be remembering wrong. But fortunately, I have a laptop (that is almost never used) that I remember being very good at overheating the drive during a SR run, so I'm running it now so I can see how the newest Release 3 version behaves during an overheat event. The drive in the laptop is a 2.5" SATA. I will update with my results.
 
@Dexterous Your laptop is a "system". It has cooling for the entire device. That cooling may not be working well or settings may be misconfigured in the BIOS. My previous suggestion was that the system may get hot just idling irrespective of any software running if the cooling sub-system is not working well or is not being driven by DOS to do the right thing. While SpinRite does ease up on the drive when it detects SMART over temps, it does not back off using the CPU. Since it's a system, the CPU heat may be driving the system temperature up. I had an old Acer and an old Compaq laptop that went this way, and in the end I trashed them because they just couldn't stay cool under any circumstances. They would start off at room temp, quickly get to over-temp, and then shutdown, in essence crashing.
 
@Dexterous Your laptop is a "system". It has cooling for the entire device. That cooling may not be working well or settings may be misconfigured in the BIOS. My previous suggestion was that the system may get hot just idling irrespective of any software running if the cooling sub-system is not working well or is not being driven by DOS to do the right thing. While SpinRite does ease up on the drive when it detects SMART over temps, it does not back off using the CPU. Since it's a system, the CPU heat may be driving the system temperature up. I had an old Acer and an old Compaq laptop that went this way, and in the end I trashed them because they just couldn't stay cool under any circumstances. They would start off at room temp, quickly get to over-temp, and then shutdown, in essence crashing.

The system's temperature management is fine - like I mentioned I was able to run Spinrite on the main SSD drive without overheating anything. As soon as Spinrite addresses the M.2 drive its temperature shoots up. Given that this is a cheap after-market drive, without a proper heatsink, and it's sitting somewhat removed from the main system fan, I'm not surprised that it's overheating. I can fix that a number of way, including ultimately sending the laptop and the M.2 drive to e-waste.

What I'm am concerned about here is that Spinrite may inadvertently be pushing up the drives temperature while it's waiting for it to cool down. That may be a dangerous situation for a person who leaves this system to run Spinrite unattended. There doesn't seem to be a fail-safe mechanism in this circumstance.
 
I can confirm that the latest release of SR61 works normally when the drive overheats. In the system I mentioned earlier, the drive hits 56C and SR stops with the red screen, and within a few seconds, the displayed temperature drops to 54 and the scan automatically resumes. So there is some issue with your system where the temp continues to climb once the red pause screen pops up. It may be an M.2 thing, it may be the lack of a heatsink, or some combination?
 
I can confirm that the latest release of SR61 works normally when the drive overheats. In the system I mentioned earlier, the drive hits 56C and SR stops with the red screen, and within a few seconds, the displayed temperature drops to 54 and the scan automatically resumes. So there is some issue with your system where the temp continues to climb once the red pause screen pops up. It may be an M.2 thing, it may be the lack of a heatsink, or some combination?
Yes - those are reasonable assumptions. I don't know if the fact it's an M.2 drive matters - you guys are the experts in that department, but it's plausible. I will install a heatsink on the M.2 and try again.
I might also plug the M.2 into another machine and run Spinrite on it there. Not sure if I have a free M.2 slot anywhere though.
 
I will install a heatsink on the M.2
This seems likely to be a waste of time. If the drive temperature keeps going up, it will still fill the sink capacity of the heatsink ... it just may take a little longer than without it. You COULD try putting the M.2 into an external case of some kind (that you could apply cooling to if needed), but that likely means USB and USB is not really advised for use with SpinRite 6.1.
 
This seems likely to be a waste of time. If the drive temperature keeps going up, it will still fill the sink capacity of the heatsink ... it just may take a little longer than without it. You COULD try putting the M.2 into an external case of some kind (that you could apply cooling to if needed), but that likely means USB and USB is not really advised for use with SpinRite 6.1.

I agree. My point remains that I really don't care about this system nor this drive. I just want to be sure Spinrite has a failsafe for this situation. It would not be good if someone come along in a couple years and destroys a drive because its temperature rose beyond safe limits.
 
I'm going to mention again that 55C is a legacy Spinrite warning from back before SSDs. Samsung SSDs are rated to 70C operating temperature so it's likely that Crucial is similar.
 
The software is not going to destroy the drive by asking it to be a drive. That is on the drive.
Good point - I was clearly being a bit dramatic. I just thought it might be a good idea, and relatively simple, to have a fail-safe in Spinrite's automated cool-down mechanism.

That said, I've seen silicon destroy itself plenty of times when it runs over temp (typically 115-120C or higher). Even if it doesn't quite get up to that extreme temperature, sustained operation at high temperature will prematurely age the silicon.

How do over-temp alarms get handled in modern operating systems? If this drive was overheating in Windows or Linux would it raise an alarm to the kernel so the OS could take action? I don't think DOS is able to handle that type of signal, if that is indeed the mechanism. I really don't know the details of all this stuff. I just wanted to report what may be a problem. I am probably making more of it than is warranted.

I'm curious though: Maybe I'll try to see what happens if I just let it run to see how high it will go. Will it hit 100C? 120C? Or will the drive shut itself off? I don't know.

The heatsink won't arrive until tomorrow evening so I probably won't be able to try it until Friday at the earliest.