I ran ReadSpeed 1,000 times

  • SpinRite v6.1 Release #3
    Guest:
    The 3rd release of SpinRite v6.1 is published and may be obtained by all SpinRite v6.0 owners at the SpinRite v6.1 Pre-Release page. (SpinRite will shortly be officially updated to v6.1 so this page will be renamed.) The primary new feature, and the reason for this release, was the discovery of memory problems in some systems that were affecting SpinRite's operation. So SpinRite now incorporates a built-in test of the system's memory. For the full story, please see this page in the "Pre-Release Announcements & Feedback" forum.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

ColbyBouma

Well-known member
Dec 26, 2020
200
84
UPDATE: The strange results are isolated to 1 SSD. See Run 4 for more details.

I wrote the following batch file to run ReadSpeed rel 1 on my HP DM1Z laptop with 128 GB Samsung PM830 SSD 1,000 times in a row.

Code:
:LOOP
        RS.EXE
        IF NOT EXIST RS999.TXT GOTO LOOP

The results were WAY more interesting than I expected! It started off fine, but on run 78 the numbers took a nosedive. I happened to watch this one while it was happening. The bar at the top was noticeably slower, but fairly smooth. However, on the subsequent runs where the numbers were a little lower, there was noticeable stuttering in the top bar. Also, after a couple hours, there was a noticeable hang between runs, and my USB drive's activity light was flashing like crazy. That's strange, because the log files usually write instantly.

I have a few theories:
  • Overheating
    • This is a laptop, and the fan was loud during the whole run. However, I didn't notice any slowdown like this when I was filling the drive with random data or zeroes.
  • Memory leak
    • Is it possible that ReadSpeed doesn't release memory after it runs since it doesn't expect to be run more than a couple times?
  • Internal drive maintenance
    • Could the drive be doing something in the background?
Here's a chart of the 0% column. Trying to show all 5 columns at once just creates a jumbled mess :) The CSV is attached below for anyone who wants to create their own charts.

1609553148581.png


Here's the PowerShell code I used to create the CSV file. This was a 1-liner, but I cleaned it up for this post.
Code:
# Grab every file in the current folder and loop through them.
Get-ChildItem | ForEach-Object {

    # Read the file and return just the 9th line (PowerShell uses 0-based indexing).
    $Line9 = (Get-Content $_.FullName)[8]

    # Return an array of the elements that are not spaces.
    # '\s+' is regex for "1 or more whitespace characters".
    $Elements = $Line9 -split '\s+'

    # Grab the 5 results. There's an empty row at the end, otherwise it would have been -5 to -1.
    # -1 is the last element of an array, -2 is second to last, etc.
    # .. is the range operator.
    $RsResults = $Elements[-6..-2]

    # Turn the array into a string, using the characters in '' as a delimiter
    $CsvString = $RsResults -join ', '
  
    # Add the string to the CSV file. The first line will create the file.
    $CsvString | Out-File -FilePath 1000.csv -Append

}

SMART data
Run 2
Run 3
Run 4
Run 5
 

Attachments

  • 1000.csv.txt
    68.4 KB · Views: 379
  • 1000 Runs 1.zip
    446.9 KB · Views: 828
Last edited:
@Dagannoth : Wow!! VERY cool work. Nice going! And, FWIW, I don't think that temperature explains what you were seeing because the pattern is extremely regular. It appears that the 1st and 3rd "dips" are missing. But then the pattern really sets in. This is fascinating. And let's remember that this is just READING the memory. Reading NAND flash is non-destructive and SSDs are rated to do that extensively.
 
It's most likely heat related, you really stressed your laptop out... and potentially lowered its life span accordingly.
I'm not worried about this system. It's an old laptop I haven't used for years, and an old SSD I got for cheap as part of a box of SSDs from eBay, so I am more than willing to sacrifice it in the name of science :D
 
Maybe run it in a refrigerator or somwhere cold to see how temperature related it is.
 
I calculated the time between runs based on the creation time of the log files. As I suspected from watching the later runs, the gap got larger as time went on. Interestingly, the same spikes are present in this chart. (EDIT: I just realized this makes perfect sense. The slower runs take longer 🤦‍♂️)

The chart contains 998 data points because the first line will always be 0, and I threw away the second line because my first test (RS000.TXT) was a few minutes before the rest of them since I wrote LOOP.BAT after it. I should have deleted that first log file before running LOOP.BAT, but I forgot.

Theory #4: my laptop was overheating, not the SSD. I am going to run this test on a desktop that has plenty of airflow to test the heat theory.

1609560731987.png


Here's the PowerShell code I used to generate this CSV.
Code:
Get-ChildItem *.txt | ForEach-Object {

    $TimeStamp = (Get-ItemProperty $_.FullName).LastWriteTime

    if ( $PreviousTime ) {
        $Result = ($TimeStamp - $PreviousTime).TotalSeconds
    } else {
        $Result = 0
    }

    $Result | Out-File -FilePath TimeBetweenRuns.csv -Append

    $PreviousTime = $TimeStamp

}
 

Attachments

  • TimeBetweenRuns.csv.txt
    7.8 KB · Views: 375
Last edited:
I think we can rule out overheating. Here are the results from moving the SSD to a desktop that has a case fan blowing directly over the SSD. This run looks very similar to the first run. Also, I overwrote the drive with zeroes first, just like I did before the first run.

1609590115710.png


Here's column 0 from run 1 and 2.
1609590973552.png


Run 3 is already underway. I want to see how much of a difference it makes when I tell ReadSpeed the filename to use, instead of RS having to read the drive to determine the next filename. This requires the previous 1,000 log files to be left behind.

Code:
FOR %%A IN (*.TXT) DO RS.EXE %%A

Run 4 will be with a different drive. I've determined this behavior is not linked to the computer, so now I want to see if this drive is just weird, or if this is an artifact of running RS way too many times.
 

Attachments

  • 1000-2.csv.txt
    68.4 KB · Views: 366
  • 1000 Runs 2.zip
    448.2 KB · Views: 383
Last edited:
  • Like
Reactions: PHoganDive
Here are the results from Run 3. As I expected, it didn't affect the dips. My main goal was to reduce the total run time, which I did. This run was 3:55, whereas the previous runs were 7:27 and 5:11. One odd thing that happened is that the loop started at RS005.TXT, so I had to edit the CSV file.

1609647534288.png


Run 2 and 3 together.
1609647728936.png


The time between runs looks MUCH better now. I had to remove a few outliers due to the log file order problem. Also, I had to switch to LastWriteTime in my PowerShell code since I re-used the log files. I will update my code in the previous post.
1609648364416.png


I want to find a better way to create a loop. Unfortunately, FOR /L and SET /A aren't supported in FreeDOS.
 

Attachments

  • 1000.csv.txt
    58.4 KB · Views: 375
  • 1000 Runs 3.zip
    448 KB · Views: 382
Milton from the newsgroup wanted some information about this PM830 SSD, so here's the SMART data for it.

Code:
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.4-pmagic64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     SAMSUNG SSD PM830 2.5" 7mm 128GB
Serial Number:    S0TYNSBCB03775
LU WWN Device Id: 5 002538 043584d30
Firmware Version: CXM03D1Q
User Capacity:    128,035,676,160 bytes [128 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jan  3 04:54:23 2021 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (  540) seconds.
Offline data collection
capabilities:              (0x53) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (   9) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       18414
 12 Power_Cycle_Count       0x0032   097   097   000    Old_age   Always       -       2087
175 Program_Fail_Count_Chip 0x0032   100   100   010    Old_age   Always       -       0
176 Erase_Fail_Count_Chip   0x0032   100   100   010    Old_age   Always       -       0
177 Wear_Leveling_Count     0x0013   091   091   010    Pre-fail  Always       -       310
178 Used_Rsvd_Blk_Cnt_Chip  0x0013   095   095   010    Pre-fail  Always       -       92
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   095   095   010    Pre-fail  Always       -       172
180 Unused_Rsvd_Blk_Cnt_Tot 0x0013   095   095   010    Pre-fail  Always       -       3860
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 CRC_Error_Count         0x003e   253   253   000    Old_age   Always       -       106
232 Available_Reservd_Space 0x0013   095   095   000    Pre-fail  Always       -       1924
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       21016805020
242 Total_LBAs_Read         0x0032   099   099   000    Old_age   Always       -       76913913611

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     18402         -
# 2  Extended offline    Completed without error       00%      1296         -
# 3  Short offline       Completed without error       00%      1232         -
# 4  Short offline       Completed without error       00%         2         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
Run 4 is complete. I swapped out the PM830 for a 120 GB Crucial BX100 SSD. I overwrote it with zeroes first, just like the previous runs.

The results are what I originally expected Run 1 to look like. I believe this shows that the strange behavior is entirely isolated to the PM830. I'm still very curious what's going on, but at least this strange behavior isn't due to ReadSpeed.

The chart with all 5 columns is readable, so I went with it instead of just column 0.
1609673472048.png


Also, I got tired of fighting with batch scripts, so I switched to QBasic. Here's the script I wrote, LOOP.BAS:
Code:
FOR i% = 0 TO 999
        ' https://www.tek-tips.com/viewthread.cfm?qid=565458
        padded$ = RIGHT$("000" + LTRIM$(STR$(i%)), 3)
        PRINT padded$
        SLEEP 1

        name$ = "RS" + padded$ + ".TXT"
        SHELL "RS.EXE " + name$
NEXT i%

Here's the SMART data for the BX100. I grabbed this before starting Run 4.
Code:
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.4-pmagic64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Silicon Motion based SSDs
Device Model:     CT120BX100SSD1
Serial Number:    1534F00B31FC
LU WWN Device Id: 5 00a075 1f00b31fc
Firmware Version: MU02
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jan  2 23:22:04 2021 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (    0) seconds.
Offline data collection
capabilities:              (0x71) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0002)    Does not save SMART data before
                    entering power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (  10) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:            (0x0035)    SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0000   100   100   000    Old_age   Offline      -       0
  5 Reallocated_Sector_Ct   0x0000   100   100   000    Old_age   Offline      -       0
  9 Power_On_Hours          0x0000   100   100   000    Old_age   Offline      -       1411
 12 Power_Cycle_Count       0x0000   100   100   000    Old_age   Offline      -       86
160 Uncorrectable_Error_Cnt 0x0000   100   100   000    Old_age   Offline      -       0
161 Valid_Spare_Block_Cnt   0x0000   100   100   000    Old_age   Offline      -       109
163 Initial_Bad_Block_Count 0x0000   100   100   000    Old_age   Offline      -       19
164 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       33662
165 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       97
166 Min_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       1
167 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       30
168 Max_Erase_Count_of_Spec 0x0000   100   100   000    Old_age   Offline      -       2000
169 Remaining_Lifetime_Perc 0x0000   100   100   000    Old_age   Offline      -       100
175 Program_Fail_Count_Chip 0x0000   100   100   000    Old_age   Offline      -       0
176 Erase_Fail_Count_Chip   0x0000   100   100   000    Old_age   Offline      -       0
177 Wear_Leveling_Count     0x0000   100   100   000    Old_age   Offline      -       1
178 Runtime_Invalid_Blk_Cnt 0x0000   100   100   000    Old_age   Offline      -       0
181 Program_Fail_Cnt_Total  0x0000   100   100   000    Old_age   Offline      -       0
182 Erase_Fail_Count_Total  0x0000   100   100   000    Old_age   Offline      -       0
192 Power-Off_Retract_Count 0x0000   100   100   000    Old_age   Offline      -       34
194 Temperature_Celsius     0x0000   100   100   000    Old_age   Offline      -       27
195 Hardware_ECC_Recovered  0x0000   100   100   000    Old_age   Offline      -       119397
196 Reallocated_Event_Count 0x0000   100   100   000    Old_age   Offline      -       0
197 Current_Pending_Sector  0x0000   100   100   000    Old_age   Offline      -       0
198 Offline_Uncorrectable   0x0000   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0000   100   100   000    Old_age   Offline      -       0
232 Available_Reservd_Space 0x0000   100   100   000    Old_age   Offline      -       100
241 Host_Writes_32MiB       0x0000   100   100   000    Old_age   Offline      -       84601
242 Host_Reads_32MiB        0x0000   100   100   000    Old_age   Offline      -       79756
245 TLC_Writes_32MiB        0x0000   100   100   000    Old_age   Offline      -       134648

SMART Error Log not supported

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Also, I ran ReadSpeed once before overwriting the drive with zeroes.
Code:
  +--------------------------------------------------------------------------+
  | ReadSpeed: Hyper-accurate mass storage read-performance benchmark. rel 1 |
  |  Benchmarked values are in megabytes read per second at five locations.  |
  +--------------------------------------------------------------------------+

Driv Size  Drive Identity     Location:    0      25%     50%     75%     100
---- ----- ---------------------------- ------- ------- ------- ------- -------
 81  120GB CT120BX100SSD1                520.2   542.4   420.4   431.2   542.5

                  Benchmarked: Sunday, 2021-01-03 at 06:01
 -----------------------------------------------------------------------------
   See the ReadSpeed forums at forums.grc.com for help and community support.
 

Attachments

  • 1000 Runs 4.zip
    428.2 KB · Views: 373
  • 1000.csv.txt
    68.4 KB · Views: 354
@DiskTuna I didn't read the whole paper, just the abstract. But, that's interesting ... and disturbing.

May your bits be stable and your interfaces be fast. :cool: Ron
 
Run 5. This time I rebooted after every test. Unfortunately, the results are pretty boring :ROFLMAO:

0% is lower because I installed Windows on this PC, so that region is no longer empty.
1610571690224.png


Here's how long each run took:
1610571730877.png


Here's the script I wrote, REBOOT.BAS. It took me a while because I am very rusty with QBasic.
Code:
File$ = "COUNT.DAT"

' Read the current count.
OPEN File$ FOR INPUT AS #1
INPUT #1, Count
CLOSE #1

' https://www.tek-tips.com/viewthread.cfm?qid=565458
padded$ = RIGHT$("00" + LTRIM$(STR$(Count)), 3)
PRINT padded$
SLEEP 1

' Build the filename, and run RS.
name$ = "RS" + padded$ + ".TXT"
SHELL "RS.EXE " + name$

' Increment the count and write it to the file.
Count = Count + 1
OPEN File$ FOR OUTPUT AS #1
WRITE #1, Count
CLOSE #1

IF Count < 1000 THEN
        ' The "REBOOT" alias isn't recognized for some reason.
        SHELL "FDAPM WARMBOOT"
ELSE
        ' Delete the count file.
        KILL File$

        PRINT "Done"
        
        ' End without "press any key to continue".
        SYSTEM
END IF

Here's my AUTOEXEC.BAT:
Code:
@echo off
SET PATH=\;\FREEDOS

IF EXIST COUNT.DAT GOTO run

ECHO 0 > COUNT.DAT

:run
    QBASIC.EXE /RUN REBOOT.BAS
 

Attachments

  • 1000 Runs 5.zip
    430.7 KB · Views: 368
  • 1000.csv.txt
    68.4 KB · Views: 824
@DiskTuna I didn't read the whole paper, just the abstract. But, that's interesting ... and disturbing.
I think reading the Conclusion (rather short, on page 12) might be helpful as well, as a truncated summary of the many specifics discussed in the paper. 😤😷🙌

I also think my FireCuda drive—with its NAND cache—will probably also be affected by those errors as time goes on.
 
  • Like
Reactions: rfrazier
Internal drive maintenance
  • Could the drive be doing something in the background?

If the drive is doing something in the background, you might be able to rule that out by pausing for a specific amount of time (optimally, the amount of time it takes for the drive to complete its internal maintenance) after each run. That would increase the total run time, of course, by waittime*999 or so.


I love these detailed analyses!

In addition, I have something of an obsession with compression. Accordingly, I took your ten .png files and compressed them more... attached to this message. *shrug* Anyone else who loves lossless (or lossy) compression of images and many other file types might appreciate FileOptimizer or the compression discussions and software from encode's forums. Encode also makes a good, small file hasher.
 

Attachments

  • 10images.zip
    240.6 KB · Views: 341
Last edited: