I ran ReadSpeed 1,000 times

  • Release Candidate 6
    Guest:
    We are at a “proposed final” true release candidate with nothing known remaining to be changed or fixed. For the full story, please see this page in the "Pre-Release Announcements & Feedback" forum.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in:

    This forum does not automatically send notices of new content. So if, for example, you would like to be notified by mail when Steve posts an update to his blog (or of any other specific activity anywhere else), you need to tell the system what to “Watch” for you. Please checkout the “Tips & Tricks” page for details about that... and other tips!

    /Steve.

ColbyBouma

Well-known member
Dec 26, 2020
128
53
UPDATE: The strange results are isolated to 1 SSD. See Run 4 for more details.

I wrote the following batch file to run ReadSpeed rel 1 on my HP DM1Z laptop with 128 GB Samsung PM830 SSD 1,000 times in a row.

Code:
:LOOP
        RS.EXE
        IF NOT EXIST RS999.TXT GOTO LOOP

The results were WAY more interesting than I expected! It started off fine, but on run 78 the numbers took a nosedive. I happened to watch this one while it was happening. The bar at the top was noticeably slower, but fairly smooth. However, on the subsequent runs where the numbers were a little lower, there was noticeable stuttering in the top bar. Also, after a couple hours, there was a noticeable hang between runs, and my USB drive's activity light was flashing like crazy. That's strange, because the log files usually write instantly.

I have a few theories:
  • Overheating
    • This is a laptop, and the fan was loud during the whole run. However, I didn't notice any slowdown like this when I was filling the drive with random data or zeroes.
  • Memory leak
    • Is it possible that ReadSpeed doesn't release memory after it runs since it doesn't expect to be run more than a couple times?
  • Internal drive maintenance
    • Could the drive be doing something in the background?
Here's a chart of the 0% column. Trying to show all 5 columns at once just creates a jumbled mess :) The CSV is attached below for anyone who wants to create their own charts.

1609553148581.png


Here's the PowerShell code I used to create the CSV file. This was a 1-liner, but I cleaned it up for this post.
Code:
# Grab every file in the current folder and loop through them.
Get-ChildItem | ForEach-Object {

    # Read the file and return just the 9th line (PowerShell uses 0-based indexing).
    $Line9 = (Get-Content $_.FullName)[8]

    # Return an array of the elements that are not spaces.
    # '\s+' is regex for "1 or more whitespace characters".
    $Elements = $Line9 -split '\s+'

    # Grab the 5 results. There's an empty row at the end, otherwise it would have been -5 to -1.
    # -1 is the last element of an array, -2 is second to last, etc.
    # .. is the range operator.
    $RsResults = $Elements[-6..-2]

    # Turn the array into a string, using the characters in '' as a delimiter
    $CsvString = $RsResults -join ', '
  
    # Add the string to the CSV file. The first line will create the file.
    $CsvString | Out-File -FilePath 1000.csv -Append

}

SMART data
Run 2
Run 3
Run 4
Run 5
 

Attachments

  • 1000.csv.txt
    68.4 KB · Views: 350
  • 1000 Runs 1.zip
    446.9 KB · Views: 331
Last edited:
@Dagannoth : Wow!! VERY cool work. Nice going! And, FWIW, I don't think that temperature explains what you were seeing because the pattern is extremely regular. It appears that the 1st and 3rd "dips" are missing. But then the pattern really sets in. This is fascinating. And let's remember that this is just READING the memory. Reading NAND flash is non-destructive and SSDs are rated to do that extensively.
 
It's most likely heat related, you really stressed your laptop out... and potentially lowered its life span accordingly.
I'm not worried about this system. It's an old laptop I haven't used for years, and an old SSD I got for cheap as part of a box of SSDs from eBay, so I am more than willing to sacrifice it in the name of science :D
 
Maybe run it in a refrigerator or somwhere cold to see how temperature related it is.
 
I calculated the time between runs based on the creation time of the log files. As I suspected from watching the later runs, the gap got larger as time went on. Interestingly, the same spikes are present in this chart. (EDIT: I just realized this makes perfect sense. The slower runs take longer 🤦‍♂️)

The chart contains 998 data points because the first line will always be 0, and I threw away the second line because my first test (RS000.TXT) was a few minutes before the rest of them since I wrote LOOP.BAT after it. I should have deleted that first log file before running LOOP.BAT, but I forgot.

Theory #4: my laptop was overheating, not the SSD. I am going to run this test on a desktop that has plenty of airflow to test the heat theory.

1609560731987.png


Here's the PowerShell code I used to generate this CSV.
Code:
Get-ChildItem *.txt | ForEach-Object {

    $TimeStamp = (Get-ItemProperty $_.FullName).LastWriteTime

    if ( $PreviousTime ) {
        $Result = ($TimeStamp - $PreviousTime).TotalSeconds
    } else {
        $Result = 0
    }

    $Result | Out-File -FilePath TimeBetweenRuns.csv -Append

    $PreviousTime = $TimeStamp

}
 

Attachments

  • TimeBetweenRuns.csv.txt
    7.8 KB · Views: 340
Last edited:
I think we can rule out overheating. Here are the results from moving the SSD to a desktop that has a case fan blowing directly over the SSD. This run looks very similar to the first run. Also, I overwrote the drive with zeroes first, just like I did before the first run.

1609590115710.png


Here's column 0 from run 1 and 2.
1609590973552.png


Run 3 is already underway. I want to see how much of a difference it makes when I tell ReadSpeed the filename to use, instead of RS having to read the drive to determine the next filename. This requires the previous 1,000 log files to be left behind.

Code:
FOR %%A IN (*.TXT) DO RS.EXE %%A

Run 4 will be with a different drive. I've determined this behavior is not linked to the computer, so now I want to see if this drive is just weird, or if this is an artifact of running RS way too many times.
 

Attachments

  • 1000-2.csv.txt
    68.4 KB · Views: 333
  • 1000 Runs 2.zip
    448.2 KB · Views: 331
Last edited:
  • Like
Reactions: PHoganDive
Here are the results from Run 3. As I expected, it didn't affect the dips. My main goal was to reduce the total run time, which I did. This run was 3:55, whereas the previous runs were 7:27 and 5:11. One odd thing that happened is that the loop started at RS005.TXT, so I had to edit the CSV file.

1609647534288.png


Run 2 and 3 together.
1609647728936.png


The time between runs looks MUCH better now. I had to remove a few outliers due to the log file order problem. Also, I had to switch to LastWriteTime in my PowerShell code since I re-used the log files. I will update my code in the previous post.
1609648364416.png


I want to find a better way to create a loop. Unfortunately, FOR /L and SET /A aren't supported in FreeDOS.
 

Attachments

  • 1000.csv.txt
    58.4 KB · Views: 338
  • 1000 Runs 3.zip
    448 KB · Views: 331
Milton from the newsgroup wanted some information about this PM830 SSD, so here's the SMART data for it.

Code:
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.4-pmagic64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     SAMSUNG SSD PM830 2.5" 7mm 128GB
Serial Number:    S0TYNSBCB03775
LU WWN Device Id: 5 002538 043584d30
Firmware Version: CXM03D1Q
User Capacity:    128,035,676,160 bytes [128 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jan  3 04:54:23 2021 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (  540) seconds.
Offline data collection
capabilities:              (0x53) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (   9) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       18414
 12 Power_Cycle_Count       0x0032   097   097   000    Old_age   Always       -       2087
175 Program_Fail_Count_Chip 0x0032   100   100   010    Old_age   Always       -       0
176 Erase_Fail_Count_Chip   0x0032   100   100   010    Old_age   Always       -       0
177 Wear_Leveling_Count     0x0013   091   091   010    Pre-fail  Always       -       310
178 Used_Rsvd_Blk_Cnt_Chip  0x0013   095   095   010    Pre-fail  Always       -       92
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   095   095   010    Pre-fail  Always       -       172
180 Unused_Rsvd_Blk_Cnt_Tot 0x0013   095   095   010    Pre-fail  Always       -       3860
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 CRC_Error_Count         0x003e   253   253   000    Old_age   Always       -       106
232 Available_Reservd_Space 0x0013   095   095   000    Pre-fail  Always       -       1924
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       21016805020
242 Total_LBAs_Read         0x0032   099   099   000    Old_age   Always       -       76913913611

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     18402         -
# 2  Extended offline    Completed without error       00%      1296         -
# 3  Short offline       Completed without error       00%      1232         -
# 4  Short offline       Completed without error       00%         2         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
Run 4 is complete. I swapped out the PM830 for a 120 GB Crucial BX100 SSD. I overwrote it with zeroes first, just like the previous runs.

The results are what I originally expected Run 1 to look like. I believe this shows that the strange behavior is entirely isolated to the PM830. I'm still very curious what's going on, but at least this strange behavior isn't due to ReadSpeed.

The chart with all 5 columns is readable, so I went with it instead of just column 0.
1609673472048.png


Also, I got tired of fighting with batch scripts, so I switched to QBasic. Here's the script I wrote, LOOP.BAS:
Code:
FOR i% = 0 TO 999
        ' https://www.tek-tips.com/viewthread.cfm?qid=565458
        padded$ = RIGHT$("000" + LTRIM$(STR$(i%)), 3)
        PRINT padded$
        SLEEP 1

        name$ = "RS" + padded$ + ".TXT"
        SHELL "RS.EXE " + name$
NEXT i%

Here's the SMART data for the BX100. I grabbed this before starting Run 4.
Code:
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.5.4-pmagic64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Silicon Motion based SSDs
Device Model:     CT120BX100SSD1
Serial Number:    1534F00B31FC
LU WWN Device Id: 5 00a075 1f00b31fc
Firmware Version: MU02
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jan  2 23:22:04 2021 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (    0) seconds.
Offline data collection
capabilities:              (0x71) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0002)    Does not save SMART data before
                    entering power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (  10) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:            (0x0035)    SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0000   100   100   000    Old_age   Offline      -       0
  5 Reallocated_Sector_Ct   0x0000   100   100   000    Old_age   Offline      -       0
  9 Power_On_Hours          0x0000   100   100   000    Old_age   Offline      -       1411
 12 Power_Cycle_Count       0x0000   100   100   000    Old_age   Offline      -       86
160 Uncorrectable_Error_Cnt 0x0000   100   100   000    Old_age   Offline      -       0
161 Valid_Spare_Block_Cnt   0x0000   100   100   000    Old_age   Offline      -       109
163 Initial_Bad_Block_Count 0x0000   100   100   000    Old_age   Offline      -       19
164 Total_Erase_Count       0x0000   100   100   000    Old_age   Offline      -       33662
165 Max_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       97
166 Min_Erase_Count         0x0000   100   100   000    Old_age   Offline      -       1
167 Average_Erase_Count     0x0000   100   100   000    Old_age   Offline      -       30
168 Max_Erase_Count_of_Spec 0x0000   100   100   000    Old_age   Offline      -       2000
169 Remaining_Lifetime_Perc 0x0000   100   100   000    Old_age   Offline      -       100
175 Program_Fail_Count_Chip 0x0000   100   100   000    Old_age   Offline      -       0
176 Erase_Fail_Count_Chip   0x0000   100   100   000    Old_age   Offline      -       0
177 Wear_Leveling_Count     0x0000   100   100   000    Old_age   Offline      -       1
178 Runtime_Invalid_Blk_Cnt 0x0000   100   100   000    Old_age   Offline      -       0
181 Program_Fail_Cnt_Total  0x0000   100   100   000    Old_age   Offline      -       0
182 Erase_Fail_Count_Total  0x0000   100   100   000    Old_age   Offline      -       0
192 Power-Off_Retract_Count 0x0000   100   100   000    Old_age   Offline      -       34
194 Temperature_Celsius     0x0000   100   100   000    Old_age   Offline      -       27
195 Hardware_ECC_Recovered  0x0000   100   100   000    Old_age   Offline      -       119397
196 Reallocated_Event_Count 0x0000   100   100   000    Old_age   Offline      -       0
197 Current_Pending_Sector  0x0000   100   100   000    Old_age   Offline      -       0
198 Offline_Uncorrectable   0x0000   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0000   100   100   000    Old_age   Offline      -       0
232 Available_Reservd_Space 0x0000   100   100   000    Old_age   Offline      -       100
241 Host_Writes_32MiB       0x0000   100   100   000    Old_age   Offline      -       84601
242 Host_Reads_32MiB        0x0000   100   100   000    Old_age   Offline      -       79756
245 TLC_Writes_32MiB        0x0000   100   100   000    Old_age   Offline      -       134648

SMART Error Log not supported

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Also, I ran ReadSpeed once before overwriting the drive with zeroes.
Code:
  +--------------------------------------------------------------------------+
  | ReadSpeed: Hyper-accurate mass storage read-performance benchmark. rel 1 |
  |  Benchmarked values are in megabytes read per second at five locations.  |
  +--------------------------------------------------------------------------+

Driv Size  Drive Identity     Location:    0      25%     50%     75%     100
---- ----- ---------------------------- ------- ------- ------- ------- -------
 81  120GB CT120BX100SSD1                520.2   542.4   420.4   431.2   542.5

                  Benchmarked: Sunday, 2021-01-03 at 06:01
 -----------------------------------------------------------------------------
   See the ReadSpeed forums at forums.grc.com for help and community support.
 

Attachments

  • 1000 Runs 4.zip
    428.2 KB · Views: 319
  • 1000.csv.txt
    68.4 KB · Views: 318
@DiskTuna I didn't read the whole paper, just the abstract. But, that's interesting ... and disturbing.

May your bits be stable and your interfaces be fast. :cool: Ron
 
Run 5. This time I rebooted after every test. Unfortunately, the results are pretty boring :ROFLMAO:

0% is lower because I installed Windows on this PC, so that region is no longer empty.
1610571690224.png


Here's how long each run took:
1610571730877.png


Here's the script I wrote, REBOOT.BAS. It took me a while because I am very rusty with QBasic.
Code:
File$ = "COUNT.DAT"

' Read the current count.
OPEN File$ FOR INPUT AS #1
INPUT #1, Count
CLOSE #1

' https://www.tek-tips.com/viewthread.cfm?qid=565458
padded$ = RIGHT$("00" + LTRIM$(STR$(Count)), 3)
PRINT padded$
SLEEP 1

' Build the filename, and run RS.
name$ = "RS" + padded$ + ".TXT"
SHELL "RS.EXE " + name$

' Increment the count and write it to the file.
Count = Count + 1
OPEN File$ FOR OUTPUT AS #1
WRITE #1, Count
CLOSE #1

IF Count < 1000 THEN
        ' The "REBOOT" alias isn't recognized for some reason.
        SHELL "FDAPM WARMBOOT"
ELSE
        ' Delete the count file.
        KILL File$

        PRINT "Done"
        
        ' End without "press any key to continue".
        SYSTEM
END IF

Here's my AUTOEXEC.BAT:
Code:
@echo off
SET PATH=\;\FREEDOS

IF EXIST COUNT.DAT GOTO run

ECHO 0 > COUNT.DAT

:run
    QBASIC.EXE /RUN REBOOT.BAS
 

Attachments

  • 1000 Runs 5.zip
    430.7 KB · Views: 323
  • 1000.csv.txt
    68.4 KB · Views: 327
@DiskTuna I didn't read the whole paper, just the abstract. But, that's interesting ... and disturbing.
I think reading the Conclusion (rather short, on page 12) might be helpful as well, as a truncated summary of the many specifics discussed in the paper. 😤😷🙌

I also think my FireCuda drive—with its NAND cache—will probably also be affected by those errors as time goes on.
 
  • Like
Reactions: rfrazier