Ремонт накопителей Seagate. Часть 11 - [238] :: Магнитные носители информации

Seagate's Seek Error Rate, Raw Read Error Rate, and Hardware ECC Recovered SMART attributes

Report Inappropriate Content
09-14-2011 11:44 PM - last edited on 09-15-2011 01:28 AM
Seagate's Seek Error Rate, Raw Read Error Rate, and Hardware ECC Recovered SMART attributes create a lot of anxiety amongst Seagate users. This is because the raw values are typically very high, and the normalised values (Current / Worst / Threshold) are usually quite low. Despite this, the numbers in most cases are perfectly OK.

The anxiety arises because we intuitively expect that the normalised values should reflect a "health" score, with 100 being the ideal value. Similarly, we would expect that the raw values should reflect an error count, in which case a value of 0 would be most desirable. However, Seagate calculates and applies these attribute values in a counterintuitive way.

In fact the normalised values of Seagate's Seek Error Rate, Raw Read Error Rate, and Hardware ECC Recovered attributes are logarithmic, not linear, and the raw values are sector counts or seek counts, not error counts.

Seagate's SMART documentation is not publicly available. The following information has not been gleaned from any official source, but is based on my own testing and observation, and on testing by others. Therefore it may contain errors.

Seek Error Rate

The raw value of each SMART attribute occupies 48 bits. Seagate's Seek Error Rate attribute consists of two parts -- a 16-bit count of seek errors in the uppermost 4 nibbles, and a 32-bit count of seeks in the lowermost 8 nibbles. In order to see these data, we will need a SMART utility that reports all 48 bits, preferably in hexadecimal. Two such utilities are HD Sentinel and HDDScan.

I believe the relationship between the raw and normalised values of the SER attribute is given by ...

normalised SER = -10 log (lifetime seek errors / lifetime seeks)

In the above formula, if the drive has recorded no errors, then we would still need to set the number of errors to 1, otherwise the result would be indeterminate.

The following table correlates the normalised SER against the actual error rate:
90 = <= 1 error per 1000 million seeks
80 = <= 1 error per 100 million
70 = <= 1 error per 10 million
60 = <= 1 error per million
50 = 10 errors per million
40 = 100 errors per million
30 = 1000 errors per million
20 = 10 errors per thousand

A drive that has not yet recorded 1 million seeks will show 100 and 253 for the Current and Worst values. I believe this is because the data are not considered to be statistically significant until the drive has recorded 1 million seeks. When this target is reached, the values drop to 60 and 60, assuming there have been no errors.

By way of example, here are the SMART data for my 13GB Seagate HDD:
http://www.users.on.net/~fzabkar/SmartUDM/13GB.RPT
Attribute ID Threshold Value Worst Raw
===============================================================
Seek Error Rate 7 30 53 38 052E0E3000EC

The number of lifetime seek errors = 0x052E (uppermost 4 nibbles) = 1326

The number of lifetime seeks = 0x0E3000EC (lowermost 8 nibbles) = 238 026 988

Using Google's calculator ...

0x052E = 1326
0x0E3000EC = 238 026 988

http://www.google.com/search?q=0x052E+in+decimal
http://www.google.com/search?q=0x0E3000EC+in+decimal

Applying the formula ...

normalised SER = -10 log (0x052E / 0x0E3000EC)

http://www.google.com/search?q=-10+log+(0x052E+/+0x0E3000EC)

... we get a result of 52.54.

Here is a second example:
http://www.users.on.net/~fzabkar/SmartUDM/120GB.RPT
Attribute ID Threshold Value Worst Raw
===============================================================
Seek Error Rate 7 30 79 60 00000580A6AC

The above drive is in fact error free. It has recorded 0x0580A6AC seeks (= 92 million) without error.

Applying the formula ...

normalised SER = -10 log (1 / 0x0580A6AC)

... we get a result of 79.65

Note that we have used 1 instead of 0 for the error count (because log 0 is indeterminate).

Raw Read Error Rate and Hardware ECC Recovered

The raw values of the RRER and HER attributes represent a sector count, not an error count. This figure rolls over to 0 once the count reaches about 250 million. I suspect that the drive records the total number of errors in each block of 250 million sectors, and then recalculates the normalised values of each attribute accordingly. This means that RRER and HER would be updated according to a rolling average rather than on a lifetime basis. I'm almost certain that the normalised values are also logarithmic, but I'm not sure how they are calculated. The above figure of 250 million sectors applies to the 7200.11 and DiamondMax 22 models, but may not apply to all.

While writing this article I came upon a Seagate document entitled "Diagnostic Commands". It doesn't discuss SMART attributes, but it refers to "Error Recovery Usage Rate" and defines it as ...

Error Recovery Usage Rate =

-log10 {(Number of sectors in which controller invoked specified error recovery scheme)/[(Number of sectors transferred) * (512 bytes/sector) * (8 bits/byte)]}

This lends support for my Seek Error Rate formula, and suggests that the RRER and HER attributes may be similarly calculated.

In fact the document mentions (but does not discuss) 5 different error recovery schemes:

HARD = multiple retries invoked and failed
FIRM = multiple retries invoked
SOFT = 5 retries invoked
OTF = 1 retry invoked (On The Fly)
RAW = OTF ECC invoked

"On The Fly" means that errored data is corrected using the ECC bytes, without an additional access of the platters.

Based on the abovementioned Error Recovery Usage Rate formula, I now postulate that the normalised value of the Raw Read Error Rate attribute could be calculated as follows:

normalised RRER = -10 log (number of errored sectors / total bits transferred)

The total number of bits is ...

(250 million sectors) x (512 bytes/sector) x (8 bits/byte) = 1.024 x 10^12

It seems to me that it makes more sense to use a round figure, say 10^12.

If we now let the number of errors equal 0 (or 1), then we have ...

max normalised RRER = -10 log (1 / 10^12) = 120

Similarly, if we let the number of errors equal 250 million (ie every sector is errored), then we have ...

min normalised RRER = -10 log (1 / 4096) = 36

Therefore, if my hypothesis is correct, we would expect that the threshold value of the RRER attribute would be 36, and its maximum possible value would be 120. In fact my Internet research tends to confirm a maximum of 120 for 7200.11 models, but the threshold figure is 34.

FWIW, here are the numbers for my own Seagate drives:
Attribute ID Threshold Value Worst Raw
================================================================
Raw Read Error Rate 1 6 114 100 00000386EBBA (ST3320620A)
Raw Read Error Rate 1 6 64 62 00000AFD20E3 (ST3120026A)
Raw Read Error Rate 1 34 77 66 000007820F8F (ST340016A)
Raw Read Error Rate 1 0 79 78 00000753BA8E (ST313021A)

Hardware ECC recovered 195 0 100 63 00000C62F66E (ST3320620A)
Hardware ECC recovered 195 0 64 62 00000AFD20E3 (ST3120026A)
Hardware ECC recovered 195 0 77 66 000007820F8F (ST340016A)

Модерирует : Akam1, Dr_StandBy, vertex4
vertex4 (05-02-2023 13:12): Ремонт накопителей Seagate. Часть 12	Версия для печати • Подписаться • Добавить в закладки
Страницы: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248