LinuxQuestions.org - sed - Replace numbers from one to several digits

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - sed - Replace numbers from one to several digits (https://www.linuxquestions.org/questions/linux-newbie-8/sed-replace-numbers-from-one-to-several-digits-4175667240/)

sed - Replace numbers from one to several digits

Hello,

I am currently learning how to use sed and want to format a sample output of smartctl.

This is the output:

Code:

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000f  099  099  051    Pre-fail  Always      -      2376

  3 Spin_Up_Time            0x0007  091  091  011    Pre-fail  Always      -      3620

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      405

  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000f  253  253  051    Pre-fail  Always      -      0

  8 Seek_Time_Performance  0x0025  100  100  015    Pre-fail  Offline      -      0

  9 Power_On_Hours          0x0032  100  100  000    Old_age  Always      -      717

 10 Spin_Retry_Count        0x0033  100  100  051    Pre-fail  Always      -      0

 11 Calibration_Retry_Count 0x0012  100  100  000    Old_age  Always      -      0

 12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      405

 13 Read_Soft_Error_Rate    0x000e  099  099  000    Old_age  Always      -      2375

183 Runtime_Bad_Block      0x0032  100  100  000    Old_age  Always      -      0

184 End-to-End_Error        0x0033  100  100  000    Pre-fail  Always      -      0

187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      2375

188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0

190 Airflow_Temperature_Cel 0x0022  084  074  000    Old_age  Always      -      16 (Lifetime Min/Max 16/16)

194 Temperature_Celsius    0x0022  084  071  000    Old_age  Always      -      16 (Lifetime Min/Max 16/16)

195 Hardware_ECC_Recovered  0x001a  100  100  000    Old_age  Always      -      3558

196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0012  098  098  000    Old_age  Always      -      81

198 Offline_Uncorrectable  0x0030  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  100  100  000    Old_age  Always      -      1

200 Multi_Zone_Error_Rate  0x000a  100  100  000    Old_age  Always      -      0

201 Soft_Read_Error_Rate    0x000a  253  253  000    Old_age  Always      -      0

My goal is to remove the ID numbers (so 1 - 201).
At the moment my command removes just the first digit of every number:

Code:

root@localhost:/var/prtg/scriptsxml# smartctl | grep -A25 'Vendor' | sed -n -E 's/(\s*)[0-9](\s*)//p'

Raw_Read_Error_Rate    0x000f  099  099  051    Pre-fail  Always      -      2376

Spin_Up_Time            0x0007  091  091  011    Pre-fail  Always      -      3620

Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      405

Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0

Seek_Error_Rate        0x000f  253  253  051    Pre-fail  Always      -      0

Seek_Time_Performance  0x0025  100  100  015    Pre-fail  Offline      -      0

Power_On_Hours          0x0032  100  100  000    Old_age  Always      -      717

0 Spin_Retry_Count        0x0033  100  100  051    Pre-fail  Always      -      0

1 Calibration_Retry_Count 0x0012  100  100  000    Old_age  Always      -      0

2 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      405

3 Read_Soft_Error_Rate    0x000e  099  099  000    Old_age  Always      -      2375

83 Runtime_Bad_Block      0x0032  100  100  000    Old_age  Always      -      0

84 End-to-End_Error        0x0033  100  100  000    Pre-fail  Always      -      0

87 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      2375

88 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0

90 Airflow_Temperature_Cel 0x0022  084  074  000    Old_age  Always      -      16 (Lifetime Min/Max 16/16)

94 Temperature_Celsius    0x0022  084  071  000    Old_age  Always      -      16 (Lifetime Min/Max 16/16)

95 Hardware_ECC_Recovered  0x001a  100  100  000    Old_age  Always      -      3558

96 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      0

97 Current_Pending_Sector  0x0012  098  098  000    Old_age  Always      -      81

98 Offline_Uncorrectable  0x0030  100  100  000    Old_age  Offline      -      0

99 UDMA_CRC_Error_Count    0x003e  100  100  000    Old_age  Always      -      1

00 Multi_Zone_Error_Rate  0x000a  100  100  000    Old_age  Always      -      0

01 Soft_Read_Error_Rate    0x000a  253  253  000    Old_age  Always      -      0

How do I rewrite the command to remove two- and three-digit numbers?

there are several ways to do that:

Code:

sed 's/^....//'      # remove the first 4 chars

sed 's/^\s*[0-9]* //' # remove space, digits and a space

sed -r 's/^\s*[0-9]+\s//'  # probably this works too

Well, first if you want to skip the first line of the output,

Code:

sed -n -E '1d; s/(\s*)[0-9]//p'

But that still leaves you with your original problem. The sed language can't really do comparisons or calculations, so I would suggest AWK instead.

Code:

smartctl | awk '$1+0<100'

Back to the original question, if you are not looking for a criterion more specific than the number of digits in the first column, then use interval operators,

Code:

smartctl | sed -n -E '/^[[:space:]]*[0-9]{1,2}[[:space:]]/p'



# or 



smartctl | sed -n -E '/^[[:space:]]*[0-9]{1,2}\b/p'

My solution is not too different from what the others have posted, except I combined the characters into a character class.

Code:

sed 's/^[[:space:][:digit:]]+//g'

To match and remove one or more whitespace/numeric characters from the beginning of each line.

Quote:

Originally Posted by individual (Post 6075631)

My solution is not too different from what the others have posted, except I combined the characters into a character class.

Code:

sed 's/^[[:space:][:digit:]]+//g'

To match and remove one or more whitespace/numeric characters from the beginning of each line.

Since your pattern specifies the beginning of the line, isn't the trailing "g" superfluous?

To replace from one to several digits, use the + quantifier (or its lazy variant +?) combined with [0-9] (because sed doesn't support \d and [[:digit:]] is unnecessarily verbose).

+ means "as many as possible, at least one"
+? means "at least one, as many as required"

Contrasted with * which means "as many as possible, none required", and *? which means "as few as possible".

The difference between greedy and lazy quantifiers can be subtle - in your example it doesn't matter which is used, but there are other situations where it does matter, so it's a useful thing to keep in mind.

Note that sed needs extended mode (-E) for all the above to work except *

So, to handle lines starting with optional spaces, then at least one digit, then a space:

Code:

sed -E 's/^ *[0-9]+ //'

One final note, the above uses literal space characters. Both \s and the long-winded [[:space:]] match five additional whitespace characters other than space, so for clarity should be avoided when only looking for spaces.

Quote:

Originally Posted by rnturn (Post 6075716)

Since your pattern specifies the beginning of the line, isn't the trailing "g" superfluous?

Indeed, good catch. I do it out of habit.