LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   awk and sed questions (https://www.linuxquestions.org/questions/programming-9/awk-and-sed-questions-895154/)

rock1961 08-02-2011 03:00 PM

awk and sed questions
 
i have a mrtg config file that i am trying to replace some strings in. the format of the lines in question are like this

Title[3560-1_10127]: Traffic Analysis for 10127 -- 3560-1

where the 2nd instance on the line of 10127 needs replaced with the string "Port 27". thats easy to do with sed but what i want to do is to do that for all the ports incrementing my way through the file. in other words, there are lines that say 10101 through 10148 and i want to replace the 2nd instance on the line of 10101 with "Port 1" and then go onto the line with 10102 and replace its 2nd instance with "Port 2"

this is breaking my brain. any of you have any ideas?

thanks

crts 08-02-2011 03:10 PM

Hi,

do you mean something like this?
Code:

sed -nr 's/(Title\[.*_([0-9]{3}([0-9]{2})).*)\2/\1Port \3/p' file
If this is not what you are looking for then you will have to provide a more representative example and what results you expect.

PTrenholme 08-02-2011 04:42 PM

Here's a gawk program that works for your sample:
Code:

$ cat rock1961.gawk
#!/bin/gawk -f
/\[[[:digit:]-]+_[[:digit:]]+\]/ {
  input=$0
  n=split(input,part,/[[\[\]_]/)
  if (input ~ " " part[3]) {
    port=substr(part[3],4)+0
    sub(" " part[3], " Port " port, input)
  }
  $0=input
}
{print}

and here's the output:
Code:

$ cat rock1961.txt
Title[3560-1_10127]: Traffic Analysis for 10127 -- 3560-1

$ ./rock1961.gawk rock1961.txt
Title[3560-1_10127]: Traffic Analysis for Port 27 -- 3560-1

Notes:
  1. This assumes that your port number is derived from the second part of the code in the square brackets by removing the first three digits of that part.
  2. To make the script executable, you need to do a chmod +x with the script name as the second argument.
  3. The input is copied to /dev/stdout, with the targeted lines altered. To write it to a file, just redirect the output to whatever file you wish. (E.g., ./rock1961.awk data.file > altered_data.file)
  4. The target lines are selected by the presence of the [nnn-nn_nnnnn], but not checked for the "Title" at the start. If that would be useful, just change the selection criteria from /\[[[:digit:]-]+_[[:digit:]]+\]/ to /Title[[:space:]]*\[[[:digit:]-]+_[[:digit:]]+\]/

colucix 08-02-2011 05:11 PM

In GNU awk the gensub function can replace the Nth occurrence of a matching pattern:
Code:

awk '{$0=gensub(/101([0-9][0-9])/,"Port \\1",2)}1' file
Anyway this leaves leading zeros in front of Port numbers from 1 to 9. A bit more tricky if you want to remove them:
Code:

awk '{port=gensub(/.*101([0-9][0-9]).*/,"\\1","g");print gensub(/101([0-9][0-9])/,("Port " port+0),2)}'

theNbomr 08-02-2011 05:14 PM

You might want to consider always using at least the maximum number of digits required for the largest port number, with leading zeros. This makes the listing sort well, since the alphabetic form and the numeric form sort the same. Might not apply to your situation, but if it does, now is the time to get it right.

--- rod.

Edit: wouldn't you know someone would say the same thing while I was typing. Note to self: learn to type fasterer.

David the H. 08-02-2011 06:09 PM

sed can also replace the nth instance on a line.
Code:

sed -i -r 's/101([0-4][0-9])/Port \1/2' file
This is a little off though, since it will match everything from 10100-10149. It sure would be nice if regex could match number sequences larger than one digit.

So you'd probably have to embed it in a shell loop instead to match each number in turn, and fool with zero-padding as well.

bash v.4's zero-padding in brace-expansion makes it relatively easy.
Code:

for num in {01..48}; do

    sed -i "s/101${num}/Port ${num#0}/2" file

done

Change the brace expansion to "{0{1..9},{10..48}}" for v.3.

grail 08-02-2011 06:14 PM

Also if you know the format is always the same:
Code:

awk '$5 = "Port " int(substr($5,4))' file

rock1961 08-03-2011 09:06 AM

Solved Awk question
 
Quote:

Originally Posted by PTrenholme (Post 4432007)
Here's a gawk program that works for your sample:
Code:

$ cat rock1961.gawk
#!/bin/gawk -f
/\[[[:digit:]-]+_[[:digit:]]+\]/ {
  input=$0
  n=split(input,part,/[[\[\]_]/)
  if (input ~ " " part[3]) {
    port=substr(part[3],4)+0
    sub(" " part[3], " Port " port, input)
  }
  $0=input
}
{print}

and here's the output:
Code:

$ cat rock1961.txt
Title[3560-1_10127]: Traffic Analysis for 10127 -- 3560-1

$ ./rock1961.gawk rock1961.txt
Title[3560-1_10127]: Traffic Analysis for Port 27 -- 3560-1

Notes:
  1. This assumes that your port number is derived from the second part of the code in the square brackets by removing the first three digits of that part.
  2. To make the script executable, you need to do a chmod +x with the script name as the second argument.
  3. The input is copied to /dev/stdout, with the targeted lines altered. To write it to a file, just redirect the output to whatever file you wish. (E.g., ./rock1961.awk data.file > altered_data.file)
  4. The target lines are selected by the presence of the [nnn-nn_nnnnn], but not checked for the "Title" at the start. If that would be useful, just change the selection criteria from /\[[[:digit:]-]+_[[:digit:]]+\]/ to /Title[[:space:]]*\[[[:digit:]-]+_[[:digit:]]+\]/

this was the ticket. running this script against the file first and redirecting to another file and then changing the Title string to PageTop and running it again against the new file did exactly what i wanted. that you so much for your help.


All times are GMT -5. The time now is 01:59 PM.