LinuxQuestions.org - how to remove two alternate char from file

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - how to remove two alternate char from file (https://www.linuxquestions.org/questions/programming-9/how-to-remove-two-alternate-char-from-file-783266/)

saurin

01-19-2010 12:16 PM

how to remove two alternate char from file

How to remove alternate char from file using shell script?
if the file content is "1234567890" the output file should be "24680"

somebody already told me that following is the solution and it is working.

sed 's/\(.\)\(.\)/\2/g' filename

now there may be small modification to generate two alternate byte, what it should be?
if the file content is "1234567890" the output file should be "125689"

PTrenholme

01-19-2010 12:45 PM

Just replace the . by .. in the two grouping expressions.

Read man sed or info sed for an explanation of the stream editor.

Anyhow, the dot matches any single character, the parenthesis define groups of expressions, and the \2 refers to the second group, so s/\(.\)\(.\)/\2/g means substitute for every group of two characters the second character of the set. Thus sed 's/\(..\)\(..\)/\2/g' filename would substitute, for every group of four characters, the last two characters in the group.

By the way, most distributions are shipped with tools to convert from unicode to to alternative encodings.

saurin

01-19-2010 01:07 PM

Quote:

Originally Posted by PTrenholme (Post 3832442)

can we pass the particular range like if i want to remove character which occurs at every 100 character interval then is there any method?

PTrenholme

01-19-2010 03:54 PM

Sure. Look at the (<expression>){n,m} construction in the manual or info documentation (referenced above) for details. (Or look at info grep in the "Regular Expressions" section.) If you don't have it installed, the pinfo command is a nice-to-have "enhanced" info reader. It replaces both the info and man commands and is (IMHO) somewhat easier to navigate.

ghostdog74

01-19-2010 10:46 PM

No need regex. KISS

Code:

# echo "1234567890" | awk -vFS= '{for(i=2;i<=NF;i+=2) printf $i}'

24680

All times are GMT -5. The time now is 03:19 PM.