LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Script to replace Control characters with a number, if not with Space (https://www.linuxquestions.org/questions/linux-newbie-8/script-to-replace-control-characters-with-a-number-if-not-with-space-4175527232/)

Sunray74 12-03-2014 01:32 PM

Script to replace Control characters with a number, if not with Space
 
Hi Guys,

I'm new to Linux and i have a project where my script need to read the text files and replace the Control-L character in the First Byte with a number and if there is no Control-L character present then
the script should append/insert 'space' in the 1st byte of every line in the file.

I have created the script till replacing the Control-L character but unable to add the Decode statement to the SED, like if there is no Control-L then insert space in the first byte.

My Script:

** Quote **

#!/bin/bash

outdir='outdir'
indir=`pwd`

if [ $# -eq 1 ]
then

filename=$1

>$outdir/$filename.txt

files=`ls $indir/$filename[a-z].txt`

for i in $files
do
i=`basename $i`

sed s/^L/1/g $indir/$i > $outdir/$i

cat $outdir/$i >> $outdir/$filename.txt

rm $outdir/$i

done
else

echo "Invalid args"

exit 1
fi

** UnQuote **

Any help would be highly appreciated.

Regards.

grail 12-03-2014 07:21 PM

Couple of points:

1. Please use [code][/code] tags around code or data to protect formatting

2. 'Invalid args' whilst semi-informative it does not advise what the correct input may have been

3. $() is clearer and more versatile than ``

4. [[]] is preferred over [] and (()) preferred for arithmetic

5. Use your globbing directly in the for loop and try not to use ls. This example may not have issues but should your files names have unusual characters, such as spaces, the for loop will not behave as expected
Code:

for i in $indir/$filename[a-z].txt
6. Use meaningful variable names, like the 'i' in for loop above as in a large script the start of the loop may be many lines back so when you come across $i you have no real idea what it refers to

7. An alternative to basename is using the ## bash construct

8. Quote all variables, again to preserve any unusual characters ... of course the caveat is you may well want the unusual characters to expand

9. Quote sed statements (I have also shown quoting for variables below)
Code:

sed 's/^L/1/g' "$indir/$i" > "$outdir/$i"
10. You go from indir_i to outdir_i to outdir_filename and then remove outdir_i. Why not cut out the middle man and simply redirect the sed immediately into outdir_filename. This also allows you to remove the basename
line as well as we can now simply use our for loop variable:
Code:

sed 's/^L/1/g' "$i" >> "$outdir/$filename"
11. sed pattern flag, 'g', makes no sense based on your requirement that we are looking at the start of the line. Currently you will replace '^L' anywhere in the line and not specifically at the start.
Here is the alternative:
Code:

sed 's/^^L/1/' "$i" >> "$outdir/$filename"
Above are merely suggestions :)

Ok, on to your actual question. You are already replacing the '^L' so now you need to tell sed what to do for the non-^L starting lines.
Important tip here is that you need to process the non-^L lines first (try placing solution below in the alternate order and see what happens:
Code:

sed -re 's/^([^^L])/ \1/' -e 's/^^L/1/' "$i" >> "$outdir/$filename"

Sunray74 12-03-2014 11:30 PM

Thanks a lot "grail". i would test the functionality and would update you accordingly.

Thanks once again for your suggestions and would make sure that i follow the standards.

Sunray74 12-04-2014 10:40 AM

Thanks a Ton !! The code has worked for me and i have taken your suggestions in to consideration to modify the code accordingly.

Thanks once again.

Regards

grail 12-04-2014 08:27 PM

No probs ... glad we got there :) Please mark as SOLVED once you have your solution.

Sunray74 12-08-2014 10:55 AM

Hi Grail,

Need your help in finding a way to read "*" (asterisk) in the file name, like for ex: test*123, and based on that i have to include
a character (any of the 4 alphabets - F, P, R or X) in place of the "*" and write the file name as "testF123" (replace asterisk with "F" in this case) to the output file.

Based on the file name, if it has a * then there should be a condition to write any of the 4 alphabets mentioned above in place of * and write the new file name (TESTF123)
to the output on the top left, precisely 3rd or 4th line and if there is no "*" in the file name then the earlier logic should be applied as is.

makes sense ?

Please let me know if you need more clarifications, so that i can get back to you with more information.

Thanks for your help.

grail 12-08-2014 06:26 PM

Quoting ... quoting ... quoting :)

Double quotes around variables will protect them from expanding unusual characters, such as *, and single quotes force any character to be seen as is, hence
using these two concepts you should easily be able to get bash to work out which files you need to update.

Sunray74 12-10-2014 10:34 AM

Thanks for providing the tips. i was able to achieve the requirement which i have posted couple days back with the help of SED command.

also, can you guide me on how to search a specific string in a file and add a new string below the searched string ?

Ex: using my above script, i need to search a word "search_word" in the files where my script is reading and exactly below that i need to insert
a new word "new_word" and write that to the output. "search_word" would be in many places but would be starting only from the first byte.

Appreciate if you could throw some light.

Regards

pan64 12-10-2014 10:42 AM

Quote:

Originally Posted by Sunray74 (Post 5282543)
also, can you guide me on how to search a specific string in a file and add a new string below the searched string ?

in general:
sed 's/<search string>/<replace string>/' inputfile > outputfile works, you need to only specify those strings. For example beginning of line is ^, so:
sed 's/^search_word/new_wordsearch_word/' will do something like you described. see man page of sed and tutorials, examples about usage to find the most convenient way....

Sunray74 12-10-2014 11:51 AM

Hi Pan64,

Thanks for the idea but this would actually replace the existing string which i do not want.
i tried that option to search for a 'search_word' and replace with 'search_word new_word", this way
i can achieve it but my requirement is as below..

File:

xxxxxxxxxxxxxxxxxxx
search_wordxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
search_wordxxxxxxxx
xxxxxxxxxxxxxxxxxxx

New Output should be like this...

xxxxxxxxxxxxxxxxxxx
search_wordxxxxxxxx
new_wordxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
search_wordxxxxxxxx
new_wordxxxxxxxxxxx

Appreciate if you could guide me in achieving the required output.

Thanks a lot for all your help.

Regards,

pan64 12-10-2014 12:00 PM

so that means you want to print the original line and the modified version too?
sed '/search_word/{p;s/search_word/replace_word/}'

Sunray74 12-10-2014 12:05 PM

Awesome, thanks a lot.

i have tested the logic on the command line and it worked. i would include this logic in to my script to test the functionality and would update you the results.

Thanks once again Pan64.

pan64 12-10-2014 12:09 PM

you are welcome
(if you really want to say thanks just press YES)

Sunray74 12-10-2014 01:37 PM

Hi Pan64,

I have made a mistake here, the original file has empty lines after the "search_word" and the "new_word" only should insert below the "search_word" but it has taken the whole line with the command.

File:

xxxxxxxxxxxxxxxxxxx
search_wordxxxxxxxx

xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
search_wordxxxxxxxx

xxxxxxxxxxxxxxxxxxx

New Output should be like this...

xxxxxxxxxxxxxxxxxxx
search_wordxxxxxxxx
new_word

xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
search_wordxxxxxxxx
new_word

xxxxxxxxxxxxxxxxxxx

Sorry for the confusion and request you to provide me the correct syntax.

grail 12-10-2014 07:02 PM

So try:
Code:

sed '/search/a\new_word' file
Also, you can look here for more information on how to use sed.


All times are GMT -5. The time now is 03:04 PM.