LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-13-2017, 03:40 PM   #1
jdoginky
LQ Newbie
 
Registered: Feb 2017
Location: Louisville Ky.
Posts: 7

Rep: Reputation: Disabled
Text conversion help


Hi, new to the group, and first time poster.
I'm trying to convert "abc" to "def" in a file, but only when found in positions 3,4,5 and 32,33,34.

Example:
I want to convert:
3 abc eelee ref: OAK ARR: ONT abc 0236
to:
3 def eelee ref: OAK ARR: ONT def 0236

but not change:
3 lmn eelee abc: OAK ARR: ONT lmn 0400 abc

I thought sed would be my best option, but having problem figuring out how.

Last edited by jdoginky; 02-13-2017 at 03:45 PM.
 
Old 02-13-2017, 03:41 PM   #2
jdoginky
LQ Newbie
 
Registered: Feb 2017
Location: Louisville Ky.
Posts: 7

Original Poster
Rep: Reputation: Disabled
Hi, new to the group, and first time poster.
I'm trying to convert "abc" to "def" in a file, but only when found in positions 3,4,5 and 32,33,34.

Example:
I want to convert:
3 abc eelee ref: OAK ARR: ONT abc 0236
to:
3 def eelee ref: OAK ARR: ONT def 0236

but not change:
3 lmn eelee abc: OAK ARR: ONT lmn 0400 abc

I thought sed would be my best option, but having problem figuring out how.
 
Old 02-13-2017, 04:08 PM   #3
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 13,576

Rep: Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341Reputation: 4341
if position is important probably awk is a better tool
 
Old 02-13-2017, 04:13 PM   #4
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,237

Rep: Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653
You say '3-5' and '32-34' which are character numbers. That can be very difficult to nail down, especially if there are any length differences or spaces that push your characters over 1 or 2.

Would it be better to look at it like 'column number'? Like for your example:

Code:
1  2    3    4    5   6    7   8    9
3 abc eelee ref: OAK ARR: ONT abc 0236
 
1 members found this post helpful.
Old 02-17-2017, 02:01 PM   #5
jdoginky
LQ Newbie
 
Registered: Feb 2017
Location: Louisville Ky.
Posts: 7

Original Poster
Rep: Reputation: Disabled
Sorry, I didn't provide a very good example. There are some column's (fields)that run together, which would prevent your suggestion.

I want to convert:
3 abc eelee ref: OAK ARR: ONT abc0236
to:
3 def eelee ref: OAK ARR: ONT def0236

but not change:
3 lmn eelee abc: OAK ARR: ONT lmn0400 abc
 
Old 02-17-2017, 02:13 PM   #6
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,237

Rep: Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653Reputation: 1653
That shouldn't matter programmatically. For example, would this pseudo-code definition do what you expect?

Code:
if data in col 1,2,3 = 3,abc,eelee
then 
change data in col2=def AND col8=def0236
 
Old 02-17-2017, 02:14 PM   #7
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 8,422
Blog Entries: 13

Rep: Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740
Quote:
Originally Posted by jdoginky View Post
Hi, new to the group, and first time poster.
I'm trying to convert "abc" to "def" in a file, but only when found in positions 3,4,5 and 32,33,34.

Example:
I want to convert:
3 abc eelee ref: OAK ARR: ONT abc 0236
to:
3 def eelee ref: OAK ARR: ONT def 0236

but not change:
3 lmn eelee abc: OAK ARR: ONT lmn 0400 abc

I thought sed would be my best option, but having problem figuring out how.
Hi,

While I tend to agree with pan64 that awk may be better, you should post your attempts with sed or awk to show what you have tried.

LQ members are happy to help you, however they are also here as volunteers and further to help you to learn "how to" by your self. Thus it's best to see your earlier attempts to see how you approach a solution and then have members offer refinement. Please post some of the attempts and describe where the outcomes were not correct or what things you wished to do but could not because of your inexperience with either sed or awk, or some other tool.
 
Old 02-17-2017, 03:39 PM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,769

Rep: Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052
sed does this fairly easily Happy to show example once you show your attempts
 
Old 03-02-2017, 10:09 AM   #9
jdoginky
LQ Newbie
 
Registered: Feb 2017
Location: Louisville Ky.
Posts: 7

Original Poster
Rep: Reputation: Disabled
Thanks all. As Iíve looked more closely at this, I think I have a clear picture of what the original VB code was doing.

If char 3-5 is in the ALL_WIDGETS file
If char 3-5 and 138-140 match
convert both to alt_widget
else
convert char 3-5 to alt_widget
replace char 138-140 with spaces

My problem (in red)is what logic to use to compare char 3-5 with 138-140 on each line to determine whether to convert both occurrences, or whether to convert 3-5, and blank out 138-140 if they differ.

> cat ALL_WIDGETS
abc,hij
def,klm
ghi,nop



> cat convert_widgets
for WIDGETs in `cat ALL_WIDGETS`
do
WIDGET=`echo $WIDGETs | cut -d\, -f1`
grep -q $WIDGET $source_file
if [ $? = 0 ];then
alt_WIDGET=`echo $WIDGETs | cut -d, -f2`

##### If char 3-5 and char 138-140 match, convert both
if [ char 138-140 = char 3-5 ];then
echo "Converting $WIDGET to $alt_WIDGET @ 3-5"
sed -E "s/^(.{2})$WIDGET/\1$alt_WIDGET /" $source_file > $source_file.tmp
mv $source_file.tmp $source_file

echo "Converting $WIDGET to $alt_WIDGET @ 138-140"
sed -E "s/^(.{137})$WIDGET/\1$alt_WIDGET /" $source_file > $source_file.tmp
mv $source_file.tmp $source_file

else #### Otherwise, convert 3-5, and replace 138-140 with spaces
echo "Converting $WIDGET to $alt_WIDGET @ 3-5"
sed -E "s/^(.{2})$WIDGET/\1$alt_WIDGET /" $source_file > $source_file.tmp
mv $source_file.tmp $source_file
#blank out 138-140
echo "Blanking out $WIDGET @ 138-140"
sed -E "s/^(.{137})$WIDGET/\1 /" $Dest/PLEG > $Dest/PLEG.tmp
mv $source_file.tmp $source_file
fi
fi
done



Thanks for your help.
 
Old 03-02-2017, 10:31 AM   #10
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,428
Blog Entries: 3

Rep: Reputation: 2205Reputation: 2205Reputation: 2205Reputation: 2205Reputation: 2205Reputation: 2205Reputation: 2205Reputation: 2205Reputation: 2205Reputation: 2205Reputation: 2205
The script will be more readable if you enclose it in [code] [/code] tags.

I would go with awk as suggested earlier, if the data is in space-delimited columns like this:

Code:
3 abc eelee ref: OAK ARR: ONT abc 0236
3 def eelee ref: OAK ARR: ONT def 0236 
3 lmn eelee abc: OAK ARR: ONT lmn 0400
Then you can do the substitution in one line:

Code:
awk '$2 == "abc" && $2 == $8 { $2 = $8 = "def"; } { print; }' $source_file >> $temp_file;
You can even pass variables to awk.

Code:
awk --assign widget='abc' --assign altwidget='def' '$2 == widget && $2 == $8 { $2 = $8 = altwidget; } { print; }' $source_file >> $temp_file;
 
Old 03-02-2017, 11:20 AM   #11
jdoginky
LQ Newbie
 
Registered: Feb 2017
Location: Louisville Ky.
Posts: 7

Original Poster
Rep: Reputation: Disabled
Thanks Turbocapitalist, unfortunately, my file is not space-delimited , however, every occurrence of 'widget' that I want to convert, is char 3-5 and 138-140 on the lines where 'widget' is found.
 
Old 03-02-2017, 11:31 AM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,769

Rep: Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052
As the fields may be run together, awk may be a little harder to use for the solution, here is the sed I was thinking of:
Code:
sed -r 's/^(.{2})(abc)(.{131})\2(.*)$/\1def\3def\4/'
So you can simply loop over your ALL_WIDGETS file and use the sed on the sourcefile as required, something like:
Code:
while IFS=, read -r current new
do
  sed -r -i "s/^(.{2})($current)(.{131})\2(.*)$/\1$new\3$new\4/" "$source_file"
done<ALL_WIDGETS
You might need to tweak it (can't remember if you have to escape the $ terminator), but you get the idea

Last edited by grail; 03-02-2017 at 11:32 AM. Reason: Updated range
 
Old 03-02-2017, 11:40 AM   #13
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 8,422
Blog Entries: 13

Rep: Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740
Not attempting to write a script for this, however my tact for when I do something like this in sed or just using emacs search and replace, I find a unique property of what I wish to change for my search spec.

From all of your examples, you wish to change all [SPACE]abc[SPACE] to another pattern.

So code for that. Your contrary examples show either abc[COLON] or abc[other characters]

My points there being that you bring up character position, repeatedly, however I see no examples yet showing the simple search won't work. I daresay, (sorry) that you can find or invent more examples to contradict that. But consider (1) if you have more examples, then you really should show all examples now, not incrementally (2) if you're inventing contrary examples because you wish to stick with this adamant restriction, then I really can't help you much except to say that once you find <pattern> you can then evaluate the position to determine if it meets the further criteria for substitution.

My next point is about universal behavior. To whit is my example of either sed or emacs search and replace. Those are universal in that I have to give a search string and a replacement string. If you have an extremely and highly specific edit requirement, it is fine, but for me if I'm fixing one thing once, I do that and move on. If I know I will be fixing something many times over, then I will write re-usable code or script to do so. Therefore allowing for arguments and options and not just coding to a very, highly particular string and range of columns.
 
Old 03-02-2017, 03:34 PM   #14
jdoginky
LQ Newbie
 
Registered: Feb 2017
Location: Louisville Ky.
Posts: 7

Original Poster
Rep: Reputation: Disabled
rtmistler, here is a before, and after sample of an actual file, and what I am trying to achieve. The values will change, but the changes will always be made to columns 3-5 and 138-140(if applicable)
(note: the red XX represent spaces)

BEFORE:
2UBBD 0008W16 01DEC1630DEC1602DEC16 02DEC16C BatchName 2015.2.4 1948000880
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
3 BBD77920101A01DEC1623DEC16 2345 AAA01500150+0100 ARN03450345+0100 71P TIC BBD7793 S F123VVFDAC 000881
3 BBD77920201A04DEC1618DEC16 2345 AAA23402340+0100 BRB00450045+0000 71P TIC BBD7793 S F123VVFDAC 000882
3 BBD77930101A01DEC1622DEC16 2345 ARN18451845+0100 BRB20352035+0000 71P TIC BRA7797 S F123VVFDAC 000883
3 BBD77930102A01DEC1622DEC16 2345 BRB21452145+0000 AAA22552255+0100 71P TIC BBD7792 S F123VVFDAC 000884
3 BBD77930201A02DEC1616DEC16 2345 ARN18451845+0100 BRB20352035+0000 71P TIC BBD77921 S F123VVFDAC 000885
3 BBD77930202A02DEC1616DEC16 2345 BRB21452145+0000 AAA22552255+0100 71P TIC BBD77922 S F123VVFDAC 000886
3 BBD77930301A05DEC1619DEC16 2345 BRB21452145+0000 AAA22552255+0100 71P TIC BBD7792 S F123VVFDAC 000887
3 BBD77930401A23DEC1623DEC16 2345 ARN18451845+0100 BRB20352035+0000 71P TIC BRA7753 S F123VVFDAC 000888
3 BBD77930402A23DEC1623DEC16 2345 BRB21452145+0000 AAA22552255+0100 71P TIC BBD0000 S F123VVFDAC 000889


AFTER:
2UBF 0008W16 01DEC1630DEC1602DEC16 02DEC16C BatchName 2015.2.4 1948000880
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
3 BF 77920101A01DEC1623DEC16 2345 AAA01500150+0100 ARN03450345+0100 71P TIC BF 7793 S F123VVFDAC 000881
3 BF 77920201A04DEC1618DEC16 2345 AAA23402340+0100 BRB00450045+0000 71P TIC BF 7793 S F123VVFDAC 000882
3 BF 77930101A01DEC1622DEC16 2345 ARN18451845+0100 BRB20352035+0000 71P TIC XX 7797 S F123VVFDAC 000883
3 BF 77930102A01DEC1622DEC16 2345 BRB21452145+0000 AAA22552255+0100 71P TIC BF 7792 S F123VVFDAC 000884
3 BF 77930201A02DEC1616DEC16 2345 ARN18451845+0100 BRB20352035+0000 71P TIC BF 7795 S F123VVFDAC 000885
3 BF 77930202A02DEC1616DEC16 2345 BRB21452145+0000 AAA22552255+0100 71P TIC BF 7798 S F123VVFDAC 000886
3 BF 77930301A05DEC1619DEC16 2345 BRB21452145+0000 AAA22552255+0100 71P TIC BF 7792 S F123VVFDAC 000887
3 BF 77930401A23DEC1623DEC16 2345 ARN18451845+0100 BRB20352035+0000 71P TIC XX 7753 S F123VVFDAC 000888
3 BF 77930402A23DEC1623DEC16 2345 BRB21452145+0000 AAA22552255+0100 71P TIC BF 0000 S F123VVFDAC 000889
 
Old 03-02-2017, 03:47 PM   #15
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 8,422
Blog Entries: 13

Rep: Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740Reputation: 3740
In that example case you can globally replace "BBD" with "BF " and "BRA" with three space characters.

I would two pass that using sed.

Besides examples, how about a more inclusive summary of your requirements. For instance I can see BBD in the header and perhaps that aids you in determining what string to change later in the file. The BRA appears sparingly and therefore what qualifiers would tell someone entering the filename and search strings into a script as arguments to know that they need to specify that string?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to Text Conversion limnephilidae Programming 5 01-03-2012 09:22 AM
String of text conversion jeewiz Linux - General 3 09-25-2009 12:10 AM
html to text conversion munna_dude Programming 15 10-19-2007 07:45 AM
Text File conversion agallant Programming 16 05-24-2004 09:01 AM
PDF to Text Conversion limnephilidae Linux - Software 2 06-25-2003 01:36 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:09 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration