LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-03-2015, 12:14 AM   #1
sam@
Member
 
Registered: Sep 2013
Posts: 31

Rep: Reputation: Disabled
Question Replace 6th column entries


Hi

My input file looks

Code:
String000002  GeneWise        CW     48945   49354   .       -       0       Pt=PEQU_00004;
String000002  LEN   NA    52125   52604   0.945751        -       .       PID=PEQU_00005;lvid_id=PEQ_28708;
String000002  LEN   CW     52125   52604   .       -       0       Pt=PEQU_00005;
String000002  WEise        NA    66200   66667   45.48   -       .       PID=PEQU_00006;lvid_id=Os03t0797100-00-D1363;Shift=0;
String000002  WEise        CW     66200   66667   .       -       0       Pt=PEQU_00006;
String000002  GUST        NA    90829   91128   0.21    +       .       PID=PEQU_00007;lvid_id=A00088;
String000002  GUST        CW     90829   91128   0.21    +       0       Pt=PEQU_00007;
String000002  LEN   NA    104627  107284  0.499954        -       .       PID=PEQU_00008;lvid_id=PEQ_36749;
String000002  LEN   CW     104627  105584  .       -       1       Pt=PEQU_00008;
I want to replace all my 6th column entries to .

My original file has decimal values in column 6 .It also has . as 6th column entries. I just want to replace all 6th column entries to .
My required output file is :


Code:
String000002  GeneWise        CW     48945   49354   .       -       0       Pt=PEQU_00004;
String000002  LEN   NA    52125   52604   .        -       .       PID=PEQU_00005;lvid_id=PEQ_28708;
String000002  LEN   CW     52125   52604   .       -       0       Pt=PEQU_00005;
String000002  WEise        NA    66200   66667   .   -       .       PID=PEQU_00006;lvid_id=Os03t0797100-00-D1363;Shift=0;
String000002  WEise        CW     66200   66667   .       -       0       Pt=PEQU_00006;
String000002  GUST        NA    90829   91128   .    +       .       PID=PEQU_00007;lvid_id=A00088;
String000002  GUST        CW     90829   91128   .    +       0       Pt=PEQU_00007;
String000002  LEN   NA    104627  107284  .        -       .       PID=PEQU_00008;lvid_id=PEQ_36749;
String000002  LEN   CW     104627  105584  .       -       1       Pt=PEQU_00008;
Is there a sed or awk command that I could use?
 
Old 03-03-2015, 12:34 AM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,910

Rep: Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318
yes: awk ' { $6="." }'
 
1 members found this post helpful.
Old 03-03-2015, 08:41 AM   #3
sam@
Member
 
Registered: Sep 2013
Posts: 31

Original Poster
Rep: Reputation: Disabled
hi I tried awk ' { $6="." }' infile> outfile

But it gave me an empty file.Am I missing anything?
 
Old 03-03-2015, 08:47 AM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,910

Rep: Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318
yes, choose:
Code:
awk '$6="."' in>out
- or -
awk '{$6=".";print}' in>out
 
Old 03-03-2015, 09:44 AM   #5
sam@
Member
 
Registered: Sep 2013
Posts: 31

Original Poster
Rep: Reputation: Disabled
Hi
Thanks, it did replace the 6th column entries but it got rid of the spaces which are necessary for further processing.


My columns are separated by tab which got eliminated in the process.Here is how it looks now:
Code:
String000002 GeneWise CW 48945 49354 . - 0 Pt=PEQU_00004;
String000002 LEN NA 52125 52604 . - . PID=PEQU_00005;lvid_id=PEQ_28708;
String000002 LEN CW 52125 52604 . - 0 Pt=PEQU_00005;
String000002 WEise NA 66200 66667 . - . PID=PEQU_00006;lvid_id=Os03t0797100-00-D1363;Shift=0;
String000002 WEise CW 66200 66667 . - 0 Pt=PEQU_00006;
String000002 GUST NA 90829 91128 . + . PID=PEQU_00007;lvid_id=A00088;
String000002 GUST CW 90829 91128 . + 0 Pt=PEQU_00007;
String000002 LEN NA 104627 107284 . - . PID=PEQU_00008;lvid_id=PEQ_36749;
String000002 LEN CW 104627 105584 . - 1 Pt=PEQU_00008;
 
Old 03-03-2015, 09:55 AM   #6
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
It still has all fields. What is the problem with the spacing?
 
Old 03-03-2015, 10:10 AM   #7
sam@
Member
 
Registered: Sep 2013
Posts: 31

Original Poster
Rep: Reputation: Disabled
if you compare with my required output file , it had tab spacing between the fields which is important for further parsinng.

Here is the sample output required.
Code:
String000002  GeneWise        CW     48945   49354   .       -       0       Pt=PEQU_00004;
String000002  LEN   NA    52125   52604   .        -       .       PID=PEQU_00005;lvid_id=PEQ_28708;
String000002  LEN   CW     52125   52604   .       -       0       Pt=PEQU_00005;
String000002  WEise        NA    66200   66667   .   -       .       PID=PEQU_00006;lvid_id=Os03t0797100-00-D1363;Shift=0;
String000002  WEise        CW     66200   66667   .       -       0       Pt=PEQU_00006;
String000002  GUST        NA    90829   91128   .    +       .       PID=PEQU_00007;lvid_id=A00088;
String000002  GUST        CW     90829   91128   .    +       0       Pt=PEQU_00007;
String000002  LEN   NA    104627  107284  .        -       .       PID=PEQU_00008;lvid_id=PEQ_36749;
String000002  LEN   CW     104627  105584  .       -       1       Pt=PEQU_00008
;


Here is code result

Code:
String000002 GeneWise CW 48945 49354 . - 0 Pt=PEQU_00004;
String000002 LEN NA 52125 52604 . - . PID=PEQU_00005;lvid_id=PEQ_28708;
String000002 LEN CW 52125 52604 . - 0 Pt=PEQU_00005;
String000002 WEise NA 66200 66667 . - . PID=PEQU_00006;lvid_id=Os03t0797100-00-D1363;Shift=0;
String000002 WEise CW 66200 66667 . - 0 Pt=PEQU_00006;
String000002 GUST NA 90829 91128 . + . PID=PEQU_00007;lvid_id=A00088;
String000002 GUST CW 90829 91128 . + 0 Pt=PEQU_00007;
String000002 LEN NA 104627 107284 . - . PID=PEQU_00008;lvid_id=PEQ_36749;
String000002 LEN CW 104627 105584 . - 1 Pt=PEQU_00008;
It did the replacing but the tab spacing is required for further parsing.

is it possible to retain the format of file by only replacing using sed or awk.

Last edited by sam@; 03-03-2015 at 10:11 AM.
 
Old 03-03-2015, 11:19 AM   #8
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Not without knowing what it is supposed to be... And the input seems to have varying field width (which would be why the columns don't match up).

It almost looks like the field width is varying depending on the contents - more spaces make things wider...

Now if each field was tab separated, then that can be fixed - use the -F and specify the field separator is a tab. By default field separation is by one or more spacing characters (spaces or tabs). If the field is delimited by just a tab, then using the explicit separator would preserve the spaces as they are not considered field separators.
 
Old 03-04-2015, 12:37 AM   #9
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,910

Rep: Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318Reputation: 7318
sed 's/^\([^ \t]*\s*[^ \t]*\s*[^ \t]*\s*[^ \t]*\s*[^ \t]*\s*\)[^ \t]*\(\s.*\)/\1.\2/'
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Linux text file 6th Column if value is greater than 80 then Higlight /change colour gbm4ibm@gmail.com Linux - General 4 02-25-2015 09:13 AM
[SOLVED] Using Awk to replace the 6th Field of a string metallica1973 Programming 4 08-06-2012 02:48 PM
remove duplicate entries from first column?? kadvar Programming 2 05-12-2010 06:22 PM
The 6th column of "/etc/fsatb"? faezeh Fedora 4 03-22-2005 09:18 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 12:00 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration