LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 01-14-2011, 03:31 AM   #1
kaprasanna1
LQ Newbie
 
Registered: Jan 2011
Posts: 6

Rep: Reputation: 0
Sed adds a space instead of tab at end of line


The objective is to read a file line by line, add a tab at end of each line and add a value(number) after the tab.

My script:

Code:
i=0
val=44
while read line
do
  #Ignore empty lines
  case "$line" in
        "") echo >> Report.tmp.tsv; continue;;
  esac
  case "$i" in
    0)
      dt=`date  +%d-%m-%y`
      echo $line | sed s/$/'\t'$dt/ >> Report.tmp.tsv;;
    1) echo $line | sed s/$/'\t'$val/ >> Report.tmp.tsv;;

    2) echo $line | sed s/$/'\t'$val/ >> Report.tmp.tsv;;
    3) echo $line | sed s/$/'\t'$val/ >> Report.tmp.tsv;;
    
  esac
  i=$(($i+1))
  val=$(($val+1))
done < Report.tsv

rm Report.tsv
mv Report.tmp.tsv Report.tsv
Report.tsv before running the script for the first time:
Code:
Test Line 1
This is Line number two
Line number is three
Fourth line of original report
At the end of first run, each line of Report.tsv gets appended by a space instead of a tab.
On the other hand each line of Report.tsv gets appended by tab at the end of second run onwards.

This was realized when Report.tsv was imported in open office spreadsheet.
First set of appended values get merged into the original column (Strings) and the subsequent appended values fall in distinct columns.

What is wrong here? Please guide.
Thanks in advance.
 
Old 01-14-2011, 03:59 AM   #2
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
Sed is a line editor. A line will end with a new-line. Using "s/$/\t/" will only add a tab before the end of the line.

You can build up lines and then change all of the newlines (except the last) to tabs. This is usually done by adding a line to the Hold buffer; recalling the hold buffer; and performing a global replace "s/\n/\t/".

Here is an example, extracting one of the records of the lspci output, and replacing the newlines with tabs:
Code:
/sbin/lspci -v | sed -n '/Network controller/,/^$/{ /^$/!H
                                                    /^$/{H;g;s/\n/\t/gp}}'
Note the use of braces to group commands together when you want to perform more than one sed command inside a subrange. The `g' flag is needed at the end of the substitute command `s' to substitute all of the newlines in the line.

If all you want to do is replace all the newlines in a file with tabs, you could use the `tr' program instead:

tr '\n' '\t' <original_file >newfile

Last edited by jschiwal; 01-14-2011 at 04:00 AM.
 
Old 01-14-2011, 04:58 AM   #3
kaprasanna1
LQ Newbie
 
Registered: Jan 2011
Posts: 6

Original Poster
Rep: Reputation: 0
jschiwal,

Really appreciate your quick reply.
I didn't quite understand why sed adds a space (not new line) at the end of a line when it has been asked to add a tab.
When I run my script third time, first two appended columns get separated by spaces (which were by tabs earlier) and third by a proper tab.
This is extremely confusing.
Also the same sed command works like a charm if I ask it to add a comma instead of \t at end of each line.
The sed-hold buffer example you presented didn't work for me. Get "sed: -e expression #1, char 25: extra characters after command" error. I am researching more on sed and hold buffer.
And [tr '\n' '\t'] replaces every single \n by a \t so it isn't really helpful for me.

Again, thanks much for the reply.
 
Old 01-14-2011, 05:15 AM   #4
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Hi,

are you sure that it is not a setting in OpenOffice that malforms the file?
Can you post the output of the following command
Code:
od -c Report.tmp.tsv
directly after the sed's have been apllied?
 
Old 01-14-2011, 05:22 AM   #5
Kenhelm
Member
 
Registered: Mar 2008
Location: N. W. England
Distribution: Mandriva
Posts: 360

Rep: Reputation: 170Reputation: 170
Tabs need to be quoted to survive being echoed
Code:
line=abc$'\t'123    #  $'\t' is a tab character in bash
echo $line
abc 123             # tab has changed to a space

echo "$line"
abc     123          # tab has been preserved
 
1 members found this post helpful.
Old 01-14-2011, 05:36 AM   #6
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
I cut and pasted my posted example. It didn't have an error.

If your file is highly structured, consider using awk instead of sed. However, using tabs as record separators instead of field separators is very odd. Normally, tabs separate fields in a record, and newlines separate records.

Also, put your sed commands in double quotes if you use bash variables. An alternative is to enclose fixed text in single quotes, and variables in double quotes. You need to do the latter if you use `$' in a sed command meaning end of line.

Code:
head kmenu.trace | sed "s/^/$Date\t/"
14-01-11        execve("/usr/bin/kmenuedit", ["/usr/bin/kmenuedit"], [/* 92 vars */]) = 0
14-01-11        brk(0)                                  = 0x602000
14-01-11        mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0049ead000
14-01-11        access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)

Last edited by jschiwal; 01-14-2011 at 05:37 AM.
 
1 members found this post helpful.
Old 01-14-2011, 05:47 AM   #7
kaprasanna1
LQ Newbie
 
Registered: Jan 2011
Posts: 6

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by crts View Post
Hi,

are you sure that it is not a setting in OpenOffice that malforms the file?
Can you post the output of the following command
Code:
od -c Report.tmp.tsv
directly after the sed's have been apllied?
In fact open office asks me which delimiter to use before hand upon which I select tab.

Output of
Code:
od -c Report.tmp.tsv
after 1st run:

Code:
0000000   T   e   s   t       L   i   n   e       o   n   e  \t   1   4
0000020   -   0   1   -   1   1  \n   T   h   i   s       i   s       L
0000040   i   n   e       n   u   m   b   e   r       t   w   o  \t   4
0000060   5  \n   L   i   n   e       n   u   m   b   e   r       i   s
0000100       t   h   r   e   e  \t   4   6  \n   F   o   u   r   t   h
0000120       l   i   n   e       o   f       o   r   i   g   i   n   a
0000140   l       r   e   p   o   r   t  \t   4   7  \n
0000154
Above file (generated after the first round of sed applies) opens in open office with tab as delimiter successfully.

Following is the out put of
Code:
od -c Report.tmp.tsv
after second run:


Code:
0000000   T   e   s   t       L   i   n   e       o   n   e       1   4
0000020   -   0   1   -   1   1  \t   1   4   -   0   1   -   1   1  \n
0000040   T   h   i   s       i   s       L   i   n   e       n   u   m
0000060   b   e   r       t   w   o       4   5  \t   4   5  \n   L   i
0000100   n   e       n   u   m   b   e   r       i   s       t   h   r
0000120   e   e       4   6  \t   4   6  \n   F   o   u   r   t   h    
0000140   l   i   n   e       o   f       o   r   i   g   i   n   a   l
0000160       r   e   p   o   r   t       4   7  \t   4   7  \n
0000176
This one when opened in open office with tab as delimiter; first set of appended values get merged with the row headers.

Thanks.
 
Old 01-14-2011, 06:34 AM   #8
kaprasanna1
LQ Newbie
 
Registered: Jan 2011
Posts: 6

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Kenhelm View Post
Tabs need to be quoted to survive being echoed
Code:
line=abc$'\t'123    #  $'\t' is a tab character in bash
echo $line
abc 123             # tab has changed to a space

echo "$line"
abc     123          # tab has been preserved
I understand.
I'm indeed on bash.
Wrote this script to test:

Code:
line="Temporary line number one"
echo $line
val=55
some=`echo $line | sed s/$/'\t'$val/`
echo "$some"

some=`echo $some | sed s/$/'\t'$val/`
echo "$some"
output:

Code:
Temporary line number one
Temporary line number one	55
Temporary line number one 55	55
Looks like for each new tab all previous tabs are lost in bash shell.
If I wrap $some by more than one set of quotes (") all the tabs are lost including the last one.
Wonder how can I solve this easily.

Thanks.
 
Old 01-14-2011, 06:41 AM   #9
kaprasanna1
LQ Newbie
 
Registered: Jan 2011
Posts: 6

Original Poster
Rep: Reputation: 0
jschiwal

I completely agree that tabs are normally used to separate fields and new-lines to separate records.
This articular report is to be updated on a nightly basis where it should be easy to compare values for subsequent days. Hence the upside down design!

Thanks for the pointer to awk and importance of quotes in sed.
 
Old 01-14-2011, 06:42 AM   #10
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
After your post #7 I realize that you are running the script twice on the same file. I initially assumed that you are processing several different files with your script. In this case double-quoting like
Code:
echo "$line" ...
should take care of the issue, as suggested by kenhelm.
 
Old 01-14-2011, 06:45 AM   #11
kaprasanna1
LQ Newbie
 
Registered: Jan 2011
Posts: 6

Original Poster
Rep: Reputation: 0
Following script seems to have done the trick:
Code:
i=0
val=44
while read line
do
    #Ignore empty lines
    case "$line" in
        "") echo >> Report.tmp.tsv; continue;;
        esac
    case "$i" in
         0)
            dt=`date  +%d-%m-%y`
            some=`echo "$line" | sed s/$/'\t'$dt/`
            echo "$some" >> Report.tmp.tsv;;
         1) some=`echo "$line" | sed s/$/'\t'$val/`
            echo "$some" >> Report.tmp.tsv;;
         2) some=`echo "$line" | sed s/$/'\t'$val/`
            echo "$some" >> Report.tmp.tsv;;
         3) some=`echo "$line" | sed s/$/'\t'$val/`
            echo "$some" >> Report.tmp.tsv;;
        esac
  i=$(($i+1))
  val=$(($val+1))
done < Report.tsv
rm Report.tsv
mv Report.tmp.tsv Report.tsv
Output of od -c Report.tsv:

Code:
0000000   T   e   s   t       L   i   n   e       o   n   e  \t   1   4
0000020   -   0   1   -   1   1  \t   1   4   -   0   1   -   1   1  \n
0000040   T   h   i   s       i   s       L   i   n   e       n   u   m
0000060   b   e   r       t   w   o  \t   4   5  \t   4   5  \n   L   i
0000100   n   e       n   u   m   b   e   r       i   s       t   h   r
0000120   e   e  \t   4   6  \t   4   6  \n   F   o   u   r   t   h    
0000140   l   i   n   e       o   f       o   r   i   g   i   n   a   l
0000160       r   e   p   o   r   t  \t   4   7  \t   4   7  \n
0000176
Thanks.
 
Old 01-14-2011, 06:54 AM   #12
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by kaprasanna1 View Post
I
If I wrap $some by more than one set of quotes (") all the tabs are lost including the last one.
Wonder how can I solve this easily.
Not sure why you want to use 'multiple' enclosing quotes, however, if you use an even number of pairs of double-quotes then you are actually not enclosing your variable.
Example:
Code:
echo "" $line ""
              ^^ opening and closing quote.
     ^^ opening and closing quote.
As you can see, the quotes are not interpreted as outer and inner quotes. Do you want to echo a the quotes? As in
Code:
echo "$line"
"content of line"
Then you will have to escape the double-quotes when you assign the value to line:
Code:
line="\"content of line\""
Please elaborate a bit more on the situation that triggers this issue. I am not sure if I fully understand what you are trying to do.
 
Old 01-14-2011, 09:24 AM   #13
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Maybe this can give you some ideas:
Code:
i=0
val=44

while read line
do
    #Ignore empty lines
    [[ -z $line ]] && continue

    case $((i++)) in
        0)  dt=`date  +%d-%m-%y`;;
    [1-3])  (( dt = val++ ));;
    esac

    echo "$line" | sed "s/$/\t$dt/" >> Report.tmp.tsv
done < Report.tsv
rm Report.tsv
mv Report.tmp.tsv Report.tsv
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
insert tab at beginning of each line in file - and other sed questions robotrock Linux - Newbie 6 08-19-2015 12:31 AM
sed - How do you replace end of line with a space pppaaarrrkkk Programming 7 02-07-2011 11:27 AM
head adds chars to end of each line (Red Hat Enterprise Linux) CheckiSt Linux - General 13 04-14-2010 03:52 AM
[SOLVED] sed: How to remove the end of a line? angel115 Programming 2 10-01-2007 10:29 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:54 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration