shell question: pad end of each line with spaces to = 80 chars ??

di11rod · 04-14-2011, 06:19 PM

I've searched the forums and the google looking for a means to do this and haven't found anything I can use.

I have a large file that looks like this:

Code:

18000034161828M850
18000034172676M850
98   093095
D    01   6H0NH    031111    18AH001HBS
18000032384257M850
18000032225047M850
18000032550899M850
18000032561615M850
18000032342743M850

I need to add spaces at the end of each line to ensure that every line has 80 chars before the carriage return. I was thinking something like this, but it doesn't do the right thing:

Code:

cat filename | sed -e 's/$/(bunch of spaces)/' | cut -c1-80 > filename2

Any ideas? I'm on fedora, so I can use awk, sed, bash, ksh, etc.

Appreciatively,

di11rod

theNbomr · 04-14-2011, 06:42 PM

This might do the trick.

Code:

while read line; do printf "%-80s\n" $line; done < filename

--- rod.

kurumi · 04-14-2011, 08:14 PM

Ruby(1.9+)

Code:

$ ruby -i.bak -ne 'puts $_.chomp.ljust(80)' file

di11rod · 04-15-2011, 12:09 AM

Thank you so much for these suggestions! I'll try them when I get to work tomorrow.

Appreciatively,

di11rod

Telengard · 04-15-2011, 02:09 AM

Quote:

Originally Posted by theNbomr

Code:

while read line; do printf "%-80s\n" $line; done < filename

Should work in AWK too

Code:

awk '{printf "%-80s\n", $1}' filename

di11rod · 04-15-2011, 11:02 AM

Kurumi-

I think your suggestion just cuts the line to no greater than 80 chars. At least that's what the filename.bak file looks like.

Code:

$ ruby -i.bak -ne 'puts $_.chomp.ljust(80)' file

thenbomr-

Code:

while read line; do printf "%-80s\n" $line; done < onefile > twofile

Yours is the closest. It puts 80 chars at the end of each line, but some of my lines have spaces between elements, and your solution chops each at the space and puts those values on separate lines.

So, the lines that look like this come out fine as desired:

Code:

18000034002519M850

But the lines that look like this get broken up into separate lines:

Code:

98   098100
D    01   6H0NH    031111    18AH001HBS

Is there a way to tie the delimiter to an existing carriage return instead of a space?

Telengard-

The awk syntax doesn't work because it is referencing field one with the $1 part. On the lines with spaces, it breaks each value out to separate lines. If there were a way to avoid referencing the fields and treat the entire line as a field...

Thanks for these suggestions. If there's a way to tweak theNbomr's solution so it deals with the lines with spaces, then I'm golden.

Appreciatively,

di11rod

di11rod · 04-15-2011, 11:39 AM

I apologize for being a lazy baby asking the programming forum to spoon-feed me a solution. Duh. I figured it out. I used sed to substitute the blanks for ':' and then used the awk syntax provided by Telengard, then used sed to change the colons back to spaces.

I love the elegance of command-line scripting. Here is the solution!

Code:

cat onefile | sed 's/ /:/g' | awk '{printf "%-80s\n", $1}' | sed 's/:/ /g' > twofile

Thanks for taking the time to help me out.

Appreciatively,

di11rod

Telengard · 04-15-2011, 01:11 PM

Quote:

Originally Posted by di11rod

The awk syntax doesn't work because it is referencing field one with the $1 part. On the lines with spaces, it breaks each value out to separate lines. If there were a way to avoid referencing the fields and treat the entire line as a field...

Sorry, that was my mistake. I was testing it that way and forgot to change it. I should have posted this:

Code:

foo$ awk '{printf "%-80s\n", $0}' onefile
18000034161828M850
18000034172676M850
98   093095
D    01   6H0NH    031111    18AH001HBS
18000032384257M850
18000032225047M850
18000032550899M850
18000032561615M850
18000032342743M850
foo$

Quote:

Originally Posted by di11rod

Code:

cat onefile | sed 's/ /:/g' | awk '{printf "%-80s\n", $1}' | sed 's/:/ /g' > twofile

I'm quite certain it can be done much more elegantly and efficiently, if you are willing to put a little more time and study into it.

We can eliminate the UUOC. The sed program accepts a second argument for the file name to process. That gives us this:

Code:

foo$ sed 's/ /:/g' onefile | awk '{printf "%-80s\n", $1}' | sed 's/:/ /g'
18000034161828M850
18000034172676M850
98   093095
D    01   6H0NH    031111    18AH001HBS
18000032384257M850
18000032225047M850
18000032550899M850
18000032561615M850
18000032342743M850
foo$

Now combine that knowledge with what we learned from the corrected awk command above. The sed command is not needed at all; awk can do the entire job itself.

Code:

foo$ awk '{printf "%-80s\n", $0}' onefile
18000034161828M850
18000034172676M850
98   093095
D    01   6H0NH    031111    18AH001HBS
18000032384257M850
18000032225047M850
18000032550899M850
18000032561615M850
18000032342743M850
foo$

To demonstrate that the lines are really padded as you want I will use tr to turn all the spaces into hyphens.

Code:

foo$ awk '{printf "%-80s\n", $0}' onefile | tr ' ' '-'
18000034161828M850--------------------------------------------------------------
18000034172676M850--------------------------------------------------------------
98---093095---------------------------------------------------------------------
D----01---6H0NH----031111----18AH001HBS-----------------------------------------
18000032384257M850--------------------------------------------------------------
18000032225047M850--------------------------------------------------------------
18000032550899M850--------------------------------------------------------------
18000032561615M850--------------------------------------------------------------
18000032342743M850--------------------------------------------------------------
foo$

So here's my final answer on how to pad lines to 80 characters with spaces using AWK.

Code:

awk '{printf "%-80s\n", $0}' FILENAME

If the input file is named onefile and the output file is to be named twofile then it can be invoked thusly:

Code:

awk '{printf "%-80s\n", $0}' onefile > twofile

If you are processing your file with awk, then use the full capabilities of the AWK language to do as much of the work as possible. I can't think of a single reason to ever pipe sed into awk or the reverse.

Many more solutions are possible. Unix, and Linux by extension, is rife with text processing programs. Each has its own place and serves its own purpose. My own approach is to learn one at a time as well as I can before moving on to the next. AWK is my current obsession, but maybe soon it will be sed, ruby, or python.

If you feel your question has been answered then please consider using the thread tools to mark this thread solved. Otherwise please post back to let us know what problems remain.

HTH

theNbomr · 04-15-2011, 02:48 PM

Quote:

Originally Posted by di11rod

Kurumi-

thenbomr-

Code:

while read line; do printf "%-80s\n" $line; done < onefile > twofile

Yours is the closest. It puts 80 chars at the end of each line, but some of my lines have spaces between elements, and your solution chops each at the space and puts those values on separate lines.

So, the lines that look like this come out fine as desired:

Code:

18000034002519M850

But the lines that look like this get broken up into separate lines:

Code:

98   098100
D    01   6H0NH    031111    18AH001HBS

Is there a way to tie the delimiter to an existing carriage return instead of a space?

Sorry, I don't know how I managed to post the wrong version. This seems to work.

Code:

while read line; do printf "%-80s\n" "$line"; done < filename

--- rod.

AnanthaP · 04-16-2011, 05:57 AM

In awk.

if (len($0) < 80) printf "%-80s\n", $0 ;
else print $0

H_TeXMeX_H · 04-16-2011, 07:07 AM

Quote:

Originally Posted by AnanthaP

In awk.

if (len($0) < 80) printf "%-80s\n", $0 ;
else print $0

what if the length is greater than 80 ? You don't need to answer, but that's the first thing I thought of when I saw this thread.

kurumi · 04-16-2011, 10:36 PM

Quote:

Originally Posted by di11rod

Kurumi-

I think your suggestion just cuts the line to no greater than 80 chars. At least that's what the filename.bak file looks like.

Code:

$ ruby -i.bak -ne 'puts $_.chomp.ljust(80)' file

filename.bak is the original file. ljust() is not used for "cutting".

AnanthaP · 04-16-2011, 11:54 PM

Quote:

I think your suggestion just cuts the line to no greater than 80 chars. At least that's what the filename.bak file looks like.

I think the OP didn't want to cut lines > 80 in length. So I just added the else part in awk.

Clearly, the end application must decide what to do with lines having length > 80. But you dont want to lose the additional bytes when coding a module to pad to length= 80.

gnashley · 04-17-2011, 03:53 AM

Set the IFS to newline to take care of lines with spaces, and simply echo any lines longer than 80 chars. Why use sed/awk/perl/ruby/python when bash can do it all?

Code:

( IFS=$'\n' ; while read line; do 
  if [[ ${#line} -gt 80 ]] ; then
    echo $line
  else
    printf "%-80s\n" "$line"
  fi
done < filename )

konsolebox · 04-17-2011, 03:56 AM

Quote:

Originally Posted by di11rod

thenbomr-

Code:

while read line; do printf "%-80s\n" $line; done < onefile > twofile

Yours is the closest. It puts 80 chars at the end of each line, but some of my lines have spaces between elements, and your solution chops each at the space and puts those values on separate lines.

Perhaps $line should just be enclosed in double quotes to prevent it from expanding to multiple arguments:

Code:

while read line; do printf "%-80s\n" "$line"; done < onefile > twofile

Another way with bash:

Code:

SPACES='                                                                                '
while read LINE; do TEMP=$LINE$SPACES; echo "${TEMP:0:80}"; done < onefile > twofile

Placing in a script:

script.sh

Code:

#!/bin/bash
SPACES='                                                                                '
while read LINE; do TEMP=$LINE$SPACES; echo "${TEMP:0:80}"; done

Code:

bash script.sh < onefile > twofile

---- Update ----

To prevent chopping longer lines:

Code:

#!/bin/bash
SPACES='                                                                                '
while read LINE; do
    if [[ ${#LINE} -ge 80 ]]; then
        echo "$LINE"
    else
        TEMP=$LINE$SPACES
        echo "${TEMP:0:80}"
    fi
done