LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-30-2012, 10:09 AM   #1
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 4,442

Rep: Reputation: 76
Substituting some characters in a text file (shell script).


Hi:

I have a text file from which I want to eliminate all newlines save in the case where two consecutive newlines are present. That is, one possible algorithm would be the following.

Code:
1. p=p+1           # advance character pointer
2. if char at position p = \n
       if char position p+1 = \n
           p=p+1
       else
           substitute char at p for ' '
3. goto step 1
If I want to implement it in the Bash script language, then
(a) I would begin by making use of a while sentence.
(b) I must treat p as a numeric variable.
(c) Is readline able to read char by char?
(d) And what would be a clause/instruction/sentence to write a file?
(e) Would it not be easier to have two files: one input file and one output file?

Could you give me some hints covering these points?

Last edited by stf92; 06-30-2012 at 10:11 AM.
 
Old 06-30-2012, 11:49 AM   #2
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Code:
sed '/^$/d' oldfile > newfile
Not exactly what you requested---this one eliminates all empty lines. I suspect that sed can also be used for the problem as stated
 
1 members found this post helpful.
Old 06-30-2012, 01:24 PM   #3
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 4,442

Original Poster
Rep: Reputation: 76
I consider myself able to write the script. If only I new how to write one character at a time, in the style of C's fputc, fput, putc and putchar. But by reading the bash manual the only builtin command that does output is printf. On the contrary, for input there is the read builtin command.
 
Old 06-30-2012, 02:35 PM   #4
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,780

Rep: Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081
Code:
$ help read
read: read [-ers] [-u fd] [-t timeout] [-p prompt] [-a array] [-n nchars] [-d delim] [name ...]
...
If -n is supplied with a non-zero NCHARS argument, read returns after
NCHARS characters have been read.
$ help echo
echo: echo [-neE] [arg ...]
     Output the ARGs.  If -n is specified, the trailing newline is
    suppressed. ...
Quote:
If only I new how to write one character at a time, in the style of C's fputc, fput, putc and putchar. But by reading the bash manual the only builtin command that does output is printf.
You can use printf or echo for output:
Code:
$ echo -n x ; echo -n y ; echo -n z ; echo
xyz
$ printf x ; printf y ; printf z ; printf '\n'
xyz
 
Old 06-30-2012, 02:43 PM   #5
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 4,442

Original Poster
Rep: Reputation: 76
Thanks ntubski. 'read -u fd' reads input from file descriptor fd. But then there must be a write command that writes to a given file descriptor!

Last edited by stf92; 06-30-2012 at 03:11 PM.
 
Old 06-30-2012, 03:24 PM   #6
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,780

Rep: Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081
You can use output redirection (this works for any command, not just echo):
Code:
echo -n x >file  # write the character 'x' to "file"
echo -n x >&2    # write the character 'x' to standard error
Actually you could use input redirection instead of -u for read as well.
 
1 members found this post helpful.
Old 06-30-2012, 03:56 PM   #7
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 4,442

Original Poster
Rep: Reputation: 76
Yes,
while
done < $infile

The thing is that I must read/modify a file or, else have the file to modify as input and do output on another file. But I think I now have enough material to begin thinking how I'll do it. Thanks a lot.

Last edited by stf92; 06-30-2012 at 05:04 PM.
 
Old 06-30-2012, 05:19 PM   #8
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Your original problem statement:
Quote:
I have a text file from which I want to eliminate all newlines save in the case where two consecutive newlines are present.
A solution:
Code:
sed -n 'h; :1 n; /./{H; b1}; /^$/p; x; s/\n/ /g; p' oldfile > newfile
I've done a few tests of this, but I won't advertise it as bulletproof.

Last edited by pixellany; 06-30-2012 at 05:20 PM. Reason: typo
 
Old 06-30-2012, 07:50 PM   #9
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 4,442

Original Poster
Rep: Reputation: 76
It seems there is no choice for me but getting familiar with sed. Thanks a lot, pixellany. The command seems to work fine.
 
Old 06-30-2012, 10:04 PM   #10
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,780

Rep: Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081
awk version; a bit longer but more readable, I think.
Code:
awk '{if (length($0)) printf("%s ",$0); else print""} END{print""}' oldfile > newfile
 
Old 06-30-2012, 10:31 PM   #11
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 4,442

Original Poster
Rep: Reputation: 76
Thank you ntubski. Some day I'll get quite familiar with sed and awk. In the meantime, I'll keep your examples in order to study them in the future. I think that using the given file as input and letting the modified file be another file (output file) I can do it within a while loop and input/output redirection. For instance, for output, I'll use something like

echo -n $var1 >>outfile.
 
Old 06-30-2012, 11:50 PM   #12
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 4,442

Original Poster
Rep: Reputation: 76
Code:
semoi@darkstar:~/script/el_mio$ cat f1
#!/bin/bash

# 1. Read a char from infile
# 2. Output it to stdout
# 3. Goto step 1

while read -n 1 car1  # -n 1 reads only one char
do
  echo -n $car1       # -n: do not output \n  
done < infile   
exit
semoi@darkstar:~/script/el_mio$ cat infile
To be or not to be. That is the question.
Whether 'tis nobler in the mind,
semoi@darkstar:~/script/el_mio$ ./f1
Tobeornottobe.Thatisthequestion.Whether'tisnoblerinthemind,semoi@darkstar:~/script/el_mio$
As you can see, either read or echo eats the spaces. Also, line terminators. Why?
 
Old 07-01-2012, 02:47 AM   #13
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 709

Rep: Reputation: 428Reputation: 428Reputation: 428Reputation: 428Reputation: 428
Hi.

From `help read':
Quote:
Read a line from the standard input and split it into fields.

Reads a single line from the standard input, or from file descriptor FD
if the -u option is supplied. The line is split into fields as with word
splitting, and the first word is assigned to the first NAME, the second
word to the second NAME, and so on, with any leftover words assigned to
the last NAME. Only the characters found in $IFS are recognized as word
delimiters.
So to preserve spaces try this:
Code:
IFS="" read -n1 x && echo "[$x]"
i.e. set IFS to empty line.
To preserve newlines as well, add -d "":
Code:
while IFS="" read -n 1 -d ""  car1  # -n 1 reads only one char
do
	echo -n "$car1"       # -n: do not output \n  
done

Last edited by firstfire; 07-01-2012 at 02:50 AM.
 
Old 07-01-2012, 03:09 AM   #14
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Quote:
Originally Posted by stf92 View Post
Thank you ntubski. Some day I'll get quite familiar with sed and awk. In the meantime, I'll keep your examples in order to study them in the future. I think that using the given file as input and letting the modified file be another file (output file) I can do it within a while loop and input/output redirection. For instance, for output, I'll use something like

echo -n $var1 >>outfile.
That will work but will be gruesomely slow compared to an awk or sed solution. No matter for input files of a few hundred characters but for bigger files processed regularly ...
 
Old 07-01-2012, 07:10 AM   #15
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Quote:
Originally Posted by ntubski View Post
awk version; a bit longer but more readable.......
Awwwww, that takes all the fun out of it..

Seriously, I am jealous of the awk wizards---someday, I'll learn it.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Substituting zeros with dots in 2nd and 3rd field of a text file. iconig Linux - Newbie 9 05-31-2012 09:19 AM
Substituting text 'Edit1' by lines n1 to n2 Nick Edwards Linux - Newbie 1 03-24-2009 12:30 PM
substituting or deleting a few characters in text i.you Linux - Software 3 12-19-2007 02:39 AM
Java: Reading characters from a text file chief_officer Programming 5 03-26-2007 07:04 PM
Convert special characters in text file nyk Linux - Software 1 01-05-2005 03:20 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 10:47 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration