LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-02-2010, 12:50 PM   #1
twoleggedtripod
LQ Newbie
 
Registered: Aug 2010
Posts: 3

Rep: Reputation: 0
Replacing word occurance with an increasing number in a file using bash


Hi there.

I have a file in the form below, and wish to replace each start line with an increasing number. So instead of:

Code:
start
content content
start
content content
start
content content
I want to generate:

Code:
 1
content content
2
content content
3
content content
I've tried this code in bash:

Code:
for ((a=1; a=100 ; a++))
do
        sed 's/start/'$a'/' input > output
done
to make it work but unfortunately it just spits out the following:

Code:
100
content content
100
content content
100
content content
After several searches and a bit of messing around, it's clear I'm missing something, so was wondering if anyone could offer any insight?

Thanks a lot
 
Old 08-02-2010, 12:58 PM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,550

Rep: Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898
Well I would not agree with the method you are using, but the reason your for statement is not working is you assign it 2 different values.
Firstly the value 1 and then the value 100. Let us have a look at the for statement make up and see if that helps you:

for(<blah>;<foo>;<bar>)

for - command or statement to be used

<blah> - set variable to initial value

<foo> - provide a reason (expression) to stop

<bar> - increase variable by a set amount

Your issue is at the <foo> stage as you have assigned to equal 100 instead of testing when does it equal 100

I will leave the rest to you
 
1 members found this post helpful.
Old 08-02-2010, 01:06 PM   #3
smoker
Senior Member
 
Registered: Oct 2004
Distribution: Fedora Core 4, 12, 13, 14, 15, 17
Posts: 2,279

Rep: Reputation: 249Reputation: 249Reputation: 249
Your comparison operator is not a comparison operator.
You have assigned something to the variable a twice.

http://www.linuxconfig.org/Bash_scri...ic-comparisons

Last edited by smoker; 08-02-2010 at 01:08 PM. Reason: It took me a long time to type this, I started when there were no replies !
 
Old 08-02-2010, 01:19 PM   #4
twoleggedtripod
LQ Newbie
 
Registered: Aug 2010
Posts: 3

Original Poster
Rep: Reputation: 0
Ah, sorry about that, I'd cleaned up the code and missed an operator in there. It should be:

Code:
for ((a=1; a<=100 ; a++))
do
        sed 's/start/'$a'/' input > output
done
 
Old 08-02-2010, 01:41 PM   #5
smoker
Senior Member
 
Registered: Oct 2004
Distribution: Fedora Core 4, 12, 13, 14, 15, 17
Posts: 2,279

Rep: Reputation: 249Reputation: 249Reputation: 249
You have still not compared 2 things. Look at the examples.
 
Old 08-02-2010, 04:33 PM   #6
dannybpng
Member
 
Registered: Sep 2003
Location: USA
Distribution: Fedora 23
Posts: 72

Rep: Reputation: 19
Here is one way to do it:

a=0
while read Line
do
if [ "$Line" == "start" ]; then
((a++))
fi
echo $Line | sed 's/start/'$a'/'
done < input > output


I would just do it with awk like so:

awk '/start/ {count++; print count; next}
{print}' input > output
 
1 members found this post helpful.
Old 08-03-2010, 07:07 AM   #7
Trickie
Member
 
Registered: Sep 2004
Posts: 38

Rep: Reputation: 23
I'm not sure why you would want to do that within the file when you can enumerate the lines as the file is read. For example, when using cat -n or a text editor like vi. Is this a homework question from college?
 
Old 08-03-2010, 09:47 AM   #8
twoleggedtripod
LQ Newbie
 
Registered: Aug 2010
Posts: 3

Original Poster
Rep: Reputation: 0
No, not homework. I attempted to make it as general as possible, so I can understand the reasoning.

What I'm writing it for is an output file from a different program. In short, the program runs something, names the cycle=x on one line, provides stuff I need, then a variable amount of data. Earlier, each run was labeled cycle=1, cycle=2 etc, so that was simply a case of using "grep -A $numberoflines cycle=$a", successfully using the "for ((a=1; a<=100 ; a++))" to strip that from a file, and do what I wanted to it:

Code:
#for ((n=1; n <= 9 ; n++))
#do
#        grep -A $1 "cycle =      $n" potentialcurve > out_$n
#done
#
#for ((m=10; m <= 90 ; m++))
#do
#        grep -A $1 "cycle =     $m" potentialcurve > out_$m
#done
#
#for ((x=1; x <= 90 ; x++))
#do
#       sed '1s/.*/$coord/' out_$x > tmol_$x
#       echo '$end' >> tmol_$x
#       babel -itmol tmol_$x -oxyz test_$x.xyz
#       cat test_$x.xyz >> combined.xyz
#done
#
#rm o* t*
However, under a different option the cycle runs for each point a few times and spits the final answer out, causing cycle=13,cycle=25,cycle=5 etc. making the previous method useless

From what I understood, the "for ((a=1; a<=100 ; a++))" line says start at 1, then go to another number (in this case 100) at regular intervals (a++), however it seems I'm wrong with this interpretation.

The awk method does work and is murderously simple, but is admittedly something I haven't looked at at all yet (but this is a newbie forum, right? )

From what I understand from other posters I'm missing something small and easy to find from my first attempt, which I also admittedly haven't figured out yet, but I'll keep you posted.
 
Old 08-03-2010, 11:40 AM   #9
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
The thread is marked "[SOLVED]", what was the solution? (Be polite, give us feedback, let us & those who follow know what you did.)

I thought there was a missing '$', but
Code:
for ((a=1; a<=10 ; a++)); do echo $a; done
&
Code:
for ((a=1; $a<=10 ; a++)); do echo $a; done
both work.

Personally, I would use awk.
(See Taylor's Laws of Programming)
 
Old 08-03-2010, 01:48 PM   #10
jthill
Member
 
Registered: Mar 2010
Distribution: Arch
Posts: 211

Rep: Reputation: 67
sed isn't really the right tool for this: starting 10000 processes for a 10000 line file is ... not aesthetically pleasing. Plus, in circumstances other than casual use it can actually turn into a performance problem. So get used to avoiding things like that.

The awk oneliner is probably best:
Code:
awk '/start/{sub(/start/,++n);{print}'
or if you want the substitution only on lines that contain *only* 'start',
Code:
awk '/^start$/{$0=++n};{print}'
but you can do it without ever leaving bash if you want. The whole-line case is really easy. I'll explain the IFS stuff in a minute.

Code:
IFS='' n=0; while read; do [[ $REPLY == start ]] && REPLY=$((++n))
echo $REPLY; done
unset IFS
and bash's parameter expansion can do quite a lot -- it's (relative to what bash can do) also easy to do the more general substitution in bash:
Code:
IFS=''; n=0
while read; do
[[ $REPLY == *start* ]] && REPLY=${REPLY%%start*}$((++a))${REPLY#*start} echo $REPLY;
done unset IFS
The shell uses the 'IFS' variable to decide where to split in its input. The 'read' command distributes the parameters into the variables you give it, or 'REPLY' if you don't give it one. With the default IFS, that means whitespace sequences are collapsed to a single space. Setting IFS to '' tells the shell to not do any parameter splitting. Do remember to unset it afterwards. Or you could leave it unset if you want the default behavior.
 
Old 08-04-2010, 02:00 AM   #11
rojee
LQ Newbie
 
Registered: Dec 2007
Posts: 3

Rep: Reputation: 0
cat & sed .............too easy

cat & sed



as per:



[root@hostest ~]# cat it.txt
start
content content
start
content content
start
content content

[root@hostest ~]# sed '/^start/d' it.txt |cat -n |sed 's/content.*/\n&/'
1
content content
2
content content
3
content content
[root@hostest ~]#
 
Old 08-06-2010, 09:36 AM   #12
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
rojee,

I think OP made a slight mistake in his sample file: 'start' is a literal, but 'content content' is not. Therefore you cannot expect a search for "content" to be useful. As I said, I would use awk & jthill was kind enough to provide 2 awk-based solutions.


twoleggedtripod,

It's a custom at LQ to give positive feedback to to those who tried to help you by telling what solution you adopted. That also helps create an archive of solutions for others in the future. Please give back to LQ, don't be a taker only.

Last edited by archtoad6; 08-08-2010 at 01:31 PM. Reason: typo
 
Old 08-08-2010, 07:42 AM   #13
rojee
LQ Newbie
 
Registered: Dec 2007
Posts: 3

Rep: Reputation: 0
sed is powerfull ---- not as powerfull as "awk" but way easier

[root@hostest ~]# cat text.txt
start
contantly changing content content
start
mary had a little lamb
start
and everwhere mary went
[root@hostest ~]#
[root@hostest ~]#
[root@hostest ~]# sed '/^start/d' it.txt |cat -n |sed 's/\t.*/\n&/'
1
contantly changing content content
2
mary had a little lamb
3
and everwhere mary went
[root@hostest ~]#
 
Old 08-08-2010, 01:47 PM   #14
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by twoleggedtripod View Post
In short, the program runs something, names the cycle=x on one line, provides stuff I need, then a variable amount of data.
I take "variable amount of data" to mean possibly more than one line. The sed approaches posted so far don't allow for that. I still think awk is the right tool for this job.

BTW, using '/' for the delimiter in a regex in both sed & perl is optional -- any character is allowed (RTM).

Also, FWIW, code samples & ASCII files are best put in "Code:" blocks when posting on LQ. Long or wide ones go better in a pastebin.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to find occurance of word in one line atul_sp Linux - Server 5 07-21-2008 02:20 PM
the number of occurrences of the word in file hinetvenkat Linux - Software 1 02-20-2008 07:25 AM
how to delete last number/word of a file and incude file count at the end of the chennaiguy Linux - Newbie 2 02-18-2008 10:08 PM
increasing the number of file descriptors on RHEL8 mingram27 Fedora 1 02-21-2007 01:41 PM
Counting number of occurance in awk program sarajevo Programming 2 11-01-2006 11:31 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 09:57 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration