LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-23-2010, 11:40 PM   #1
venkatrg
LQ Newbie
 
Registered: Dec 2010
Location: Chennai, India
Posts: 4

Rep: Reputation: 0
Smile Bash script to parse a file to get a set of line between a specific characters


Hi all,
I have a log file that contains information like this:

----------------------------
r11141 | prasath-palani | 2010-12-23 16:21:24 +0530 (Thu, 23 Dec 2010) | 1 line
Changed paths:
M /projects/
M /projects/
M /applications
updated for integration test
----------------------------
r11140 | upendra.sahu | 2010-12-23 16:09:38 +0530 (Thu, 23 Dec 2010) | 1 line
Changed paths:
M /projects/trunk/
M /projects/trunk/
A /projects/trunk/
updated to use the Manifest function
----------------------------

what i need is, i need to copy the data given between the "----------------------------" to seperate files, for, e.g. the first set of data between the "----------------------------" should be in one file and another set of data in another file.

Can anyone help me on how to write a bash script for this?
 
Old 12-24-2010, 12:03 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
What have you tried and where are you stuck?
 
Old 12-24-2010, 12:47 AM   #3
Snark1994
Senior Member
 
Registered: Sep 2010
Distribution: Debian
Posts: 1,632
Blog Entries: 3

Rep: Reputation: 346Reputation: 346Reputation: 346Reputation: 346
While I'm sure you can do it in bash, I find bash rather hard to get my head around - it seems hard to debug and incredibly picky about syntax. If you're just starting out scripting, a language like perl or python would seem to be a better choice to me...
 
Old 12-24-2010, 05:22 AM   #4
venkatrg
LQ Newbie
 
Registered: Dec 2010
Location: Chennai, India
Posts: 4

Original Poster
Rep: Reputation: 0
Thumbs up

Quote:
Originally Posted by grail View Post
What have you tried and where are you stuck?
Hi,
Finally i spent 4hrs and created the script, and posting it here, which may be useful for others....


#!/bin/bash
logfile="$1/trunklog.txt"

#Exit if trunklog.txt file doesn't exists in the $1 {Path to the logfile}
if [ ! -f "$logfile" ]; then
echo "File trunklog.txt doesn't exist in the $1 directory"
exit 1
fi

#Create the log file in the path
cd "$1"
cd "../frags"
#Delete all the files/folers in the ./frag directory
rm -rf *


#Parses the input "tmp.txt" file and creates the logfrags file with file name as
#revision number given in the first line of the input "tmp.txt" file.
#Deletes the passed "tmp.txt" file once the logfrags file is created.
processFile()
{
firstline="TRUE"
fname="xxx.txt"

# Set loop separator to end of line
BAKIFS=$IFS
IFS=$(echo -en "\n\b")

exec 3<&0
exec 0<"$1"
while read -r line
do
#echo $line
var="$line"

#Read the first line in the file to get the revision number
if [[ "$var" =~ "| 1 line" ]] && [ $firstline = "TRUE" ] ; then
fname=${var%%\|*}
fname=${fname#r}
fname=${fname//[[:space:]]}
echo $var >> "$fname"
firstline="FALSE"
else
echo $var >> "$fname"
fi
done
exec 0<&3
# restore $IFS which was used to determine what the field separators are
FS=$BAKIFS

#Delete the tmp.txt file which is passed as argument to this function
rm "$1"
}

# Read the ./state/trunklog.txt file line by line and parse it by
# Calling the Updatelogfrag function to create the frag files
readlogfile()
{
first="TRUE"

# Set loop separator to end of line
BAKIFS=$IFS
IFS=$(echo -en "\n\b")

exec 3<&0
exec 0<"$1"

while read -r line
do
#echo $line
#Remove the first line in the trunklog.txt file
if [ $first = "TRUE" ]; then
first="FALSE"
elif [ $first = "FALSE" ]; then
if [[ ! "$line" =~ ---------* ]]; then
echo "$line" >> "temp.txt"
else
processFile "temp.txt"
fi
fi
done

exec 0<&3

# restore $IFS which was used to determine what the field separators are
FS=$BAKIFS
}


echo "Creating individual numbered revision log fragments (logfrags) files. . . . ."

readlogfile "$logfile"

echo ". . . Done"

echo;echo "logfrags files created in $PWD/ directory"
exit 0
 
Old 12-24-2010, 05:32 AM   #5
vikas027
Senior Member
 
Registered: May 2007
Location: Sydney
Distribution: RHEL, CentOS, Ubuntu, Debian, OS X
Posts: 1,305

Rep: Reputation: 107Reputation: 107
You can also use a one liner awk for this purpose.

See this thread http://www.linuxquestions.org/questi...-shell-595506/

Also, if your problem is resolved, go to top of your thread and in the Thread Tools click on "Mark this thread as solved".
 
Old 12-24-2010, 06:55 AM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,006

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
venkatrg - Firstly I commend you on your script as it is quite in depth. Second, if you use [code][/code] tags around your code it will maintain formatting and be a
lot easier to read.

As vikas has pointed out there are easier ways, but I would like to help you with what you have presented.

I will start from the top:

1. You refer to $1 all the way through the script. Are you aware that when the script is called that $1 is the first item on the command line after the script name and prior to a space but when you call
one of your own functions it is the first item after the function name prior to a space? I ask as it is very confusing from a reading point of view to know which $1 is being referenced.

2. Let me know what you think might happen if the following were the only lines in your code:
Code:
#Create the log file in the path
cd "$1"
cd "../frags"
#Delete all the files/folers in the ./frag directory
rm -rf *
Now we call the script but make a little typo in our haste:
Code:
./script / folder
#         ^ this is a space between slash and word folder (typo)
Think about what might happen here??

3. Echo not necessary here:
Code:
IFS=$(echo -en "\n\b")
# could just be
IFS=$'\n\b'
I am also curious how backspace (\b) will be a separator??

4. In processFile function, why the need to set var=$line? Is there a reason you could not simply use $line in the places where you have $var?

5. No need to mix up testing options:
Code:
if [[ "$var" =~ "| 1 line" ]] && [ $firstline = "TRUE" ] ; then
#becomes
if [[ "$var" =~ "| 1 line" &&  $firstline = "TRUE" ]] ; then
6. You open file descriptor '3':
Code:
exec 3<&0
In both functions, but it is never actually closed. Line to close would be:
Code:
exec 3<&-
7. Escape (\) not required here:
Code:
fname=${var%%\|*}
8. The following compound if statement inside the while loop for function readlogfile has me confused:
Code:
while read -r line
do
#echo $line
#Remove the first line in the trunklog.txt file
    if [ $first = "TRUE" ]; then
        first="FALSE"
    elif [ $first = "FALSE" ]; then
        if [[ ! "$line" =~ ---------* ]]; then
            echo "$line" >> "temp.txt"
        else
            processFile "temp.txt"
        fi
    fi
done
So if we start from the top of this snippet:

a. We using a while loop to read from the past in log file
b. The use of if then elif is not required as there are only 2 options for $first so this could become a simple if / else construct:
Code:
    if [ $first = "TRUE" ]; then
        first="FALSE"
    else
        if [[ ! "$line" =~ ---------* ]]; then
            echo "$line" >> "temp.txt"
        else
            processFile "temp.txt"
        fi
    fi
c. Once in the else we then check if line contains dashes. If it does we call the function processFile. My issue here is that if the dashes are in the first line we look at then we will call function on a file that does not exist. I realise that the previous if is probably coping with this, but it might be an idea to test that the file exists prior to calling the function

9. When using =~ you are doing a regular expression comparison (of sorts) and so placing * at the end of the line is not required:
Code:
if [[ ! "$line" =~ ---------* ]]; then
# same result as
if [[ ! "$line" =~ --------- ]]; then
Once the match is made you do not care what comes after

I hope you do not take any of the above as negative. It is solely meant to aide you in improving what you have

Something to think about ... generally you will create functions for tasks that you repeat several times in code, but most of your tasks are fairly linear so it may
be just as easy to have most of this as a continuous code piece (just a thought)

Look forward to seeing how you go.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Help needed for using awk to parse a file to make array for bash script tallmtt Programming 12 04-14-2012 01:16 PM
bash script to remove first characters from every line (00) Linux - General 8 08-01-2011 10:28 AM
sed script to parse a file into smaller files with set # of lines kmkocot Linux - Newbie 3 11-12-2009 11:51 AM
bash - read or write to specific line in text file? babag Programming 11 08-23-2008 01:44 PM
Need help with perl/bash script to parse PicBasic file cmfarley19 Programming 13 11-18-2004 05:06 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 02:36 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration