Bash script to parse a file to get a set of line between a specific characters
Hi all,
I have a log file that contains information like this: ---------------------------- r11141 | prasath-palani | 2010-12-23 16:21:24 +0530 (Thu, 23 Dec 2010) | 1 line Changed paths: M /projects/ M /projects/ M /applications updated for integration test ---------------------------- r11140 | upendra.sahu | 2010-12-23 16:09:38 +0530 (Thu, 23 Dec 2010) | 1 line Changed paths: M /projects/trunk/ M /projects/trunk/ A /projects/trunk/ updated to use the Manifest function ---------------------------- what i need is, i need to copy the data given between the "----------------------------" to seperate files, for, e.g. the first set of data between the "----------------------------" should be in one file and another set of data in another file. Can anyone help me on how to write a bash script for this? |
What have you tried and where are you stuck?
|
While I'm sure you can do it in bash, I find bash rather hard to get my head around - it seems hard to debug and incredibly picky about syntax. If you're just starting out scripting, a language like perl or python would seem to be a better choice to me...
|
Quote:
Finally i spent 4hrs and created the script, and posting it here, which may be useful for others.... #!/bin/bash logfile="$1/trunklog.txt" #Exit if trunklog.txt file doesn't exists in the $1 {Path to the logfile} if [ ! -f "$logfile" ]; then echo "File trunklog.txt doesn't exist in the $1 directory" exit 1 fi #Create the log file in the path cd "$1" cd "../frags" #Delete all the files/folers in the ./frag directory rm -rf * #Parses the input "tmp.txt" file and creates the logfrags file with file name as #revision number given in the first line of the input "tmp.txt" file. #Deletes the passed "tmp.txt" file once the logfrags file is created. processFile() { firstline="TRUE" fname="xxx.txt" # Set loop separator to end of line BAKIFS=$IFS IFS=$(echo -en "\n\b") exec 3<&0 exec 0<"$1" while read -r line do #echo $line var="$line" #Read the first line in the file to get the revision number if [[ "$var" =~ "| 1 line" ]] && [ $firstline = "TRUE" ] ; then fname=${var%%\|*} fname=${fname#r} fname=${fname//[[:space:]]} echo $var >> "$fname" firstline="FALSE" else echo $var >> "$fname" fi done exec 0<&3 # restore $IFS which was used to determine what the field separators are FS=$BAKIFS #Delete the tmp.txt file which is passed as argument to this function rm "$1" } # Read the ./state/trunklog.txt file line by line and parse it by # Calling the Updatelogfrag function to create the frag files readlogfile() { first="TRUE" # Set loop separator to end of line BAKIFS=$IFS IFS=$(echo -en "\n\b") exec 3<&0 exec 0<"$1" while read -r line do #echo $line #Remove the first line in the trunklog.txt file if [ $first = "TRUE" ]; then first="FALSE" elif [ $first = "FALSE" ]; then if [[ ! "$line" =~ ---------* ]]; then echo "$line" >> "temp.txt" else processFile "temp.txt" fi fi done exec 0<&3 # restore $IFS which was used to determine what the field separators are FS=$BAKIFS } echo "Creating individual numbered revision log fragments (logfrags) files. . . . ." readlogfile "$logfile" echo ". . . Done" echo;echo "logfrags files created in $PWD/ directory" exit 0 |
You can also use a one liner awk for this purpose.
See this thread http://www.linuxquestions.org/questi...-shell-595506/ Also, if your problem is resolved, go to top of your thread and in the Thread Tools click on "Mark this thread as solved". |
venkatrg - Firstly I commend you on your script as it is quite in depth. Second, if you use [code][/code] tags around your code it will maintain formatting and be a
lot easier to read. As vikas has pointed out there are easier ways, but I would like to help you with what you have presented. I will start from the top: 1. You refer to $1 all the way through the script. Are you aware that when the script is called that $1 is the first item on the command line after the script name and prior to a space but when you call one of your own functions it is the first item after the function name prior to a space? I ask as it is very confusing from a reading point of view to know which $1 is being referenced. 2. Let me know what you think might happen if the following were the only lines in your code: Code:
#Create the log file in the path Code:
./script / folder 3. Echo not necessary here: Code:
IFS=$(echo -en "\n\b") 4. In processFile function, why the need to set var=$line? Is there a reason you could not simply use $line in the places where you have $var? 5. No need to mix up testing options: Code:
if [[ "$var" =~ "| 1 line" ]] && [ $firstline = "TRUE" ] ; then Code:
exec 3<&0 Code:
exec 3<&- Code:
fname=${var%%\|*} Code:
while read -r line a. We using a while loop to read from the past in log file b. The use of if then elif is not required as there are only 2 options for $first so this could become a simple if / else construct: Code:
if [ $first = "TRUE" ]; then 9. When using =~ you are doing a regular expression comparison (of sorts) and so placing * at the end of the line is not required: Code:
if [[ ! "$line" =~ ---------* ]]; then I hope you do not take any of the above as negative. It is solely meant to aide you in improving what you have :) Something to think about ... generally you will create functions for tasks that you repeat several times in code, but most of your tasks are fairly linear so it may be just as easy to have most of this as a continuous code piece (just a thought) Look forward to seeing how you go. |
All times are GMT -5. The time now is 04:33 PM. |