LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-11-2013, 10:25 AM   #1
jaa1180
Member
 
Registered: Oct 2003
Location: USA, Tennessee
Distribution: Ubuntu
Posts: 307

Rep: Reputation: 30
Bash Script Help - Need to read first few characters and make decision


Greetings All,

I am trying to pull data from the logs of a RH linux server. What I am needing looks like this.

at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,
at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,
at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,


Here is the command I run to get the output above:
grep VelocityEngine /wwwroot/current/tomcat/logs/catalina.out.nightly_20131010060005 | grep 'Left side' | awk '{print $16,$17}'

What I am trying to do is see if the line starts with 'at', if so remove it and the last X number of characters. If not, then remove the last X number of characters.

Is there a command that I can use or a string of commands to meet this?
 
Old 10-11-2013, 11:17 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,008

Rep: Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193
Yes ... and you are already using it ... awk

1. You can remove all of your references to grep as awk already does regex matching.

2. Based on your displayed output it would be clear that $16 is either 'at' or blank so simply test it

3. Use substr, sub, gensub (or any other string manipulator) to remove last characters
 
Old 10-11-2013, 12:37 PM   #3
jaa1180
Member
 
Registered: Oct 2003
Location: USA, Tennessee
Distribution: Ubuntu
Posts: 307

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by grail View Post
Yes ... and you are already using it ... awk

1. You can remove all of your references to grep as awk already does regex matching.

2. Based on your displayed output it would be clear that $16 is either 'at' or blank so simply test it

3. Use substr, sub, gensub (or any other string manipulator) to remove last characters
Yes, you are correct. The $16 is the 'at' and there is no blank line. I already tested, as you can see in the output.
I don't really use awk other than printing a certain field.

What might a command be to do as you suggest?
 
Old 10-11-2013, 02:00 PM   #4
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by jaa1180 View Post
What I am trying to do is see if the line starts with 'at', if so remove it and the last X number of characters. If not, then remove the last X number of characters.
With this InFile ...
Code:
at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,
at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,
at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,
... this code ...
Code:
X=25  # X = number of characters to be trimmed from lines which DO    begin with "at".
Y=40  # Y = number of characters to be trimmed from lines which DON'T begin with "at".
awk -v X=$X -v Y=$Y  \
     '{if (substr($0,1,2)=="at") $0=substr($0,3,length($0)-X)
                            else $0=substr($0,1,length($0)-Y);
      print}' $InFile >$OutFile
... produced this OutFile ...
Code:
 /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8a
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d
 /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8a
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d
 /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8a
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d
Daniel B. Martin
 
Old 10-11-2013, 02:11 PM   #5
jaa1180
Member
 
Registered: Oct 2003
Location: USA, Tennessee
Distribution: Ubuntu
Posts: 307

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by danielbmartin View Post
With this InFile ...
Code:
at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,
at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,
at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8ac-5d0f8c8fd8fe.vtl[line 171,
... this code ...
Code:
X=25  # X = number of characters to be trimmed from lines which DO    begin with "at".
Y=40  # Y = number of characters to be trimmed from lines which DON'T begin with "at".
awk -v X=$X -v Y=$Y  \
     '{if (substr($0,1,2)=="at") $0=substr($0,3,length($0)-X)
                            else $0=substr($0,1,length($0)-Y);
      print}' $InFile >$OutFile
... produced this OutFile ...
Code:
 /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8a
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d
 /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8a
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d
 /wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d2fe-4ce6-a8a
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/e/a/eaf77a9f-d
Daniel B. Martin
Okay, thank you sir. I will work through the code and learn what is happening - to learn and have AWK power.
 
Old 10-11-2013, 02:55 PM   #6
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by jaa1180 View Post
I will work through the code and learn what is happening - to learn and have AWK power.
$0 refers to each line in the InFile, taken in turn.
substr means "substring" and is a way to select a portion of a string.
So, if (substr($0,1,2)=="at") compares the part of $0 from position 1 for a length of 2 to the character string "at".

If the comparison is true, then $0=substr($0,3,length($0)-X)
$0 is replaced by a subset of $0... namely, the portion starting at position 3 and ending with whatever position "chops off" the last X characters.
We start at position 3 to get rid of the unwanted "at".

If the comparison is false, then $0=substr($0,1,length($0)-Y)
$0 is replaced by a subset of $0... namely, the portion starting at position 1 and ending with whatever position "chops off" the last Y characters.

Finally, we do this: print}'
which says to print $0 (which has been changed in some way).

$InFile >$OutFile identifies the input and output files.


Daniel B. Martin

Last edited by danielbmartin; 10-11-2013 at 02:56 PM. Reason: Cosmetic improvement
 
Old 10-11-2013, 07:01 PM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,008

Rep: Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193
Also see if this helps address not using grep:
Code:
awk '/VelocityEngine/ && /Left side/{x=($16 == "at")?25:40;print substr($17,1,length($17)-x)}' /wwwroot/current/tomcat/logs/catalina.out.nightly_20131010060005
 
Old 10-14-2013, 11:03 AM   #8
jaa1180
Member
 
Registered: Oct 2003
Location: USA, Tennessee
Distribution: Ubuntu
Posts: 307

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by grail View Post
Also see if this helps address not using grep:
Code:
awk '/VelocityEngine/ && /Left side/{x=($16 == "at")?25:40;print substr($17,1,length($17)-x)}' /wwwroot/current/tomcat/logs/catalina.out.nightly_20131010060005
Sorry, this did not work. It was really slow and produced:
Code:
live/c5d0ef3e-3dbe-45ee-bb4d-

live/c5d0ef3e-3dbe-45ee-bb4d-


















live/cbb73b0e-8933-458b-969f-

live/cbb73b0e-8933-458b-969f-
live/cbb73b0e-8933-458b-969f-
live/cbb73b0e-8933-458b-969f-

live/cbb73b0e-8933-458b-969f-
live/cbb73b0e-8933-458b-969f-
live/cbb73b0e-8933-458b-969f-
live/cbb73b0e-8933-458b-969f-
But many, many blank lines and a bunch of similar lines to the above.
I will work threw it though. Thank you for the help!

---------- Post added 10-14-13 at 11:04 AM ----------

Quote:
Originally Posted by danielbmartin View Post
$0 refers to each line in the InFile, taken in turn.
substr means "substring" and is a way to select a portion of a string.
So, if (substr($0,1,2)=="at") compares the part of $0 from position 1 for a length of 2 to the character string "at".

If the comparison is true, then $0=substr($0,3,length($0)-X)
$0 is replaced by a subset of $0... namely, the portion starting at position 3 and ending with whatever position "chops off" the last X characters.
We start at position 3 to get rid of the unwanted "at".

If the comparison is false, then $0=substr($0,1,length($0)-Y)
$0 is replaced by a subset of $0... namely, the portion starting at position 1 and ending with whatever position "chops off" the last Y characters.

Finally, we do this: print}'
which says to print $0 (which has been changed in some way).

$InFile >$OutFile identifies the input and output files.


Daniel B. Martin
Oh, thank you sir. I was not kidding about me working through it and learning. But wow, thanks!
 
Old 10-14-2013, 11:29 AM   #9
jaa1180
Member
 
Registered: Oct 2003
Location: USA, Tennessee
Distribution: Ubuntu
Posts: 307

Original Poster
Rep: Reputation: 30
I may be explaining this incorrectly in reviewing my original post.
I have a script that I threw together real quick just to grab all the VTL files. However, I noticed it was missing files.
So what I am trying to do is produce a more accurate list of file names in a TXT file.

I am working toward trimming the front and the back of the line from the log file.
When I use the following:
Code:
awk '/VelocityEngine/ && /wwwroot/ && /Left side/' /wwwroot/current/tomcat/logs/catalina.out.nightly_20131010060005
the output is many, many lines of (different file names and locations)...
2013-10-09 09:27:55,852 ERROR org.apache.velocity.app.VelocityEngine - Left side ($newsItem.newsTags.size()) of '>' operation has null value at /wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl[line 267, column 147]

So... I need to remove everything and just be left with the file name aadf529e-d867-4b23-8397-e0458392ceea.vtl - then output this file name to a file I call velocityfilenames.txt.

This is the end goal.

To get close to the right information before I was using...
Code:
 grep VelocityEngine /wwwroot/current/tomcat/logs/$CFILE | grep 'Left side' | awk '{print $16}' | sed 's/.\{5\}$//' | sed 's/^.\{37\}//' > /home/tfn9845/velocityfilenames.txt
Where $CFILE is the previous day log file.

Hopefully I have not made the confusion worse.
 
Old 10-14-2013, 11:55 AM   #10
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,008

Rep: Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193
Well I do find it interesting that when using awk you use an extra regex, /wwwroot/, but for some reason grep does not require this??

So I will not guess and just stick with the original:
Code:
awk -F"[[ ]" '/VelocityEngine/ && /Left side/{print $(NF-4)}' /wwwroot/current/tomcat/logs/$CFILE
 
Old 10-14-2013, 12:05 PM   #11
jaa1180
Member
 
Registered: Oct 2003
Location: USA, Tennessee
Distribution: Ubuntu
Posts: 307

Original Poster
Rep: Reputation: 30
Thumbs up

Quote:
Originally Posted by grail View Post
Well I do find it interesting that when using awk you use an extra regex, /wwwroot/, but for some reason grep does not require this??

So I will not guess and just stick with the original:
Code:
awk -F"[[ ]" '/VelocityEngine/ && /Left side/{print $(NF-4)}' /wwwroot/current/tomcat/logs/$CFILE
Greetings Grail,
Thank you sir.

Looks like that is working pretty good.

/wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl
live/cbb73b0e-8933-458b-969f-97be36ecd8c3_4.field
live/cbb73b0e-8933-458b-969f-97be36ecd8c3_4.field
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/a/a/aadf529e-d867-4b23-8397-e0458392ceea.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/0/5/0582de90-2dc8-4954-b6d6-70748cc9ab3b.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/0/5/0582de90-2dc8-4954-b6d6-70748cc9ab3b.vtl
/wwwroot/current/tomcat/webapps/../../dotCMS/assets/0/5/0582de90-2dc8-4954-b6d6-70748cc9ab3b.vtl

Very nice!
Thanks again!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
non-ascii characters in bash script and unicode igor.R Linux - Newbie 31 12-29-2012 03:45 AM
[SOLVED] Strange characters in bash script.... cryingthug Programming 16 05-20-2012 02:34 PM
bash script to remove first characters from every line (00) Linux - General 8 08-01-2011 10:28 AM
[SOLVED] bash: ps, psgrep and read /proc/$pid make script exit catkin Programming 1 03-22-2011 09:28 AM
Bash Script to get the first characters of a string onesikgypo Programming 12 11-14-2008 09:57 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:36 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration