ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
so that I can assign those items to variables. I have tried using sed to replace anything before the ':' with nothing (i.e. sed 's/.*://') but unfortunately the file I'm parsing is a bit more complicated than the one above. I am using this as an example for simplicity as I feel there must be a way to make grep go back and search again from the top of the file for a new string.
Does anyone have any idea how to make that happen?
It simply misses the file name. In this case grep expects input from the keyboard (standard input) and you have to terminate it using Ctrl-D, whereas Ctrl-C interrupts the whole process. Anyway, why don't you show us a piece of the real input? Maybe we can give some more help.
Thanks for the replies. David, I'm not actually using cat. I am using tail -500 logfile.log (it's a log file) | grep 'stuff' | cut -d blah blah blah
Unfortunately I can't get a sample of the exact output because it is on a private system, but it basically takes this format:
[mm/dd/yyyy hh:mm:ss] Creating a connection config for: SITE
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set url = URL
[mm/dd/yyyy hh:mm:ss] Creating a connection config for: SITE
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set url = URL
[mm/dd/yyyy hh:mm:ss] Creating a connection config for: SITE
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set parameter: PARAM
[mm/dd/yyyy hh:mm:ss] Set url = URL
I need PARAM, PARAM, PARAM and URL for each site. The desired output would be
SITE
PARAM
PARAM
PARAM
URL
SITE
PARAM
PARAM
PARAM
URL
SITE
PARAM
PARAM
PARAM
URL
And actually, it doesn't even need to be output, I just need those things separated out so that I can manipulate them. It seems I can get all of the sites, all of one param or another, or all the URLs using grep and cut/sed, but I can't get them ordered in the way I want because once the file is "grepped" once, grep doesn't continue from the top again. I hope this isn't too vague. I would love to be able to post the actual log file but I just can't do it
Thanks guys! Both of these look to have potential but neither of them worked quite like I'd expected. The reason is that some of the PARAMS have special characters in them (for example one of them is a DN string like cn=username,ou=a,ou=b,ou=c,dc=a,dc=b,dc=c)
With awk, I was able to get everything except one of the PARAMs which happens to have spaces in it (I assume because awk is using the last field and using a space as the delimiter?)
With translate I was able to get SITE and only one of the PARAMs, I'm guessing because some of the PARAMs have colons in them. I've come up with some stuff I can post without giving away too much info. This is the exact format the log file follows (punctuation and everything):
Code:
[mm/dd/yyyy hh:mm:ss] Creating a connection config for: SITE1
[mm/dd/yyyy hh:mm:ss] Set parameter: some.stuff.i.dont.need
[mm/dd/yyyy hh:mm:ss] Set parameter: java.naming.security.principal=CN=user,OU=a,OU=b,DC=a,DC=b,DC=c,DC=d,DC=e,DC=f
[mm/dd/yyyy hh:mm:ss] Set parameter: java.naming.security.credentials=somehashedvalue
[mm/dd/yyyy hh:mm:ss] Set parameter: some.more.stuff.i.dont.need
[mm/dd/yyyy hh:mm:ss] Set java.naming.provider.url = http://www.example.com/
Creating a connection config for: SITE2
[mm/dd/yyyy hh:mm:ss] Set parameter: some.stuff.i.dont.need
[mm/dd/yyyy hh:mm:ss] Set parameter: java.naming.security.principal=CN=user,OU=a,OU=b,OU=c,DC=a,DC=b,DC=c,DC=d,DC=e
[mm/dd/yyyy hh:mm:ss] Set parameter: java.naming.security.credentials=somehashedvalue
[mm/dd/yyyy hh:mm:ss] Set parameter: some.more.stuff.i.dont.need
[mm/dd/yyyy hh:mm:ss] Set java.naming.provider.url = http://www.example2.com/
Creating a connection config for: SITE3
[mm/dd/yyyy hh:mm:ss] Set parameter: some.stuff.i.dont.need
[mm/dd/yyyy hh:mm:ss] Set parameter: java.naming.security.principal=CN=user,OU=a,OU=b,OU=c,OU=d,DC=a,DC=b,DC=c,DC=d
[mm/dd/yyyy hh:mm:ss] Set parameter: java.naming.security.credentials=somehashedvalue
[mm/dd/yyyy hh:mm:ss] Set parameter: some.more.stuff.i.dont.need
[mm/dd/yyyy hh:mm:ss] Set java.naming.provider.url = http://www.example3.com/
Note that the OU structures are different and will vary depending on the site, so I do not have a specific number of fields for that line unfortunately. Likewise notice that there are a couple random lines in the middle of each block which I don't need, although it might be ok because I can probably grep them out if I can get everything else right. Unfortunately the format's not uniform but if I can get close I might be able to figure the rest out on my own. I'm still playing with awk and tr to see if I can get this to work, but in the mean time if you guys are able to get the output above from the code above that, I should be in business!
I got it guys! I ended up just piping a bunch of sed commands together after using awk to print out the last field using the field separator of ":"! I used a little bit of each of your replies combined with a bit of my own tweaking!
Here is my final command (I tested it on the real log file and with a little tweaking I got it to work like I planned). This assumes that "testfile" has the format given above:
Code:
cat testfile | awk -F ": " '{ print $NF }' | grep -v 'need' | sed 's/.*java.naming.security//' | sed 's/.*principal=//' | sed 's/.*credentials=//' | sed 's/.*provider.url = //'
THANKS!!!
Last edited by StupidNewbie; 03-16-2012 at 03:23 PM.
There's generally no need to mix and match grep, sed, and awk. sed can do everything grep can do and more, and awk is a full text-processing scripting language that can completely replace the other two, and then some.
grep and sed can also be handed multiple expressions at once, using the "-e" option.
Also, don't forget that "." is a regex operator, meaning "match any character", so you have to escape it or use a bracket expression if you want to match a literal period.
It's possible to compact the command even more if you use extended regular expressions (the -r option in sed). Then you can use parentheses to group a list of alternate values to match (separated by "|").
Code:
sed -r -e '/need/d' -e 's/.*(for: |java\.naming\.security|principal=|credentials=|provider\.url = )//' infile.txt
Thanks David. Even though I got this to work, I will give that a shot too. I tried using Sed before (by itself) and it just became so cluttered and cryptic I couldn't keep track of what I was replacing. Also, there were some quirks like Sed not properly interpreting brackets {} in order to make the pattern repeat a specific number of times, which became an issue with the OU string since DC= repeats multiple times, as well as OU=, and it's an unknown number of repetitions each time. Anyway, I will give your code a shot and see if it looks cleaner and works the same way. Thanks!
NB: Your first post showed Ubuntu and the latter Mac OS X from where you are posting. On a Mac the delivered sed is the BSD version and has no -r option. In case you are using it thereon you can compile the GNU sed though, like I did for exactly that purpose.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.