Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hey guys
I am new to regular expressions and could use a little help if possible. Below is some output that I want to parse/filter so I have been trying my hand at some regular expressions.
I am trying to return any data samples with:
Size: 5000 to 30000
Birate: 320 or higher
Length: Not zero
After filtering everything to these requirements I then need to grab the previous line containing the link. I had this working before using the "-B 1" switch in order to get the previous line but once I changed my regex a bit it stopped working.
The best I have come up with so far for parsing the size is the following, but I am aiming for 5 meg to 30meg rather than 10 meg to 30meg
#Size: [0-9][0-9][0-9][0-9][K0-9][KB] #1 meg to 100meg
#Size: [0-3][0-5][0-9][0-9][0-9]KB #10 meg to 30meg
Code:
Sample data
[702] slsk://tarabusaw/E:/_GROOVESHARK/DMC/Commercial/2001/220/03 - DJ Luck & MC Neat Megamix - Les Adams.mp3
Size: 10607KB Bitrate: 192 Length: 0:00 Queue: 0 Speed: 22322 Free: Y filetype: mp3
Code:
Bash script to regex data ( currently not working )
file=$( cat result.txt | grep 'Size: [0-3][0-5][0-9][0-9][0-9]KB' | grep 'Bitrate: [34][28][0-9]' | grep -v 'Length: 0:00' | grep -B 1 'slsk')
echo $file
it appears that your regex only shows data with a length of 0:00 instead of not equal to 0:00
also the second line also appears. I'm trying to just dump the slsk: links up to .mp3, but of course only those that match the criteria. Thanks again for the help and do you have any idea why this is happening?
May want to be careful there as the current solution provided now includes Length = 0.00 which is what was being asked to exclude
Another interesting thing to fact or in would be how well do you know the data prior to running the script?
I ask this because if out of maybe 1000s of lines there are potentially only a handful with 'slsk' in them, the script may be looking at the wrong information first (just a thought)
Another point, from memory bitrate is normally a fixed set of values, ie. I do not think you could have a bitrate of 100 (could be wrong of course).
Assuming correct, [34][28][0-9] would yield results which cannot exist but may be in the data ... again just a thought
May want to be careful there as the current solution provided now includes Length = 0.00 which is what was being asked to exclude
Another interesting thing to fact or in would be how well do you know the data prior to running the script?
I ask this because if out of maybe 1000s of lines there are potentially only a handful with 'slsk' in them, the script may be looking at the wrong information first (just a thought)
Another point, from memory bitrate is normally a fixed set of values, ie. I do not think you could have a bitrate of 100 (could be wrong of course).
Assuming correct, [34][28][0-9] would yield results which cannot exist but may be in the data ... again just a thought
Yes you are correct, it displays entries only with 0:00 length.
Also the concerns that you raised are not really a concern as the bitrate seems to consistently work and the slsk link is always the line preceding the attribute line (size etc).
@Scottish_Jason - Note that I changed the bit rate expressions to match the one line of data that you supplied.
Yes I see that, but it still should have returned 192k samples in that case
edit: Actually I am getting results now that I have changed the bitrate back to the previous one.. great!
only problem left is that it displays both lines. While writing this I think I just remembered about a switch that prints only one line? will go and check
edit: ohhh -C 1 .... and it is already implemented, hmm...
Last edited by Scottish_Jason; 11-25-2014 at 11:07 PM.
Hmmm, lots of possible corner cases.
If it must be done in bash, I'd probably extract the numeric values into an array, and do real arithmetic tests on the values. sed or grep can do the extraction easily.
Better option might be a language with regex and proper logic idioms. Perl or awk might be a good start.
Hmmm, lots of possible corner cases.
If it must be done in bash, I'd probably extract the numeric values into an array, and do real arithmetic tests on the values. sed or grep can do the extraction easily.
Better option might be a language with regex and proper logic idioms. Perl or awk might be a good start.
I was thinking about doing it that way but came to the conclusion it might be over my head. I am fairly new to bash and regex and have never used perl etc. Only C+
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.