ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
i just want the characters which are to the left of the first .(dot) in FQDN name. I could get it using substr and split function,but how do i get it through regex. Thanks !
Last edited by kdelover; 11-27-2010 at 07:43 AM.
Reason: [Solved]
Click here to see the post LQ members have rated as the most helpful post in this thread.
i just want the characters which are to the left of the first .(dot)
You did yourself and the readers a favor by explaining the requirement fairly clearly and unambiguously. Reading your description of the requirement, almost translates itself into regex code.
'characters': any old character, in regex-speak, is matched by '.' So that becomes the first regex meta-character.
Code:
.
Since you specified plural, it has to be two or more, so the '.' meta-character gets modified by something that says 'two or more':
Code:
{2,}
Since this is Perl and regexes, we should make it look like it. So far, we've got
Code:
/.{2,}/
Now, we want to terminate the match by specifying a literal '.' (dot). To match a dot, we have to escape it from being interpreted as a regex meta-character, and we do that by preceding it with a '\' We can tack that onto the regex that we've built up so far.
Code:
/.{2,}\./
So, we've matched anything including the terminating dot character. To extract just the characters preceding the dot, we can enclose it in parentheses:
Code:
/(.{2,})\./
Now we can test the scalar against the regex, and any matching string will be waiting for us in the special variable '$1' (because it was the 'first' parenthesized subset).
But this is Perl, and Perl regexes are greedy. When matching URLs, that can be a problem, because by nature, Perl regexes will swallow up the biggest possible matching substring, and we want the smallest. So, we can add one last bit of regex notation, and we should be done:
Code:
/(.{2,}?)\./
The question-mark following the the quantifier tells the regex to be 'non-greedy'. It will stop trying to match at the first, not the last, dot in the string.
At least, that's my interpretation of the question.
BTW, you did specify 'characters' (plural), but you probably meant 'character or characters', and not surprisingly, the regex that matches that is quite different. So, you've also pointed out the importance of being accurate about your spec.
Yes, that looks about right to me, however, the '{1,}' notation can be replaced by the nicer '+' quantifier. The '*' quantifier would be equivalent to '{0,}' (although I;m not sure that is legal), and the '?' quantifier could be used as if it were {0,1}. I use these quantifiers to describe the meaning, although I don't know whether they would work, but if it serves to explain the meaning, then okay. There are, of course, many other notations used in regular expressions, and if you are interested, you can find many good references to the subject online.
Yes, that looks about right to me, however, the '{1,}' notation can be replaced by the nicer '+' quantifier. The '*' quantifier would be equivalent to '{0,}' (although I;m not sure that is legal), and the '?' quantifier could be used as if it were {0,1}.
I think this is more like what you originally intended
Code:
$a =~ /(.+?)\./
It wasn't clear from your original post whether the scalar against which to match was to include the leading '$host=', or whether that was a fragment of Perl code. If the scalar you are scanning is just the IP, then you can simplify the match by anchoring it to the beginning of the string, using the '^' notation:
Code:
$a =~ m/^(.+)\./
When I want to extract characters that lie between some specified delimiters, I tend to use a regex that says 'match the opening delimiter, followed by everything that isn't a closing delimiter':
Code:
# delimiters are '.' (dot)
$a =~ m/\.([^.]+)/
This tends to eliminate unexpected greediness of the quantifiers.
Thanks NBOR. i was trying to use regex and extract two numeric fields which are separated by whitespaces and words.I could go as far to extract only 1 numeric field,i am unable to get the second,even though i tried making regex non-greedy. Let me still try if i can get both the numeric fields. Thanks again for your explanation in post#4,it has really helped me a lot.
Code:
#!/usr/bin/perl
use strict;
use warnings;
my $output=`awk '/MemFree:|SwapFree:/ {print}' /proc/meminfo`;
print "Awk Output:\n$output\n";
$output=~s/\n//;
print "After removing New line:\n$output\n";
my $regex= $1 if ($output=~m/(\w+\s+)kB/);
print "$regex\n";
Output:
Code:
Awk Output:
MemFree: 2925076 kB
SwapFree: 1366012 kB
After removing New line:
MemFree: 2925076 kBSwapFree: 1366012 kB
2925076
Yikes! Calling awk from perl is heretical! There is nothing you can do in awk that you cannot do in perl, and it is almost certainly faster than launching awk.
thanks Rod. ya i know its a bad practice to use awk in perl scripts.Well,this is what i did to get MemFree and SwapFree Values..
Code:
my $cmd='/bin/cat /proc/meminfo';
$cmd=~s/\s//g; # remove what ever white spaces are there
my ($mem,$swap)=$cmd=~m/MemFree:(\d+).*SwapFree:(\d+)/;
This gives me memfree and swapfree values,i was wondering,can i make this regex even smaller? Sorry,if i have been dragging this Question for too long
Your regex looks fine. I wouldn't try to overthink the whole thing too much. However, your use of
Code:
my $cmd='/bin/cat /proc/meminfo';
is full of holes. For starters, you should have used backticks, not single-quotes. I'll assume this was a transcription error, although copy & paste tends to prevent those. Having said that, the whole 'use a script to dump a file' paradigm is completely un-necessary. Perl is quite capable of opening and reading a file. What's more, it even knows how to do the right thing by associating a commandline filename with its standard input file descriptor. See my use of this facility in my previous post.
Code:
while( <> ){
Perl will read its data in the same way, using this method, whether it gets a filename as $ARGV[0], or if data is provided on its standard input using IO redirection or from a pipe. For the simple script in my previous post, the following three commandlines will all work equivalently:
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.