[SOLVED] proper syntax for grep "a[;$]" - endline in charset...
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
You don't; the line ending isn't a possibility, it's a certainty. All that matters is what happens before it.
If you omit the '$' at the end, there could be additional text you're not matching. Putting a '$' at the end means there is not additional line text to match.
grep only operates on single lines -- you'd need to use something like awk to operate across newlines.
allbeit correct, what jhwilliams said, there are situations where you want that anyway.
Try this:
Code:
echo $'\n'
echo $"\n"
printf $'\n'
printf $"\n"
And your solution does not, what you have described.
It works differently.
The "$" in sed means "a virtual non-existing character" meaning end of the line. There is no such thing like a newline...
$ echo "a" | sed 's/$/ /' | grep "a[; ]" | sed 's/ $//'
a
ugly, but working for me...
Could you explain exactly what your purpose is for doing this? If you gave us some actual details about what you want to do, we could perhaps come up with something less "rude" for you.
Perhaps you want something more like this?
Code:
grep "a[; ]*$"
That's "a", followed by an optional (*) space or semicolon, followed by the line ending anchor. If it's not the newline you're worried about, but the word ending (i.e. it can appear anywhere in the line), then use "\>", the word-ending anchor, instead.
And I hope you realize that it's almost never necessary to use grep and sed together like this. sed can do line filtering on its own.
BTW, uhelp, FYI, your use of $"" has no significance here. Despite the visual appearance, it has no direct relationship to $''. It's for setting up strings that can be translated according to different locales:
ok... i have a file with sequences of [A-Z] or [0-9]:
DV 2000 ACER-TRAVELMATE 44
DV 6000-ACER TRAVELMATE 55
DV/9000 ACER TRAVELMATE 77
i want to grep within these lines with these rules:
if the test string is a sequence of letters, then grep "[^A-Z]$testString[^A-Z]"
if the test string is a number, then grep "[^0-9]$testString[^0-9]"
in this way test strings "DV", "44", "55" and "77" won't give any grep output, because beginning of line is not [^A-Z] and end of line is not [^0-9].
so i need a way to tell grep to include beginning and end of line to the admitted characters.
what i do now is simply add a space at the first and last positions of the string. so the [^A-Z] and [^0-9] conditions are satisfied...
is there a smarter way to do the same?
i want to grep within these lines with these rules:
if the test string is a sequence of letters, then grep "[^A-Z]$testString[^A-Z]"
if the test string is a number, then grep "[^0-9]$testString[^0-9]"
this is very confusing---it implies that you need to first test the "test string", and then build the grep syntax based on that.
How about just telling us the desired output after operating on various lines in the file?---i.e. show us an input file, and then the desired output.
the code i posted is a working script...
you can launch it an see what it does...
i'd like to do exactly the same thing, but with simplier code: as you can see, now $_out needs to be processed with sed: once to add a space at the beginning and in the end of every line, and once to remove those spaces before displaying output...
this could be obtained by letting $_sep contain "beginning and end of lines"...
sorry, maybe i didn't explain properly...
i believe forums are made to let people who knows more help people who don't know or know less. i surely belong to one of these last categories, and i will never complain anyone for anything. i hardly realize how such a lot of wonderful helpers may exist. so thanks, thanks and thanks again. always.
that said, the thread has been marked as solved because i was told there is no solution other than my "ugly fix"... the fix works, but since i feel like a baby out of the greatest cake shop of the world, i'd REALLY like to learn if my fix is the best that can be done... and when you interested in after the thread was closed, i smelled the flavour of one of those wonderful cakes...
i didn't post the expected output because i thought it wouldn't have helped. and if i was wrong, posting the whole code would have helped you much more. but it seems like i was wrong twice...
the function is used to simulate how a search engine works.
if i search for a string, the text is plitted into the minimum number of substrings only made of numbers or letters: "ACER DV2001EL" is plitted into the substrings "ACER DV 2001 EL". other chars ([-/ ;,.-]) are interpreted as substrings separators. so the string "ACER DV2-001EL" is plitted into the substrings "ACER DV 2 001 EL".
in my code, this is done by:
after that, the search engine looks for the lines in the input file that contain all of the substrings. but every substring must not preceeded or followed by a character of the same charset. if the input file contains the line "COMPUTER ACER DV2001EL NEW", the second search string ("ACER DV2-001EL") gives no results, since the substring "2" is followed by "0" and does not match.
in my code this is done adding a custom "separator" before and after every substring:
Code:
for _string in $_query; do
if [[ $_string =~ [0-9] ]]; then
_sep="[^0-9]"
else
_sep="[^A-Z]"
fi
_out=$(echo "$_out" | grep -i "${_sep}${_string}${_sep}")
done
this works fine in the previous example, but would not work if one of the substrings is the first or the last "word" of the line in the input file. if the line is "ACER DV2001EL NEW", the search string "ACER DV2001EL" would give no results, since grep looks for a NON [A-Z] character before the substring "ACER".
i guess it could be useful for many purposes, so i'm asking to anyone who knows more than me (and i know YOU are a LOT ) if this can be done in a better way than adding a space at the beginning and in the end of every line of the input file, so that grep finds a separator before and after each substring.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.