regex substring
Hi
I have a string: stuff_bitmore_needed_stuffnotneeded_083.txt The objective is to extract the 'needed' part of the string however it may not always be 6 characters, it could be more or less. Is there a regex solution to my problem I wont to use it in a bash script. Your help is appreciated Thank you |
Certainly - several no doubt.
But to use regex you have be able to precisely define what to keep and/or what to discard. Precisely. |
There is probably a regex solution, but you will need to supply a better description of the string and the part you wish to extract.
The basic problem is to work out a regular expression which describes the needed part and how it may be recognized within the entire string. For example, if as in your string the needed part is always after the second underscore and contains no underscores itself, you might use something like this: Code:
s/^[^_]+_[^_]+_([^_]+).*/\1/ To get help here you will need to provide a few real examples of the strings you want to extract from, along with the results you would expect from each. If there is a pattern to the strings then describing that pattern will lead most directly to the solution. |
Umm its more complicated than I thought. All the strings follow the pattern above, all have the underscores in the same positions.
This is the closest i have got: [^_][\w][a-z][a-z][a-z][a-z][a-z][a-z][a-z][a-z][a-z][a-z][a-z] this gives me: e_needed I know this very poor, I am experimenting with it on https://regexr.com/ Hope that help |
regex always exposed corner cases - be very careful what you choose to use. "\w" is usually defined (e.g. in those engines that choose to follow perlre) to include the underscore character ....
|
Quote:
I could not get the substitutions to work at the site you linked, but found this site which does seem to work: regex101.com. In the expression you show above, you can replace the repeated [a-z]'s with [a-z]+ (one or more characters in range a-z), but I don't think that will do what you want. The example I gave above does work at the URL I have linked, but you need to enter the match and substitution patterns in separate input elements, like so... Code:
Regular Expression: I would encourage you to open a terminal on your GNU/Linux machine and learn by using grep and sed from the command line. It will teach you the skills without any quirks which web-applications sometimes have, and in the text environment where regular expressions natively exist! Plus you will have all the native documentation available at the same time: man regex, man pcre, man pcresyntax and man pcrepattern, and more! For example, putting your test patterns in a file named 'infile' and using sed to match/replace (again with the above sample): Code:
cat infile |
Or if regexp is not a religion
Code:
str="stuff_bitmore_needed_stuffnotneeded_083.txt" |
if _ is the delimiter you can do it easily (but OP should tell us if that was the case)
Code:
P=( ${str//_/ } ) |
Thank you everyone that contributed. It seems I have a lot to learn.
|
you are welcome.
If you think your problem is solved, please mark the thread solved. If you have some additional questions, do not hesitate, just ask. And if you really want to say thanks just click on yes. (and obviously everyone of us have a lot to learn) |
All times are GMT -5. The time now is 02:54 PM. |