LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   effective regular expression for pattern matching (https://www.linuxquestions.org/questions/linux-newbie-8/effective-regular-expression-for-pattern-matching-4175426056/)

rsmitha 09-06-2012 11:12 PM

effective regular expression for pattern matching
 
Hello All,

I have been trying to find a proper regular expression to match (rather unmatch) some strings in the files I have.
I have pasted some strings here from a file named "TestFile":
virtual void Handle(const ABC::reportStatusEngaged& ) = 0;
virtual void Handle(const ABC::reportStatusPathOk& ) = 0;
virtual void Handle(const ABC::reportStatusSteeringOverride& ) = 0;
virtual void Handle(const ABC::reportStatusHeadingConverged& ) = 0;
virtual void Handle(const ABC::reportStatusSpeedOK& ) = 0;
virtual void Handle(const ABC::reportStatusHardwareOK& ) = 0;
virtual void Handle(const ABC::reportStatusImplementOk& ) = 0;
virtual void Handle(const ABC::reportStatusDmuOrientationOk& ) = 0;
virtual void Handle(const ABC::reportStatusValveSteering& ) = 0;
virtual void Handle(const ABC::reportStatusValveOnline& ) = 0;
virtual void Handle(const ABC::reportStatusEngageLimits& ) = 0;
virtual void Handle(const ABC::reportStatusElectricSteer& ) = 0;
virtual void Handle(const ABC::reportStatusValveRequiresPowerCycle& ) = 0;
virtual void Handle(const ABC::reportGuidanceDisengageReasonToken& ) = 0;
virtual void Handle(const ABC::reportGuidanceRemoteEngageToken& ) = 0;
virtual void Handle(const ABC::reportPathErrorsToken& ) = 0;

My aim is to find the strings which have no "Token" in them with Token before the "&". When I run the script on this file after the replacement it should not show any matches.
I tried a number of things out of which one is shown below:
sed -ne 's/\(re.*\)[^Token]&/\1Token\&/' TestFile

The trouble is, it seems to skip some strings like:
reportStatusSteeringOverride&
reportStatusHeadingConverged&
The strings containing "OK".

Please help and advise.

chrism01 09-07-2012 12:30 AM

Can you post the desired output from that lot, as your description is unclear to me.

grail 09-07-2012 07:28 AM

Your current regex says, find lines containing re followed by zero or more characters up until not one of 'T','o','k','e' or 'n' followed by &
Also, as you have used '-n' in your sed, you either need to remove it or put a 'p' at the end of the replacement:
Code:

sed -ne 's/\(re.*\)[^Token]&/\1Token\&/p' TestFile
You can probably see why you are getting errors. Looking at your 2 examples:

reportStatusSteeringOverride& ... This will not be displayed as the 'e' prior to & is part of the characters of Token
reportStatusHeadingConverged& ... This will be displayed as 'd' is not part of Token

pan64 09-07-2012 07:54 AM

[^Token] means: contains anything but the letters T,o,k,e or n (and does not mean something which does not contain the world Token).

I would suggest you to search for Token& and skip those lines:
sed ' /Token&/n; s/&/Token&/' file

rsmitha 09-07-2012 06:01 PM

Hi Chris,

The desired output would be to have all those strings mentioned, to have the world "Token"embedded in them before the "&".
For example, the first one: virtual void Handle(const ABC::reportStatusEngaged& ) should become: virtual void Handle(const ABC::reportStatusEngagedToken& )

Hi Grail,
The problem is "converged" is also not being displayed.

Also, could you guys tell me how to use regexp to negative match an entire word?
I tried [^\<Token\<]. But it does not seem to work.

Regards,
rsmitha.

grail 09-08-2012 03:23 AM

Maybe something like:
Code:

sed -n '/Token&/! s/&/Token&/p' file
If you are using this to edit a file, change the -n to -i and remove the 'p' at the end.

rsmitha 09-08-2012 11:10 PM

Hi Grail,

I think this works !!

Thank you !!
I was initiallly doing something like this with sed.
sed -n '/re.*Token&/!p' $file
It was giving me the right matches but I did not know how to feed this to the replacement command.
Your reply has given me that. :)


All times are GMT -5. The time now is 12:09 PM.