LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 02-10-2011, 04:30 PM   #1
lexit
LQ Newbie
 
Registered: Feb 2011
Posts: 2

Rep: Reputation: 0
how to use regexp in diff


I need to compare 2 files using diff. The problem I've encountered is that I need to exclude certain lines that contain certain phrases. I know that diff supports the -I switch but no matter how I try to form the regexp it doesn't seem to work the way I expect it to. If anyone has used the -I switch before could you please post some examples of how it is used. Thanks.


diff -I "\[skipthisline\]" file1 file2 > output.diff

I need to exclude lines that contain the string "[skipthisline]" but I have no idea what syntax is used after the -I switch. Is is supposed to be included in quotes or slashes /\[skipthisline\]/ or entered without either? I need to include a backslash before each bracket so that it's not interpreted as a set of characters like [a-z] but is instead interpreted as a string. Do I need to use 2 backslashes? "\\[skipthisline\\]"

Is it sufficient to simply type the string I want to match or do I need to match the entire line in order to exclude it from the output?

.*\[skipthisline\].*

or

^.*\[skipthisline\].*$
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 02-11-2011, 08:11 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Apparently, the regex has to match the corresponding line in both files for it to work. Otherwise it still shows it as a difference.
Code:
$ cat fileA.txt
[Fry]
[Leela]
[Bender]
[Farnsworth]
[Amy]
[Hermes]
[Zoidberg]
[Nibbler]

$ cat fileB.txt
[Fry]
[Leela]
[Bender]
[Prof. Farnsworth]
[Amy]
[Hermes]
[Dr. Zoidberg]
[Nibbler]

$ diff fileA.txt fileB.txt
4c4
< [Farnsworth]
---
> [Prof. Farnsworth]
7c7
< [Zoidberg]
---
> [Dr. Zoidberg]

$ diff -I '^\[Dr.*\]' fileA.txt fileB.txt
4c4
< [Farnsworth]
---
> [Prof. Farnsworth]
7c7
< [Zoidberg]
---
> [Dr. Zoidberg]

$ diff -I '^\[.*Zoid.*\]' fileA.txt fileB.txt
4c4
< [Farnsworth]
---
> [Prof. Farnsworth]

$ diff -I 'Zoid.*\]$' fileA.txt fileB.txt
4c4
< [Farnsworth]
---
> [Prof. Farnsworth]
Inserting a blank line somewhere so that the line numbers don't match up also foils the regex match.

As for escaping characters, first realize that protecting them from the shell and from the regex pattern are two separate things. To fully protect from the shell, you usually should use single-quotes. See here for a full discussion on shell quoting: http://www.tldp.org/LDP/abs/html/quoting.html

After you've protected the string from the shell and passed it to the program, then you have to escape the regex-reserved characters, including brackets. There are a couple of ways to do this. Backslashing is one way, as I demonstrated above. You can also use brackets to list a range of acceptable characters, including restricted ones such as brackets.
Code:
diff -I '[[].*Zoid.*[]]$' fileA.txt fileB.txt
But to match a ']', it has to be the first character in the list. This doesn't work:
Code:
$ diff -I '[[].*Zoid.*[ ]]$' fileA.txt fileB.txt
More on regex syntax here: http://www.tldp.org/LDP/abs/html/x16775.html
and here: http://www.regular-expressions.info/posixbrackets.html
 
2 members found this post helpful.
Old 02-12-2011, 05:02 PM   #3
lexit
LQ Newbie
 
Registered: Feb 2011
Posts: 2

Original Poster
Rep: Reputation: 0
Excellent tutorial and examples on how to use regexp with diff. Many thanks for your helpful reply.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
setting git-diff --color-words as git-diff potuz Linux - Software 5 09-09-2010 01:31 PM
regexp help cliff76 Linux - Newbie 3 03-07-2008 02:15 PM
Help with regexp anupamsr Linux - Software 4 03-05-2008 06:43 AM
Using a regexp with diff stash1071 Linux - General 2 05-23-2006 01:09 PM
Dual Boot diff Hard Disk diff OS on Suse 9.1 wilhem Linux - Newbie 1 08-13-2004 06:06 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 11:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration