LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 06-04-2009, 09:51 AM   #1
Geneset
Member
 
Registered: Jan 2007
Location: Athlone, ROI
Distribution: Ubuntu Hardy Desktop, Solaris 10, Workstation 2008 x64
Posts: 75

Rep: Reputation: 16
Question way of ignoring out of order lines in diff?


Hi Folks.

I'm working at diffing html pages and mysql db dumps and I'm coming across a (minor) issue.

I can use the regular expression engine to ignore certain "words" that i know to have changed (dates, version numbers, hostnames, etc) but (for a reason that is being hunted separatly) occasionally data is reported on the webpage out of order.

This behaviour is not a fault and i know that i can do
Code:
diff <(sort filenameA ) <(sort filenameB)
And that solves the issue with regards to one file.

But im dealing with snapshots consisting of hundreds of pages, and have been recursivly diffing on each directory successfully, solving all comparision problems but this, and leaving me with one file that lists the differences between the directories for each file.

i could do something along the lines of
Code:
diff <(find dirnameA | xargs sort) <(find dirnameB | xargs sort)
But then to diff this just looks like one long file and i lose the delineation between files.

After inspecting the man pages i cant find a (obvious) way to ignore out of order lines.

Anyone have any bright ideas on either:

A) a regex i could use in diff to compare each line to the other lines in the current file.
B) a way of post processing the resulting diff file to excise the offending swaps


Regards
G

Last edited by Geneset; 06-04-2009 at 09:52 AM. Reason: Spelling
 
Old 06-04-2009, 09:56 AM   #2
MensaWater
Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 6,017
Blog Entries: 5

Rep: Reputation: 787Reputation: 787Reputation: 787Reputation: 787Reputation: 787Reputation: 787Reputation: 787
Have you tried sdiff? It tries to put the differences side by side. I often find it more useful than diff though not quite a perfect tool.
 
Old 06-04-2009, 10:01 AM   #3
Geneset
Member
 
Registered: Jan 2007
Location: Athlone, ROI
Distribution: Ubuntu Hardy Desktop, Solaris 10, Workstation 2008 x64
Posts: 75

Original Poster
Rep: Reputation: 16
i thought of that also, and was my initial vector at this issue, but without the regex capabilities of diff, the percieved error rate greatly increases; all of the changes we already knew about get displayed, instead of reducing the number of observed difference to things that we DONT know about.
Thank you for your feedback jl
 
  


Reply

Tags
diff, directories, lines, sort


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
cron is ignoring lines... true_atlantis Linux - Software 4 04-09-2009 05:35 AM
diff / patch ignoring changes to particular lines Kikazaru Linux - General 2 03-09-2009 10:57 AM
need a hash algorithm ignoring input order Thinking Programming 3 07-12-2006 06:09 AM
need a hash algorithm ignoring input order Thinking Programming 1 01-02-2006 05:15 PM


All times are GMT -5. The time now is 09:08 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration