LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 11-30-2010, 03:31 PM   #1
ab52
LQ Newbie
 
Registered: Jul 2010
Posts: 9

Rep: Reputation: 0
Comparing two files


I have two text files i want to compare the differances between but i dont wnat all of them, there is only about 30lines of relvent text i want to compare

any ideas, either perl or bsah

thanks
Adam
 
Old 11-30-2010, 03:37 PM   #2
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 543Reputation: 543Reputation: 543Reputation: 543Reputation: 543Reputation: 543
Hi there,

probably better to provide some more information in order for folks to give some decent suggestions. I.e. do you want to compare 30 lines in one file, to an entire second file? Or compare 30 lines in one file to 30 lines in the second file? Maybe you want to compare the first 30 lines of each file, then quit? Are the lines consecutive in each file? What kind of data is in each file - plain alphabet soup, or some sort of XML?

In cases like this, it's often a great idea to show us snippets of the data files, and show what sort of output you expect.

Do you know any Perl or Bash? I'd lean towards Perl so far for this, but who knows, it may be a relatively easy job, and bash might do it. Maybe neither of these tools will seem to be the right one, once the problem is better understood.

Cheers!
 
Old 11-30-2010, 03:46 PM   #3
ab52
LQ Newbie
 
Registered: Jul 2010
Posts: 9

Original Poster
Rep: Reputation: 0
ok thanks, i will get you some exmaple to have a look at

my programming skills in limited to non

thanks
Adam
 
Old 11-30-2010, 03:59 PM   #4
XavierP
Moderator
 
Registered: Nov 2002
Location: Kent, England
Distribution: Lubuntu
Posts: 19,174
Blog Entries: 4

Rep: Reputation: 428Reputation: 428Reputation: 428Reputation: 428Reputation: 428
Moved: This thread is more suitable in Programming and has been moved accordingly to help your thread/question get the exposure it deserves.
 
Old 11-30-2010, 04:51 PM   #5
garyg007
Member
 
Registered: Aug 2008
Location: north-east ohio
Distribution: Debian-squeeze/stable;
Posts: 279
Blog Entries: 1

Rep: Reputation: 31
@ab52

are you looking for something like this excerpt from a perl document I found?
Code:
Doing String Selections (Parsing)
If regular expressions' only benefit was looking for a (albeit complex)
string within a string, it wouldn't be worth learningl. Regular expressions
(and Perl itself, for that matter) really start earning their keep by allowing
you to select and process substrings based on what they contain, and the
context in which they appear.
For instance, create a program whose input is a piped in directory
command and whose output is stdout, and whose output represents a batch
file which copies every file (not directory) older than 12/22/97 to a
directory called \oldie. This would be pretty nasty in C or C++. The
directory output would look something like this:
  Volume in drive D has no label
  Volume Serial Number is 4547-15E0
  Directory of D:\polo\marco
.                   <DIR>            12-18-97 11:14a .
..                  <DIR>            12-18-97 11:14a ..
INDEX       HTM             3,237    02-06-98 3:12p index.htm
APPDEV      HTM             6,388    12-24-97 5:13p appdev.htm
NORM        HTM             5,297    12-24-97 5:13p norm.htm
IMAGES              <DIR>            12-18-97 11:14a images
TCBK        GIF               532    06-02-97 3:14p tcbk.gif
LSQL        HTM             5,027    12-24-97 5:13p lsql.htm
CRASHPRF    HTM            11,403    12-24-97 5:13p crashprf.htm
WS_FTP   LOG            5,416 12-24-97 5:24p WS_FTP.LOG
FIBB     HTM           10,234 12-24-97 5:13p fibb.htm
MEMLEAK HTM            19,736 12-24-97 5:13p memleak.htm
LITTPERL        <DIR>            02-06-98 1:58p littperl
         9 file(s)              67,270 bytes
         4 dir(s)        132,464,640 bytes free
UUUUgly! I'd hate to do this in C or C++. But wait. It's 18 lines in Perl?
while(<STDIN>)
  {
  my($line) = $_;
  chomp($line);
  if($line !~ /<DIR>/)               #directories don't count
    {
    #** only lines with dates at position 28 and (long) filename at pos 44 **
    if ($line =~ /.{28}(\d\d)-(\d\d)-(\d\d).{8}(.+)$/)
      {
      my($filename) = $4;
      my($yymmdd) = "$3$1$2";
      if($yymmdd lt "971222")
        {
        print "copy $filename \\oldie\n";
        }
      }
    }
  }
The above snippet came from [quote]Troubleshooters.Com and Code Corner
Present
Steve Litt's Perls of Wisdom:
Perl Regular
Expressions
(With Snippets)
[/code]

Last edited by garyg007; 11-30-2010 at 04:55 PM.
 
Old 11-30-2010, 08:38 PM   #6
Lsatenstein
Member
 
Registered: Jul 2005
Location: Montreal Canada
Distribution: Fedora Core 6 XEN
Posts: 194
Blog Entries: 1

Rep: Reputation: 36
File Comparisons using editor and color indications

Quote:
Originally Posted by ab52 View Post
I have two text files i want to compare the differences between but i dont want all of them, there is only about 30lines of relevent text i want to compare

any ideas, either perl or bsah

thanks
Adam
I am not sure from your request if you wanted it done in a view mode or in batch. There are quite a few text editors that allow you to open two files and do a compare between them. One editor that I used in Windows (yes, where I found that tool, allowed me to see what was inserted and removed by file, using colors.
 
Old 12-01-2010, 06:13 AM   #7
Mark1986
Member
 
Registered: Aug 2008
Location: Netherlands
Distribution: Xubuntu
Posts: 87

Rep: Reputation: 11
If you are using Windows you can use TextDiff. If you are using some Linux version, you might want to try sdiff. It has built-in options to compare only those lines you want to compare. It is, however, used in command line. It can become a bit nasty when you compare long lines.
 
Old 12-01-2010, 07:20 AM   #8
frogweasel
LQ Newbie
 
Registered: Mar 2008
Location: Poquoson, VA
Distribution: Ubuntu 10.04
Posts: 12

Rep: Reputation: 0
I agree with others that an appropriate editor is the best choice, but you seem to want to script this.
If so, and if the files are are of a predictable length and format, this may work for you:

For simplicity, assume two files of 10 lines each.
You want to compare lines 5-7 only.

head -7 filename | tail -3 > /tmp/temp.txt (create a file with the lines to be compared)

Do that with both files and use diff or sdiff to compare.

If the file formats are not predictable, additional work will have to be done.
 
Old 12-01-2010, 09:26 AM   #9
dannybpng
Member
 
Registered: Sep 2003
Location: USA
Distribution: Fedora 20
Posts: 58

Rep: Reputation: 19
SED (stream editor) would be a possible choice. Here is the way to get a range of lines out of files and use diff on them.

Print lines 5 to 10 inclusive:
sed -n '5,10p' file1.txt > section1.txt

Print lines starting with the line beginning with "START" till a line beginning with "END":
sed -n '/^START/,/^END/p' file2.txt > section2.txt

diff section1.txt section2.txt

Dan
 
Old 12-01-2010, 10:20 AM   #10
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 230Reputation: 230Reputation: 230
Edit: Dan posted while I was composing. My suggestion is now a bit redundant.

sed can do the line selection in one step:
Code:
sed -n '5,7p' file1 > temp1
IMNRHO, this problem is too simple to bother w/ Perl. -- I see it as a 3-liner in bash.
Generalizing the line #s to <w>,<x>,<y>,<z>:
Code:
sed -n '<w>,<x>p' file1 > temp1
sed -n '<y>,<z>p' file2 > temp2
diff temp1 temp2  | less -S#33

If the size of the files does not make the process too long, you could diff the files 1st & use sed or grep to disregard the irrelevant. This would avoid the creation of temp files.

Last edited by archtoad6; 12-01-2010 at 10:23 AM.
 
Old 12-01-2010, 11:08 AM   #11
johannes121
LQ Newbie
 
Registered: Feb 2009
Distribution: Debian
Posts: 1

Rep: Reputation: 6
Quote:
Originally Posted by archtoad6 View Post
Code:
sed -n '<w>,<x>p' file1 > temp1
sed -n '<y>,<z>p' file2 > temp2
diff temp1 temp2  | less -S#33

If the size of the files does not make the process too long, you could diff the files 1st & use sed or grep to disregard the irrelevant. This would avoid the creation of temp files.
Or you could just do it as a one-liner (without temp files):

Code:
diff <(sed -n '<w>,<x>p' file1) <(sed -n '<w>,<x>p' file2)
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
comparing files newbiesforever Linux - Software 3 07-07-2010 03:20 PM
Comparing text files... jong357 Slackware 14 03-31-2007 04:29 PM
comparing lots of files Frustin Linux - General 4 09-22-2005 02:54 PM
Using diff for comparing 2 files beep Programming 5 01-21-2005 12:51 PM
Comparing 2 Files xianzai Programming 2 05-23-2004 11:50 AM


All times are GMT -5. The time now is 01:32 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration