LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 02-13-2012, 05:24 AM   #1
p3rcy
LQ Newbie
 
Registered: Feb 2012
Posts: 4

Rep: Reputation: Disabled
Comparing two files


Hi,
I am new to Linux and C++.
I want to compare two files and print the values in first file which are not present in second file to the third file. E.g

File1 File2 File3
1 1 2
2 3 4
3 5
4 7
5 9
11

As we can see, values from File1 i.e 2 and 4 are not in file2 and have been printed to file3. I know we have to use arrays. Please help me.
 
Old 02-13-2012, 05:27 AM   #2
millgates
Member
 
Registered: Feb 2009
Location: 192.168.x.x
Distribution: Slackware
Posts: 852

Rep: Reputation: 389Reputation: 389Reputation: 389Reputation: 389
Hi,
1) Is that a homework?
2) If not, is there a reason why you want to do this in C++?
3) Can you show us what you have so far?
4) Are those files sorted?

Last edited by millgates; 02-13-2012 at 05:30 AM. Reason: typo
 
Old 02-13-2012, 05:46 AM   #3
p3rcy
LQ Newbie
 
Registered: Feb 2012
Posts: 4

Original Poster
Rep: Reputation: Disabled
No, this is not a homework. This is an officework. I can do the sorting myself. But, I am out of touch with programming and basically want to read from file and put into an array. But if someone can help me with the complete work, I'd be very grateful. It can be in java also not necessarily in C++.

But I'm completely out of touch with java. I can only remember how to compile now!
 
Old 02-13-2012, 06:21 AM   #4
millgates
Member
 
Registered: Feb 2009
Location: 192.168.x.x
Distribution: Slackware
Posts: 852

Rep: Reputation: 389Reputation: 389Reputation: 389Reputation: 389
Well if you're on unix, something like this could do the job (if each file contains each number only once):

Code:
 cat "file1" "file2" "file2"|sort -n | uniq -c | awk '{if($1 == 1) print $2}' > file3
If you insist on using C++, loading a list of ints into memory could look like this.

Code:
#include <iostream>
#include <fstream>
#include <list>

int main(){
	std::ifstream file1("file1", std::ios::in);
	if (!file1) { return 1; }
	std::list < int > list1;
	int temp;
	while ((file1 >> temp)) {
	       	list1.push_back(temp); 
		std::cout << temp << std::endl;
	}

	// do someting with list
	file1.close();
	return 0;
}
You may want to take care of some details such as handling files that contain other characters than digits as that would cause an infinite loop here, but this could be a start.

Last edited by millgates; 02-13-2012 at 06:24 AM.
 
Old 02-13-2012, 06:22 AM   #5
Weapon S
Member
 
Registered: May 2011
Location: Netherlands
Distribution: Debian, Archlinux
Posts: 262
Blog Entries: 2

Rep: Reputation: 49
Ehm... read the manual

[edit]
:-[ I've been beaten and overclassed by the previous post.

Diff is the tool you want.
<code>man diff</code>
I thought you could make it show only the lines that were missing from one file, but maybe you need to 'pipe' it through grep. (grep Can get pretty complicated, but it's useful to know you don't need to construct a 'regular expression'. Just giving the string you are looking for as an argument to grep qualifies... in most cases.)
Quote:
I know we have to use arrays.
That does sound an awful lot like homework...

Last edited by Weapon S; 02-13-2012 at 06:24 AM.
 
Old 02-13-2012, 06:32 AM   #6
millgates
Member
 
Registered: Feb 2009
Location: 192.168.x.x
Distribution: Slackware
Posts: 852

Rep: Reputation: 389Reputation: 389Reputation: 389Reputation: 389
Quote:
Originally Posted by Weapon S View Post
[edit]
Diff is the tool you want.
Code:
man diff
I thought you could make it show only the lines that were missing from one file, but maybe you need to 'pipe' it through grep.
for the diff version, perhaps something like this might work:

Code:
diff file1 file2| awk '{if ($1 == "<") print $2 }'

Last edited by millgates; 02-13-2012 at 06:33 AM.
 
Old 02-13-2012, 06:36 AM   #7
millgates
Member
 
Registered: Feb 2009
Location: 192.168.x.x
Distribution: Slackware
Posts: 852

Rep: Reputation: 389Reputation: 389Reputation: 389Reputation: 389
Quote:
Originally Posted by p3rcy View Post
As we can see, values from File1 i.e 2 and 4 are not in file2 and have been printed to file3.
Btw, shouldn't 11 also be in file3?
 
Old 02-13-2012, 09:03 AM   #8
p3rcy
LQ Newbie
 
Registered: Feb 2012
Posts: 4

Original Poster
Rep: Reputation: Disabled
@millgates :
No 11 was in file2. Improper formatting I guess in the code
I tried diff and the code you put, but it seems to put show all the unique entries from both the files.

@Weapon S : Thank you for dedicating your time as well.

Thank you to both of you. But I created my own code somehow in haste and I haven't checked the redundancy of LOCs. But it worked for now.

Code:
#include <stdio.h> 
#include <stdlib.h>         
#include <iostream>
#include <string>
#define NOT_FOUND -1
#define MAX 1100

using namespace std;

int search( const int arr[], int target, int n );
void showAry( int arr[], int n );


int main()
{
    int x[MAX],y[MAX];
    int index,sizex=0,sizey=0;
    int c, target, m,i,j,n;
    
    FILE* fin,fin2;
    fin=fopen("data.txt", "r");
    if(fin==NULL) 
    {
        printf("Error opening file ... Press 'Enter' to exit ... ");
        getchar();
        return -1;
    }
    /* m holds the number elements when loop has reached EOF */
    for (m = 0; fscanf(fin, "%d" , &x[m]) != EOF ; ++m) {;} 
    fclose (fin);

    
    fin=fopen("data2.txt", "r");
    if(fin==NULL) 
    {
        printf("Error opening file ... Press 'Enter' to exit ... ");
        getchar();
        return -1;
    }
    
    for (m = 0; fscanf(fin, "%d" , &y[m]) != EOF ; ++m) {;} 
    fclose (fin);

    //for (m=0;m<5;m++)
      //cout<<x[m]<<endl;
    
     //sizex=0;
     while(x[sizex]) sizex++;    

    //cout<<"\n"<<(size-1)<<endl;    
 
    //for (m=0;m<5;m++)
      //cout<<y[m]<<endl;
     while(y[sizey]) sizey++;
  
    cout<<endl;

    for(i=0;i<sizex;i++)
      { n=0;
       for(j=0;j<sizey;j++)
        { 
            
          if(x[i]==y[j])
              break;
           else 
            {
             n++; 
             if(n==sizey) cout<<x[i]<<endl; 
            }
        }
      }     

    //getchar();
    cout<<endl;
    return(0);
}

void showAry( int arr[], int n )
{
    int i;
    for( i=0; i<n; ++i )
        printf( "%d ", arr[i] );
}

int search( const int arr[], int target, int n )
{
    int i;
    for( i=0; i<n; ++i )
        if( target==arr[i] ) return i;
        
    return -1;
}
Usage : The program when executed compares the content of data.txt with data2.txt and outputs the content of data.txt not available in data2.txt to the screen.
 
Old 02-13-2012, 09:21 AM   #9
millgates
Member
 
Registered: Feb 2009
Location: 192.168.x.x
Distribution: Slackware
Posts: 852

Rep: Reputation: 389Reputation: 389Reputation: 389Reputation: 389
Quote:
Originally Posted by p3rcy View Post
@millgates :
I tried diff and the code you put, but it seems to put show all the unique entries from both the files.
Are you sure you copied those examples correctly?

Quote:
Originally Posted by p3rcy View Post
But I created my own code somehow in haste and I haven't checked the redundancy of LOCs. But it worked for now
This looks more like C than C++, actually. Or something in between.

Code:
printf("Error opening file ... Press 'Enter' to exit ... ");
cout<<endl;
You really shouldn't mix these two together.
You also don't have to store both files in memory. One is enough
 
Old 02-13-2012, 10:38 AM   #10
p3rcy
LQ Newbie
 
Registered: Feb 2012
Posts: 4

Original Poster
Rep: Reputation: Disabled
Yes, I copied the examples correctly. Actually the files I wanted to compare had more than 1000 entries and thats why the whole chaos.

Quote:
Originally Posted by millgates
You really shouldn't mix these two together.
You also don't have to store both files in memory. One is enough
Yes, I know. But like I said, I'm really out of touch and developed the whole program in haste overlooking even the basic optimisations.

One's programming skills can really come handy in life!

I'll research more with 'diff'. Thanx
 
Old 02-14-2012, 12:27 PM   #11
Reuti
Senior Member
 
Registered: Dec 2004
Location: Marburg, Germany
Distribution: openSUSE 15.2
Posts: 1,339

Rep: Reputation: 260Reputation: 260Reputation: 260
Quote:
Originally Posted by p3rcy View Post
I'll research more with 'diff'.
Besides diff there is join (text utilities) which can also print unpairable lines and you get the unique ones as desired:
Code:
$ join -v1 file1 file2
or if unsorted:
Code:
$ join -v1 <(sort file1) <(sort file2)
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparing lines in two files? needhelp12 Linux - Newbie 3 11-12-2011 11:48 PM
Comparing two files ab52 Programming 10 12-01-2010 11:08 AM
comparing files newbiesforever Linux - Software 3 07-07-2010 03:20 PM
comparing files between two machines morphixrocks Linux - General 6 04-27-2007 12:12 AM
Comparing 2 Files xianzai Programming 2 05-23-2004 11:50 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration