LinuxQuestions.org - file compare and result output

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - file compare and result output (https://www.linuxquestions.org/questions/linux-newbie-8/file-compare-and-result-output-4175431869/)

arn2025

10-12-2012 09:56 AM

file compare and result output

hello;

i have two files 100kactivesept.csv and 100kactivevms.csv the two files are really long, however i want to spit out both results of field1 in file1 that are not in field 2 of file 2, and the vice versa, field 2 in file1 not in field one in file 1. hope am making sense.

Quote:

$ head 100kactivesept.csv
012071967934,100000
000029600012,100000
012071963923,100000
012071940886,100000
000029614463,100000
374745625,100000
374748422,100000
012071956396,100000
000029613510,100000
012071967627,100000

Quote:

/cygdrive/e/voucher/September
$ head 100kactivevms.csv
AT100K|247131123|
AT100K|247130535|
AT100K|247130418|
AT100K|247130988|
AT100K|247130550|
AT100K|247130997|
AT100K|247130157|
AT100K|247131072|
AT100K|247131081|
AT100K|247130601|

shivaa

10-12-2012 10:43 AM

As far as I could understand your requirement, if you want only 2nd field of first file and 1st field of second file, then do as:
1st file:
% more <filename> | awk -F"," '{print $2}' > /tmp/firstfile.txt
and for 2nd file:
% more <filename> | awk -F"|" '{print $1}' > /tmp/secondfile.txt
Then you'll get your desired output in /tmp/firstfile.txt and /tmp/secondfile.txt.
Moreover if you want to merge output of both /tmp/firstfile.txt and /tmp/secondfile.txt in columned manner, use following cmd:
% paste /tmp/firstfile.txt /tmp/secondfile.txt

Hope it will help you, else explain your question little more.

arn2025

10-12-2012 11:34 AM

THIS DOES NOT GIVE THE DESIRED OUTPUT, I WOULD Like to get field ones that are not in field to of file 1, sort of like a vlookup

schneidz

10-12-2012 11:53 AM

^ what have you tried so far... i think awk, grep, sed, cut, ... would be useful in this exercise.

i would change one of the files so that the fields match and run grep -f -v in a while loop.

shivaa

10-12-2012 01:00 PM

Quote:

Originally Posted by arn2025 (Post 4804050)

THIS DOES NOT GIVE THE DESIRED OUTPUT, I WOULD Like to get field ones that are not in field to of file 1, sort of like a vlookup

To be honest, your question is not so clear that what exactly you want. Better explain with some example and elaborate.

theNbomr

10-12-2012 04:12 PM

Yes, please give examples and written explanation of how some parts of each file match. In the sample you provide, I can see almost no matching strings. For the sake of optimization, it would be useful to know a few more things:

are the records in each file ordered/sorted in any way?
are the records in each file unique (can a record in either be duplicated)?
is the order of the output records significant?
is there any positional correspondence between the records? i.e. does record #1 correspond to record #1 in each file?

That last one points to the possibility of some very lengthy searches. It may be necessary to scan each record of each file for every record of the other file, or even twice if you need to do the reverse comparison.
Your second sample file doesn't really look like a CSV formatted file. Are we to assume what the field delimiters are?

--- rod.

All times are GMT -5. The time now is 02:26 PM.