awk Question? Search by line, using 2 files?
Hi guys,
I need some help searching 2 files if possible? I need to make sure that the sequence of characters in each line contained in in File1, is also contained somewhere in a line in File2. If a line in File1 does not appear in File2, I need to redirect the output containging the contents of that line to a new file. The file formats do not match, so line 1 in File1 may not appear at the leftmost of File 2, ie it could be 20 characters in. Anyone have any ideas. I was thinking swk, but I do not know how to achieve this? File1 is in this format: 0000HHGJ47P49e71ULCKQN6BSO0 0017RKQEU5CSAe7PVH9PDDSMDQL 002D9856TGG6De4A5JEP9KA5RDR 002Q68CL8D6JCe0DHO6SJU13K2K 003OFNNNLSL8IeFN253UP0LFSK1 File2 is in this format: [root@c001n01 IC]# head -10 Enumeration.reflections.0.F1.uniq 0000HHGJ47P49e71ULCKQN6BSO0~MD5_R_2ELR14GQH1SAI9J3HAU9B6UVN6~MD5~M2 0000Q4AJ7MEKGe5Q5BAEN71KMIV~MD5_R_832BFUCEFE9RD39MJKAPTUTJ1H~MD5~M2 0001CAMSNRL8ReC39IBSLOR2573~MD5_R_14AV9PVVOIGOEC6LAV8UJT77I2~MD5~M2 . . (Thousands of lines) . . G4128H5E2040J65A40T16HQNF679QFHFP3EDS3Te3AHRHJH969EOP~GM_R_CRSM79KOGN5H7887OKVMQCQQIQ~MD5~M2 G4128H5E54N0CANSU6B8411JHUD4MK8AUNLKBFKe2O1OPBHHCK09T~GM_R_46GD5OOKCMALDF4FBHSP2TFBVS~MD5~M2 G4128HAGN3S0I4QKB67Q6CI55VBDSTCQ5OLH08Ne5TTVV4UBFKOV7~GM_R_6HBKI5A645S9JEI0V56LKJ3061~MD5~M2 Thanks, Mick |
You might use awk, but I would use simple bash:
Code:
#!/bin/bash |
Quote:
|
Quote:
This worked a treat, thanks. Is there anyway that I could alter the script slightly so that after 1 instance of the line in file1 is found in file2, then the next line in file1 is tested (rather than have the script look for a 2nd instance). This would make the search a lot faster as file 2 is huge and there is only ever going to be 1 instance of the line in file2 anyway? Thanks in advance! |
According to the man page, i think it's doing that already:
-q, --quiet, --silent Quiet; do not write anything to standard output. Exit immedi- ately with zero status if any match is found, even if an error was detected. Also see the -s or --no-messages option. |
All times are GMT -5. The time now is 04:45 AM. |