cut and grep commands not found rows
Hello,
I would like to remove rows found in the file found_160k.txt from the file 160k-1.txt. Is there a line number limitation using this command (because the found row is "0".): root@SAMSUNG:~# cut -d: -f1 found_160k.txt | grep -vf- 160k-1.txt | wc -l 0 in the found_160k.txt (contains 3000 rows): 00373f5500d74281d926ed11d84b1168:amigo':123456789 160k-1.txt contains (160 000 rows): 00373f5500d74281d926ed11d84b1168:amigo' Thank you in advance. |
The grep command makes no sense. Your selecting an inverted match of nothing. Are you trying to find the number of lines that were not cut? Grep doesn't know what "-f1" in the cut command means. The dash after f in the grep command should not be there. I'm not sure what you're attempting to do, but omitting the grep command would give you the number of lines in the file.
|
Thank you very much.
I just would like to remove rows found in "found_160k.txt" file from the file "160k-1.txt". |
Have you tried with a smaller sample set to see why your command is not working? Most commands do have some type of limitation, however, if you were to hit it then you would have an error message.
First simple test to see if it is a limit thing would be to make a copy of the 'found' file and add a single entry which should get returned. I would add that this is often a case where you could use a single tool like awk instead of 2 commands which might have issues :) |
I tried with a smaller sample test. It worked. So maybe it is a limitation.:(
Thank. |
Maybe try using Perl / Python / Ruby as these may have better utilisation of the files than the commands being used.
Another option could also be to use xargs to send the data to grep?? |
If the files are in the same order, you could use the utility "comm" to show which lines are unique to the second file. The different options, such as -1 and -3, can be combined.
|
yes, cut | grep -v -f - file should work in general. If it works with a smaller set you need to check the error code returned. Probably out of memory, or something "strange" happened.
But without real data and reproduction we cannot give you correct answer (just guess...) |
Given
Code:
bash-4.4$ cat 160-k1.txt Code:
bash-4.4$ join -t ":" -v1 160-k1.txt found_160k.txt Quote:
|
freeroute, how did you solve it with the large files?
|
Quote:
Thanks for your question. This week-end I will try a solution. Do you have a suggestion, maybe? Someone told me, try awk. So I will try, unfortunately I never used awk. |
My suggestion was with "comm". For example:
Code:
comm -1 -3 <(cut -d : -f 1-2 found.txt | sort) <(sort longlist.txt) |
Thanks. It would be great if "comm" command works. I will reply, if I tried. (I am on a desktop PC now, this evening I can try on laptop (it have only 1 GB RAM)...
|
Quote:
|
All times are GMT -5. The time now is 05:01 PM. |