LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-02-2016, 09:01 PM   #1
freeroute
Member
 
Registered: Jul 2016
Location: Hungary
Distribution: Debian
Posts: 69

Rep: Reputation: Disabled
cut and grep commands not found rows


Hello,

I would like to remove rows found in the file found_160k.txt from the file 160k-1.txt.

Is there a line number limitation using this command (because the found row is "0".):

root@SAMSUNG:~# cut -d: -f1 found_160k.txt | grep -vf- 160k-1.txt | wc -l
0

in the found_160k.txt (contains 3000 rows):

00373f5500d74281d926ed11d84b1168:amigo':123456789

160k-1.txt contains (160 000 rows):
00373f5500d74281d926ed11d84b1168:amigo'


Thank you in advance.
 
Old 11-02-2016, 09:28 PM   #2
AwesomeMachine
LQ Guru
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,524

Rep: Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015Reputation: 1015
The grep command makes no sense. Your selecting an inverted match of nothing. Are you trying to find the number of lines that were not cut? Grep doesn't know what "-f1" in the cut command means. The dash after f in the grep command should not be there. I'm not sure what you're attempting to do, but omitting the grep command would give you the number of lines in the file.
 
1 members found this post helpful.
Old 11-02-2016, 09:47 PM   #3
freeroute
Member
 
Registered: Jul 2016
Location: Hungary
Distribution: Debian
Posts: 69

Original Poster
Rep: Reputation: Disabled
Thank you very much.
I just would like to remove rows found in "found_160k.txt" file from the file "160k-1.txt".
 
Old 11-02-2016, 09:53 PM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,037

Rep: Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203
Have you tried with a smaller sample set to see why your command is not working? Most commands do have some type of limitation, however, if you were to hit it then you would have an error message.
First simple test to see if it is a limit thing would be to make a copy of the 'found' file and add a single entry which should get returned.

I would add that this is often a case where you could use a single tool like awk instead of 2 commands which might have issues
 
1 members found this post helpful.
Old 11-02-2016, 10:16 PM   #5
freeroute
Member
 
Registered: Jul 2016
Location: Hungary
Distribution: Debian
Posts: 69

Original Poster
Rep: Reputation: Disabled
I tried with a smaller sample test. It worked. So maybe it is a limitation.
Thank.
 
Old 11-02-2016, 11:41 PM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,037

Rep: Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203
Maybe try using Perl / Python / Ruby as these may have better utilisation of the files than the commands being used.

Another option could also be to use xargs to send the data to grep??
 
1 members found this post helpful.
Old 11-03-2016, 03:05 AM   #7
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,752
Blog Entries: 4

Rep: Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970
If the files are in the same order, you could use the utility "comm" to show which lines are unique to the second file. The different options, such as -1 and -3, can be combined.
 
1 members found this post helpful.
Old 11-03-2016, 05:58 AM   #8
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 24,269

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
yes, cut | grep -v -f - file should work in general. If it works with a smaller set you need to check the error code returned. Probably out of memory, or something "strange" happened.

But without real data and reproduction we cannot give you correct answer (just guess...)
 
1 members found this post helpful.
Old 11-03-2016, 07:39 AM   #9
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,574

Rep: Reputation: 2847Reputation: 2847Reputation: 2847Reputation: 2847Reputation: 2847Reputation: 2847Reputation: 2847Reputation: 2847Reputation: 2847Reputation: 2847Reputation: 2847
Given
Code:
bash-4.4$ cat 160-k1.txt 
00373f5500d74281d926ed11d84b1168:amigo'
00473f5500d74281d926ed11d84b1168:amigo'
00573f5500d74281d926ed11d84b1168:amigo'
bash-4.4$ cat found_160k.txt 
00373f5500d74281d926ed11d84b1168:amigo':123456789
00473f5500d74281d926ed11d84b1168:amigo':123456789
00773f5500d74281d926ed11d84b1168:amigo':123456789
then
Code:
bash-4.4$ join -t ":" -v1 160-k1.txt found_160k.txt
00573f5500d74281d926ed11d84b1168:amigo'
Note - from 'man join'
Quote:
Important: FILE1 and FILE2 must be sorted on the join fields.
 
1 members found this post helpful.
Old 11-04-2016, 05:35 AM   #10
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,752
Blog Entries: 4

Rep: Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970
freeroute, how did you solve it with the large files?
 
Old 11-04-2016, 07:32 AM   #11
freeroute
Member
 
Registered: Jul 2016
Location: Hungary
Distribution: Debian
Posts: 69

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Turbocapitalist View Post
freeroute, how did you solve it with the large files?
Hi,

Thanks for your question.
This week-end I will try a solution.
Do you have a suggestion, maybe? Someone told me, try awk. So I will try, unfortunately I never used awk.
 
Old 11-04-2016, 07:41 AM   #12
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,752
Blog Entries: 4

Rep: Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970Reputation: 3970
My suggestion was with "comm". For example:

Code:
comm -1 -3 <(cut -d : -f 1-2 found.txt | sort) <(sort longlist.txt)
Though I don't know the internal workings to know how to reduce memory dependence or predict which parts might cause it to run out.
 
Old 11-04-2016, 07:47 AM   #13
freeroute
Member
 
Registered: Jul 2016
Location: Hungary
Distribution: Debian
Posts: 69

Original Poster
Rep: Reputation: Disabled
Thanks. It would be great if "comm" command works. I will reply, if I tried. (I am on a desktop PC now, this evening I can try on laptop (it have only 1 GB RAM)...
 
Old 11-04-2016, 05:33 PM   #14
freeroute
Member
 
Registered: Jul 2016
Location: Hungary
Distribution: Debian
Posts: 69

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Turbocapitalist View Post
My suggestion was with "comm". For example:

Code:
comm -1 -3 <(cut -d : -f 1-2 found.txt | sort) <(sort longlist.txt)
Though I don't know the internal workings to know how to reduce memory dependence or predict which parts might cause it to run out.
I read the manual and examples of "comm" command. Very simple command and very useful. It works. Thank you very much for your help again.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] cut/grep command help roopakl Linux - Newbie 5 04-29-2013 02:06 PM
[SOLVED] Cut Rows lainey Linux - Newbie 3 11-15-2011 12:47 PM
How to use command grep,cut,awk to cut a data from a file? hocheetiong Linux - Newbie 7 09-11-2008 07:16 PM
cut , paste or grep ziox Programming 1 12-15-2004 10:51 PM
tcprobe+grep+cut Axion Programming 1 08-24-2004 05:27 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:37 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration