LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-07-2015, 11:21 AM   #1
jeriryan
Member
 
Registered: Apr 2003
Location: United States
Distribution: RHEL 5.4, Snow Leopard
Posts: 87

Rep: Reputation: 15
find: argument list too long


Hi all,
I've seen similar questions regarding argument list too long, but nothing exactly like my problem. My starting point is a large list of 'interesting' files, say in INTERESTINGFILES.txt. I need to search through this list a bunch of times for stuff like specific permissions, or certain extensions, and other stuff. On most systems it isn't a problem to do:

Code:
INTFILES=`cat INTERESTINGFILES.txt`
PERMS=find $INTFILES -perm -4000
EXT=find $INTFILES -name "*.ext"
but on some systems where INTERESTINGFILES.txt is very large, the find commands fail because $INTFILES is too long, "argument list too long". My initial fix for this is to wrap it in a loop that runs through INTERESTINGFILES.txt and does a find on each file"

Code:
for item in $INTFILES
do
find $item -perm -4000 >> PERMS
done
and just work with the output of the PERMS file, but it's much slower using the loop than just doing it all in the find command (when it works). Another way I suppose would be to take INTERESTINGFILES.txt, do an ls on each file, stick that list in a new file, and do a grep whenever I need something, but that is definitely not as versatile as the find command (at least as far as checking permissions). Any suggestions other than the slow loop wrapper?
 
Old 10-07-2015, 11:32 AM   #2
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 3,774
Blog Entries: 1

Rep: Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339
What does an entry in the interestingfiles.txt file look like? And where are you getting them from? Because, as I see it, in an architectual way, whatever is making the interestingfiles.txt should be using a script/code that checks each file as it is being added.

Last edited by szboardstretcher; 10-07-2015 at 11:34 AM.
 
Old 10-07-2015, 11:41 AM   #3
jeriryan
Member
 
Registered: Apr 2003
Location: United States
Distribution: RHEL 5.4, Snow Leopard
Posts: 87

Original Poster
Rep: Reputation: 15
So I think someone wanted to save time later when looking for certain files by only doing one global find. interestingfiles.txt is a big global find that looks like:

find / -perm -4000 -o -name "*.txt" -o -group abc -o -user bob..... >>INTERESTINGFILES.txt

So instead of later doing a global find for each type of item, he can just search for it in interestingfiles.txt, which is (usually) a much smaller subset and thus faster.

It's possible I could edit this original find command, but could you suggest a way that this original global find could be modified to sort files as they are found on the fly?
 
Old 10-07-2015, 11:46 AM   #4
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 3,774
Blog Entries: 1

Rep: Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339
So if that is the command you are using to create the file, then every file that is in it will fit the find description in your original question:

Code:
-perm -4000
-name "*.txt"
Because "find / -perm -4000 -o -name "*.txt" -o -group abc -o -user bob..... >>INTERESTINGFILES.txt" will only add files with permission 4000 and extension .txt to the INTERESTINGFILES.txt list.

Thought of another way,.. What is the difference between this:

Code:
find / -perm -4000 -o -name "*.txt" -o -group abc -o -user bob >> INTERESTINGFILES.txt
INTFILES=`cat INTERESTINGFILES.txt`
PERMS=find $INTFILES -perm -4000
EXT=find $INTFILES -name "*.txt"
and this?

Code:
find / -perm -4000 -o -name "*.txt" -o -group abc -o -user bob >> INTERESTINGFILES.txt
INTFILES=`cat INTERESTINGFILES.txt`
PERMS=`cat INTERESTINGFILES.txt`
EXT=`cat INTERESTINGFILES.txt`
I would hope I am misunderstanding your question. If I am,. please provide a fake line from the INTERESTINGFILES.txt file, and explain what you want to do with it.

Last edited by szboardstretcher; 10-07-2015 at 11:48 AM.
 
Old 10-07-2015, 12:03 PM   #5
jeriryan
Member
 
Registered: Apr 2003
Location: United States
Distribution: RHEL 5.4, Snow Leopard
Posts: 87

Original Poster
Rep: Reputation: 15
The find command has -o's, not -a's, so the files in INTERESTINGFILES.txt will match one of the requirements of the find, but not all of them. A line from INTERESTINGFILES.txt will just have a file name, the output of the global find command, matching one of the many requirements.

I now need to process this list of files and determine which of them have perm 4000, or are owned by a specific user, etc.. and put them in their own variables. The best way I know to do that is with another find command, this time searching through my smaller list instead of the whole file system, e.g.:
PERMS=find $INTFILES -perm -4000
BOB=find $INTFILES -user bob
but sometimes $INTFILES is too large, and I get "argument list too long" with the find.

In other words, from the start someone was trying to save time by pre-gathering all the interesting files we would need to examine later. If he hadn't done this, we would just be doing:
PERMS=find / -perm -4000
BOB=find / -user bob
EXT=find / -name "*.txt*"
but this is a wasteful number of global finds and on this hardware, takes a very long time for each of those to complete.
 
Old 10-07-2015, 12:40 PM   #6
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,600

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
Very likely this would be faster using Perl or Python.

The parameter list to a process is limited to 131072 bytes (/usr/include/linux/limits.h). So depending on the file name lengths, that could be as few as 100, or as many as around 10,000.

The reason Perl or Python would be faster is that there are no internal limits to the size of the array (and no reason to use an array anyway). A simple loop in either language can do a stat of the file, and check whatever else you want - without the fork/exec overhead of doing the equivalent using find.

Last edited by jpollard; 10-07-2015 at 12:41 PM.
 
1 members found this post helpful.
Old 10-07-2015, 12:53 PM   #7
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 3,774
Blog Entries: 1

Rep: Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339Reputation: 1339
Agreed. IMO This would be MUCH MUCH better as a python script, php script, or a C program (if you swing that way) than a shell script.
 
Old 10-07-2015, 01:50 PM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,243

Rep: Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684
I was wondering if we could back up just a little to the original information provided.

1. Error being received :- find: argument list too long

2. Information from OP :- I've seen similar questions regarding argument list too long, but nothing exactly like my problem.

3. Further information provided :- on some systems where INTERESTINGFILES.txt is very large


Now I am curious, exactly what searching did you do that didn't cover passing a very large amount of data to find (essentially the exact error message) to get said error message??

I can clearly see the reason for asking for an alternative to the loop you are using, which by the way should not be a for loop unless you can guarantee no whitespace in any path / file name,
but the original premise seems highly unlikely.
 
  


Reply

Tags
bash, scripting


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Argument list too long ekelly30 Linux - Kernel 11 01-09-2012 11:46 AM
[SOLVED] argument list too long deep27ak Linux - Newbie 5 11-26-2011 04:52 PM
Argument list too long ust Linux - Software 11 10-26-2009 11:16 AM
"argument list too long" - why am I getting this with the find command laggerific Linux - Software 11 10-01-2007 01:01 PM
Argument list too long trutnev Linux - General 3 04-22-2004 05:32 PM


All times are GMT -5. The time now is 12:13 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration