Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
I've seen similar questions regarding argument list too long, but nothing exactly like my problem. My starting point is a large list of 'interesting' files, say in INTERESTINGFILES.txt. I need to search through this list a bunch of times for stuff like specific permissions, or certain extensions, and other stuff. On most systems it isn't a problem to do:
but on some systems where INTERESTINGFILES.txt is very large, the find commands fail because $INTFILES is too long, "argument list too long". My initial fix for this is to wrap it in a loop that runs through INTERESTINGFILES.txt and does a find on each file"
for item in $INTFILES
find $item -perm -4000 >> PERMS
and just work with the output of the PERMS file, but it's much slower using the loop than just doing it all in the find command (when it works). Another way I suppose would be to take INTERESTINGFILES.txt, do an ls on each file, stick that list in a new file, and do a grep whenever I need something, but that is definitely not as versatile as the find command (at least as far as checking permissions). Any suggestions other than the slow loop wrapper?
What does an entry in the interestingfiles.txt file look like? And where are you getting them from? Because, as I see it, in an architectual way, whatever is making the interestingfiles.txt should be using a script/code that checks each file as it is being added.
Last edited by szboardstretcher; 10-07-2015 at 11:34 AM.
The find command has -o's, not -a's, so the files in INTERESTINGFILES.txt will match one of the requirements of the find, but not all of them. A line from INTERESTINGFILES.txt will just have a file name, the output of the global find command, matching one of the many requirements.
I now need to process this list of files and determine which of them have perm 4000, or are owned by a specific user, etc.. and put them in their own variables. The best way I know to do that is with another find command, this time searching through my smaller list instead of the whole file system, e.g.:
PERMS=find $INTFILES -perm -4000
BOB=find $INTFILES -user bob
but sometimes $INTFILES is too large, and I get "argument list too long" with the find.
In other words, from the start someone was trying to save time by pre-gathering all the interesting files we would need to examine later. If he hadn't done this, we would just be doing:
PERMS=find / -perm -4000
BOB=find / -user bob
EXT=find / -name "*.txt*"
but this is a wasteful number of global finds and on this hardware, takes a very long time for each of those to complete.
Very likely this would be faster using Perl or Python.
The parameter list to a process is limited to 131072 bytes (/usr/include/linux/limits.h). So depending on the file name lengths, that could be as few as 100, or as many as around 10,000.
The reason Perl or Python would be faster is that there are no internal limits to the size of the array (and no reason to use an array anyway). A simple loop in either language can do a stat of the file, and check whatever else you want - without the fork/exec overhead of doing the equivalent using find.
I was wondering if we could back up just a little to the original information provided.
1. Error being received :- find: argument list too long
2. Information from OP :- I've seen similar questions regarding argument list too long, but nothing exactly like my problem.
3. Further information provided :- on some systems where INTERESTINGFILES.txt is very large
Now I am curious, exactly what searching did you do that didn't cover passing a very large amount of data to find (essentially the exact error message) to get said error message??
I can clearly see the reason for asking for an alternative to the loop you are using, which by the way should not be a for loop unless you can guarantee no whitespace in any path / file name,
but the original premise seems highly unlikely.