LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   "argument list too long" - why am I getting this with the find command (https://www.linuxquestions.org/questions/linux-software-2/argument-list-too-long-why-am-i-getting-this-with-the-find-command-585622/)

laggerific 09-18-2007 02:34 PM

"argument list too long" - why am I getting this with the find command
 
Hey all,

I'm running into this issue on OSX 10.3.5, but since these are basic linux commands I thought I would check with you all to see if I am doing anything wrong.

I am trying to split up a huge directory based on the modified date of the files within. Due to the "Argument list too long" issue with mv and cp and such, I googled around and found that to get around this it was recommended to use find | xargs. Here was the command to just get files older than 30 days out of the directory:
Code:

find . -ctime +30 -print | xargs -J % mv % /destination/directory/
That worked fine in moving 10s of thousands of files...what I find odd, though, is what happens when I try to do the same thing on these relocated files to break them up further with the following command:
Code:

find . -name 200703* -print | xargs -J % mv % /desination/directory/
I get an "argument list too long" on the find command...yet the find command worked superbly on the larger set of data. Is there something obvious I am doing wrong?

Thanks.

jlliagre 09-18-2007 04:11 PM

It's the first command (find) which breaks, 200703* expands to something larger than what your environment can handle.

Just use
Code:

find . -name "200703*" -print | ...
and that should work.

By the way, if you have a posix compliant find, you can avoid xargs and use the better "+" termination feature.

eg.:
Code:

find . -name "200703*" -exec mv {} /destination/directory +

trashbird1240 09-19-2007 08:36 AM

You need quotation marks to stop the shell from expanding the wildcard.

Btw, find is a Unix command, it wasn't invented with Linux ;)

Joel

jschiwal 09-19-2007 08:50 AM

Quote:

Originally Posted by jlliagre (Post 2896181)
It's the first command (find) which breaks, 200703* expands to something larger than what your environment can handle.

Just use
Code:

find . -name "200703*" -print | ...
and that should work.

By the way, if you have a posix compliant find, you can avoid xargs and use the better "+" termination feature.

eg.:
Code:

find . -name "200703*" -exec mv {} /destination/directory +

I think you can run into the same problem if you use {}+ if you have a large number of results. The xargs commands have a couple of options to limit the number of arguments handled at once.

If any of files found might have whitespace in the name, consider using:
Code:

find . -name "200703*" -print0 | xargs -0 -L 1000 mv -t /destination/directory
Suppose that you copy a large number of files to another machine and want to verify that the copies went OK. You could use "find <directory> -exec md5sum '{}' \; >target.md5

After copying the "target.md5" file to the other machine you can verify the files with "md5sum --check target.md5 >checked". Imagine that the files that checked out OK on the source machine can be deleted. The lines for the files OK to delete end with ": OK". So if you copy the checked file to the source directory you can run:
"sed -n '/: OK$/s/: OK//p' checked | tr '\n' '\000' | xargs -0 -L 1000 rm -v"
The "tr" commands converts newline characters to null characters allowing for filenames with spaces or other "evil" filename characters.

laggerific 09-20-2007 09:43 AM

Fantastic...thanks all...I had thought I tried putting the wildcard in quotes, but I guess not. But, once I did, I had no issues...I couldn't get the find without xargs to work, but I'm not sure if it's an issue with my implementation or OSX's find command.

Either way...thanks again.

jschiwal 09-22-2007 12:45 AM

If found myself using the form "| tr '\n' '\000' | xargs -0" in one-liners very useful. It allows you to process a list of files, such as finding the the files unique to one of the lists, and treating the output as if it came from the "find" command.

Another example is extracting the XML file from a saved K3B job, extracting the filenames, and using the results to delete the files which were backed up. Having intermediate files and also be useful in debugging, readability of your script, or in allowing a larger number files to be processed.

jlliagre 09-22-2007 03:23 AM

Quote:

Originally Posted by jschiwal (Post 2896783)
I think you can run into the same problem if you use {}+ if you have a large number of results.

Actually not. {} + is a simpler and better solution which can handle any number of results, odd filenames with spaces or newlines in their names and designed to avoid -print0 and xargs tricks.

jschiwal 09-22-2007 01:34 PM

Code:

      -exec command {} +
              This variant of the -exec option runs the specified  command  on
              the  selected  files, but the command line is built by appending
              each selected file name at the end; the total number of  invoca‐
              tions  of  the  command  will  be  much  less than the number of
              matched files.  The command line is built in much the  same  way
              that  xargs builds its command lines.  Only one instance of '{}'
              is allowed within the command.  The command is executed  in  the
              starting directory.

Are you sure? The xargs command has an option to limit the number of arguments handled at a time. Part of the description implies that all of the arguments are added. And part of it implies that it isn't the case.

<update>The info file says that it is done so that the maximum command line length isn't exceeded. So the manpage could be worded better, or that phrase from the info file included.</update>

I had a problem getting it to work until I found the part in the info manual that explains that the + character is used in place of the ; character to terminate the command. The manpage could have said that the '+' character was a command terminator.
example: find ./ -maxdepth 1 -name "*.txt" -exec '{}' \+

The info manual does say that using xargs may be faster.
Quote:

The above use of `-exec' causes `find' to build up a long command
line and then issue it. This can be less efficient than some uses of
`xargs'; for example `xargs' allows new command lines to be built up
while the previous command is still executing, and allows you to
specify a number of commands to run in parallel. However, the `find
... -exec ... +' construct has the advantage of wide portability.

jlliagre 09-22-2007 02:52 PM

The gnu find manual page is indeed misleading.

By the way, you do not need to escape the plus like you do with the semicolon, as '+' isn't a special character to the shell.

jschiwal 09-23-2007 04:31 AM

I think the one example in the info manual did escape it in case you have regular expression globbing turned on. In that case '+' means at least one of the previous character.

hurry6 10-01-2007 10:52 AM

FInd - "Argument list too long"
 
I am also running into similar issue when trying to archive files on directory older than 90 days. It worked very well with a small amount of files.

find * -mtime +90 > file.out

I have tried narrowing down on file name but the results were similar.

Anyone have any idea on how I can construct this directive without causing argument list too long?

jlliagre 10-01-2007 12:01 PM

Try that one.
Code:

find . -mtime +90 -type f | sed -e 's/^\.\///' -e '/^\./d' > file.out


All times are GMT -5. The time now is 03:03 PM.