How long can my command-line be?
I was writing a bash script that builds a very long list of file names that are passed to a program. Unfortunately, I was getting a lot of "Argument list too long". The last time I ran ./configure I was told that I could have command-line 2^31 long. Well, I would often get this message when I had only 1800 parameters... sometimes. Other times I could fit 2000. I came to the conclusion that the sheer number of characters on the command-line was too long. So I thought that maybe if every 100 file names I put a backslash and new line that that would work. It didn't.
So if a command-line can only have a certain number of characters, how do I find out what this number is? Can I pluck this out of an environment variable? Or is there another reason I'd be getting this? It occurs to me that the call I was making was to tar (before I discovered --files-from) and the message was: Quote:
|
It's a kernel limit, named ARG_MAX and defined in limits.h. You can query it with getconf
Code:
c:~$ getconf ARG_MAX |
Note that this buffer containing the argument list is also shared with all of your shell variables... if you have a lot of variables set, or if one variable is very large, you may run in to 'argument list too long' relatively quickly. I learned this the hard way: I have a number of shell functions set up which load data in to shell variables, and one of them has a tendency to write a lot of data to one of my variables in certain situations (this is an ... *ahem* undocumented feature). This left no space in the argument list. I had to run 'set | less' to find the offending variables, then clear them.
The answer to this problem is to use xargs: in a directory which contains 9000 files, use Code:
find . -maxdepth 1 -type -f | xargs gzip Code:
gzip * Reading the man pages of xargs, and truely grokking the contents is one of those things that will make you understand linux at a deeper level. |
Quote:
|
Quote:
Code:
$ foo=$(cat /dev/urandom | cut -c -2097152 ); ls * This worked: Code:
$ foo=$(cat /dev/urandom | strings | head -40980); ls * Code:
$ ls * | wc -l |
Thanks bartonski :)
Tried it but ... Code:
c:~$ foo=$(cat /dev/urandom | strings | head -40980); ls /usr/bin |
Quote:
|
Quote:
Code:
c:~$ foo=$(cat /dev/urandom | strings | head -40980); ls /usr/bin/* |
Quote:
Code:
c:~$ getconf ARG_MAX Code:
$ foo=$(cat /dev/urandom | strings | head -40980); ls * The way that I actually filled $foo was to run foo=$(cat /dev/urandom | strings | head $x); ls * I started with $x=20, and manually doubled $x until I got this to fail. I posted the first value of $x which failed for me.. I'm quite certain that there's a better way to fill $foo, but I was being lazy. Ok, I figured it out: this only happens if foo is exported. here's the code I ran: Code:
$ foo="xxx"; while [ ${#foo} -lt 4194305 ]; do echo -n "${#foo}: "; export foo="$foo$foo"; ls * | wc -l; done actually, on my system, Code:
$ getconf ARG_MAX I guess that I have at least 32768 bytes (32768=131072-98304) worth of stuff sitting around in the argument list buffer. That seems a little odd... (/me opens a new shell) Code:
$ set | wc -c Code:
$ echo "$(set | wc -c) + $(ls * | wc -c)" | bc |
The comments in limits.h are somewhat revealing:
Code:
#define ARG_MAX 131072 /* # bytes of args + environ for exec() */ |
Quote:
This is generic for *n*x processes rather than specifically for shell scripts. Netsearching did not turn up a good description but IIRC Stevens' UNIX Systems Programming described how each process has kernel-space memory and process-space memory. The kernel-space memory includes ARG_MAX space for data passed on the (v)exec* family of system calls -- executable (path) name, arguments and environmental variables. In the specific case of bash calling a bash script one of the *exec*e calls must be used or the envars would be lost. When you write "I wonder how you find out what the size of the exec() environment is ..." it should (TM!) be a two-dimensional null-terminated array of char* pointing to null-terminated envar names and a null-terminated envar values (or equivalent). Netsearching indicated that it is implementation-dependent whether all these pointers are in the ARG_MAX space or not so the scheme I have suggested places an upper limit on the space taken out of ARG_MAX for envar storage. |
Quote:
|
Quote:
|
Quote:
|
Quote:
If there's a potential problem, you should be avoiding it in first place. Just like when you are in C you need to care about where your pointers are going, or you need to check whether your malloc succeeded, don't you? ;) Sure we could get a shell with unlimited environment, and with dynamic memory handling, but this is besides the point. The truth is that, if someone wants bash to look like C++, then s/he should be using C++ in first place, because at some point, someone might also think of turning C++ into a bash clone mwhaha :jawa: Shells have always been this way, there are probably billions of shell code lines around the world, and those scripts would be much bigger on most other programming languages, even the higher level ones. Shell languages are as they are for a reason, they make a lot of assumptions to make the scripting a lot easier, and the limited environment is one of these assumptions. The shell way is that way. You either use xargs, or save the stuff to a file, or parse it on a loop. By the way, this is neither new nor specific to Linux. I can perfectly remember the "out of environment space" errors in DOS (any version), with both command.com and 4dos. The only shell that I remember right now that allocates environment space dynamically is the os/2 one, as far as I can remember. That doesn't mean there aren't more around that can do it... |
Quote:
|
Quote:
Eric Raymond puts it best in The Art of Unix Programming Quote:
|
Quote:
|
All times are GMT -5. The time now is 01:56 AM. |