LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   How to face the error "Argument list too long" in bash? (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-face-the-error-argument-list-too-long-in-bash-917019/)

wasim_jd 12-04-2011 01:29 AM

How to face the error "Argument list too long" in bash?
 
Hi All,
Me a new 1 here wandering.. could any one help me in writing a small script. Request goes like this..I have hundreds of files in a directory and when I am trying a simple awk its giving me
Code:

awk -F "," '{if($15==588 ) print $5,$48 }' 20111101*
-bash: /usr/bin/awk: Argument list too long

So I want to write a simple script where I can ave the output.
Files in directory goes with date wise...
2011MMDD0001
2011MMDD0002
2011MMDD0003
.....
....
.....
We can have a loop with date incrementing or pleae what ever you EXPERTS suggest..

Regards,
Wasim.

colucix 12-04-2011 01:59 AM

Yes, a loop over dates is exactly what you need. However from your example it seems that apart from date there is also a numeric counter in your file names, therefore you should take it in account in order to pass a single argument to the awk command, e.g.
Code:

#!/bin/bash
#
#  Initial date
#

date=20111101
#
#  Set initial counter
#

count=1
#
#  Loop over dates until final date is reached
#  and nested loop over count until file exists
#

while [[ $date -le 20111130 ]]
do
  while [[ -f ${date}$(printf "%04d" $count) ]]
  do
    awk -F, '{if($15==588 ) print $5,$48}' ${date}$(printf "%04d" $count)
    ((count++))
  done 
  date=$(date -d "$date 1 day" +%Y%m%d)
done


wasim_jd 12-04-2011 02:27 AM

Didnt work
 
Thanx colucix, but I didnt had any result. Further more I checked with a single I do have data in that. Could you please look back into that ading to that if I want to have the total count for a single file(single date) I have given like this..
awk -F "," '{if($84 ~ /^LA_FCC/) print}' 20110626* | wc -l.

Thanx alot in advance...

colucix 12-04-2011 02:31 AM

Maybe you need to change the file name specification inside the script with the actual names. Please, show us (all or an excerpt of) the output from
Code:

ls 20110626*
Thanks.

wasim_jd 12-04-2011 02:42 AM

Didnt work, actually the below fiiles again contains data in that ,
 
actually the below files again contains data in that , i want the count in each file say if field$2 matches LA_BC then I want the total number of counts of LA_BC. The script which you provided earlier doesnt given result too.
WS01:/opt # vi la.sh
#!/bin/bash
#
# Initial date
#
date=20111101
#
# Set initial counter
#
count=1
#
# Loop over dates until final date is reached
# and nested loop over count until file exists
#
while [[ $date -le 20111130 ]]
do
while [[ -f ${date}$(printf "%04d" $count) ]]
do
awk "," -F, '{if($84 ~ /^LA_BC/) print $2}' ${date}$(printf "%04d" $count)
((count++))
done
date=$(date -d "$date 1 day" +%Y%m%d)
done
WS01:/opt # ./la.sh
WS01:/opt #
nothing i got as result

201106260000000001.txt 201106260751020158.txt 201106261236040315.txt 201106261649450472.txt 201106261943430629.txt
201106260003000002.txt 201106260754030159.txt 201106261238100316.txt 201106261650550473.txt 201106261944540630.txt
201106260006010003.txt 201106260757020160.txt 201106261239030317.txt 201106261651010474.txt 201106261945000631.txt
201106260009040004.txt 201106260800030161.txt 201106261240190318.txt 201106261652090475.txt 201106261946270632.txt
201106260012030005.txt 201106260803020162.txt 201106261242010319.txt 201106261653290476.txt 201106261947430633.txt
201106260015030006.txt 201106260806000163.txt 201106261243500320.txt 201106261654000477.txt 201106261948040634.txt
201106260018040007.txt 201106260809010164.txt 201106261245010321.txt 201106261656070478.txt 201106261949260635.txt
201106260021040008.txt 201106260812000165.txt 201106261247320322.txt 201106261657050479.txt 201106261950310636.txt
201106260024050009.txt 201106260815050166.txt 201106261248000323.txt 201106261659350480.txt 201106261951030637.txt
201106260027030010.txt 201106260818030167.txt 201106261250480324.txt 201106261700020481.txt 201106261952210638.txt
201106260030030011.txt 201106260821050168.txt 201106261251040325.txt 201106261701570482.txt 201106261953480639.txt
201106260033000012.txt 201106260824030169.txt 201106261253570326.txt 201106261703010483.txt 201106261954040640.txt
201106260036040013.txt 201106260827000170.txt 201106261254020327.txt 201106261705250484.txt 201106261955260641.txt
201106260039010014.txt 201106260830000171.txt 201106261257040328.txt 201106261706040485.txt 201106261956530642.txt
201106260042010015.txt 201106260833030172.txt 201106261259290329.txt 201106261707550486.txt 201106261957000643.txt
201106260045050016.txt 201106260836020173.txt 201106261300010330.txt 201106261709020487.txt 201106261958170644.txt

colucix 12-04-2011 03:23 AM

The listing of your directory, reveals that the actual file names are not those one mentioned in your first post. In the suggested code I built the file name using your example, but now the solution must be different.

First, the count in the file names is not sequential. It looks like a random number. Therefore you can remove the count part from the script and loop over the actual file names using a for loop, e.g.
Code:

#!/bin/bash
#
#  Initial date
#
date=20111101
#
#  Loop over dates until final date is reached
#  and nested loop over count until file exists
#
while [[ $date -le 20111130 ]]
do
  for file in ${date}??????????.txt
  do
    awk -F, '{if($15==588 ) print $5,$48}' $file
  done 
  date=$(date -d "$date 1 day" +%Y%m%d)
done

Actually this can be condensed in a single loop (by removing the date part as well):
Code:

#!/bin/bash
for file in *.txt
do
  awk -F, '{if($15==588 ) print $5,$48}' $file
done

provided the files you want to check are all in the current working directory.

Dark_Helmet 12-04-2011 03:31 AM

Here's another option:

1. Create a basic script to do the awk and/or wc commands you want:
Code:

#!/bin/bash

# Begin /opt/awk_subscript.bash

filename=${1}
repetitionCount=$( awk -F "," '{if($84 ~ /^LA_FCC/) print}' "${filename}" | wc -l )
echo "${filename}: ${repetitionCount}"

# End /opt/awk_subscript.bash

2. Make the script executable:
Code:

chmod ugo+x /opt/awk_subscript.bash
3. Issue a cd command to move to the directory containing your data files

4. Issue the following find command:
Code:

find . -maxdepth 1 -type f -name '20111101*' -exec /opt/awk_subscript.bash "{}" \;
I have not tested any of this, but I have done something similar for other problems. Besides, I don't have any sample files.

Anyway, the general idea is this: the awk_subscript.bash file is written to handle a single file at a time. The find command causes the shell to find each file matching the '20111101*' wildcard in the current directory. As each matching file is found, awk_subscript.bash is launched with the matching filename as an argument.

This method is not an efficient use of system resources, but it will work.

wasim_jd 12-04-2011 04:31 AM

Thanx to colucix and dark helmet
 
I have created a file with the followng code given by colucix like vi la.sh
Code:

#!/bin/bash
for file in *.txt
do
  awk -F, '{if($15==588 ) print $5,$48}' $file
done

then
:wq
then i executed the file with
#./la.sh | wc -l
and this works fine what I needed but I dont know how efficient is this....

Thanx again.....

colucix 12-04-2011 05:43 AM

Quote:

Originally Posted by wasim_jd (Post 4541517)
and this works fine what I needed but I dont know how efficient is this....

For sure it is less efficient than running a single awk command. On the other hand awk is notable for its speed in parsing millios of lines in a bunch of micro-seconds. Anyway, due to the limitation in the length of the command line arguments list, there is no other way at this point. For an exhaustive explanation of the alternatives to circumvent the argument list too long problem, take a look at this article on Linux Journal.

An aside note: to post line of codes in your threads, it's better to enclose them in CODE tags. This improves the readability of the code, preserves spacing and indentation and more often shows to other users the exact solution to your problems, since the squeezing of spaces and/or hidden characters can be misleading in some cases.

To use CODE tags you have two options:
1. switch to Advanced mode, then select the text/code written in your post and press the # button
2. Literally write the CODE tags before and after your code as
[CODE]some code here[/CODE].
Cheers!


All times are GMT -5. The time now is 02:03 PM.