summing up count returned by several bash commands to divide load
I have a script that executes the following to count all instances with the matching regex:
zcat /path/to/logs/today/* | grep '%[A-Z0-9_]\+-' | grep -v 'Primary ID' | wc -l It basically looks through every file in the today directory, finds every line that matches that regex, and returns the SUM of those lines for a report I run. This works fine on a directory that contains <15000 individual files, but anytime I run it on a directory with <15K files, I get "argument too long", regardless of the piping that occurs after the zcat So the only way I can think to accomplish this, is to run it in stages, for example: Stage 1: ls -l | wc -1 (this returns total count of files in directory, ex: 45000) Stage 2: Divide by 3 = 3x15000 sets of files Stage 3: Run the command on the first 15000 files (listed alphabetically) and return that count: zcat /path/to/logs/today/* | grep '%[A-Z0-9_]\+-' | grep -v 'Primary ID' | wc -l Stage 4: Run the command on the second set of 15k and return a count Stage 5: Run the command on the third set of 15K and return a count Stage 6: Sum up the counts of all returned in all three sets Can anyone suggest a way to achieve this? The above command is executed in part of a script using variables for the directory. |
You can try:
Code:
for i in /path/to/logs/today/* ; do Alternatively you could use find to serialize such access: Code:
find /path/to/logs/today -name '*' -exec zcat {} ';' | grep '%[A-Z0-9_]\+-' | grep -v 'Primary ID' | wc -l In both cases, there is one zcat process per file. Otherwise you have to do some awkward thing like reading the file name 1000 times to make a list, then execute a zcat for that 1000 list -- and you still have the issue of creating that list using something like find. |
Quote:
You, Sir, are awesome. Thank you so much for taking the time to think about the problem and come up with a great solution. I ended up going with your second solution as it'll fit into my original script much more easily. I owe you a beer! |
All times are GMT -5. The time now is 07:14 AM. |