split a file and process resulting files in parallell
Hi,
I have huge (100 Gbytes) files which I need to post process after they have been generated.
I wonder if there is an easy way to split the file and then process each file coming form split file.
I mean doing it automatically on one command line without waiting for split output and then start the processing commands.
Here is an example where I want to split a file and the count number of McDonald words in each file.
split -b 1000m -a 3 sourcefile.txt resultfile | foreach splitted file do "grep -c McDonald"
I hope you understand what I want to do.
Br Mathias
PS. The file server is very fast so I do not expect IO Wait to be limiting factor.
|