LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   piping gzip into split (https://www.linuxquestions.org/questions/linux-newbie-8/piping-gzip-into-split-4175441252/)

soupmagnet 12-13-2012 01:11 AM

piping gzip into split
 
I'm working on a shell script for an Android device that continually saves the output of logcat to series of files for later reference. The script as it stands is as so:

Code:

logcat | split -l 1000 - /some_dir/log-
I need to somehow pipe gzip into it but it doesn't seem as though either the shell or Android is responsive to wildcards or regular expressions...or maybe I'm doing it wrong.

If I write the code as so:

Code:

logcat | split -l 1000 - /some_dir/log- | gzip -c /some_dir/log-*
...it returns something to the effect of "gzip: some_dir/log-* : No such file or directory", even though there are files in that directory that match that pattern.

I would like to zip each file being made if possible. What am I missing? Will regular expressions like [^\.gz$] in this situation? (I'm not even sure if that's the correct usage to elliminate files already gzipped).

pan64 12-13-2012 04:12 AM

the problem with your example is that split will write files, nothing will be piped to gzip.
you will need to first compress and split:
http://stackoverflow.com/questions/1...z-zip-or-bzip2
http://www.linuxquestions.org/questi...-files-347840/
http://techlogbook.wordpress.com/200...iles-in-linux/
or you will need to write a small script to compress files after splitting (on the fly). That would require some perl or similar knowledge

ntubski 12-13-2012 10:45 AM

If you have GNU split you can use the --filter option:

Code:

logcat | split -l 1000 --filter='gzip > $FILE' - /some_dir/log-
But it's quite likely Android has a smaller version of split that doesn't support that.

soupmagnet 12-13-2012 03:36 PM

Quote:

Originally Posted by ntubski (Post 4848578)
If you have GNU split you can use the --filter option:

Code:

logcat | split -l 1000 --filter='gzip > $FILE' - /some_dir/log-
But it's quite likely Android has a smaller version of split that doesn't support that.

You're right. Android doesn't seem to support '--filter'


I've decided it's probably best to run a separate script to check the contents of "/some_dir" for anything NOT ending in .gz and zip thos files accordingly, but I'm having trouble with regex.



Let's say, in "/some_dir", files are listed as such:

log-aa.gz
log-ab.gz
log-ac.gz
log-ad
log-ae
log-af

Entering:
Code:

ls *.gz
..will obviously output
log-aa.gz
log-ab.gz
log-ac.gz

But I need to match everything BUT that which contains or ends in ".gz". I know I have to somehow get "negative lookahead" to work, but it doesn't seem to want to. Taking the most simplistic approach (to avoid problems with the "."), I tested a few commands like:

Code:

ls "(?!gz)"
Code:

ls "*(?!gz)"
Code:

ls "(?!gz)$"
Code:

ls "*(?!gz)$"
...all of which return "No such file or directory", including on my laptop (so as to elliminate Android as the culprit).

I understand "plenty" about matching strings with regex, just not inversely. What am I missing? I am pretty confused in the correct usage of negative and positive lookahead and I haven't found much that is either useful or easily understood.


Eventually I would like to include the "." in the pattern because as you can tell, there will eventually be a file saved as "log-gz" which would be matched against the "gz" argument.

Quote:

Originally Posted by pan64
That would require some perl or similar knowledge

As far as I am aware, Android doesn't support Perl so that may be out of the question.

[edit:] Also, the script is logging the output of 'logcat' which is continuously running. Without 'split' the files saved are too large to be displayed in one text file, so zipping the file and then splitting it would be pointless because each file only would have the beginning output of 'logcat' and not the most recent. As it stands now, a file will be created and will filled up until it gets to 1000 lines and creates a new file with each new file containing the most recent data.

chrism01 12-13-2012 08:08 PM

This seems to work
Code:

cd tmp
for file in *
do
        if [[ $file != *.gz ]]
        then
                gzip $file
        fi
done

Code:

#before
 ls
log-aa.gz  log-ab.gz  log-ac.gz  log-ad  log-ae  log-af  t.t

#After
 ls
log-aa.gz  log-ab.gz  log-ac.gz  log-ad.gz  log-ae.gz  log-af.gz  t.t.gz


soupmagnet 12-13-2012 10:45 PM

Quote:

Originally Posted by chrism01 (Post 4848944)
This seems to work
Code:

cd tmp
for file in *
do
        if [[ $file != *.gz ]]
        then
                gzip $file
        fi
done

Code:

#before
 ls
log-aa.gz  log-ab.gz  log-ac.gz  log-ad  log-ae  log-af  t.t

#After
 ls
log-aa.gz  log-ab.gz  log-ac.gz  log-ad.gz  log-ae.gz  log-af.gz  t.t.gz


I tried using a 'for' loop but it never worked for me (improper use of variables I'm guessing).

Regardless, I worked out something that could be done all in one script.

Code:

function1(){

while true; do

    logcat | split -l 1000 /some_dir/log-

done

}

function2(){

while true; do

    var=`find /some_dir -ipath "*-??"`

    sleep 1          #to avoid an error when the script starts
    gzip $var
    sleep 3599        #script will check every hour for unzipped files

done

}

function1 | function2

So far it's working exactly as it should. Thanks anyway for everyone's help.


All times are GMT -5. The time now is 03:39 PM.