LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Shell Script or Command to Remove PDF file from large logs (https://www.linuxquestions.org/questions/linux-newbie-8/shell-script-or-command-to-remove-pdf-file-from-large-logs-891452/)

Doknik 07-13-2011 05:17 AM

Shell Script or Command to Remove PDF file from large logs
 
Hi,
I need to remove a large binary file(PDF file) from a large log file which is generated daily.This is seriously hogging space on our servers.I need to remove the large PDF from the logs to make the logs smaller and manageable

I need to take out the texts (or binary file) between the strings

<my:PDF> and </my:PDF>


<applicationForm> and </applicationForm>

<image> and </image>

<extractedSignature> and </extractedSignature>


I am not sure whether sed utility can do this, these are large files and need to be pruned .I am not seeking logrotation advice just a script or command that can strip these large logs of texts between the characters above . I am not sure how to do this.These files are rather large.I am not sure how to achieve this with sed , tail, head , tr or any other facility .
Your help would be greatly appreciated.

colucix 07-13-2011 05:42 AM

You can try:
Code:

sed '/<applicationForm>/,/<\/applicationForm>/d' file.log
if you are satisfied of the result (sent to standard output) then you can run the command again adding the -i option to actually change the file content. Or use -i.bck to keep a backup copy of the file itself, so that you can easily do a diff between the input and the output file. I would run multiple sed command to remove the different key pairs. Hope this helps.

Doknik 07-14-2011 08:39 AM

Colucix, Thanks a million i tried this and it works it ruduced the file size drastically..i really appreciate your help.have a good day


All times are GMT -5. The time now is 02:14 AM.