GREP help.. Urgent :(
Hi I am a newbie and need you expert's help.
I got a doc root (lets say /site/mysite/docs/) where i want to execute a recursive grep on all the directories and get a list of files in a file_list.txt Now search is like this 1. Capture all files which has "<!--# ((Any Text Here)) -->" 2. Capture all files that has "<!--# ((Any Text Here)) -->" as well as "<!--#include virtual= ((Path To SSI/HTML)) -->" BOTH 3. Ignore all file that has "<!--#include virtual= ((PATH TO SSI/HTML))-->" ONLY I was able to get first two points done with following find /site/mysite/docs/ -exec grep -ls '<!--#' {} \; > ssi_file_list.txt However my boss needs to cut off files which has "<!--#include virtual= ((PATH TO SSI/HTML))-->" ONLY. Can someone help ? |
Not sure that I've understood, however to exclude a match from your grep use -v, example
grep -something- | grep -v -e'<!--#include virtual= ((PATH TO SSI/HTML))-->' hope this helps |
If you're using a recent version of GNU grep you can try its recursive capability (option -r) to avoid the find command. Anyway, you have to apply grep multiple times to include all the requirements. For example, to find out the files that have only the <!--#include virtual= pattern, you might do something like this:
Code:
while read file Code:
BEGIN { Code:
#!/bin/bash |
You could probably combine like so with a small modification:
Code:
#!/usr/bin/awk -f Code:
find /site/mysite/docs -type f -exec ./test.awk {} \; |
@ Everyone :Thanks a ton guys.. Let me try out the options and update you Gurus :)
@ colucix : Hey I am sorry for using word like "Urgent" .. this is my first post ever to any forum .. didn't know the etiquettes :) And thanks again for your help!!! Saggu |
Quote:
|
Quote:
Is there an option with find where I can ignore all binary files like .gz, .tar, .mp3, .gif etc ? |
Quote:
#!/bin/bash ssi_list="/home/wwwdocs/ssi_list.txt" for line in $(cat $ssi_list); do grep '<!--#' $line | grep -v '<!--#include virtual=' if [ $? == "0" ]; then echo $line >> ssi_list_refined.txt fi done ssi_list.txt has list of all the files under /site/mysite/docs/ docroot ! its working.. thankyou for all your help :) |
All times are GMT -5. The time now is 03:48 PM. |