LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (http://www.linuxquestions.org/questions/linux-software-2/)
-   -   grepping "'s and .'s and the such (http://www.linuxquestions.org/questions/linux-software-2/grepping-s-and-s-and-the-such-603409/)

secretlydead 11-29-2007 11:56 PM

grepping "'s and .'s and the such
 
Hi,

I've got about a hundred thousand million files that I need to do a find replace on.

They are php files. and there are strings like so:

".$studentname.".tablename

that need to be changed to:

student.".$studentname."tablename

when i grep for ".$studentname.".tablename , it also give me results like $studentname, not even accounting for the .'s and the $'s, etc.

going through the whole thing, finding and replacing with geany is a seriously tedious option that i hope to avoid.


anyone got a clue how this could be done?

matthewg42 11-30-2007 12:28 AM

If you go the command:
Code:

grep "foo bar" input_file
The shell processes the quotes when it handles the command. In this case it means the pattern is:
Code:

foo bar
including the space. Grep therefore receives this list of arguments in argv (excluding argv[0] which is the program name):
  1. foo bar
  2. input_file
The thing to note is that the shell strips out the quotes because it is using the shell is interpretting them. If you want to prevent the shell from interpretting the quotes, you can do one of many things, depending on what you want to happen. Firstly, you can escape each quote:
Code:

grep \"foo bar\" input_file
This might not be what you want in this case, because the input parameter list will be:
  1. "foo
  2. bar"
  3. input_file
so the search pattern would be foo" and the intput files would be bar" and input_file.

You can also use a double quote within double quotes, by escaping it:
Code:

grep "foo\"bar" input_file
This might not be what you want in this case, because the input parameter list will be:
  1. foo"bar
  2. input_file

You can also put double quotes within single quotes. Single quotes prevent anything from their midst from being interpretted by the shell:
Code:

grep 'foo"bar' input_file
argv would look like this:
  1. foo"bar
  2. input_file

The single quotes are also useful for you because you have as part of your patter the $ character followed by a word. If you did not quote the string, or if you used double quotes, the shell would interpret $studentname to mean "the value of the shell variable studentname". This is almost certainly not what you want. Consider this:
Code:

grep \".$studentname.\".tablename input_file
Here we have escapes the quotes, so they will make it through to the grep argv list, but the shell will replace the $studentname with the value of that variable in the shell, which is probably not set. So grep will get this argument list:
  1. "..".tablename
  2. input_file

To prevent both quote expansion and variable expansion, just whack the whole lot in single quotes, like this:
Code:

grep '".$studentname.".tablename' input_file
Here's what grep sees...
  • ".$studentname.".tablename
  • input_file

But wait, this is not what you really want... grep interprets the first argument as a regular expression pattern, and in this both . and $ have special meanings. If you want to find literal strings instead of searching for regular expressions, you can use fgrep instead. You might have to know about the regular expressions though because you want to do a replace, and probably you will move on to using sed for that, which will expect to search and replace using regular expressions...

You can go an read a regular expression tutorial if you want to know exactly why, but here I will just say that you need to prefix the $ and the . characters with a backslash to make grep treat them like a litern character, and not as a special pattern:
Code:

grep '"\.\$studentname\."\.tablename' input_file
Here's what grep sees...
  • "\.\$studentname\."\.tablename
  • input_file


Here's a handy hint: If you want to check to see what grep would see after quote processing of the shell, just pass that string to the echo command and have a look at the output. e.g:
Code:

% echo '"\.\$studentname\."\.tablename'
"\.\$studentname\."\.tablename


secretlydead 11-30-2007 12:46 AM

thanks, i appreciate the reply... it didn't occur to me to escape the $

what i did was set the variable like so:
myvar2='".$username.".vocablists'

and then ran the command

grep -i $myvar2 ./*


which works well, and paves the way for using sed to replace it

matthewg42 11-30-2007 01:05 AM

Tha works to get the quotes and $ to grep, but you still need to escape them in the regular expression. $ means "end of line" and . means "any character". The . might result in a false positive. GNU grep seems to be smart enough to realise that a $ in the middle of a pattern should be treated as a literal character, but this might not be true in other grep implementations.

To illustrate the false positive problem:
Code:

echo -e "one.two\none8two" |grep 'one.two'
one.two
one8two

You can see that the pattern with the . matches both, and probably should not. You should escape . characters in the RE pattern if you want to detect only a . character:
Code:

echo -e "one.two\none8two" |grep 'one\.two'
one.two


secretlydead 11-30-2007 01:48 AM

i see.

but if i grep for '\$username\.'

it still gives me responses like:

$username;
and
$username (that's a space after it)

rhoekstra 11-30-2007 02:07 AM

With compliments to matthewg42, with his very clear explanation on grep.. well done..

As for the solution, you could consider using sed instead of grep.. sed can parse files and do search and replace in them..

consider this:
Code:

sed -i.bak 's/"\.\$studentname\."\.tablename/student."$studentname."tablename/g' <filename>
Now, assuming sed is modern enough, this will issue sed to do an inline search-replace in a file and store a <file>.bak as a back-up.. of course it's up to the user (you) if you want to use this way of editing... Could be tricky, but I happily use it most cases.

matthewg42 11-30-2007 03:49 AM

Quote:

Originally Posted by secretlydead (Post 2975310)
i see.

but if i grep for '\$username\.'

it still gives me responses like:

$username;
and
$username (that's a space after it)

Not for me:
Code:

% cat testfile
$username.
Note that the following line has a space at the end:
$username
$username_

% grep '\$username.' testfile
$username.
$username
$username_

% grep '\$username\.' testfile
$username.



All times are GMT -5. The time now is 03:41 PM.