LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   PHP script/function to get 4-5 words around a keyword in a string(like google) (https://www.linuxquestions.org/questions/programming-9/php-script-function-to-get-4-5-words-around-a-keyword-in-a-string-like-google-199075/)

rmanocha 06-29-2004 09:13 AM

PHP script/function to get 4-5 words around a keyword in a string(like google)
 
hey guys,
I am writing a search engine for a website.
I was just wondering how i would write effecient/nice/clean PHP code for taking 4-5 words around a matched keyword in a string which contains the contents of a text file and then display the results in a similar fashion as google.
I was thinking that this can be done with regexps but since i am no pro with them...i shout out to you guys to help me.
Please suggest something.
Thanks

david_ross 06-29-2004 01:50 PM

Something like this?
Code:

<?

$string = "I was just wondering how i would write effecient/nice/clean PHP code for taking 4-5 words around a matched keyword in a string which contains the contents of a text file and then display the results in a similar fashion as google.";

$word = "just";

preg_match("/(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?$word ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)?/i",$string,$result);

print preg_replace("/$word/","<B>$word</B>",$result[0]);

?>

Will output:
I was just wondering how i would write effecient

Hero Doug 06-29-2004 02:47 PM

This will work as well.

PHP Code:

<?php

/* NOTE - It's your responsibility to format the search phrase removing any stupid characters like ^ and # */
/* The search phrase */
$SearchPhrase "linux tutorials articles php here things run test";

/* An example of keywords you might be searching */
$Keywords "This is an example of what you might find in a column containing keywords that are used to search for such things as linux tutorials or help with php. You may also notice that I'm running out of things to say, so I'll end this here.";

/* Find the amount of words */
$Words substr_count($SearchPhrase" ");

/* Explode the search phrase */
$Exploded explode(" "$SearchPhrase);

/* Put the keywords in a variable for highlighting */
$Highlighted $Keywords;

/* Loop through the words */
for($i 0$i <= $Words$i++){

    
/* Highlight the words by overwriting the same set of keywords */
    
$Highlighted preg_replace("/" $Exploded["$i"] . "/""<B>" $Exploded["$i"] . "</B>"$Highlighted);

}

echo 
$Highlighted;

?>


rmanocha 06-29-2004 03:33 PM

Thank guys,
I will try both these peices of code firt thing in the mornign and let you know how it goes.
I also wrote some code...but it is really ineffecient...i basically break up the string into an array at every space and then check each string with each search word(nested foreach loop....very ineffecient) and then print out the text around the amtched word.
well newayz...now that u guys have given me this code...i can remove the one i have written and use either of these.
thanks again.

rmanocha 06-30-2004 05:14 AM

allright....i tested out the both the codes and the segment given by david_ross seems to work better for my needs.
However i was looking for a few enhancements in the code.As of now..the code only finds the first match in the tet, highlights it and prints it out.
However i want to find all the matches and print them.
This is kind of the thing google does too.It tends to find all the matches and then prints them out.
Also if the search text provided by the user is more than one word, anothe rporblem arises.I changed the code given by david_ross to this:
Code:

$search_arr = explode(" ",$search);
        $out = "....";
        foreach($search_arr as $tmp) {
                            preg_match("/(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?$tmp? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)?/i",$text,$result);
                            $out .= preg_replace("/$tmp/i","<B>$tmp</B>",$result[0]);
                            $out .= "....";
        }
        return $out;

This works fine as in it finds the correct places and puts the dots in just fine.However the problem arises when the first term of the search word comes after the second word of the search phrase in the text.The output i get is in reverse order...as in i get the text snippet corresponding to the latter text match before relative to the other text match which sould come earlier since it comes first in the text even though the keywod corresponding to it comes second in the seach term.
I hope i have been able to explain my problem well and that someone will be able to help me.
Thanks again.

rmanocha 07-12-2004 07:54 AM

Quote:

Originally posted by david_ross
Something like this?
Code:

<?

$string = "I was just wondering how i would write effecient/nice/clean PHP code for taking 4-5 words around a matched keyword in a string which contains the contents of a text file and then display the results in a similar fashion as google.";

$word = "just";

preg_match("/(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?$word ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)?/i",$string,$result);

print preg_replace("/$word/","<B>$word</B>",$result[0]);

?>

Will output:
I was just wondering how i would write effecient

I was just wondering what I can do to make this code faster? currently if i do not include this code into my script...i get a load time of .003 seconds and if this is included...i get a time of 6.34 seconds.granted that I am currently using a P3 500 MHz with 128 MB ram, and that this would be much faster on other machines, I was still wondering if there is any way i can make this faster since there is a lot of difference in load times.
thanks

keefaz 07-12-2004 09:16 AM

To find all matches, just use the g switch in the regex of david_ross code like :
PHP Code:

preg_match("/(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?$word ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)?/ig",$string,$result); 

This way you don't need to loop as the entire text will be parsed and all matches will be printed.

rmanocha 07-12-2004 10:40 AM

Quote:

Originally posted by keefaz
To find all matches, just use the g switch in the regex of david_ross code like :
PHP Code:

preg_match("/(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?$word ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)? ?(\w+)?/ig",$string,$result); 

This way you don't need to loop as the entire text will be parsed and all matches will be printed.

thanks, but I now i dont want all matches to be highlighted...only the first match along with the neighbouring few words should be printed out with the search term highlighted.
I have this working right now...but the problem lies in the fact that this is taking too much time. It takes almost 8 seconds to display such a result with 2 hits only.
I need to improve the time and even though a faster machine will change this significantly, I know i can definitely improve on my code too. so again...I call out for any help that is ready to come my way...:)
sil vous plait.

rmanocha 07-13-2004 06:19 AM

anyone...someone....if you are a reg exp guru...please help me out here.


All times are GMT -5. The time now is 04:57 AM.