LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 08-07-2004, 02:44 AM   #1
amit_28oct
Member
 
Registered: Feb 2004
Posts: 31

Rep: Reputation: 15
how can I seprate normal text from html tags spell check it & then again place it ins


Hello
I am devloping a spell check utility using PHP4.2.2 on Redhat9, Apache & Netscape7.1.
I have devloped it successfully for normal text.
but problem comes when I try to check spellings in html formated text.

heres how it works
I have one php named compose_richtext.php. I have provided a toolbar which I got from http://www.kevinroth.com/rte/demo.htm. here user can type formated text. Their is in built feature given in that toolbar for spell check but I don't want to use that for some reasons. So I implemnted my own code.
when user clicks on spell check the whole data is passed to next page i.e Spellcheck.php their I am using pspell_check & pspell_suggest for spellchecking.
But the problem is how to distinguish actual text from html tags. I looked in php manual & found

htmlentities
nl2br
htmlspecialchars

but I don' think this will work for me.
I can easily use strip_tags but then I will loose all the tags so how will I place the actual text again in formated form.

I read something like tidy->ishtml is their but that don't works with php4.2.2 also I was unable to find proper documantation of how to use it.

So can u please help me & tell me that how can I seprate normal text from html tags spell check it & then again place it inside html tags.


Amit
 
Old 08-07-2004, 04:38 AM   #2
kev82
Senior Member
 
Registered: Apr 2003
Location: Lancaster, England
Distribution: Debian Etch, OS X 10.4
Posts: 1,263

Rep: Reputation: 50
i dont have pspell on any php installation i have access to but you should find it simple to adapt this code

PHP Code:
<?php
 $ary
=preg_split('((<[^>]*>)|( ))'$_GET['stp'], -1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);

 
$out="";

 for(
$i=0;$i<count($ary);$i++) {
  
$a=$ary[$i];
  if(
$a==" " || substr($a,0,1)=="<") {
   
$out.=$a;
  } else {
   
//spellcheck here
   
$out.=$a;
  }
 }

 print(
$out);
?>
 
Old 08-07-2004, 05:16 AM   #3
amit_28oct
Member
 
Registered: Feb 2004
Posts: 31

Original Poster
Rep: Reputation: 15
Thanks for ur support kev.
but with this code what will happen if users text message contains < > signs for eg user text could be
Arrow looks like this >------->
& the second thing which I mentioned in my Question was how will I place all the text in html format again after checking spellings because in this case we will loose all the info about previous formating. Because String $out is just plain text.

Here is the code which am using


Regards
Amit
Code:
<?php 
		    

$formatedtext=base64_decode($rtedata);
$normaltext=strip_tags($formatedtext);
$word=' ';
$updated_msg='';
if($_REQUEST['Button']=="spellcheck_rich"){
	$pspell_link = pspell_new("en");
	//$text=base64_decode($Compose);
	$str_len=strlen($normaltext);
	for($i=0; $i<=$str_len; $i++){
		if(ord($normaltext[$i])==10){
			echo "<br>";
			$updated_msg=$updated_msg."\n";
		}else if( (((ord($normaltext[$i])<=64) || ( (ord($normaltext[$i]))>=91 && (ord($normaltext[$i]))<=96 ) ||  ( (ord($normaltext[$i])) >=123 )) &&  ( (ord($normaltext[$i])) != 10 ))) {
			echo $normaltext[$i];
			$updated_msg=$updated_msg.$normaltext[$i];
		}else{
			
			if($word==' ')
			{
				$word=$normaltext[$i];
				$loc=$i;
			}
			else
				$word=$word.$normaltext[$i];
			if($normaltext[$i+1]==' '  || $normaltext[$i+1]=='\n' || $normaltext[$i+1]=='.' || $normaltext[$i+1]==',' || ( (((ord($normaltext[$i+1])<=64) || ( (ord($normaltext[$i+1]))>=91 && (ord($normaltext[$i+1]))<=96 ) ||  ( (ord($normaltext[$i+1])) >=123 )) &&  ( (ord($normaltext[$i+1])) != 10 )))){
				if (!pspell_check($pspell_link,$word)) {
					$size=strlen($word);
?>
		<input type=text value="<?php echo $word  ?>" size="<? echo strlen($word); ?>" name=elements<? echo "_".$loc."_".$size ?>>
		<input type=button value="?" onclick="Suggest('<?php echo base64_encode($word) ?>','<?php echo $loc ?>','<?php echo $size ?>')">
		
		
<?php
			$updated_msg=$updated_msg.$word;
				}else{
					echo $word;
					$updated_msg=$updated_msg.$word;
				}
				$word=' ';
			}
		}
	}//For closed
}

$updated_msg=base64_encode($updated_msg);
?>

Last edited by amit_28oct; 08-07-2004 at 05:47 AM.
 
Old 08-07-2004, 05:42 AM   #4
kev82
Senior Member
 
Registered: Apr 2003
Location: Lancaster, England
Distribution: Debian Etch, OS X 10.4
Posts: 1,263

Rep: Reputation: 50
<edit>
Re: your edit

by amit_28oct
but with this code what will happen if users text message contains < > signs for eg user text could be
Arrow looks like this >------->


if there not tags then you should have already converted them to &lt and &gt

the second thing which I mentioned in my Question was how will I place all the text in html format again after checking spellings because in this case we will loose all the info about previous formating. Because String $out is just plain text.

as i say $out is html, it is not just the plain text, run it and see.
</edit>

if the users message contains < > signs then how are you differentiating them from tags? The code i gave above does place stuff back into its original form with tags after youve spellchecked it so im not sure what you mean. below is what i believe to be correct code but as i dont have php compiled with the spelling libraries i cant check it. if it doesnt work i'll need to see the output.

PHP Code:
<?php
$ary
=preg_split('((<[^>]*> )|( ))'$_GET['stp'], -1,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);

$out="";
$psl=pspell_new("en");

for(
$i=0;$i<count($ary);$i++) {
  
$a=$ary[$i];
  if(
$a==" " || substr($a,0,1)=="<") {
   
$out.=$a;
  } else {
   if(
pspell_check($psl,$a)) {   
    
$out.=$a;
   } else {
    
$out.=sprintf("<font color=\"#ff0000\">%s</font>"$a);
   }
  }
}

print(
$out);
?>
this should take a GET variable stp and spellcheck each word and output it with tags preserved but each incorrectly spelt word should be in red.

Last edited by kev82; 08-07-2004 at 05:50 AM.
 
Old 08-07-2004, 06:31 AM   #5
amit_28oct
Member
 
Registered: Feb 2004
Posts: 31

Original Poster
Rep: Reputation: 15
hello kave
hear are 2 sample outputs for different values of stp
stp="<span style=\"font-weight: bold;\">say</span>";
style="font-weight: bold;">say
stp="<b>say sayy say</b>";
say sayy say

Try yahoo's spell check function in Netscape & then try yahoo's facility of typing formated text (In IE) if possible. observe how spell checking works in boths cases that is what I am trying to do.

Amit

Last edited by amit_28oct; 08-07-2004 at 06:44 AM.
 
Old 08-07-2004, 07:09 AM   #6
kev82
Senior Member
 
Registered: Apr 2003
Location: Lancaster, England
Distribution: Debian Etch, OS X 10.4
Posts: 1,263

Rep: Reputation: 50
well one of us is doing something very wrong cos when i run your examples through my code it comes out fine

check out http://khn.homelinux.net/kev/spell.php?stp=<span style=\"font-weight: bold;\">say</span>

it is definatly spellchecking the right stuff, and not the others.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
strip html tags rblampain Programming 6 08-07-2005 06:22 AM
Vim Spell Check Inside Tags? Optimistic Linux - Software 1 04-15-2005 02:36 AM
Bash script for correcting HTML tags hq4ever Programming 4 11-08-2004 04:06 AM
regular expression for parsing html tags Bert Linux - Software 3 10-14-2002 04:31 PM
vi spell-check? jeri Linux - General 2 09-01-2001 03:41 PM


All times are GMT -5. The time now is 09:57 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration