|
Posted by Norman Peelman on 12/04/06 04:28
<alxasa@gmail.com> wrote in message
news:1165114672.184887.241070@79g2000cws.googlegroups.com...
> Hello, could someone please see where/how this working spellchecker
> code parses words? It does a great job in my application, connects to
> a DB with words and don't want to rip & replace the whole thing due to
> this one glitch. I think it can be fixed. Basically, it accepts whole
> words like "green" "blue" "orange".... however, if a word like
> green8blue is sent thru, it ignores the 8 and considers 'green' 'blue'
> as two distinct words. The code has problems with special characters
> like ' apostrophe , and does the same thing. Maybe someone can see how
> it handles numbers and special characters, and come up with a more
> elegant solution. :))
>
<snipped>
> /**
> * matches all the words and puts them in the word array
> *
> * Matches all the words in the string it is passed
> * and puts them in the word array ready for checking.
> *
> * @param string $str // the string to get the words from
> * @return bool returns true if it can match some words or it returns
> false
> */
> function get_words_from_str($str)
> {
> $pattern = "/[a-zA-Z']+/i"; // pattern to make only word chars
> if (preg_match_all($pattern, $str, $word)) // match them
> {
> for($i=0;isset($word[0][$i]);$i++)
> {
> $this->words[$i] = $word[0][$i]; // store them in the word
> array
> }
> return true;
> }
> return false;
> }
>
Change: $pattern = "/[a-zA-Z']+/i"; // pattern to make only word chars
to: $pattern = "/[a-zA-Z0-9'_-]+/i"; // pattern to make only word chars
....that should get you pretty close.
Norm
[Back to original message]
|