|
Posted by Dam on 01/26/07 05:39
Hi,
You can try Edit Distance or Levenshtein distance, which are more
advanced similarity algorithms
http://www.google.com.ar/search?hl=es&q=distance+algorithm&btnG=B%C3%BAsqueda&meta=
http://en.wikipedia.org/wiki/Levenshtein_distance
I believe there are some T-SQL implementations to these methods.
However I guess Google must do more than look for most similar words to
your keywords. Maybe you could find most similar and used words to try
to detect what user is most probably going to look for.
Hope that helps,
Damian
On 25 ene, 20:38, "Pacific Fox" <tacofl...@gmail.com> wrote:
> I am trying to recreate the same functionality Google has in regards to
> suggesting words (not names), when you misspell something it comes up
> with suggestions.
>
> We have a list of words in the database to match against.
>
> I've looked at SOUNDEX but it is not close enough, DIFFERENCE is even
> worse.
> The only way I can get SOUNDEX to be more accurate is with
> SELECT [word]
> FROM [tbl_word]
> WHERE ( SOUNDEX( word ) = SOUNDEX( 'test' ) AND LEN( word) = LEN(
> 'test' ) )
>
> I've been looking at Regular Expression matching which I reckon would
> provide more accurate matches. Not sure how that will affect
> performance, as we could be talking about 20,000 records.
>
> Or also been looking at the Double Metaphone algorithm.
>
> Is there something else that I am missing, anyone know what to use in a
> situation like this?
>
> Thanks in advance.
[Back to original message]
|