Posted by Jochem Maas on 06/20/05 23:28
thanks steve!
Steve Edberg wrote:
> At 6:00 PM +0200 6/20/05, Cilliè wrote:
>
>>> out of interest what are you trying/going to do with
>>> such a list?
>>
>>
>> playing with categorizing stuff based on word frequency and relevance
>> to other stuff with similar word frequency.
>> "the" will give a lot of false positives :)
>
>
>
> You might want to look at full text indexing or text analysis/data
> mining software, eg:
>
> http://www.textanalysis.info/
>
> You could also check the stop-word lists from MySQL's fulltext indexing,
> or from search engines like htdig...
>
> http://dev.mysql.com/doc/mysql/en/fulltext-search.html
>
> http://www.htdig.org/
>
> Googling for the phrase "stop word list" also may be useful
>
> steve
>
Navigation:
[Reply to this message]
|