|  | Posted by Chung Leong on 08/11/06 00:28 
brock@gunter-smith.com wrote:> I'd like to be able to take a string and search within it for all words
 > (of the longest length possible) that are possibly contained within it
 > (in sequence, we're not re-ordering the letters in the string).
 > Obviously the brute force approach (which may be the only solution) is
 > to iterate through a dictionary file searching for occurances of each
 > entry within the string.
 >
 > If anyone has done anything similar to this, were there any other
 > methods used to reduce the number of iterations required like using a
 > list of common words that are not generally elements of other words
 > that can be quickly broken out from the string? Or are there libraries
 > that may be of use in efficiently processing this type of search?
 >
 > e.g.  given the string "themeether", possible solutions might be
 > {'the','meet','her'} or {'theme','ether'}
 
 That's very similiar to the Thai word-breaking problem.  Is that what
 you're trying to do, in fact? There's a link listing some of the
 approaches:
 
 http://www.fi.muni.cz/~xantos/poster/#x1-3000
  Navigation: [Reply to this message] |