|  | Posted by Carl Vondrick on 08/10/06 19:27 
brock@gunter-smith.com wrote:> I'd like to be able to take a string and search within it for all words
 > (of the longest length possible) that are possibly contained within it
 > (in sequence, we're not re-ordering the letters in the string).
 > Obviously the brute force approach (which may be the only solution) is
 > to iterate through a dictionary file searching for occurances of each
 > entry within the string.
 
 It sounds like you are after a LCS (Longest Common Subsequence)
 implementation.  Just google for "longest common subsequence" and you'll
 get a thousand ways to do it.  Wikipedia has one that seems to work
 well: http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
 
 LCS is used in diff algorithms.
 
 > If anyone has done anything similar to this, were there any other
 > methods used to reduce the number of iterations required like using a
 > list of common words that are not generally elements of other words
 > that can be quickly broken out from the string? Or are there libraries
 > that may be of use in efficiently processing this type of search?
 >
 > e.g.  given the string "themeether", possible solutions might be
 > {'the','meet','her'} or {'theme','ether'}
 >
  Navigation: [Reply to this message] |