|
Posted by Arnold Shore on 05/30/06 03:56
Folks, I'm looking for an OSS version of subject application, smaller and
simpler than, say, OWL. I18n not needed, in an attempt to keep size and
complexity down - as an example.
The only semi-heavy stuff it needs to do is to parse and full-text index the
common MS Office file/formats like Word and Excel, and PDF's. And the
Porter stemmer function.
I'm satisfied that all the pieces exist. While I could build this using
antiWord, some php/excel and PDF classes, I want to make sure that ground
hasn't already been well plowed. I've hit the usual sources, FreshMeat and
SourceForge somewhat casually, but struck out.
Will appreciate any thoughts, URL's, etc. Thanks, all.
AS
Navigation:
[Reply to this message]
|