Posted by Benjamin Niemann on 03/29/06 11:53
Jo wrote:
> Thanks..
> Im writing a HTML parser that removes the tags and keeps using sensible
> text. This is in C#.Its like a tool.But, can i add another tool to it
> like HTML Tidy to cleanup? Wud that be right?
It can save you a lot of time. Tidy could also convert HTML to XHTML, which
could be then parsed with an XML parser and you can analyze the contents
more conviently with stuff like XPath, ...
For C# the article <http://www.devx.com/dotnet/Article/20505/0/page/1> may
help.
--
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://pink.odahoda.de/
[Back to original message]
|