Posted by Jason Petersen on 10/04/47 11:10
On Thu, 10 Mar 2005 00:18:05 +0100, BlackDex <black.dex@lycos.nl> wrote:
> Hello ppl,
>
> I have a question about regex and html parsing.
>
> I have the following code:
> ---
> <p class=MsoNormal><font size=3 face="Comic Sans MS"><span lang=NL
> style='font-size:12.0pt;font-family:"Comic Sans MS"'> </span></font></p>
I'm guessing that you're writing a function to parse "HTML" that users
upload via a web form? I would start with a look at this Perl script
to fix code generated by MS products:
http://www.fourmilab.ch/webtools/demoroniser/
Also, PHP's "libtidy" extension might be useful for you, although I
haven't used it personally.
http://us2.php.net/manual/en/ref.tidy.php
[Back to original message]
|