Reply to Re: HTMLPurifier - Standard Compliant HTML Filtering

Your name:

Reply:


Posted by Ambush Commander on 08/19/06 15:33

John Dunlop wrote:
> Do you mean standards compliant, valid or something else? If you mean
> standards compliant - assuming that that includes HTML - you would have
> to assign meanings to all the ambiguous clauses of the HTML4.01 spec
> (strictly speaking, all of them). If you mean valid, you would have to
> guess or somehow infer what any invalid markup was intended to mean
> before you could sort it.

In a way, both. I can't be completely standards compliant, because
technically that would mean I'd let XSS through. What I can do is,
while disallowing XSS, ensure that any output the filter gives won't
break a XHTML 1.0 Transitional page's validation at the W3C validator.
This is no easy task, especially since the spec doesn't get everything
right (for example SGML exclusions). Currently, the only thing that's
bothering the filter are control characters and non-SGML allowed
codepoints: anything else you throw at it will be turned into something
that will validate.

As in valid, people use deprecated elements and attributes like <font>
and <center> all the time. The filter converts these into their proper
representations (<span style=""> and <div style="text-align:center;">)
So it can be quite smart about that sort of thing (it also does
automatic <p> tag closings, etc). Kind of like Tidy, the only thing is
that Tidy doesn't guarantee validation. We do.

[Back to original message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация