Reply to Re: Closing tags within <script></script>

Your name:

Reply:


Posted by Benjamin Niemann on 10/13/06 17:49

Hello,

Jeffrey wrote:

> I've found an oddity with HTML/Javascript that I'm hoping someone on
> this list could shed some light on for me. This arose when I was using
> the libxml parser to parse some HTML web pages.

libxml is correct (too correct for such a usage), these and other websites
not.

As you can obviously not fix documents that are not your own and far too
many documents on the web are malformed, invalid or simply a heap of s**t,
it is not a wise decision to use a strict parser like libxml.
There are special parsers built to deal with such 'tag-soup' documents,
e.g. 'Beautiful Soup' for Python
<http://www.crummy.com/software/BeautifulSoup/>.
There may be similar packages for the language of your choice (if it does
not happen to be Python).

HTH

--
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://pink.odahoda.de/

[Back to original message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация