Re: htmlentities & charencoding — PHP Programming Language

You are here: Re: htmlentities & charencoding « PHP Programming Language « IT news, forums, messages

Posted by Andy Hassall on 07/11/06 22:08

On Tue, 11 Jul 2006 17:36:20 -0400, Jerry Stuckle <jstucklex@attglobal.net>
wrote:

>Mel wrote:
>> On 2006-07-11 21:52:53 +1000, Jerry Stuckle <jstucklex@attglobal.net> said:
>>
>>> The HTML validator on w3.org is decent, but it doesn't handle
>>> javascript very well. I just ignore the errors in javascript; for
>>> instance, something like:
>>>
>>> j=4&i;
>>>
>>> The "&i" is not a valid html entity - but it's valid javascript code.
>>> And this javascript wouldn't work:
>>>
>>> j = 4%amp;i;
>>
>> No, it wouldn't, but valid XHTML _requires_ you to preclude the embedded
>> JavaScript with the appropriate CDATA marker. The character '&' is
>> reserved by the markup just like '>' and '<'. Not adhering to the
>> outlined standards simply encourages bad markup and makes cross-browser
>> compatibility more difficult. It's a big stretch to equate cross-browser
>> issues with unencoded ampersands, but it's not that difficult to deal
>> with. Javascript has some functional string methods for encoding HTML
>> entities.
>
>Who said anything about XHTML? This is straight html.
>
>And the point is - this is valid javascript, but the validator on w3.org
>doesn't recognize it as such. Therefore it spits out errors where there
>are none.

Yes, this seems to be backed up by HTML 4.01 appendix B.3.2, which even has an
example of the contents of a <script> element in VBScript using & as a string
concatenation operator.

http://www.w3.org/TR/html4/appendix/notes.html#notes-specifying-data

It discusses how to avoid accidentally closing the <script> element, but seems
to indicate that & doesn't start a character reference inside <script>, as
that's automatically CDATA. So validators producing errors in this case would
appear to be wrong.

However, validator.w3.org currently handles the example given without error. I
uploaded the following:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Strict //EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15">
<title>Page</title>
</head>
<body>

<script type="text/javascript">
j=4&i;
</script>

</body>
</html>

It responded:

This Page Is Valid -//W3C//DTD HTML 4.01 Strict //EN!

(it also validates as Transitional, unsurprisingly) Has its behaviour changed
recently? Did it used to produce errors in this case?

The "HTML Tidy" validator as used in the HTML Validator Firefox extension also
accepts & within <script> without complaint, and correctly complains about "</"
appearing in the script source.

--
Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool

Navigation:

Next in forum: Re: PHP is looking for php.ini file in c:\windows
Prev in forum: Re: Installed PHP now IIS crashes on restart
Thread view: Re: htmlentities & charencoding

[Reply to this message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация