|
Posted by Jerry Stuckle on 07/11/06 11:52
Taras_96 wrote:
> Hi all,
>
> I was hoping to get some clarification on a couple of questions I have:
>
> 1) When should htmlspecial characters be used? As a general rule should
> it be used for text that may contain special characters that is going
> to be rendered in the browser (ie: text that isn't in tags)? I've got a
> javascript onclick handler whose code includes an ampersand and the
> HTML validator complains. I don't know if I should escape the
> ampersand, or even if its possible (seeing that the text is inside a
> HTML attribute).
>
Well, I haven't looked at the code, but I suspect htmlspecialchars(),
since it converts fewer characters and has fewer options, it would be
faster.
The HTML validator on w3.org is decent, but it doesn't handle javascript
very well. I just ignore the errors in javascript; for instance,
something like:
j=4&i;
The "&i" is not a valid html entity - but it's valid javascript code.
And this javascript wouldn't work:
j = 4%amp;i;
> Why would you ever use htmlentities as opposed to htmlspecialchars? The
> only reason I can think of is if you're page's charset doesn't support
> the special character you're trying to render (for example, the euro
> using Latin1), but then why wouldn't you just change the pages charset
> to UTF-8 (unless you're editor can't save in UTF-8, which might
> indicate its time to get another editor). The comment on the PHP manual
> entry for html entities, 'Please, don't use htmlentities to avoid XSS!
> Htmlspecialchars is enough!' seems to suggest that the uses for
> htmlentities is limited, since it needn't be used to avoid XSS.
>
Just changing the page charset doesn't change what PHP uses. You can
pass a charset to either function, but if you need more than the five
chars handled by htmlspecialchars() you need to use htmlentities().
And the notes are comments - from users, not the PHP developers. I give
it some credence, but not as much as the "official" word from the PHP
developers. And if you look through them enough, you'll find errors and
other people who get in and correct the errors. Not that much different
than what you find here on usenet.
> 2) A comment in the PHP manual entry for htmlentities states that their
> function can be used to 'replace any characters in a string that could
> be 'dangerous' to put in an HTML/XML file with their numeric entities
> (e.g. é for [e acute])'. Why would it be dangerous!?
>
Don't know here, but I suspect browsers may act differently in different
languages. But I have enough trouble with my native language, so I
really haven't worried about it. But again that's a user comment.
> 3) What are some typical uses of specifying HTTP input/output character
> encoding? If it is used to convert output, why wouldn't you just change
> the output page's char encoding? If its used to convert input from say
> UTF-8 to Latin1, couldn't you just use a function to do this?
>
I use it anytime I'm displaying data input by the user, read from a
database, etc. You never know when the data might contain a '<', a '"',
etc.
Changing the char encoding for the page doesn't convert any characters.
All it does is tell the browser how to handle the characters. It's up
to you, the programmer, to ensure the character encoding you use matches
that of the page.
> That's about it!
>
> Thanks in advance
>
> Taras
>
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Navigation:
[Reply to this message]
|