|
Posted by Gleep on 12/16/85 11:57
On Mon, 04 Sep 2006 11:24:04 +0200, Wim Cossement <wcosseme@nospam.bcol.be> wrote:
>Hello,
>
>I was wondering if there are a few good pages and/or examples on how to
>process form data correctly for putting it in a MySQL DB.
>
>Since I'm not used to using PHP a lot, I already found out that
>addslashes() can be used escape some characters, but I'm having some
>more problems with for instance ä, å and µ (since the text is scientifical)
>Now some people also throw in htmlspecialchars() to convert those to
>HTML entities, but some nest htmlspecialchars() in addslashes() and
>others do the opposite.
>
>Is there a good and error proof way of ensuring that what one puts in a
>textarea gets stored and can be retrieved safe and sound?
>
>Thanks in advance,
>
>Wimmy
i found user comments in the php manual under htmlspecialchar
think these might help
also if you need to save special characters I sugget turning off magic quotes and that supresses
the backslashes normally adds with set_magic_quote_runtime(0);
After inspecting the non-native encoding problem, I noticed that for example, if the encoding is
cyrillic, and I write Latin characters that are not part of the encoding (æ for example -
ae-ligature), the browser will send the real entity, such as æ for this case.
Therefore, the only way I see to display multilingual text that is encoded with entities is by:
<?php
echo str_replace('&', '&', htmlspecialchars($txt));
?>
The regex for numeric entities will skip the Latin-1 textual entities.
A sample function, if anybody want to turn html entities (and special characters) back to simple.
(eg: "è", "<" etc)
function html2specialchars($str){
$trans_table = array_flip(get_html_translation_table(HTML_ENTITIES));
return strtr($str, $trans_table);
}
Quite often, on HTML pages that are not encoded as UTF-8, and people write in not native encoding,
some browser (for sure IExplorer) will send the different charset characters using HTML Entities,
such as б for small russian 'b'.
htmlspecialchars() will convert this character to the entity, since it changes all & to &
What I usually do, is either turn & back to & so the correct characters will appear in the
output, or I use some regex to replace all entities of characters back to their original entity:
<?php
// treat this as pseudo-code, it hasn't been tested...
$result = preg_replace('/&#(x[a-f0-9]+|[0-9]+);/i', '&#$1;', $source);
?>
Why '? The HTML and XML DTDs proposed ' for this.
See http://www.w3.org/TR/html/dtds.html#a_dtd_Special_characters
So better use this:
$text = htmlspecialchars($text, ENT_QUOTES);
$text = preg_replace('/�*39;/', ''', $text);
Navigation:
[Reply to this message]
|