|
Posted by deko on 02/12/07 09:58
>>>> As I understand it, the characters that make up an Internet domain name
>>>> can consist of only alpha-numeric characters and a hyphen
>>>> (http://tools.ietf.org/html/rfc3696)
>>> ..."Any characters, or combination of bits (as octets), are permitted in
>>> DNS names. However, there is a preferred form that is required by most
>>> applications.".....
>>
>> I just tried registering various domain names with an underscore. The
>> registrar's system rejected it. While this may not be the best
>> verification, I have yet to see a valid Internet domain with an underscore
>> or any other non-alphanumeric character (other than a hyphen).
>
> There are efforts to fully internationalise DNS entries, so even non-roman
> based character sets are allowed. See for instance
> <http://www.ietf.org/rfc/rfc4185.txt>. We're not there yet by a long shot,
> but there's no doubt it will happen.
Eventually, I'm sure.
Getting back to my regex question, I wonder if it would be better to check for
illegal characters:
if
(preg_match('/(`|~|!|@|#|$|%|^|&|*|(|\)|_|\+|=|\[|\{|\]|\}|\||;|\:|\'|\"|\<|\>|\?|)/',
$url_a['host'])) ???
I'm not having much luck catching invalid hostnames otherwise...
[Back to original message]
|