|
Posted by Steve on 01/24/07 23:16
"Taras_96" <taras.di@gmail.com> wrote in message
news:1169556359.958468.157380@s48g2000cws.googlegroups.com...
| Hi everyone,
|
| I'm lost as to the meaning of a function being 'binary-safe'. The
| Wikipedia entry states that:
|
| "essentially one that treats its input as a raw stream of data without
| any specific format" and that
|
| "Most functions using any special or markup characters, such as escape
| codes or those that expect null-terminated strings are not binary
| safe".
first, wiki is not what i'd call an entirely accurate resource. most of the
time the answers are very good, but to those with no reference point in the
subject matter, there is no measure to discern the bad. second, languages
handle 'special', 'markup', 'escape codes', and 'null-terminated strings'
differently. that is a general statement at best and as a 'for instance'.
perhaps the author should have further qualified his definition with
examples from a specific language. anyway...
| This to me implies that the function need not know what the bytes
| represent, it operates on the data as a raw byte stream.
that's correct.
| If this is the
| case, I can't see how strpos would be binary-safe (it is documented as
| such).
just like c (and other c-ish languages), php sees a string as an array of
bytes. the null-terminator tells php where the array ends (essentially).
so...it is binary-safe.
| For example, by calling 'strpos('a','cat')', surely the function
| itself must know that the strings 'a' and 'cat' are null terminated,
since php knows how php stores strings, this would be true. it is
binary-safe since it knows where to start and stop looking for the 'needle'
in the 'haystack'. both arguments treated as an array of bytes.
| and thus according to the wikipedia definition (since special
| significance is given to the null character) the function is not binary
| safe.
fuck wiki. it's usually close but is not a reliable source of information.
| To further illustrate my point, what would happen if the strings
| that were passed in were encoded in UCS-2, where the string 'a' would
| be encoded as '00 61' (assuming big endian encoding) - wouldn't strpos
| recognise the '00' byte as an ASCII null character, and conclude that
| the first parameter was simply an empty string?
then i suppose you'd decode them from ucs-2 into a string, at which point
php would know what it was dealing with. what if the incoming was a base 64
encoded string? i suppose you decode it and, viola. either way, php
consistently sees these strings as an array of bytes - whether encoded or
decoded.
strpos would recognize '00' as two characters of a string...not as one
individual byte equal to \0. this is where your en/decoding comes into play.
| This post:
|
http://groups.google.com/group/php.general/browse_thread/thread/c401d89a4a68e94b/e7820fe40145ac7b?lnk=st&q=what+is+%22binary+safe%22&rnum=4&hl=en#e7820fe40145ac7b
| offers a different definition, which doesn't make much sense to me.
not different at all. it doesn't seem that either sources have made much
sense to you (not trying to be rude).
| To add to the confusion, other websites suggest that 'binary-safe'
| simply implies that the function can operate on the given data without
| altering it.
why is that confusing? not only is it correct in this context, but there are
other meanings of 'binary-safe' in programming.
| What exactly is binary-safe in the PHP context? Could examples of
| binary-safe and non binary-safe functions be given?
fairly close to what wiki states...just not enough that you are
understanding it. you've given yourself the first binary-safe example,
strpos. just google for others...or look at the other php string functions
and see which are, and which are not safe...then, given what each does, see
if you can infer why one is yet another isn't.
Navigation:
[Reply to this message]
|