|
Posted by Taras_96 on 01/25/07 04:14
>
> | This to me implies that the function need not know what the bytes
> | represent, it operates on the data as a raw byte stream.
>
> that's correct.
>
Thus, by this definition, wouldn't strpos NOT be binary safe, since it
needs to know something about what is represented by the raw byte
stream? In particular, that the byte 0x00 represents the end of a
string.
Going back to my example, say we pass in strpos('a','cat') with the
strings encoded in UCS-2.
So, in terms of bytes, strpos would be passed in 0x00 0x16 as the first
parameter. Because the function imposes some meaning on specific bytes,
in particular 0x00, the function would conclude that the first
parameter was an empty string. Strpos can't blindly operate on the
bytes it receives, it must interpret them to find the end of strings.
Compare this with say an array_join function, where the two parameters
need to just melded together - no interpretation of the input byte
sequences are needed whatsoever, you just need to join the two
together! This to me seems a more correct view of operating on the byte
stream.
> strpos would recognize '00' as two characters of a string...not as one
> individual byte equal to \0. this is where your en/decoding comes into play.
I would have thought if you passed in the byte sequence '0x001600' (the
null terminated string 'a' encoded in UCS-2) strpos would *not*
recognise the first byte as the characters 00 - this would be encoded
as (well, in ASCII anyway) '0x0303'.
>
> | This post:
> |http://groups.google.com/group/php.general/browse_thread/thread/c401d...
> | offers a different definition, which doesn't make much sense to me.
>
> not different at all. it doesn't seem that either sources have made much
> sense to you (not trying to be rude).
>
No offense taken - if they made much sense I wouldn't be posting
Taras
[Back to original message]
|