|
Posted by Chung Leong on 01/27/06 02:23
ralphie wrote:
> i see.. eventhough i'd rather have proper utf8 in the database what?
> whatever somethimes you have to cut down your expectation to reality :-)
Ok, after playing around with this a bit more I have somewhat of a
solution. It's a ugly, nasty hack and quite limited. Basically, since
the mssql extension won't return the data as UTF-8, we will ask MSSQL
to return the text as binary, and then we convert it to UTF-8 with
iconv.
To insert some text, we first convert from UTF-8 to UCS-2, little
endian:
$ucs2 = iconv("UTF-8", "UCS-2LE", $utf8);
Then we put it into binary representation:
$a = unpack('H*hex', $s);
$hex = '0x' . $a['hex'];
Now stick that into a INSERT statement. MSSQL will do an implicit
conversion to nvarchar if that's the type of the column:
mssql_query("INSERT INTO table (nvarchar_col) VALUES($hex)");
Retrieval is a bit tricky, as the mssql extension is too dumb to get
varbinary bigger than 255. It handles image columns with no problem
however. So what we do is cast from nvarchar to varbinary, then from
there to image:
mssql_query("SELECT CAST(CAST(nvarchar_col AS varbinary(8000)) AS
image) AS col FROM table");
The data comes out in UCS-2, so we have to convert it to UTF-8:
$utf8 = iconv("UCS-2LE", "UTF-8", $ucs2);
It's not a great solution but seems to work in my tests. Conversion is
only possible between nvarchar and varbinary. You can't cast a ntext
column to image. So you're limited to nvarchar's 4000 maximum size.
Navigation:
[Reply to this message]
|