|
Posted by Erland Sommarskog on 04/01/07 10:00
pinney.colton@gmail.com (pinney.colton@gmail.com) writes:
> I have a SQL 2005 database that's created by a survey data collection
> system. Users of this system are fairly non-technical and have little
> to no conscious control over the datatypes. As a result, the data is
> mostly stored quite inefficiently as varchars. For example, there is
> data that could be stored in a column of bits and it's stored as a
> varchar value of 0 or 1. (Yuck, I know.)
Well, obvious varchar is not right, but it is not that much of a disaster,
although there is a two-byte overhead for each value. What is more important
is that there is a constraint that disallows any other values. But I
can guess from the background, that there isn't. :-)
More generally, there is not really an agreement in the SQL community how
to store boolean values. Bit is probably the most commonly used data type,
and I am that camp myself. But others advocate things like:
iscompleted char(1) NOT NULL
CONSTRAINT ckc_iscompleted CHECK (iscompleted IN ('Y', 'N'))
Others use T or F for true or false. Or J or N as we did in an older
incarnation of our system. (J from "ja", the Swedish word for "yes".) So
why not 0 or 1?
What I am getting at here is that while the varchar thing is not optimal,
it may not be broken enough to warrant a change. It may be sufficient to
add a constraint to ascertain that there are no other values.
> I am building a reporting system using this raw data and have a new
> table structure designed that is much more efficient (and better for
> reporting). Does anyone have any suggestions for getting this data
> into my new structure? Specifically, how would you recommend checking
> that varchar field and determining it could be stored as a bit?
SELECT varcharcol, COUNT(*) FROM tbl GROUP BY varcharcol
should give you an indication of what really is in that column. Say that you
find something like:
0 1234
1 9802
Y 23
N 12
X 1
Then you could convert to bit in this way:
SELECT CASE WHEN varcharcol IN ('0', 'N') THEN convert(bit, 0)
WHEN varcharcol IN ('1', 'Y') THEN convert(bit, 1)
END
Question remains what you should do with that X column. Make the bit
column nullable and store it as NULL? Just drop that row on the floor?
--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
[Back to original message]
|