|
Posted by Erland Sommarskog on 03/13/06 13:55
David Portas (REMOVE_BEFORE_REPLYING_dportas@acm.org) writes:
> If you want to record more than 3ms precision then you'll have to use a
> non-DATETIME datatype. IDENTITY won't help you do that. If you don't
> need that level of precision then you can just increment the
> num_of_hits for each hit within the same 3ms timeframe. Again, IDENTITY
> doesn't help you. Either way my design is perfectly sound.
No, it isn't. Using no_of_hits is a kludge, and also a performance problem,
because instead of a plain insert, you need to do an IF EXISTS and then
INSERT or UPDATE. Similarly, if you want count the number of hits, you
need to sum no_of_hits, rather than doing count(*) which is likely to
give the optimizer fewer choices for an effective query plan.
The problem is fairly apparent here, because SQL Server has a fairly
low resolution on time. But the problem is not the resolution, the
problem is that time is a contiguous entity that does not have discreet
values. Using time for as a primary key has the same problem as using a
floating-point as a primary key.
On the other hand, IDENTITY is a discrete set of value that is easy
to work with. After all, that is all we know. This was hit 56762 that
was registered. It happens to have the same values in the database as
hit 56763, but we don't know if the values they model were the same or
not. Thus, it would be incorrect to handle them as being the same. The
only thing we can to discern them, is to add a number to the observations,
so that we know that they are distinct.
> Unfortunately these types examples just tend to become rather tedious
> "what if this..." and "what if that..." exchanges. As a result they
> aren't very informative. Fundamentally, my observation is simply this:
> that there is no information in duplicate data that cannot also be
> modelled in relational form (i.e. with keys).
Yes, these exchanges are tedious, when someone insists on putting the cart
before the horse, and tries to press a circular reality into a squared
model.
no_of_hits here is perfect example of this. It adds no information to
the data, but only serves to make the data more difficult to work with.
Certainly, it makes the relational purists sleep better at night, but
that is usually not what the customer is paying for.
--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
[Back to original message]
|