|
Posted by Hugo Kornelis on 11/02/06 23:55
On 1 Nov 2006 15:11:58 -0800, --CELKO-- wrote:
>>> So I'll join in the challenge - how would your constraint look like if it _has_ to work on MS SQL Server 2005? <<
>
>This grep checks email format against RFC 3696 and was written by David
>Thompson
>
>[a-z0-9!$'*+\-_]+(\.[a-z0-9!$'*+\-_]+)*
>@([a-z0-9]+(-+[a-z0-9]+)*\.)+
>([a-z]{2}aero|arpa|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|travel)
Hi Joe,
The requirement was actually to check a URL, not an email address. But
I'd be just as happy with a working CHECK constraint for well-formed
email address - the actual issue here is that I believe that there ARE
things that are better handled in the CLR than in ANSI SQL.
Oh, and if you ever have to build a REAL email address validator, please
find a better one than the one above - David Thompson appears to be
unaware that people outside of the US also have mail now, since he
disallows all non-US top-level domain names.
>Let's translate it. The SQL Server LIKE predicate is missing the
>Kleene plus + and Kleene star * repetition tokens. We can fake those
>with a table of repetitions and a JOIN. The * table has an empty
>string in it in addition to repetitions of the character search
>pattern.
>
>CREATE TABLE Symbols
>(char_cnt INTEGER NOT NULL PRIMARY KEY,
> token VARCHAR(255) NOT NULL);
>INSERT INTO Symbols VALUES (1, '[a-z0-9!$'*+-_]');
>INSERT INTO Symbols VALUES (2, '[a-z0-9!$'*+-_][a-z0-9!$'*+-_]');
>Etc.
>
>So to get a pattern for a string with a dot in it, you would use
>
>SELECT 'Valid'
> FROM Symbols AS A1, Symbols AS A2
> WHERE @target LIKE A1.token + '.' + A2.token
> AND LEN(@target) < char_cnt - 1;
So how would you incorporate this logic in a SQL Server CHECK constraint
which (as you undoubtebly know) is not allowed to access other tables or
even other rows from the same tables?
(snip)
>The final tuning trick is the CASE expression from Hell. Using the
>order of execution of the WHEN clauses, test for common simple errors
>and finish the monster like predicate.
>
>SELECT ..
> FROM ..
> WHERE CASE
> WHEN <@ count is not one> THEN 'F'
> WHEN <illegal char in target> THEN 'F'
> WHEN <invalid suffix> THEN 'F'
> Etc.
> WHEN <LIKE predicate from Hell> THEN 'T'
> ELSE 'F' END = 'T';
>
>Is this ugly? Well, I have seen more tables with longer rows in a lot
>of code posted here.
I don't care if it's ugly - it's not a constraint! How is this SELECT
going to prevent someone entering data that is not well-formed?
>Is it easy to understand, if you read SQL? I think so.
It will probably no lonnger be easy to understand once it's finished (if
you are able to get it working in a CHECK constraint at all!!). The
first parts (<@ count is not one> and <illegal char in target>) will
probably be quite straightforward, The next (<invalid suffix>) less so.
And the <LIKE predicate from Hell>? Well, if that refers to the SELECT
'Valid' (...) query from the start of your post, it's definitely not
easy. I consider myself to be quite good at understanding SQL, and I had
to look hard before I understood the trick. A less experienced SQL coder
would probably not get it.
(snip)
>I think this demonstates that it is possible and could be put into a
>Standard SQL procedure.
If standard SQL allows a subquery in a CHECK constraint, then you have
indeed demonstrated that it is possible in standard SQL. But the
question was to implement the constraint in SQL Server, using only
standard SQL (or rather, the subset of standard SQL that SQL Server
implements). I maintain that it is not possible.
>What do you do when the DB you are DBA-ing has the grep you like in
>your favorite CLR language, the grep fred likes in his favorite CLR
>language, and so forth for a dozen different developers? What if one
>grep is not like another grep? Look at how many versions of grep we
>have in Unix alone.
In such a case, I'd do the same as what I'd do if the DB I am DBA-ing
has table and column names in PascalCase, Fred likes to code in
camelCase and so forth for a dozen different developers. I would force
all developers to follow one standard (set by me), making sure that
management backs me on this. Developers who don't follow the standard
will very soon realise that I am the one who decides what does and what
doesn't get deployed on the production server, and that following the
standards is a good way to improve the chance of getting the code
deployed.
With CLR, it's the same. I get to review the code before I install the
assembly on the production server, so it better be in a language I am
able to read. And if an email or URL validator is already installed,
there's no chance that any second such validator will be added - if the
first doesn't work satisfactory, just file a change request but don't
duplicate code.
I really don't see the big deal. If you put the code to validate in a
SQL stored procedure or user-defined function, you'd also have to deal
with various developers who might code competing versions of the same
code.
--
Hugo Kornelis, SQL Server MVP
Navigation:
[Reply to this message]
|