| 
	
 | 
 Posted by Hugo Kornelis on 11/02/06 23:55 
On 1 Nov 2006 15:11:58 -0800, --CELKO-- wrote: 
 
>>> So I'll join in the challenge - how would your constraint look like if  it _has_ to work on MS SQL Server 2005? << 
> 
>This grep checks email format against RFC 3696 and was written by David 
>Thompson 
> 
>[a-z0-9!$'*+\-_]+(\.[a-z0-9!$'*+\-_]+)* 
>@([a-z0-9]+(-+[a-z0-9]+)*\.)+ 
>([a-z]{2}aero|arpa|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|travel) 
 
Hi Joe, 
 
The requirement was actually to check a URL, not an email address. But 
I'd be just as happy with a working CHECK constraint for well-formed 
email address - the actual issue here is that I believe that there ARE 
things that are better handled in the CLR than in ANSI SQL. 
 
Oh, and if you ever have to build a REAL email address validator, please 
find a better one than the one above - David Thompson appears to be 
unaware that people outside of the US also have mail now, since he 
disallows all non-US top-level domain names. 
 
>Let's translate it.  The SQL Server LIKE predicate is missing the 
>Kleene plus + and Kleene star * repetition tokens.  We can fake those 
>with a table of repetitions and a JOIN.  The * table has an empty 
>string in it in addition to repetitions of the character search 
>pattern. 
> 
>CREATE TABLE Symbols 
>(char_cnt INTEGER NOT NULL PRIMARY KEY, 
> token VARCHAR(255) NOT NULL); 
>INSERT INTO Symbols VALUES (1, '[a-z0-9!$'*+-_]'); 
>INSERT INTO Symbols VALUES (2, '[a-z0-9!$'*+-_][a-z0-9!$'*+-_]'); 
>Etc. 
> 
>So to get a pattern for a string with a dot in it, you would use 
> 
>SELECT 'Valid' 
>  FROM Symbols AS A1, Symbols AS A2 
> WHERE @target LIKE A1.token + '.' + A2.token 
>  AND LEN(@target) < char_cnt - 1; 
 
So how would you incorporate this logic in a SQL Server CHECK constraint 
which (as you undoubtebly know) is not allowed to access other tables or 
even other rows from the same tables? 
 
(snip) 
>The final tuning trick is the CASE expression from Hell.  Using the 
>order of execution of the WHEN clauses, test for common simple errors 
>and finish the monster like predicate. 
> 
>SELECT .. 
>  FROM .. 
> WHERE CASE 
>       WHEN <@ count is not one> THEN 'F' 
>       WHEN <illegal char in target> THEN 'F' 
>       WHEN <invalid suffix> THEN 'F' 
>       Etc. 
>       WHEN <LIKE predicate from Hell> THEN 'T' 
>       ELSE 'F' END = 'T'; 
> 
>Is this ugly?  Well, I have seen more tables with longer rows in a lot 
>of code posted here. 
 
I don't care if it's ugly - it's not a constraint! How is this SELECT 
going to prevent someone entering data that is not well-formed? 
 
>Is it easy to understand, if you read SQL?  I think so. 
 
It will probably no lonnger be easy to understand once it's finished (if 
you are able to get it working in a CHECK constraint at all!!). The 
first parts (<@ count is not one> and <illegal char in target>) will 
probably be quite straightforward, The next (<invalid suffix>) less so. 
And the <LIKE predicate from Hell>? Well, if that refers to the SELECT 
'Valid' (...) query from the start of your post, it's definitely not 
easy. I consider myself to be quite good at understanding SQL, and I had 
to look hard before I understood the trick. A less experienced SQL coder 
would probably not get it. 
 
(snip) 
>I think this demonstates that it is possible and could be put into a 
>Standard SQL procedure. 
 
If standard SQL allows a subquery in a CHECK constraint, then you have 
indeed demonstrated that it is possible in standard SQL. But the 
question was to implement the constraint in SQL Server, using only 
standard SQL (or rather, the subset of standard SQL that SQL Server 
implements). I maintain that it is not possible. 
 
>What do you do when the DB you are  DBA-ing has the grep you like in 
>your favorite CLR language, the grep fred likes in his favorite CLR 
>language, and so forth for a dozen different developers?  What if one 
>grep is not like another grep?  Look at how many versions of grep we 
>have in Unix alone. 
 
In such a case, I'd do the same as what I'd do if the DB I am DBA-ing 
has table and column names in PascalCase, Fred likes to code in 
camelCase and so forth for a dozen different developers. I would force 
all developers to follow one standard (set by me), making sure that 
management backs me on this. Developers who don't follow the standard 
will very soon realise that I am the one who decides what does and what 
doesn't get deployed on the production server, and that following the 
standards is a good way to improve the chance of getting the code 
deployed. 
 
With CLR, it's the same. I get to review the code before I install the 
assembly on the production server, so it better be in a language I am 
able to read. And if an email or URL validator is already installed, 
there's no chance that any second such validator will be added - if the 
first doesn't work satisfactory, just file a change request but don't 
duplicate code. 
 
I really don't see the big deal. If you put the code to validate in a 
SQL stored procedure or user-defined function, you'd also have to deal 
with various developers who might code competing versions of the same 
code. 
 
--  
Hugo Kornelis, SQL Server MVP
 
  
Navigation:
[Reply to this message] 
 |