|
Posted by rcamarda on 12/02/06 02:14
Celko, sometimes I cring when I see your replies.
1. I use an ETL tool, it creates the DDL, I just copied and pasted. My
extreme apologies.
2. I use an ETL tool, it handles the relationships and data validation.
I haven't put those in the tables and have relied on validation within
the tool.
3. This is a data warehouse, not an operational system. I build the
data for speedy reporting, not data entry.
4. Valid point, I need better hygen with alisases.
5. this might be weakness in my SQL skills. Example. the business keys
for the table im trying to update is ADMREP_ID, SESSION_ID and
SESSION_DAY. I concatinate them into a single field so I have a unique
key. I guess this would be my primary key.
6. Yes, it is a Y/N field. Tables that are slowly changing dimensions
have start_date, end_date, update_date and curr_ind. Y is the current
record as it is in the operational system.
7. This is an area that I would like to have more control within the
ETL tool. it looks at the data and determines what type to create. Of
course I could create the tables ahead of time, and not let the tool
create them for me.
8. the table I am trying to update is the summary table. Not sure what
you are meaning.
9. yes, if this were an operational database. This is a data warehouse
for reporting. I want to keep the on the fly computations to a minimum.
Also, the session_day is computed by taking the current date minus the
last day to add drop. This allows us to compare a given day session by
session. "How does day 0 for session 200703 compare to day 0 for
session 200603". Day 0 (last day to add or drop classes) varies year to
year and session to session.
10. This data is for a data warehouse. I have 3 databases, SOURCE,
STAGING and TARGET. Stating is as you describe, My admissions officer
dimension has the keys and captions, where the student table has the
admissions officer key, but not any information about the admissions
officer. When I build the final data in Target, I have a lot of
redundancy and pull in captions into the final tables.
11. Good point, i've ment to convert those to char.
12. Data warehouse, final data is de-normalized. (My vendor of our lead
managment system says the same thing, which I remind them that I am not
building data for data entry, but for reporting. This is also why I can
smoke their home grown reporting with my data and Cognos tools).
Thanks for the constructive suggestions Celko. I've learned a lot on
how I might change the UPdate to run faster...no wait a minute, you
just finished berating me and came just sort of calling me stupid...
Never mind.
--CELKO-- wrote:
> This is a mess. I would start over and get it right.
>
> 1) Why do you use all uppercase letters? It is a screaming bitch to
> read and we did this because we had no choice in the 1950's with
> punch cards.
> 2) Why do you have no keys in any table? Most columns are NULL-able, a
> sure sign that you have a rotten data model (if you have one at all).
> . In fact you have more NULLs than the entire database for a major
> automobile company.
> 3) Why are you storing computable totals in a table? That is how we
> did it with file systems; in SQL, we use a VIEW. Or we use a front-end
> report application.
> 4) Why did you pick the unreadable alphabetic aliases? That is, why
> does CC remind the reader that the base table is "D_Bi_Student"?
> 5) Why did you use a "_key" suffix? That tells us HOW an attribute
> is use, not what the data element IS. Also, think about how stupid
> "unique_key" is - are there any non-unique keys???
> 6) Is "curr_ind" actually a flag? We do not write with assembly
> level bit flags in SQL. It is redundant in a properly designed schema.
> The status is computed from the current state of the basic data.
> 7) Why did you use so many proprietary data types?
> 8) Why did you mix summary data and base data in the same table? Do
> you know what 1NF and the definition of a table are?
> 9) Having both "session_dt" and "session_day" is a bit
> redundant.
> 10) Why are you keeping captions in a table with the data? A caption
> is part of a display and display is done in the front end, never in the
> database.
> 11) Why are so many of your codes VARCHAR(n)? Most codes are fixed
> length.
> 12) It looks like you need to split up your tables and normalize the
> schema. Think about the basic entities with which you are working -
> students, candidates, admissions personnel, etc. and then once those
> are in tables, think about their relationships.
Navigation:
[Reply to this message]
|