|
Posted by HC on 02/05/07 05:23
On Feb 4, 10:48 pm, "HC" <hboo...@gte.net> wrote:
> On Feb 4, 5:38 pm, Erland Sommarskog <esq...@sommarskog.se> wrote:
>
> > HC (hboo...@gte.net) writes:
> > > select tMO.* from caredata.dbo.tbl_medications as tM inner join
> > > caredata_aa.dbo.tbl_medicationorders as tMO on tMO.medid = tM.pkey
> > > where tMO.resid = 43
>
> > > the join clause tMO.medid is matched against the primary key of the tM
> > > table. There is no index for tMO.medid, nor is there one for
> > > tMO.resid. Would it help to index them since the Clustered Index Seek
> > > is being performed against the tbl_Medications table?
>
> > There should be an index in resid. If the major part of the queries
> > against tbl_medicationorders are against resid, then it's probably a
> > good idea to cluster on this column. (You would then have to drop
> > the primary key, and the reapply it as nonclustered.)
>
> > > In my program I do a LEFT OUTER JOIN instead of an INNER JOIN, but the
> > > performance is the same: slow. So, I just left it at an INNER JOIN to
> > > try to keep things simple.
>
> > Well, since you only return data from tMO, this may be even better:
>
> > SELECT tMO.*
> > FROM caredata_aa.dbo.tbl_medicationorders tNO
> > WHERE tMO.resid = 43
> > AND EXISTS (SELECT *
> > FROM caredata.dbo.tbl_medications tM
> > WHERE tMO.medid = tM.pkey)
>
> > Not that it is likely to affect performance, but you would avoid
> > returning the same row twice from meditioncationorders.
>
> > > I ran the query as you asked, against both database engines and have
> > > posted them online. I was having problems with my regular website so
> > > I had to cobble one together for this purpose. It was done through a
> > > web-interface so I just UUE encoded a ZIP file of the text file with
> > > the results. Each db engine's results are in a different file. I
> > > included the query (as seen above), and the statistics. You can find
> > > them here:http://home.earthlink.net/~hboothe/On that lame page
> > > you'll see links on the left side for the pages dealing with each db
> > > engine.
>
> > I'm afraid that they appear as somewhat cryptic to me. I guess that I
> > could copy them into a text editor and try to decode them, but I'm
> > lazy. Couldn't you just upload the output as text files there.
>
> > > The 14 second process I refer to (that takes over 4 minutes on
> > > Express) is using the same tables. What I am trying to do is to pare
> > > this stuff down to the smallest piece that exhibits the unwanted
> > > behavior so that the situation is not obfuscated unnecessarily. I
> > > think that once the problem with the little query is fixed, the
> > > problem with the big query will be fixed. The process that takes 14
> > > seconds on these little tables in MSDE and over 4 minutes in Express
> > > is nasty. I have multiple databases. One database is the parent of
> > > the whole system, containing information that is relevant across each
> > > of the other databases. You may notice in the query I've posted that
> > > the tMO table is from one db (caredata_aa.dbo.tbl_MedicationOrders)
> > > and tM is from another (caredata.dbo.tbl_Medications). My program
> > > (system) can support many of the _aa databases (_aa, _ab, _ac, etc.).
> > > The process that takes 14 seconds on MSDE is supposed to report to the
> > > users those medications that are not in use by any medication order in
> > > any of the databases (so, no match from tM.pkey to tMO.medid from any
> > > tMO of which there may be as many as 4 (maximum number currently
> > > used)). Each record must not match across each of multiple db's.
>
> > Ehum, this design sounds dubious. I get the impression that if this was
> > all in the same database, and the same tables, this could be done in a
> > single query.
>
> > Or is there any particular reason you have multiple databases?
>
> > Anyway, it occurred to me, there is one thing to check for. Run this query:
>
> > select name from sys.databases where is_auto_close_on = 1
>
> > For all databses that appear do:
>
> > ALTER DATABASE db SET AUTO_CLOSE OFF
>
> > --
> > Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se
>
> > Books Online for SQL Server 2005 athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books...
> > Books Online for SQL Server 2000 athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx-Hide quoted text -
>
> > - Show quoted text -
>
> Hello, Erland. First, I have put a variety of indexes into the
> tbl_medicationorders (clustered and unclustered) with no real
> difference in the Total Execution Time of the query. Second, I could
> do the select as you suggest but I'm not getting any records from the
> database other than the 12 I expect (for this particular select
> statement, so I the modified statement would seem unnecessary. Third,
> and this is the worst, I've found something that makes a difference in
> the execution time of the queries every time.
>
> Briefly, before I mention what has drastically shortened the execution
> time of the query let me say that i thought it was flaky as hell so I,
> thinking it might be a problem with my installation of SQL Express, re-
> installed SQL Express from a fresh download from MS and reloaded the
> DB's. I get the same flaky problems in query execution time.
>
> Here's what I did after the reload, carefully mapped out just as I did
> it. I rebooted after re-installing Express, just as a precaution. I
> detached the databases from MSDE and attached them to Express
> (different boxes, of course). I ran sp_updatestats on each DB. I ran
> my query as I had previously posted with the inner join. I ran it
> three times with the STATISTICS PROFILE ON and, in Management Studio I
> had it show me the Execution Plan and the Client Statistics tabs. I
> had noticed some really strange results, sometimes really fast,
> sometimes really slow, even if I had not changed the database. I
> found something that makes a difference in the query execution speed
> but it's really bizarre. Here is what I got:
>
> The average Total Execution Time for those three runs is 1,202.667
> ms. Really slow and intolerable. Running this query from this box
> against my MSDE system gives me responses in the 15-30 ms range.
>
> I use the Object Explorer to locate a table in the caredata database.
> I right-click on a table (one completely unrelated to the query) and
> click Modify. I change NOTHING, just look at the window that shows me
> the fields of the table. I click back on my query tab and, after
> resetting client statistics, I run the query 3 times.
>
> Now, the average Total Execution Time is down to 650.6667 ms. No
> kidding. Same query, same data, same server, same table, EVERYTHING
> is the same except now I have a tab open showing me the properties of
> a table in the Caredata datbase.
>
> Next, I open another unrelated table in the caredata_aa database (the
> other DB involved in the query) in the same way (right-click, hit
> Modify, change NOTHING). When I return to the tab with the query on
> it and reset the client statistics and then run it three times I get
> an average Total Execution Time of 62.000 ms.
>
> What I'm seeing is that when I have an object (table) open from the
> databases involved in the query (even if it's just a view of the
> record layout in the Management Express console), the query runs
> ridiculously faster, almost as fast as it runs against the MSDE
> database.
>
> I have reposted the data on that site for you. I didn't like how it
> put line breaks in it which is why I UUE encoded it. You can take UUE
> encoded stuff and just save it as a text file with a UUE extension and
> programs like WinZip will read it. It's there now in plain text under
> the SQL Express Results or here is a direct link: http://
> home.earthlink.net/~hboothe/id2.html
>
> I posted the results twice, once from when I had the other tables open
> so the query was fast and once when the other tables were closed so
> the query was long.
>
> I don't understand why opening up the tables would make a difference.
> If it was on a remote server that had to be sought out it might make
> sense that the query would run faster if the server had already been
> discovered but since it's running in shared memory that seems
> unnecessary.
>
> I ran the query you provided, as a nested select statement but the
> response times were roughly the same as what I got with the join, on
> average across 3 runs, 1,181.667ms Total Execution Time.
>
> In all candor, the design may be dubious or outright stupid. The
> design is based on a prior product which used separate sub-directories
> to store the data for different clients. The idea here is that my
> software product can keep information for one or more businesses.
> That is, I may install my software at one company and they, as a
> service, might host information for many other businesses on my
> application. However, I may install the software at one company who
> only hosts information for themselves. Each company that is hosted on
> the system has it's data completely separate from the others. Any
> information that would need to be shared and used across all the
> different businesses on one installation go in the "parent" database
> (which I call the repository) and any information specific to the
> particular client goes in the "facility" database. I'm seeing some
> ways I could have done it differently now, as a result of our
> discussion, but when I started this project and product 2 years ago I
> didn't consider doing it differently. Now the thing is installed
> across several locations and I have another client about to go live
> with a new system at the end of the month, presumably with Vista which
> means no MSDE, so I'm in no position to try to re-do the DB structure/
> layout/design.
>
> Hmm, at the end you mention the auto close thingy. I just pulled the
> db's with auto close and it's all the ones I attached. I ran the auto
> close = burn and die (okay, I'm being goofy, I ran the alter database
> db set auto_close off) and then ran my query and it was blazing fast.
> That auto_close, from the name, would explain why having open
> connections to the DB would fix the problem.
>
> I'm going to reboot now (to make sure I've not done anything else that
> might be affecting it) and then I'll check it and post back.
>
> Thank you again for all your help and time.
>
> --HC
Erland, I have rebooted after running the commands to set AUTO_CLOSE
to off and the queries are fast still. That seems to be the largest
problem. The program response time for the simple query I've posted
before is in the 0.1 down to 0.08 second range, plenty fast. The
nasty query has gone from 4 minutes or more to just over 1 minute.
That's still too long but I think that with some tweaking of the
indexes I can get that down. It's not as good as MSDE kicking it out
in 14-30 seconds, but it's WAY better than > 4 minutes.
It turns out, as I'm sure you know, that AUTO_CLOSE defaults to on
ONLY for SQL Express (not for other versions of SQL 2005 (from my
reading of BOL)). Turning it off was a huge help to me and my
program.
I will play with indexes tomorrow and the next day to see if I can
speed the rest of process up, but for now, the biggest problem I was
facing (the incredible slowness of the routine work) is fixed.
Thank you very much for all your time and effort on this. I very much
appreciate it.
--HC
Navigation:
[Reply to this message]
|