|
Posted by serge on 12/04/05 23:19
Thanks for the reply Erland.
> That was a long post, and I guess this is because this is a complex
> problem. Or at least, I like to believe so, as I understood far from
> all of it.
I don't think it's a complex problem. It's a complex problem for me
since I've never run into this type of scenario before, at least I don't
remember if I ever did.
> For instance, I failed to understand why you could not use
> dynamic SQL.
Ok I am asking the same question as you now "Why I can't use
dynamic SQL?" Well maybe I am trying to address this problem
in a matter that is not possible. Let me answer your questions below
before I elaborate more on this.
> There was a table and some sample data for it, but I did not really
> understand how it was used. Does one row describe a search? Or
> does all rows with the same REPSELNO describe a search?
Yes, all rows with the same REPSELNO describe a search.
> (In such case, should there not be a header table defining the search,
> and then a sub-table with the details?)
You are right a header table a sub-table with all the details is the correct
way to do this:
IF(SELECT OBJECT_ID('REP_SEL')) IS NOT NULL
DROP TABLE REP_SEL
GO
CREATE TABLE REP_SEL
(
REPSELNO INT IDENTITY(1, 1) NOT NULL, -- Idenitifies
-- which report query this "sales question" is part of
OPERATOR TINYINT NOT NULL -- 1 is logical operator AND, 2 is OR
)
GO
IF(SELECT OBJECT_ID('REP_SEL_DET')) IS NOT NULL
DROP TABLE REP_SEL_DET
GO
CREATE TABLE REP_SEL_DET
(
AUTOID INT IDENTITY(1, 1) NOT NULL,
REPSELNO INT NOT NULL, -- Foreign Key for REP_SEL
SupplierID INT NOT NULL, -- from the Suppliers table
ProductID INT NOT NULL, -- from the Products table, if you choose
--a ProductID, SupplierID is selected also by
inheritence
CategoryID INT NOT NULL, -- from the Categories table
SOLDDFROM DATETIME NULL, -- Sold from which date
SOLDTO DATETIME NULL, -- Sold to which date
MINSALES INT NOT NULL, -- The minimum amount of sales
MAXSALES INT NOT NULL, -- The maximum amount of sales
)
GO
> How do you describe the AND/OR thing you are talking about?
I create a new REP_SEL record and specify that ALL conditions
must be TRUE. Therefore I would specify OPERATOR = 1 (meaning
use Logical Operator AND).
> It would certainly help to have a more concrete example where you show
> some sample entries, and the exact results those examples are supposed
> to give.
For example my sales questions would be something like:
REPSEL Sales question #1
I want the employee name(s) who sold products
- from supplier 1
- BETWEEN '7/1/1996' AND '7/10/1996'
- Between amount $300 and $500
REPSEL Sales question #2
I want the employee name(s) who sold products
- from supplier 2
- BETWEEN '7/1/1996' AND '7/10/1996'
- Between amount $300 and $500
REPSEL Sales question #3
I want the employee name(s) who sold products
- from supplier 3
- BETWEEN '7/1/1996' AND '7/10/1996'
- Between amount $300 and $500
The manually hand-written SP for the same 3 questions above is:
SELECT Employees.EmployeeID, Employees.LastName, Employees.FirstName,
SUM([Order Details].UnitPrice * Quantity) AS Amount
FROM Employees
JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
WHERE OrderDate BETWEEN '7/1/1996' AND '7/10/1996'
AND Suppliers.SupplierID IN (1, 2, 3)
GROUP BY Employees.EmployeeID, Employees.LastName, Employees.FirstName
HAVING SUM([Order Details].UnitPrice * Quantity) BETWEEN 300 AND 500
If you run this you should get 1 employee name who met all three conditions:
EmployeeID LastName FirstName Amount
3 Leverling Janet 336.0000
If I were to ask the same three questions with using the OR logical operator
then I don't need the common intersection of all three results, I care to
get
the result of any of the three questions. Obviously the example I used above
are not always the type of questions that are being asked. Each question
could
have completely different date range, amount range and the SupplierID,
ProductID
and CategoryID can be specified.
The equivalent of the above three questions would be:
INSERT INTO REP_SEL
SELECT 1 -- I am specifying logical operator AND to be used
GO
INSERT INTO REP_SEL_DET
SELECT 1, 1, -1, -1, '7/1/1996', '7/1/1996', 300, 500 UNION ALL
SELECT 1, 2, -1, -1, '7/1/1996', '7/1/1996', 300, 500 UNION ALL
SELECT 1, 3, -1, -1, '7/1/1996', '7/1/1996', 300, 500
GO
The above scenario was simple. Now what if i change the common parameters:
REPSEL Sales question #1
I want the employee name(s) who sold products
- from supplier 1
- ProductID = 2 (SupplierID is automatically chosen and has the value 1)
- BETWEEN '1/1/1996' AND '5/31/1997'
- Between amount $300 and $3000
REPSEL Sales question #2
I want the employee name(s) who sold products from supplier 2 AND
- from supplier 2
- ProductID = 5 (SupplierID is automatically chosen and has the value 2)
- BETWEEN '1/1/1996' AND '1/1/1997'
- Between amount $500 and $1500
REPSEL Sales question #3
I want the employee name(s) who sold products from supplier 3
- from supplier 3
- ProductID = 7 (SupplierID is automatically chosen and has the value 3)
- BETWEEN '1/1/1996' AND '5/31/1997'
- Between amount $200 and $1750
The manually hand-written SP for the same 3 questions above is 3 different
SQL statements:
-- for Question #1
SELECT Employees.EmployeeID, Employees.LastName, Employees.FirstName,
SUM([Order Details].UnitPrice * Quantity) AS Amount,
[Order Details].ProductID
FROM Employees
JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
WHERE OrderDate BETWEEN '1/1/1996' AND '5/31/1997'
AND [Order Details].ProductID = 2
GROUP BY Employees.EmployeeID, Employees.LastName, Employees.FirstName,
[Order Details].ProductID
HAVING SUM([Order Details].UnitPrice * Quantity) BETWEEN 300 AND 3000
The result (the list of employee names are):
EmployeeID LastName FirstName Amount ProductID
1 Davolio Nancy 1748.0000 2
2 Fuller Andrew 380.0000 2
4 Peacock Margaret 2492.8000 2
6 Suyama Michael 1140.0000 2
9 Dodsworth Anne 304.0000 2
-- for Question #2
SELECT Employees.EmployeeID, Employees.LastName, Employees.FirstName,
SUM([Order Details].UnitPrice * Quantity) AS Amount,
[Order Details].ProductID
FROM Employees
JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
WHERE OrderDate BETWEEN '1/1/1996' AND '1/1/1997'
AND [Order Details].ProductID = 5
GROUP BY Employees.EmployeeID, Employees.LastName, Employees.FirstName,
[Order Details].ProductID
HAVING SUM([Order Details].UnitPrice * Quantity) BETWEEN 500 AND 1500
The result (the list of employee names are):
EmployeeID LastName FirstName Amount ProductID
1 Davolio Nancy 1105.0000 5
4 Peacock Margaret 544.0000 5
8 Callahan Laura 544.0000 5
-- for Question #3
SELECT Employees.EmployeeID, Employees.LastName, Employees.FirstName,
SUM([Order Details].UnitPrice * Quantity) AS Amount,
[Order Details].ProductID
FROM Employees
JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID
WHERE OrderDate BETWEEN '1/1/1996' AND '5/31/1997'
AND [Order Details].ProductID = 7
GROUP BY Employees.EmployeeID, Employees.LastName, Employees.FirstName,
[Order Details].ProductID
HAVING SUM([Order Details].UnitPrice * Quantity) BETWEEN 200 AND 1750
The result (the list of employee names are):
EmployeeID LastName FirstName Amount ProductID
1 Davolio Nancy 540.0000 7
2 Fuller Andrew 720.0000 7
8 Callahan Laura 360.0000 7
Now my query is to find out the employee name(s) that are in the result of
ALL
my three sales questions. So in T-SQL I am not to do UNION of the results,
rather do an INNER JOIN of all the three results. The JOIN result would
obviously be in this case:
EmployeeID LastName FirstName
1 Davolio Nancy
I would find out that only Nancy Davalio met all my sales questions.
The equivalent of the above three questions would be:
INSERT INTO REP_SEL
SELECT 1 -- I am specifying logical operator AND to be used
GO
INSERT INTO REP_SEL_DET
SELECT 2, 1, 2, -1, '1/1/1996', '5/31/1997', 300, 3000 UNION ALL
SELECT 2, 2, 5, -1, '1/1/1996', '1/1/1997', 500, 1500 UNION ALL
SELECT 2, 3, 7, -1, '1/1/1996', '5/31/1997', 200, 1750
GO
I am going to a lot of time reading your articles on Dynamic SQL as I am
sure they will be very helpful:
http://www.sommarskog.se/dyn-search.html
Now to elaborate my earlier points.
I am thinking to write a single SP that handles all the possible scenarios
using dynamic SQL:
From my original post, i have 5 possible scenarios for the columns
SupplierID, ProductID and CategoryID. These are the dynamic input
parameters, there can be 5 different combinations to choose from:
a-SupplierID only
b-SupplierID and a ProductID,
c-SupplierID and a CategoryID
d-SupplierID, ProductID and a CategoryID
e-CategoryID only
The main SP would use a cursor, temp table or table variable, i would
loop through each sales question and call the dynamic SQL SP recursively
and do JOIN operations between each call. As soon as 1 sales question
returns no row I can stop from proceeding and conclude that all my sales
questions DO not have a common employee. Or as soon as the joining of the
temp table and the latest call to the dynamic SQL sp fail to find any
employees
common, I will stop from proceeding also and conclude that all my sales
questions DO not have a common employee.
Is this what I'll have to end up doing?
I originally said I don't think I can use Dynamic SQL is because I was
trying to find out if there's a way to do this using a single T-SQL query
without looping/processing each sales question separately? Maybe there is
a way with some type of Dynamic JOINs or maybe I am just dreaming of
something that is not possible.
How does OLAP and DrillThrough feature do these types of queries? Isn't
OLAP built specifically to handle these types of questions? Is there
anything
or any T-SQL code from OLAP inner-workings that can be taken and used
to handle my problem?
Other points I have is I am trying to determine if there are maybe some
methods that could easily start by eliminating early in the processing if
the
end result would be FALSE. In other words is there a technique to maybe
sort the sales questions one way and before even starting to process each
sales questions, find out that some sales questions contradict each other
therefore mathematically the whole thing will never return a common result.
For example:
Question #1:
I want all employees that sold CategoryID = 1 (Beverages) between
'1/1/1996' and '1/1/1998' and between $1 and $100000
Question #2:
I want all employees that sold CategoryID = 1 (Beverages) between
'1/1/2004' and '1/1/2005' and between $1 and $100000
SELECT Employees.EmployeeID, Employees.LastName, Employees.FirstName,
SUM([Order Details].UnitPrice * Quantity) AS Amount
FROM Employees
JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Categories ON Products.CategoryID = Categories.CategoryID
WHERE OrderDate BETWEEN '1/1/1996' AND '1/1/1998'
AND Products.CategoryID = 1
GROUP BY Employees.EmployeeID, Employees.LastName, Employees.FirstName
HAVING SUM([Order Details].UnitPrice * Quantity) BETWEEN 1 AND 100000
ORDER BY Employees.EmployeeID, Employees.LastName, Employees.FirstName
GO
SELECT Employees.EmployeeID, Employees.LastName, Employees.FirstName,
SUM([Order Details].UnitPrice * Quantity) AS Amount
FROM Employees
JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID
JOIN Products ON [Order Details].ProductID = Products.ProductID
JOIN Categories ON Products.CategoryID = Categories.CategoryID
WHERE OrderDate BETWEEN '1/1/2004' AND '1/1/2005'
AND Products.CategoryID = 1
GROUP BY Employees.EmployeeID, Employees.LastName, Employees.FirstName
HAVING SUM([Order Details].UnitPrice * Quantity) BETWEEN 1 AND 100000
ORDER BY Employees.EmployeeID, Employees.LastName, Employees.FirstName
GO
The equivalent of the above two questions would be:
INSERT INTO REP_SEL
SELECT 1 -- I am specifying logical operator AND to be used
GO
INSERT INTO REP_SEL_DET
SELECT 3, -1, -1, 1, '1/1/1996', '1/1/1998', 1, 100000 UNION ALL
SELECT 3, -1, -1, 1, '1/1/2004', '1/1/2005', 1, 100000
GO
Obviously Northwind Orders does not have any dates in 2004
so both questions 1 & 2 will never have a JOIN that will return a resultset.
If I were to have to process various types of questions and they are in the
range of dozens of questions, can I find a trick to easily eliminate the
whole
processing of the dozens of questions by not processing each question 1
by one from the beginning? What if all the questions had common result
and only the last question ends up returning no common result? Can't I
have dealt with the last question earlier?
If you have read part or all the post then I Thank you very
much for having taken the time to read.
[Back to original message]
|