query assistance: northwind

Showing posts with label northwind. Show all posts

Friday, March 30, 2012

Query Help

Hi All,
I'm not sure if this even possible.
Here is the query from NorthWind database:
SELECT TOP 3 Employees.EmployeeID,
Orders.CustomerID
FROM Orders,Employees
WHERE Employees.EmployeeID = 9
AND Orders.EmployeeID = Employees.EmployeeID
Result:
EmployeeID CustomerID
9 RICSU
9 ERNSH
9 SAVEA
Is there a way to return this query as 1 record and name columns
CustomerID1, CustomerID2, CustomerID3
EmployeeID CustomerID1 CustomerID2
CustomerID3
9 RICSU ERNSH
SAVEA
Thanks,
Walter
I'd say do that in your user interface code rather than in SQL. Leave
the database for what it does best - retrieving data. How difficult can
it be to display a data set as horizontal list?
Certainly you can do it in the database, but do you really want 3
*random* customers for each employee? You haven't specified ORDER BY in
your query so you can't predict which three customers you'll get back.
Let's assume that you really wanted the first three orders placed by an
employee:
SELECT MAX(employeeid),
MAX(CASE WHEN ord = 1 THEN customerid END),
MAX(CASE WHEN ord = 2 THEN customerid END),
MAX(CASE WHEN ord = 3 THEN customerid END)
FROM
(SELECT O1.employeeid, O1.customerid, COUNT(*) AS ord
FROM Orders AS O1
JOIN Orders AS O2
ON O1.employeeid = O2.employeeid
AND (O1.orderdate > O2.orderdate OR
(O1.orderdate = O2.orderdate AND O1.orderid >= O2.orderid))
WHERE O1.employeeid =9
GROUP BY O1.employeeid, O1.customerid, O1.orderdate
HAVING COUNT(*)<=3) AS T ;
David Portas
SQL Server MVP
|||David,
Thanks for your help.Works like a champ!
I don't have an interface for this query.
Walter
"David Portas" <REMOVE_BEFORE_REPLYING_dportas@.acm.org> wrote in message
news:1124208616.358495.95330@.g47g2000cwa.googlegro ups.com...
> I'd say do that in your user interface code rather than in SQL. Leave
> the database for what it does best - retrieving data. How difficult can
> it be to display a data set as horizontal list?
> Certainly you can do it in the database, but do you really want 3
> *random* customers for each employee? You haven't specified ORDER BY in
> your query so you can't predict which three customers you'll get back.
> Let's assume that you really wanted the first three orders placed by an
> employee:
> SELECT MAX(employeeid),
> MAX(CASE WHEN ord = 1 THEN customerid END),
> MAX(CASE WHEN ord = 2 THEN customerid END),
> MAX(CASE WHEN ord = 3 THEN customerid END)
> FROM
> (SELECT O1.employeeid, O1.customerid, COUNT(*) AS ord
> FROM Orders AS O1
> JOIN Orders AS O2
> ON O1.employeeid = O2.employeeid
> AND (O1.orderdate > O2.orderdate OR
> (O1.orderdate = O2.orderdate AND O1.orderid >= O2.orderid))
> WHERE O1.employeeid =9
> GROUP BY O1.employeeid, O1.customerid, O1.orderdate
> HAVING COUNT(*)<=3) AS T ;
> --
> David Portas
> SQL Server MVP
> --
>

Wednesday, March 28, 2012

Query Help

Hi All,
I'm not sure if this even possible.
Here is the query from NorthWind database:
SELECT TOP 3 Employees.EmployeeID,
Orders.CustomerID
FROM Orders,Employees
WHERE Employees.EmployeeID = 9
AND Orders.EmployeeID = Employees.EmployeeID
Result:
EmployeeID CustomerID
9 RICSU
9 ERNSH
9 SAVEA
Is there a way to return this query as 1 record and name columns
CustomerID1, CustomerID2, CustomerID3
EmployeeID CustomerID1 CustomerID2
CustomerID3
9 RICSU ERNSH
SAVEA
Thanks,
WalterI'd say do that in your user interface code rather than in SQL. Leave
the database for what it does best - retrieving data. How difficult can
it be to display a data set as horizontal list?
Certainly you can do it in the database, but do you really want 3
*random* customers for each employee? You haven't specified ORDER BY in
your query so you can't predict which three customers you'll get back.
Let's assume that you really wanted the first three orders placed by an
employee:
SELECT MAX(employeeid),
MAX(CASE WHEN ord = 1 THEN customerid END),
MAX(CASE WHEN ord = 2 THEN customerid END),
MAX(CASE WHEN ord = 3 THEN customerid END)
FROM
(SELECT O1.employeeid, O1.customerid, COUNT(*) AS ord
FROM Orders AS O1
JOIN Orders AS O2
ON O1.employeeid = O2.employeeid
AND (O1.orderdate > O2.orderdate OR
(O1.orderdate = O2.orderdate AND O1.orderid >= O2.orderid))
WHERE O1.employeeid =9
GROUP BY O1.employeeid, O1.customerid, O1.orderdate
HAVING COUNT(*)<=3) AS T ;
David Portas
SQL Server MVP
--|||David,
Thanks for your help.Works like a champ!
I don't have an interface for this query.
Walter
"David Portas" <REMOVE_BEFORE_REPLYING_dportas@.acm.org> wrote in message
news:1124208616.358495.95330@.g47g2000cwa.googlegroups.com...
> I'd say do that in your user interface code rather than in SQL. Leave
> the database for what it does best - retrieving data. How difficult can
> it be to display a data set as horizontal list?
> Certainly you can do it in the database, but do you really want 3
> *random* customers for each employee? You haven't specified ORDER BY in
> your query so you can't predict which three customers you'll get back.
> Let's assume that you really wanted the first three orders placed by an
> employee:
> SELECT MAX(employeeid),
> MAX(CASE WHEN ord = 1 THEN customerid END),
> MAX(CASE WHEN ord = 2 THEN customerid END),
> MAX(CASE WHEN ord = 3 THEN customerid END)
> FROM
> (SELECT O1.employeeid, O1.customerid, COUNT(*) AS ord
> FROM Orders AS O1
> JOIN Orders AS O2
> ON O1.employeeid = O2.employeeid
> AND (O1.orderdate > O2.orderdate OR
> (O1.orderdate = O2.orderdate AND O1.orderid >= O2.orderid))
> WHERE O1.employeeid =9
> GROUP BY O1.employeeid, O1.customerid, O1.orderdate
> HAVING COUNT(*)<=3) AS T ;
> --
> David Portas
> SQL Server MVP
> --
>

Query Help

Hi All,
I'm not sure if this even possible.
Here is the query from NorthWind database:
SELECT TOP 3 Employees.EmployeeID,
Orders.CustomerID
FROM Orders,Employees
WHERE Employees.EmployeeID = 9
AND Orders.EmployeeID = Employees.EmployeeID
Result:
EmployeeID CustomerID
9 RICSU
9 ERNSH
9 SAVEA
Is there a way to return this query as 1 record and name columns
CustomerID1, CustomerID2, CustomerID3
EmployeeID CustomerID1 CustomerID2
CustomerID3
9 RICSU ERNSH
SAVEA
Thanks,
WalterI'd say do that in your user interface code rather than in SQL. Leave
the database for what it does best - retrieving data. How difficult can
it be to display a data set as horizontal list?
Certainly you can do it in the database, but do you really want 3
*random* customers for each employee? You haven't specified ORDER BY in
your query so you can't predict which three customers you'll get back.
Let's assume that you really wanted the first three orders placed by an
employee:
SELECT MAX(employeeid),
MAX(CASE WHEN ord = 1 THEN customerid END),
MAX(CASE WHEN ord = 2 THEN customerid END),
MAX(CASE WHEN ord = 3 THEN customerid END)
FROM
(SELECT O1.employeeid, O1.customerid, COUNT(*) AS ord
FROM Orders AS O1
JOIN Orders AS O2
ON O1.employeeid = O2.employeeid
AND (O1.orderdate > O2.orderdate OR
(O1.orderdate = O2.orderdate AND O1.orderid >= O2.orderid))
WHERE O1.employeeid =9
GROUP BY O1.employeeid, O1.customerid, O1.orderdate
HAVING COUNT(*)<=3) AS T ;
--
David Portas
SQL Server MVP
--|||David,
Thanks for your help.Works like a champ!
I don't have an interface for this query.
Walter
"David Portas" <REMOVE_BEFORE_REPLYING_dportas@.acm.org> wrote in message
news:1124208616.358495.95330@.g47g2000cwa.googlegroups.com...
> I'd say do that in your user interface code rather than in SQL. Leave
> the database for what it does best - retrieving data. How difficult can
> it be to display a data set as horizontal list?
> Certainly you can do it in the database, but do you really want 3
> *random* customers for each employee? You haven't specified ORDER BY in
> your query so you can't predict which three customers you'll get back.
> Let's assume that you really wanted the first three orders placed by an
> employee:
> SELECT MAX(employeeid),
> MAX(CASE WHEN ord = 1 THEN customerid END),
> MAX(CASE WHEN ord = 2 THEN customerid END),
> MAX(CASE WHEN ord = 3 THEN customerid END)
> FROM
> (SELECT O1.employeeid, O1.customerid, COUNT(*) AS ord
> FROM Orders AS O1
> JOIN Orders AS O2
> ON O1.employeeid = O2.employeeid
> AND (O1.orderdate > O2.orderdate OR
> (O1.orderdate = O2.orderdate AND O1.orderid >= O2.orderid))
> WHERE O1.employeeid =9
> GROUP BY O1.employeeid, O1.customerid, O1.orderdate
> HAVING COUNT(*)<=3) AS T ;
> --
> David Portas
> SQL Server MVP
> --
>

Tuesday, March 20, 2012

Query Execution Plan

I'm studying the effects of indexes on query statements and trying to understand the underlying processes such as BTree's, etc. For the Northwind table 'Orders', I left the Primary Key index alone, but put an index on the field 'ShipVia'. My question is more focused on the theory and mechanical methods behind the scenes more than specific to tuning a particular SQL statement. I want to know why SQL Server would choose to use the Primary Key clustered index and not use my newly created index to execute the following statement. Wouldn't the leaf nodes of the index point to the records located in the data pages and therefore are best retrievable through the new index?

StmtText
-
SELECT *
FROM Orders
where ShipVia = 3

(1 row(s) affected)

StmtText

|--Clustered Index Scan(OBJECT:([Northwind].[dbo].[Orders].[PK_Orders]), WHERE:([Orders].[ShipVia]=Convert([@.1])))

It does use my index when I change the query as such:

SELECT ShipVia
FROM Orders
where ShipVia = 3

(2 row(s) affected)

StmtText
--
|--Index Seek(OBJECT:([Northwind].[dbo].[Orders].[IX_Orders]), SEEK:([Orders].[ShipVia]=Convert([@.1])) ORDERED FORWARD)

(1 row(s) affected)

By the way, I did take into account the size of the table. I do understand that if the table is small enough, the benefits of an index can be lost. I tried a test in which I duplicated the records to generate over a million each. The query execution plan did not change.

The problem is two-fold: that you're issuing a SELECT * & that your predicate is probably not restrictive enough. For SQL Server to use your index in this case, it would have to look up the actual record from the clustered index for each match it finds in the secondary index. Looking up the record in the clustered index involves reading about 2-3 pages (root node and leaf node, with a possible intermediate node). So if enough rows qualify (say more than 100 rows) then it would be much more expensive to use this index because it would involve reading about 300 pages from disk.

You can check these effects yourself by forcing the index (use the with (index='[index name]') hint, see the article on FROM clause in books on-line) and using "set statistics io on". To have SQL Server use the index, either make the filter restrictive enough (very few rows have ShipVia = 3) or modify the query as you have done.

|||

I believe what you just told be totally clicked! Thank you.

I remember reading now that only clustered indexes have the actual record as the leaf node and any other indexes would simply hold an ID of the record (true?). That being said, it makes sense to not use the secondary index in many cases.

Thank you Mostafa

|||My pleasure. And yes what you said is (almost) true: secondary indexes hold the clustering key of the record if it's unique (e.g. when it's the primary key). So in your example if the primary key (and clustering key) of Orders is OrderID, then each record in the ShipVia index would hold (ShipVia, OrderID).

Wednesday, March 7, 2012

Query Cost - how is it calculated?

Hi,
I'm trying to figure out where the value of "Cost" comes from in the
Query Analyzer.
For example, when running this on Northwind:
select * from dbo.[Alphabetical list of products]
You get this 3 parts for the Query Plan. The part of
"Categories.PK_Categories" has a cost of 24% of the
total and a value of 0.0119... Where does this value
come from?
All I get is "cost for CPU" and "cost for I/O" - but adding
these together does not equal that value.
Anyone have a clue?
--
With regards,
Martijn Tonies
Database Workbench - tool for InterBase, Firebird, MySQL, Oracle & MS SQL
Server
Upscene Productions
http://www.upscene.comI doubt that anyone outside of Microsft, and maybe Kalen, could really
answer that question...
Not only does the optimizer estimate IO and CPU, it does some estimation
about how much physical IO is required based on some expectation of how much
of the data might be in memory... It also looks at the recent use of tables
to determine the level of locking for certain operations as well...
Wayne Snyder, MCDBA, SQL Server MVP
Mariner, Charlotte, NC
www.mariner-usa.com
(Please respond only to the newsgroups.)
I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org
"Martijn Tonies" <m.tonies@.upscene-removethis.nospam.com> wrote in message
news:ekzghVpZFHA.3040@.TK2MSFTNGP14.phx.gbl...
> Hi,
> I'm trying to figure out where the value of "Cost" comes from in the
> Query Analyzer.
> For example, when running this on Northwind:
> select * from dbo.[Alphabetical list of products]
> You get this 3 parts for the Query Plan. The part of
> "Categories.PK_Categories" has a cost of 24% of the
> total and a value of 0.0119... Where does this value
> come from?
> All I get is "cost for CPU" and "cost for I/O" - but adding
> these together does not equal that value.
>
> Anyone have a clue?
> --
> With regards,
> Martijn Tonies
> Database Workbench - tool for InterBase, Firebird, MySQL, Oracle & MS SQL
> Server
> Upscene Productions
> http://www.upscene.com
>
>|||Hello Wayne,
> I doubt that anyone outside of Microsft, and maybe Kalen, could really
> answer that question...
> Not only does the optimizer estimate IO and CPU, it does some estimation
> about how much physical IO is required based on some expectation of how
much
> of the data might be in memory... It also looks at the recent use of
tables
> to determine the level of locking for certain operations as well...
Hmm - when extracting a plan via SHOWPLAN_ALL, I do get the estimates
of some of the info (IO/CPU), but I have no idea where the actual "cost"
comes
from.
I've been fiddling around with multiplying IO * "nr of executes" and CPU *
nr of
executes, but it doesn't really add up properly... :-/
With regards,
Martijn Tonies
Database Workbench - tool for InterBase, Firebird, MySQL, Oracle & MS SQL
Server
Upscene Productions
http://www.upscene.com
>
> --
> Wayne Snyder, MCDBA, SQL Server MVP
> Mariner, Charlotte, NC
> www.mariner-usa.com
> (Please respond only to the newsgroups.)
> I support the Professional Association of SQL Server (PASS) and it's
> community of SQL Server professionals.
> www.sqlpass.org
> "Martijn Tonies" <m.tonies@.upscene-removethis.nospam.com> wrote in message
> news:ekzghVpZFHA.3040@.TK2MSFTNGP14.phx.gbl...
> > Hi,
> >
> > I'm trying to figure out where the value of "Cost" comes from in the
> > Query Analyzer.
> >
> > For example, when running this on Northwind:
> >
> > select * from dbo.[Alphabetical list of products]
> >
> > You get this 3 parts for the Query Plan. The part of
> > "Categories.PK_Categories" has a cost of 24% of the
> > total and a value of 0.0119... Where does this value
> > come from?
> >
> > All I get is "cost for CPU" and "cost for I/O" - but adding
> > these together does not equal that value.
> >
> >
> > Anyone have a clue?
> >
> > --
> > With regards,
> >
> > Martijn Tonies
> > Database Workbench - tool for InterBase, Firebird, MySQL, Oracle & MS
SQL
> > Server
> > Upscene Productions
> > http://www.upscene.com
> >
> >
> >
>

Query Concept

Hi,
I would like to use Northwind as an example.
I am trying to use the Orders table and why the following query only returns
one record for each customer if the outer query pass info into the inner
query since each customer has more than one records in the Orders table.
e.g. customerid "ALFKI" has six records in Orders table, but after the query
run, it only returns one record for "ALFKI". I assume "ALFKI" should pass
into the inner query six times... it is supposed to show six identitcal
reocrds for "ALFKI", but not, why?
select a.* from orders a where a.orderdate = (select max(orderdate) from
orders b where a.customerid = b.customerid)
order by a.customerid
Thanks
EdmundIn principle the subquery is evaluated for each row but it will always
return the same value for any given customerid. If there is only one
value that corresponds to the maximum date for "ALKFI" then you will
only get one row returned for that customer.
Please show us exactly what result you want then maybe someone can help
with a solution.
David Portas
SQL Server MVP
--|||Actually, i am just wondering why I don't get six records back for customer
"ALFKI" but just one since I guess the outer query is passing into the inner
query for six times even I may get six identical records...
"David Portas" wrote:

> In principle the subquery is evaluated for each row but it will always
> return the same value for any given customerid. If there is only one
> value that corresponds to the maximum date for "ALKFI" then you will
> only get one row returned for that customer.
> Please show us exactly what result you want then maybe someone can help
> with a solution.
> --
> David Portas
> SQL Server MVP
> --
>|||Because the subquery will return only one record, which has the last
orderdate, for each CustomerID.
Perayu
"Ed" <Ed@.discussions.microsoft.com> wrote in message
news:D85ECABE-E0C5-431F-9948-B329086B1A31@.microsoft.com...
> Hi,
> I would like to use Northwind as an example.
> I am trying to use the Orders table and why the following query only
> returns
> one record for each customer if the outer query pass info into the inner
> query since each customer has more than one records in the Orders table.
> e.g. customerid "ALFKI" has six records in Orders table, but after the
> query
> run, it only returns one record for "ALFKI". I assume "ALFKI" should pass
> into the inner query six times... it is supposed to show six identitcal
> reocrds for "ALFKI", but not, why?
> select a.* from orders a where a.orderdate = (select max(orderdate) from
> orders b where a.customerid = b.customerid)
> order by a.customerid
> Thanks
> Edmund
>
>
>
>|||Maybe the following example will make it clearer:
SELECT a.customerid, a.orderdate,
(SELECT MAX(orderdate)
FROM orders b
WHERE a.customerid = b.customerid) AS subquery_result
FROM orders a
WHERE a.customerid = 'ALFKI' ;
Here I've just moved the subquery into the select list so that you can
see the result it returns for each row - the same value each time. You
can also see that there is only ONE row that matches the value returned
by the subquery, therefore that's the only row that satisfies the
condition in your WHERE clause.
David Portas
SQL Server MVP
--|||Ed,

> Actually, i am just wondering why I don't get six records back for custome
r
> "ALFKI" but just one since I guess the outer query is passing into the inn
er
> query for six times even I may get six identical records...
From all alfki's orders, just one is the latest. Eventhough the outer query
is passing value six times, the value is the same (customerid) and the value
returned is always the same (max(orderdate)).
declare @.d datetime
select @.d = max(orderdate)
from orders
where customerid = 'alfki'
print @.d
select *
from orders
where customerid = 'alfki' and orderdate = @.d
go
AMB
"Ed" wrote:
> Actually, i am just wondering why I don't get six records back for custome
r
> "ALFKI" but just one since I guess the outer query is passing into the inn
er
> query for six times even I may get six identical records...
> "David Portas" wrote:
>

Friday, March 30, 2012

Query Help

Wednesday, March 28, 2012

Query Help

Query Help

Tuesday, March 20, 2012

Query Execution Plan

Wednesday, March 7, 2012

Query Cost - how is it calculated?

Query Concept

query assistance

Blog Archive

About Me