Monday, March 26, 2012

Query goes from 1 second to 40 minutes with only 1 more record?

I have a very strange problem where I have a simple select count(*) with a handle of of joined tables that works fine and returns a count of records in under 1 second when the criteria in the where clause limits the count to 36984. But if I change the criteria to where the number of records would be 1 higher, 36985, the query takes ~40 minutes!

Here's the query

SELECT count(*)
FROM AT JOIN Al ON Al.AlID= AT.ALID
JOIN DProfile ON Al.DProfileID= DProfile.DProfileID
JOIN Label ON DProfile.LabelID = Label.LabelID
JOIN PricingTier ON PricingTier.LabelID = DProfile.LabelID
JOIN PricingTarget on PricingTarget.PricingTargetID = PricingTier.PricingTargetID
JOIN ATP ON ATP.PricingTierID = PricingTier.PricingTierID
WHERE PricingTarget.Target = 'AT'
AND PricingTier.MaxAgeInDays = 0
AND ATP.LengthMultiple = 0
AND AT.ATID > 408095
AND AT.ATID < 451199

Notice the part of the where clause in red. This is how I'm changing the number of rows that select count(*) should find.

Here's the output from showplan

|--Compute Scalar(DEFINE:([Expr1021]=CONVERT_IMPLICIT(int,[Expr1028],0)))
|--Stream Aggregate(DEFINE:([Expr1028]=Count(*)))
|--Hash Match(Inner Join, HASH:([MGN2].[dbo].[AL].[ALID])=([MGN2].[dbo].[AT].[ALID]))
|--Hash Match(Inner Join, HASH:([MGN2].[dbo].[DProfile].[DProfileID])=([MGN2].[dbo].[AL].[DProfileID]), RESIDUAL:([MGN2].[dbo].[DProfile].[DProfileID]=[MGN2].[dbo].[AL].[DProfileID]))
| |--Nested Loops(Inner Join, WHERE:([MGN2].[dbo].[PricingTier].[LabelID]=[MGN2].[dbo].[DProfile].[LabelID]))
| | |--Nested Loops(Inner Join, OUTER REFERENCES:([MGN2].[dbo].[PricingTier].[PricingTargetID]))
| | | |--Nested Loops(Inner Join, OUTER REFERENCES:([MGN2].[dbo].[ATP].[PricingTierID]))
| | | | |--Clustered Index Scan(OBJECT:([MGN2].[dbo].[ATP].[PK_ATPID]), WHERE:([MGN2].[dbo].[ATP].[LengthMultiple]=(0)))
| | | | |--Clustered Index Seek(OBJECT:([MGN2].[dbo].[PricingTier].[PK_PricingTierID]), SEEK:([MGN2].[dbo].[PricingTier].[PricingTierID]=[MGN2].[dbo].[ATP].[PricingTierID]), WHERE:([MGN2].[dbo].[PricingTier].[MaxAgeInDays]=(0)) ORDERED FORWARD)
| | | |--Clustered Index Seek(OBJECT:([MGN2].[dbo].[PricingTarget].[PK_PricingTargetID]), SEEK:([MGN2].[dbo].[PricingTarget].[PricingTargetID]=[MGN2].[dbo].[PricingTier].[PricingTargetID]), WHERE:([MGN2].[dbo].[PricingTarget].[Target]='AT') ORDERED FORWARD)
| | |--Clustered Index Scan(OBJECT:([MGN2].[dbo].[DProfile].[PK_D_Profiles]))
| |--Clustered Index Scan(OBJECT:([MGN2].[dbo].[AL].[PK_AL]))
|--Clustered Index Seek(OBJECT:([MGN2].[dbo].[AT].[PK_Track]), SEEK:([MGN2].[dbo].[AT].[ATID] > (408095) AND [MGN2].[dbo].[AT].[ATID] < (451199)) ORDERED FORWARD)

Now, if I change the part of the where clause from 'AND AT.ATID > 408095' to 'AND AT.ATID > 408094', increase the number returned by one, the query goes from taking 1 second to 40 minutes. Here's the showplan text that is different, when the only difference in the query is changing the number in the where clause.

|--Compute Scalar(DEFINE:([Expr1021]=CONVERT_IMPLICIT(int,[Expr1028],0)))
|--Stream Aggregate(DEFINE:([Expr1028]=Count(*)))
|--Merge Join(Inner Join, MERGE:([MGN2].[dbo].[PricingTarget].[PricingTargetID])=([MGN2].[dbo].[PricingTier].[PricingTargetID]), RESIDUAL:([MGN2].[dbo].[PricingTarget].[PricingTargetID]=[MGN2].[dbo].[PricingTier].[PricingTargetID]))
|--Clustered Index Scan(OBJECT:([MGN2].[dbo].[PricingTarget].[PK_PricingTargetID]), WHERE:([MGN2].[dbo].[PricingTarget].[Target]='AT') ORDERED FORWARD)
|--Nested Loops(Inner Join, WHERE:([MGN2].[dbo].[AL].[ALID]=[MGN2].[dbo].[AT].[ALID]))
|--Nested Loops(Inner Join, WHERE:([MGN2].[dbo].[ATPrice].[PricingTierID]=[MGN2].[dbo].[PricingTier].[PricingTierID]))
| |--Sort(ORDER BY:([MGN2].[dbo].[PricingTier].[PricingTargetID] ASC))
| | |--Hash Match(Inner Join, HASH:([MGN2].[dbo].[DProfile].[DProfileID])=([MGN2].[dbo].[AL].[DProfileID]), RESIDUAL:([MGN2].[dbo].[DProfile].[DProfileID]=[MGN2].[dbo].[AL].[DProfileID]))
| | |--Nested Loops(Inner Join, WHERE:([MGN2].[dbo].[PricingTier].[LabelID]=[MGN2].[dbo].[DProfile].[LabelID]))
| | | |--Clustered Index Scan(OBJECT:([MGN2].[dbo].[PricingTier].[PK_PricingTierID]), WHERE:([MGN2].[dbo].[PricingTier].[MaxAgeInDays]=(0)) ORDERED FORWARD)
| | | |--Clustered Index Scan(OBJECT:([MGN2].[dbo].[DProfile].[PK_D_Profiles]))
| | |--Clustered Index Scan(OBJECT:([MGN2].[dbo].[AL].[PK_AL]))
| |--Clustered Index Scan(OBJECT:([MGN2].[dbo].[ATPrice].[PK_ATPriceID]), WHERE:([MGN2].[dbo].[ATPrice].[LengthMultiple]=(0)))
|--Clustered Index Seek(OBJECT:([MGN2].[dbo].[AT].[PK_Track]), SEEK:([MGN2].[dbo].[AT].[ATID] > (408094) AND [MGN2].[dbo].[AT].[ATID] < (451199)) ORDERED FORWARD)

I tried increasing the size of the TempDB from the default of 8 MB to 1000 MB for both the files, but it behaves the exact same way. I'm at a loss as to what is causing this dramatic difference for such a simple query. Anyone have an idea?

Thanks!

By the way, this is on SQL Server 2005, if that helps any.|||

Jim:

Good post. I appreciate your inclusion of the plan and the research you put into your question. It appears to me that you are at a plan "crossover" point. What I mean is that you are at the point where the mere addition of one more record to the output range causes the query optimizer to choose a different query plan. In your plan I see many "clustered index scans". A cover index might be appropriate to improve the performance of your query. Can you post the indexes, keys and unique constraints of all of your tables?

What follows below is academic. I am not sure whether I need to include this or not because it looks like you have already done this with your testing, so it looks like you already have a good handle on this. If this is of no interest, just skip the rest.


Dave

I created a mock-up table with 32767 rows of data. One of the columns is a "testDate" column with an associated index. After creating the table I performed a number of select queries with SHOWPLAN_TEXT turned on so that I could discover the filter criteria at which the query plan switched from "INDEX SEEK" to "CLUSTERED INDEX SCAN". I found that when I switch the filter date from "1/31/7" to "1/30/7" that the plan switched. Therefore I ran these two queries with SHOWPLAN _TEXT turned on to illustrate the query plan "crossover" point. I think that your query has a similar issue. My main question to you is was it your intent to find the crossover or is this something that jumped up and bit you? OUCH!

if exists
( select 0 from sysobjects
where type = 'U'
and id = object_id ('dbo.crossoverTest')
)
drop table dbo.crossoverTest
go

create table dbo.crossoverTest
( rid integer
constraint pk_crossoverTest primary key,
filler char (200),
testDate datetime
)
go

create index testDate_ndx
on dbo.crossoverTest (testDate)
go

insert into dbo.crossoverTest
select iter,
'Filler',
dateadd (mi, -17*iter, convert(datetime, '2/3/7'))
from (
select 256*b.number + a.number as iter
from master.dbo.spt_values a (nolock)
inner join master.dbo.spt_values b (nolock)
on a.[name] is null
and b.[name] is null
and b.number <= 127
and a.number <= 255
and 256*b.number + a.number > 0
) as small_iterator

go

update statistics dbo.crossoverTest
go

set showplan_text on
go

select rid,
left (filler, 10) as Filler,
testDate
from crossoverTest
where testDate >= '1/31/7'

go

-- StmtText
-- --
-- |--Compute Scalar(DEFINE:([Expr1002]=substring(Convert([crossoverTest].[filler]), 1, 10)))
-- |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([tempdb].[dbo].[crossoverTest]) WITH PREFETCH)
-- |--Index Seek(OBJECT:([tempdb].[dbo].[crossoverTest].[testDate_ndx]), SEEK:([crossoverTest].[testDate] >= 'Jan 31 2007 12:00AM') ORDERED FORWARD)

-- Table 'crossoverTest'. Scan count 1, logical reads 813, physical reads 0, read-ahead reads 0.

go

select rid,
left (filler, 10) as Filler,
testDate
from crossoverTest
where testDate >= '1/30/7'

go

-- StmtText
--
-- |--Compute Scalar(DEFINE:([Expr1002]=substring(Convert([crossoverTest].[filler]), 1, 10)))
-- |--Clustered Index Scan(OBJECT:([tempdb].[dbo].[crossoverTest].[pk_crossoverTest]), WHERE:([crossoverTest].[testDate]>='Jan 30 2007 12:00AM'))

-- Table 'crossoverTest'. Scan count 1, logical reads 913, physical reads 0, read-ahead reads 0.


go

set showplan_text off
go

|||

Thanks Dave,

Ultimately, a simple index was the issue. What perplexed me was just that the difference between the queries when it returned one more row was soooo dramatic. And to answer your question, I had just come upon the crossover issue through trial and error. I hadnt noticed the query plans were different until I got to the point that 1 row was the difference between success and failure.

No comments:

Post a Comment