Key Lookup Threshold

February 5, 2009 · Posted in SQL Server 

Gail talks about bookmark lookups…. err.. key lookups in this post. So are they good or bad? Well, like many things in SQL, it depends. The main factor is the number of rows returned. A few rows are fine but the cost rises sharply with larger result sets. There comes a point where the threshold is crossed and a scan is more efficient. This is because a scan leverages sequential IO while a lookup does random IO.

Here are the results of the code at the end of the post.

Rows IO  
Scan 422  
1 row 4  
5 rows 12  
10 rows 22  
25 rows 52  
50 rows 114  
100 rows 217  
250 Rows 526 <–Threshold
500 Rows 1043  
1000 Rows 4141  

 

As you can see, at 250 rows, we have crossed the threshold and it is cheaper do a scan. If you are passing in a literal, the optimizer can detect this and switch to a scan. If it is a stored proc or parameterized SQL, a plan is cached the first time it is run. Problems happen when the result size greatly varies depending on the parameter. There are ways around this all with their pro’s and con’s. Here are some:

  • A covering index.
  • A statement level recompile hint
  • Plan guides and hints

Here is the sample code that can repro these numbers on SQL Server 2008.

CREATE TABLE #temp
(
id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
c1 CHAR(1000) DEFAULT( 'blah'),
c2 INT
)
INSERT INTO #temp(c2)
VALUES
(1)
GO 
INSERT INTO #temp(c2)
VALUES
(5)
GO 5
INSERT INTO #temp(c2)
VALUES
(10)
GO 10
INSERT INTO #temp(c2)
VALUES
(25)
GO 25
INSERT INTO #temp(c2)
VALUES
(50)
GO 50
INSERT INTO #temp(c2)
VALUES
(100)
GO 100
INSERT INTO #temp(c2)
VALUES
(250)
GO 250
INSERT INTO #temp(c2)
VALUES
(500)
GO 500
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000

CREATE INDEX ix ON #temp(c2
)

--The baseline
SET STATISTICS io ON
SELECT
*
FROM #temp WITH (INDEX=1)
WHERE c2 = 1

--1 row returned
SELECT *
FROM #temp
WHERE c2 = 1

--5 rows returned
SELECT *
FROM #temp
WHERE c2 = 5

--10 rows returned
SELECT *
FROM #temp
WHERE c2 = 10

--25 rows returned
SELECT *
FROM #temp
WHERE c2 = 25

--50 rows returned
SELECT *
FROM #temp
WHERE c2 = 50

--100 rows returned
SELECT *
FROM #temp
WHERE c2 = 100

--250 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 250

--1000 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 1000

This content is published under the Attribution-Share Alike 3.0 Unported license.

Comments

blog comments powered by Disqus