Key Lookup Threshold
Gail talks about bookmark lookups…. err.. key lookups in this post. So are they good or bad? Well, like many things in SQL, it depends. The main factor is the number of rows returned. A few rows are fine but the cost rises sharply with larger result sets. There comes a point where the threshold is crossed and a scan is more efficient. This is because a scan leverages sequential IO while a lookup does random IO.
Here are the results of the code at the end of the post.
| Rows | IO | |
| Scan | 422 | |
| 1 row | 4 | |
| 5 rows | 12 | |
| 10 rows | 22 | |
| 25 rows | 52 | |
| 50 rows | 114 | |
| 100 rows | 217 | |
| 250 Rows | 526 | <–Threshold |
| 500 Rows | 1043 | |
| 1000 Rows | 4141 |
As you can see, at 250 rows, we have crossed the threshold and it is cheaper do a scan. If you are passing in a literal, the optimizer can detect this and switch to a scan. If it is a stored proc or parameterized SQL, a plan is cached the first time it is run. Problems happen when the result size greatly varies depending on the parameter. There are ways around this all with their pro’s and con’s. Here are some:
- A covering index.
- A statement level recompile hint
- Plan guides and hints
Here is the sample code that can repro these numbers on SQL Server 2008.
CREATE TABLE #temp
(
id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
c1 CHAR(1000) DEFAULT( 'blah'),
c2 INT
)
INSERT INTO #temp(c2)
VALUES
(1)
GO
INSERT INTO #temp(c2)
VALUES
(5)
GO 5
INSERT INTO #temp(c2)
VALUES
(10)
GO 10
INSERT INTO #temp(c2)
VALUES
(25)
GO 25
INSERT INTO #temp(c2)
VALUES
(50)
GO 50
INSERT INTO #temp(c2)
VALUES
(100)
GO 100
INSERT INTO #temp(c2)
VALUES
(250)
GO 250
INSERT INTO #temp(c2)
VALUES
(500)
GO 500
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000
CREATE INDEX ix ON #temp(c2
)
--The baseline
SET STATISTICS io ON
SELECT *
FROM #temp WITH (INDEX=1)
WHERE c2 = 1
--1 row returned
SELECT *
FROM #temp
WHERE c2 = 1
--5 rows returned
SELECT *
FROM #temp
WHERE c2 = 5
--10 rows returned
SELECT *
FROM #temp
WHERE c2 = 10
--25 rows returned
SELECT *
FROM #temp
WHERE c2 = 25
--50 rows returned
SELECT *
FROM #temp
WHERE c2 = 50
--100 rows returned
SELECT *
FROM #temp
WHERE c2 = 100
--250 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 250
--1000 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 1000
This content is published under the Attribution-Share Alike 3.0 Unported license.
