SELECT is using a non-PK Index instead of the PK
SELECT is using a non-PK Index instead of the PK
In the [dbo].[Programs]
table, the column [Id]
is the PrimaryKey (not par of a composite key). There are also quite a few other indexes on that table.
[dbo].[Programs]
[Id]
When I'm running this simple query, SELECT [Id] FROM [dbo].[Programs]
, here is the execution plan:
SELECT [Id] FROM [dbo].[Programs]
My question is: Why is it not just using the PK index instead?
Performance is not an issue as that table has 23 rows, but I just find it odd, and I want to understand why SqlServer is right, and why I'm wrong to assumed it would be better.
2 Answers
2
The query optimizer will look to see what's the quickest way (as far as it can tell) to get you your data.
It might be interesting to check the execution plan of the same query, with ORDER BY [Id]
added.
ORDER BY [Id]
I'll assume that Id
is not only the primary key, but that its index is the clustered index on the table. What that means is that, to go through the index and pick out the Id
values, it'll have to read the entire table (because the clustered index is really the full table, sorted by the index key).
Id
Id
I'll also assume that IX_SupplyProgramId
has a limited number of other columns included. Note that all indexes will have the value of the clustered index, as that's how they connect the index back to the actual row. And, every value of Id
will be included
IX_SupplyProgramId
Id
So, the amount of data that has to be read if it reads the clustered index is probably larger than the amount to read in the other index. Read are one of the most expensive operations the query engine does, so reducing reads is a good thing.
So, it uses the smaller index to get the Id
values, instead of the clustered index.
Id
[SupplyProgramId]
[Id]
[Id]
@Tipx I don't believe that uniqueness has absolutely anything to do with this case.
– Aaron Bertrand♦
Sep 14 '18 at 19:00
@AaronBertrand If it wasn't "Non-Nullable and Unique", it couldn't use the other IX because it wouldn't guarantee that every [Id] would be referenced strictly one. Some [Id] could have no references to them, making the result set incomplete. Some [Id]s could be referenced multiple times, adding the need to do a distinct on the result.
– Tipx
Sep 14 '18 at 19:04
@Tipx - Aaron is correct - even if there were 20-30 rows with each unique value of
SupplyProgramId
, each row would have to appear in the index. Since each row appears in the index, each Id
appears in the index. And, note that Id
is unique (as the primary key) - repeats of the SupplyProgramId
don't mean repeats of the Id
.– RDFozz
Sep 14 '18 at 19:06
SupplyProgramId
Id
Id
SupplyProgramId
Id
@RDFozz Alright, I guess I don't understand something then. I don't want to spill too much in comments (as some would direct us to the chat), so I'll let it go!
– Tipx
Sep 14 '18 at 19:11
It has nothing to do with primary key, except that the primary key is typically clustered.
In your case SQL Server is not using the clustered index because, by definition, the clustered index includes all of the columns in the table. Since you only want Id
, it is using a skinnier index that satisfies your query simply because it's less work to do so, and even if your table only has a single column, it's still going to choose the non-clustered index.
Id
If I ask you to get me a beer from the fridge, your choices are to:
In your case, 1. is using the clustered index, 2. is using some wide index, and 3. is using an index that only contains Id
.
Id
A clustered index is simply not always the best choice for an operation, much like a Ferrari is not the car you always want to use for a task (racing someone vs. towing a yacht, for example).
Thanks for contributing an answer to Database Administrators Stack Exchange!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Oh, so because
[SupplyProgramId]
is not nullable, and unique, it know it will contain every[Id]
strictly once... damn he's wise! I tried using the sort on[Id]
, and it does indeed use the PK Index at that point. So when it reads the Clustered PK Index, since it contains the data too, it's heavier to read hence why it's not doing it... nice! Thanks sir!– Tipx
Sep 14 '18 at 18:56