Indexes and Partitioning

Who is this article for?

This article is designed for people that are already familiar with SQL Server partitioning that want a deeper understanding of how indexing in partitioned tables works.

There are a lot of good articles regarding partitioning, how to create partition schemes and functions. One of my favourites is Brent Ozar that gives an easy introduction and contains links to more deep down resources.

Introduction

Indexes in partitioned tables can be divided into two subgroups:

  • Aligned
  • Unaligned

An aligned index, it’s an index that uses the same partition scheme and column as its table. SQL Server will align indexes for you unless you specify something different, like another partition scheme or filegroup.

The reasoning behind this feature, it’s that in general it’s better to have all the indexes aligned to be able to benefit from partition switching. A good starting point is as always msdn.

Let’s jump into code

First of all, let’s create a simple partitioned table in the heap with some records.

CREATE PARTITION FUNCTION PF_T1(INT) AS RANGE LEFT FOR VALUES (0,5,10)

CREATE PARTITION SCHEME PS_T1 AS PARTITION PF_T1 TO ([SECONDARY],[F1],[F2],[F3])

CREATE TABLE T1(
       ID INT IDENTITY(1,1),
       PartitionKey INT NOT NULL
)ON PS_T1(PartitionKey)

INSERT INTO T1(PartitionKey) VALUES (0),(4),(5),(7),(8),(11),(12),(13)

Now, let’s see where our data is to start with.

EXEC dbo.SeePartitions 'T1'

jsblog

No surprises here, we are in the heap using the Partition Scheme. Now that we know where we are let’s play with some clustered indexes. Let’s start by something simple, a clustered non unique index on the partitioned column.

CREATE CLUSTERED INDEX T1_clustered
   ON T1 (PartitionKey)

EXEC dbo.SeePartitions 'T1'

jsblog1

If we take a look to the index, no surprises either, it’s aligned using the Partition Scheme.

EXEC dbo.SeeIndex 'T1_clustered'

jsblog2

 

 

 

 

Let’s drop the previous index and check the results.

DROP INDEX T1_clustered ON T1

EXEC dbo.SeePartitions 'T1'

jsblog3

 

So we are back on the heap.

CREATE CLUSTERED INDEX T1_clustered
   ON T1 (ID)

EXEC dbo.SeePartitions 'T1'

EXEC dbo.SeeIndex 'T1_clustered'

jsblog4

What has happened there? Why does my index have two columns when I specified only one? If you take a closer look to the statement, we are creating the index on column “ID” not on the “PartitionKey”, that is not the column our table is partitioned on. As mentioned before, SQL Server will try to keep our indexes aligned unless we are very explicit about it and we tell it not to. Microsoft documentation states:

“When partitioning a nonunique clustered index, and the partitioning column is not explicitly specified in the clustering key, SQL Server adds the partitioning column by default to the list of clustered index keys.”

Now, if you look at that index in SSMS you will see that that column is not in the General tab, but it is specified in the Storage tab.

jsblog5

If you click “Script”, you will see what SQL Server has done behind the scenes.

CREATE CLUSTERED INDEX [T1_clustered] ON [dbo].[T1]
(
       [ID] ASC
) ON [PS_T1]([PartitionKey])

Ok, let’s drop the previous index and create a new one.

DROP INDEX T1_clustered ON T1

CREATE UNIQUE CLUSTERED INDEX T1_clustered
   ON T1 (ID)

jsblog6

We get a really good error, it means that the “PartitionKey” needs to be explicitly in the index because our index is unique. Ok, so let’s add it.

CREATE UNIQUE CLUSTERED INDEX T1_clustered
   ON T1 (ID, PartitionKey)

EXEC dbo.SeePartitions 'T1'

EXEC dbo.SeeIndex 'T1_clustered'

jsblog7

The above behaviour is the same for nonclustered indexes so, what if I do know what I’m doing and I really don’t want to have my indexes aligned. In that case you need to be explicit about the filegroup you want the index to live in.

CREATE UNIQUE NONCLUSTERED INDEX T1_nonclustered
   ON T1 (ID)
   ON [PRIMARY]

jsbog8

Conclusions

SQL server tries really hard to keep your indexes aligned for a very good reason, unaligned indexes will have a very painful consequence: switching partitions won’t be possible.

There are very few scenarios where performance can be improved by having unaligned indexes, like calculating a MAX without filtering by the partition key, a really good article can be found here, but rarely the benefits outweighs the cost.

Leave a Reply