SQL Indexes – Indexing Theory Simply, Balanced Tree, Heaps

Last modified date:

Correct table indexing in SQL Server is a base for good database performance during querying. You need to understand how SQL Server stores data into tables/indexes if you want to create appropriate sql indexes. It is also important to know how to approach these data correctly during querying.

How Does SQL Server Organize Data Physically?

File page is the smallest unit for reading and writing into db objects in SQL Server. Every single sql file page haves 8 KB and is related to one object (for example table or index). Individual file pages are organized into Extents. Each extent consists of 8 file pages.

SQL Extents

On the highest level, data are organized into 2 file types. We will focus on the first one (MDF) in this article

  • MDF files where the data is
  • LDF files of transactional log

Key to Understanding SQL Indexes? Logical Data Organization in SQL Server.

The way of physical data storing on disk was mentioned in the introduction. Now we finally get to the logical organization. Data are stored in file pages as was already mentioned. There is plenty of them and so there must be a system by which SQL server orientates in it.

Performance of SQL queries depends directly on ability of SQL engine to give out individual file pages to some table which we are trying to query. This performance depends on how are the tables logically organized.

We use special system objects called Index allocation map (IAM) for these purposes. Each table haves assigned at least one such object. These objects work on linking principle – it links individual file pages with tables.

According to number of IAM in a table,  we distinguish 2 organization methods – Heap and balanced tree

Heap – Not Organized Data

Heap means tables which are not organized in any way ­:) and have only 1 IAM (so called first IAM). It is like going for a book to the library which is completely unorganized. You would have to go through all the books to find the one you are looking for.

You will get a table of this type whenever you create it without the primary key or without indexes. It is simply a heap of unorganized file pages. SQL server must scan whole heap if we query such table with a condition or try JOIN with another table. This means every page file is scanned separately – and this takes a long while.

SQL heap

Source: Itzik Ben-Gan, Dejan Sarka, Ron Talmage. Querying Microsoft SQL Server 2012. Microsoft press, Vydání 2012. 752 stran. ISBN 0735666059

Balanced Tree – Organized Data

On the other hand, data organization such as balanced tree is something completely different. Table gets organized as a balanced tree anytime you create clustered index (i.e. primary key) above the table.

This architecture creates clusters and therefore is the scan for the records much faster. SQL Server does not have to scan whole table as in heap case. It searches individual clusters. Sql indexes therefore work like if you search library for books by genre and author.

Source: Itzik Ben-Gan, Dejan Sarka, Ron Talmage. Querying Microsoft SQL Server 2012. Microsoft press, Vydání 2012. 752 stran. ISBN 0735666059

Next articles will have me look at index fragmentation and I will also demonstrate, how to repair indexes automatically using SQL script

My name is Jan Zedníček and I work as a BI Developer at Kentico Software in Brno. Mostly you can see me there working in the office, but I also work partly as a freelancer. I have been working as a freelancer for many companies for more than 5 years, but Kentico is the matter of my heart. I used to work as a financial controller at companies like Aero Vodochody or Sberbank and I also used to be a bond program manager in Unicapital Investment group. When I am not at work, I like playing volleyball, chess, doing a workout in the gym and I enjoy tasting of best quality rums. I am trying to summarize all my knowledge on this website not to forget them (because of the rum effect, you know =) and to put them forward to anyone. Don´t worry about asking for help or writing some comments.

Category: Excel tutorials SQL Administration Tags:

About Ing. Jan Zedníček - BI Developer, Finance controller

My name is Jan Zedníček and I work as a BI Developer at Kentico Software in Brno. Mostly you can see me there working in the office, but I also work partly as a freelancer. I have been working as a freelancer for many companies for more than 5 years, but Kentico is the matter of my heart. I used to work as a financial controller at companies like Aero Vodochody or Sberbank and I also used to be a bond program manager in Unicapital Investment group. When I am not at work, I like playing volleyball, chess, doing a workout in the gym and I enjoy tasting of best quality rums. I am trying to summarize all my knowledge on this website not to forget them (because of the rum effect, you know =) and to put them forward to anyone. Don´t worry about asking for help or writing some comments.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.