Correct table indexing in SQL Server is a base for good database performance during querying. You need to understand how SQL Server stores data into tables/indexes if you want to create appropriate sql indexes. It is also important to know how to approach these data correctly during querying.

How Does SQL Server Organize Data Physically?

File page is the smallest unit for reading and writing into db objects in SQL Server. Every single sql file page haves 8 KB and is related to one object (for example table or index). Individual file pages are organized into Extents. Each extent consists of 8 file pages.

SQL Extents

On the highest level, data are organized into 2 file types. We will focus on the first one (MDF) in this article

  • MDF files where the data is
  • LDF files of transactional log

Key to Understanding SQL Indexes? Logical Data Organization in SQL Server.

The way of physical data storing on disk was mentioned in the introduction. Now we finally get to the logical organization. Data are stored in file pages as was already mentioned. There is plenty of them and so there must be a system by which SQL server orientates in it.

Performance of SQL queries depends directly on ability of SQL engine to give out individual file pages to some table which we are trying to query. This performance depends on how are the tables logically organized.

We use special system objects called Index allocation map (IAM) for these purposes. Each table haves assigned at least one such object. These objects work on linking principle – it links individual file pages with tables. According to number of IAM in a table,  we distinguish 2 organization methods – Heap and balanced tree

Heap – Not Organized Data

Heap means tables which are not organized in any way ­:) and have only 1 IAM (so called first IAM). It is like going for a book to the library which is completely unorganized. You would have to go through all the books to find the one you are looking for.

You will get a table of this type whenever you create it without the primary key or without indexes. It is simply a heap of unorganized file pages. SQL server must scan whole heap if we query such table with a condition or try JOIN with another table. This means every page file is scanned separately – and this takes a long while.

SQL heap
Source: Itzik Ben-Gan, Dejan Sarka, Ron Talmage. Querying Microsoft SQL Server 2012. Microsoft press, Vydání 2012. 752 stran. ISBN 0735666059

Balanced Tree – Organized Data

On the other hand, data organization such as balanced tree is something completely different. Table gets organized as a balanced tree anytime you create clustered index (i.e. primary key) above the table.

This architecture creates clusters and therefore is the scan for the records much faster. SQL Server does not have to scan whole table as in heap case. It searches individual clusters. Sql indexes therefore work like if you search library for books by genre and author.

Source: Itzik Ben-Gan, Dejan Sarka, Ron Talmage. Querying Microsoft SQL Server 2012. Microsoft press, Vydání 2012. 752 stran. ISBN 0735666059

Next articles will have me look at index fragmentation and I will also demonstrate, how to repair indexes automatically using SQL script

Rate this post

Ing. Jan Zedníček - Data Engineer & Controlling

My name is Jan Zedníček and I have been working as a freelancer for many companies for more than 10 years. I used to work as a financial controller, analyst and manager at many different companies in field of banking and manufacturing. When I am not at work, I like playing volleyball, chess, doing a workout in the gym.

🔥 If you found this article helpful, please share it or mention me on your website

Leave a Reply

Your email address will not be published. Required fields are marked *