Database Indexes

Some are excerpts from Grokking System Design

The Concept

Simply saying, an index is a data structure that can be perceived as a table of contents that points us to the location where actual data lives.

Goal of Indexing

The goal of creating an index on a particular table in a database is to make it faster to search through the table and find the row or rows that we want.

Type of Indexing

  • Primary Index − Primary index is defined on an ordered data file. The data file is ordered on a key field. The key field is generally the primary key of the relation.

  • Secondary Index − Secondary index may be generated from a field which is a candidate key and has a unique value in every record, or a non-key with duplicate values.

  • Clustering Index − Clustering index is defined on an ordered data file. The data file is ordered on a non-key field.

When to Use Indexing

In the case of data sets that are many terabytes in size but with very small payloads (e.g., 1 KB), indexes are a necessity for optimizing data access.

Cons of Using Indexes

How do Indexes decrease write performance? since we not only have to write the data but also have to update the index.

So adding indexes can increase the read performance, but at the same time, decrease the write performance.

This performance degradation applies to all insert, update, and delete operations for the table. Thus, adding unnecessary indexes on tables should be avoided, and indexes that are no longer used should be removed.

Tradeoff

Is the database to provide a data store more often written or read from? If writes are more common operation, then it's probably not worth using indexing.

Reference & Reading List

Last updated