What are Indexes
In this tutorial, we are going to discuss about What are Indexes in databases. One of the very first things you should turn to when that happens is database indexing.
Indexes in databases are data structures that provide quick lookup of rows in a table based on the values of one or more columns. They enhance the performance of database queries by reducing the number of rows that need to be scanned when searching, sorting, or joining tables. Indexes are critical for optimizing the performance of database systems, especially in applications with large datasets or complex query requirements.
The goal of creating an index on a particular table in a database is to make it faster to search through the table and find the row or rows that we want. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.
Indexes are primarily used to improve the performance of SELECT queries by enabling the database system to locate rows more efficiently. They speed up data retrieval operations by reducing the number of rows that need to be scanned or searched.
Indexes are typically implemented as B-tree data structures, although other data structures such as hash indexes or bitmap indexes may also be used depending on the database system and the query requirements.
Indexes can be created on one or more columns of a table. Single-column indexes are used to speed up queries that filter or sort data based on a single column, while composite indexes are used for queries that involve multiple columns.
Index Creation
- Indexes can be created using SQL
CREATE INDEX
statements. The syntax varies slightly between different database systems. - Indexes can be created explicitly by database administrators or automatically by the database optimizer based on query patterns and performance considerations.
Index Usage
- Database queries can use indexes to quickly locate rows that match search conditions specified in
WHERE
clauses. - Indexes also facilitate efficient sorting of query results and joining of tables by providing ordered access paths to the data.
- However, indexes may not always be used by the query optimizer if it determines that a full table scan or other access method is more efficient based on factors like data distribution, index selectivity, and query complexity.
Example: A library catalog
A library catalog is a register that contains the list of books found in a library. The catalog is organized like a database table generally with four columns: book title, writer, subject, and date of publication. There are usually two such catalogs: one sorted by the book title and one sorted by the writer name. That way, you can either think of a writer you want to read and then look through their books or look up a specific book title you know you want to read in case you don’t know the writer’s name. These catalogs are like indexes for the database of books. They provide a sorted list of data that is easily searchable by relevant information.
Simply saying, an index is a data structure that can be perceived as a table of contents that points us to the location where actual data lives. So when we create an index on a column of a table, we store that column and a pointer to the whole row in the index. Let’s assume a table containing a list of books, the following diagram shows how an index on the ‘Title’ column looks like:
Purpose of Database Indexes
1. Faster Data Retrieval: Indexes significantly speed up query execution by providing a more efficient means of locating data, which can lead to a reduction in the number of disk I/O operations and CPU usage.
2. Sorting and Ordering: Indexes can be used to quickly sort and order the data in a table based on specific criteria, which can be useful for reporting or displaying data in a specific order.
How Indexes Improve Query Performance
1. Reduced Table Scans: By using an index, the database can avoid full table scans, which require reading every row in a table. Instead, the database can directly access the indexed columns, reducing the amount of data that needs to be read and processed.
2. Efficient Data Access: Indexes provide a more efficient means of accessing data by organizing it in a way that allows the database to quickly locate the rows that meet the query criteria.
3. Index Selectivity: Indexes with high selectivity can improve query performance by reducing the number of rows that need to be accessed. High selectivity means that the index can effectively filter out a large number of rows, thereby reducing the amount of work required to process a query.
How Indexes decrease write performance?
It’s important to note that while indexes can significantly improve query performance, they also come with some overhead. Indexes require additional storage space and can slow down write operations, such as INSERT, UPDATE, and DELETE, since the indexes must be updated along with the table data. Therefore, it’s essential to strike a balance between the number of indexes and their impact on query performance and storage requirements.
When adding rows or making updates to existing rows for a table with an active index, we not only have to write the data but also have to update the index. This will decrease the write performance. This performance degradation applies to all insert, update, and delete operations for the table. For this reason, adding unnecessary indexes on tables should be avoided and indexes that are no longer used should be removed.
To summarize, adding indexes is about improving the performance of search queries. If the goal of the database is to provide a data store that is often written to and rarely read from, in that case, decreasing the performance of the more common operation, which is writing, is probably not worth the increase in performance we get from reading.
That’s all about the What are Indexes in databases. If you have any queries or feedback, please write us email at contact@waytoeasylearn.com. Enjoy learning, Enjoy system design..!!