MySQL Indexing
We hope that this brief introduction can help users get much more out of their MySQL database since in the time we’ve been helping people host their web applications, the most frequent cause of performance issues we’ve observed has been missing or improperly used MySQL indexes.
What are MySQL Indexes?
The database server’s task is to search through all the rows in your table to locate those that match your request each time your web application executes a database query using a WHERE clause. Each time more rows need to be examined as the table gets bigger.
By taking data from a column in your table and storing it alphabetically in a different area called an index, indexes provide a solution to this issue that is precisely the same as the index in a reference book. All data types can be processed using the same method; for instance, dates and numbers are stored in date order and respect.
Making a Basic Index
It is worthwhile to learn how to generate indexes using your web framework, such as migrations in Ruby on Rails, but in this article, I will share the raw MySQL commands. Many web frameworks will help with the formation of indexes.
In this illustration, a group of kids in a very sizable school will be seated at a table. Given that our institution has 100,000 pupils, this is a sizable table. The table’s first few rows appear as follows:
ID | First Name | Last Name | Class |
---|---|---|---|
1 | ravi | kumar | 9A |
2 | shiva | roi | 9B |
3 | ramya | kumari | 9A |
4 | divya | singh | 9B |
The following query was used to construct the table:
CREATE TABLE `studentsData` ( `id` int(11) NOT NULL AUTO_INCREMENT, `firstName` varchar(255) DEFAULT NULL, `lastName` varchar(255) DEFAULT NULL, `class` varchar(10) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB
The following query was used to create the index for a column:
CREATE INDEX by_lastName ON studentsData (`lastName`);
Our website application runs these three queries in the table:
Find a student using their ID
Find students using their last names
List every pupil in the class.
We must make sure that each of these columns has its own index because each of those queries looks for information in a single column.
The first query, SELECT * FROM students WHERE id = 1, is unique since it uses the PRIMARY KEY to look up a row. No more optimization is necessary because we already know precisely which row we desire. When it is practicable, it is desirable to always look up rows in this manner, and nearly all tables should have a distinct column specified as the primary key in this manner.
The second query, SELECT * FROM students WHERE lastName = “kumar” will search a column for which there is currently no index, and this is precisely the kind of query that, if left unchecked, will scan the whole table. Let’s immediately make an index for that column:
CREATE INDEX by_lastName ON studentsData (`lastName`);
By doing this, we can quickly search these up because the table will now have an index named by_lastName that contains indexes that copy the lastName column.
We can quickly look up the students in a certain class by employing the exact same concept in the class column. We should make the following straightforward index on it:
CREATE INDEX by_class ON studentsData (`class`);
Querying Multiple Columns
What happens when we search the database for many columns and our queries get a little more complicated? In this instance, we are looking for a student by the name of Smith because we are aware of their class.
SELECT * FROM studentsData WHERE class = 9A AND lastName = ‘kumar'
We won’t gain from using both of our current indexes to execute this query because MySQL typically uses one index to query the table. But at this point, we also don’t need to create any more indexes!
The database server will examine the table and find that there is an index on class and that there are only roughly 20 students in each class. The 20 students in the class will be located using the by_class index we already constructed, and then each row’s lastName will be manually checked. In comparison to scanning all 100,000 rows, checking 20 rows causes the server no trouble, and we’ve avoided memory waste by not adding any more indexes.
Indexing Multiple Columns
Though we had small class numbers in our previous example, what if we occasionally had courses with more than 100 students? In this situation, it would be wise to expand our by_class index so that it also contains the lastName. This would need more memory but allow us to effectively search on both columns at once.
DROP INDEX by_class ON studentsData; CREATE INDEX by_class_and_lastName ON studentsData (class, lastName);
We removed the by_class index, but why? Because our new index will allow us to search by class, last name, and solely by class, it will make the by_class index unnecessary.
What makes us keep by_last_name then? Indexes, however, can only be utilized from the very beginning. The query must use the index’s parts sequentially, starting with the first, even though the entire index need not be used.
Choosing Columns to Index
It is extremely feasible to index your database without looking at your queries by simply scanning through your tables and indexing columns that will obviously be searched on. Columns like the following should nearly always be indexed:
- columns (typically with a _id at the end) that link with other tables.
- Every time a user checks in, their usernames and/or email addresses are searched for.
- whatever field a URL uses. A uuid field is commonly available in apps for this use. Keep in mind that id ought to be a primary key already.
Summary
You are probably missing an index on your table if you notice that your SELECT queries are taking a long time to execute. Consider which columns are being searched on, and use the following syntax to build an index
CREATE INDEX by_studentId ON grades (studentId);
Starting with indexing columns that link with other tables (whose names typically finish in _id) is always a good idea. Look for columns you know you’ll frequently search on after that. Concentrate on the biggest tables.