So far in this MongoDB Schema Design Anti-Patterns series, we've discussed avoiding massive arrays as well as a massive number of collections.
Today, let's talk about indexes. Indexes are great (seriously!), but it's easy to get carried away and make indexes that you'll never actually use. Let's examine why an index may be unnecessary and what the consequences of keeping it around are.
Would you rather watch than read? The video above is just for you.
#Unnecessary Indexes
Before we go any further, we want to emphasize that indexes are good. Indexes allow MongoDB to efficiently query data. If a query does not have an index to support it, MongoDB performs a collection scan, meaning that it scans every document in a collection. Collection scans can be very slow. If you frequently execute a query, make sure you have an index to support it.
Now that we have an understanding that indexes are good, you might be wondering, "Why are unnecessary indexes an anti-pattern? Why not create an index on every field just in case I'll need it in the future?"
We've discovered three big reasons why you should remove unnecessary indexes:
- Indexes take up space. Each index is at least 8 kB and grows with the number of documents associated with it. Thousands of indexes can begin to drain resources.
- Indexes can impact the storage engine's performance. As we discussed in the previous post in this series about the Massive Number of Collections Anti-Pattern, the WiredTiger storage engine (MongoDB's default storage engine) stores a file for each collection and for each index. WiredTiger will open all files upon startup, so performance will decrease when an excessive number of collections and indexes exist.
- Indexes can impact write performance. Whenever a document is created, updated, or deleted, any index associated with that document must also be updated. These index updates negatively impact write performance.
In general, we recommend limiting your collection to a maximum of 50 indexes.
To avoid the anti-pattern of unnecessary indexes, examine your database and identify which indexes are truly necessary. Unnecessary indexes typically fall into one of two categories:
- The index is rarely used or not at all.
- The index is redundant because another compound index covers it.
#Example
Consider Leslie from the incredible TV show Parks and Recreation. Leslie often looks to other powerful women for inspiration.
Let's say Leslie wants to inspire others, so she creates a website about her favorite inspirational women. The website allows users to search by full name, last name, or hobby.
Leslie chooses to use MongoDB Atlas to create her database. She creates a collection named InspirationalWomen
. Inside of that collection, she creates a document for each inspirational woman. Below is a document she created for Sally Ride.
1 // InspirationalWomen collection 2 3 { 4 "_id": { 5 "$oid": "5ec81cc5b3443e0e72314946" 6 }, 7 "first_name": "Sally", 8 "last_name": "Ride", 9 "birthday": 1951-05-26T00:00:00.000Z, 10 "occupation": "Astronaut", 11 "quote": "I would like to be remembered as someone who was not afraid to do what
she wanted to do, and as someone who took risks along the way in order
to achieve her goals.", 12 "hobbies": [ 13 "Tennis", 14 "Writing children's books" 15 ] 16 }
Leslie eats several sugar-filled Nutriyum bars, and, riding her sugar high, creates an index for every field in her collection.
data:image/s3,"s3://crabby-images/a8f45/a8f45cf0bd1c4f7feb8f5190b13ebbce3dd30e35" alt="There's a secret ingredient in these Nutriyum bars that make me feel so good."
She also creates a compound index on the last_name and first_name fields, so that users can search by full name. Leslie now has one collection with eight indexes:
_id
is indexed by default (see the MongoDB Docs for more details){ first_name: 1 }
{ last_name: 1 }
{ birthday: 1 }
{ occupation: 1 }
{ quote: 1 }
{ hobbies: 1 }
{ last_name: 1, first_name: 1}
Leslie launches her website and is excited to be helping others find inspiration. Users are discovering new role models as they search by full name, last name, and hobby.
#Removing Unnecessary Indexes
Leslie decides to fine-tune her database and wonders if all of those indexes she created are really necessary.
She opens the Atlas Data Explorer and navigates to the Indexes pane. She can see that the only two indexes that are being used are the compound index named last_name_1_first_name_1
and the hobbies_1
index. She realizes that this makes sense.
Her queries for inspirational women by full name are covered by the last_name_1_first_name_1
index. Additionally, her query for inspirational women by last name is covered by the same last_name_1_first_name_1
compound index since the index has a last_name
prefix. Her queries for inspirational women by hobby are covered by the hobbies_1
index. Since those are the only ways that users can query her data, the other indexes are unnecessary.
data:image/s3,"s3://crabby-images/ccd6f/ccd6f3964e5dae7b60ba682eddfef39d807e7bc7" alt="Screenshot of the Atlas Data Explorer's Indexes pane"
In the Data Explorer, Leslie has the option of dropping all of the other unnecessary indexes. Since MongoDB requires an index on the _id
field, she cannot drop this index.
In addition to using the Data Explorer, Leslie also has the option of using MongoDB Compass to check for unnecessary indexes. When she navigates to the Indexes pane for her collection, she can once again see that the last_name_1_first_name_1
and the hobbies_1
indexes are the only indexes being used regularly. Just as she could in the Atlas Data Explorer, Leslie has the option of dropping each of the indexes except for _id
.
data:image/s3,"s3://crabby-images/5b20b/5b20bdf748c9b40fb2df8554f930af3d198dc8d9" alt="Screenshot of Compass's Indexes pane"
Leslie decides to drop all of the unnecessary indexes. After doing so, her collection now has the following indexes:
_id
is indexed by default{ hobbies: 1 }
{ last_name: 1, first_name: 1}
#Summary
Creating indexes that support your queries is good. Creating unnecessary indexes is generally bad.
Unnecessary indexes reduce performance and take up space. An index is considered to be unnecessary if (1) it is not frequently used by a query or (2) it is redundant because another compound index covers it.
You can use the Atlas Data Explorer or MongoDB Compass to help you discover how frequently your indexes are being used. When you discover an index is unnecessary, remove it.
Be on the lookout for the next post in this anti-patterns series!
#Related Links
Check out the following resources for more information:
More from this series