A technique to speed up data retrieval in databases by creating a separate data structure (index) that stores search keys and pointers to the actual data.
In large databases, searching row-by-row is too slow. Indexing allows databases to find, fetch, and sort data faster by reducing the number of disk accesses needed. It’s like an optimized table of contents for your database.
Use indexing when you need to make query operations (like SELECT, JOIN, ORDER BY) significantly faster on large datasets.
You need to know
Choosing the right columns to index is important. Indexing improves read performance but can slow down writes (inserts/updates/deletes) because the index also has to be updated.
Dense vs Sparse Indexing:
Dense: Every record has an index entry.
Sparse: Only selected records have an index entry, pointing to blocks where others can be found.
Clustered vs Non-Clustered Indexing:
Clustered: Data is ordered to match the index.
Non-Clustered: Index contains pointers to the unordered data.
Multilevel Indexing:
If indexes grow too big, a hierarchy (index on the index) can be built to minimize memory usage and access time.
Popular technologies (indexing algorithms)
B+ Trees are the most commonly used index structure. It supports both exact match and range queries efficiently
Hash Indexes are best for exact matches but not range queries. Hashing is fast for direct lookups (
WHERE id = 123
).
Like posts like this?
Every week, you'll get a new system design concept, broken down like this one.
Free subscribers also get a little bonus:
🎁 The System Design Interview Preparation Cheat Sheet
If you're into visuals, paid subscribers unlock:
→ My Excalidraw system design template – so you have somewhere to start
→ My Excalidraw component library – used in the diagram of this issue
No pressure though. Your support helps me keep writing, and I appreciate it more than you know ❤️