An index is a data structure that makes lookups fast. Without one, the database scans every row. With the right index, it jumps directly to the rows you need. Understanding indexes is the single highest-leverage database skill.
How B-Tree Indexes Work
Most database indexes are B-trees (balanced trees). Think of it like a phone book: instead of reading every entry, you jump to the right section, then the right page, then the right entry. The cost goes from O(n) to O(log n).
When you create an index:
CREATE INDEX idx_users_email ON users(email);
The database builds a separate sorted data structure pointing from email values to the corresponding rows. Lookups by email are now fast.
Composite Indexes
Indexes on multiple columns follow the “leftmost prefix” rule:
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
This index helps with:
WHERE user_id = 123(uses first column)WHERE user_id = 123 AND created_at > '2026-01-01'(uses both columns)WHERE user_id = 123 ORDER BY created_at(uses both for filter and sort)
It does not help with:
WHERE created_at > '2026-01-01'alone (can’t skip the first column)
Column order in composite indexes matters. Put the most selective column first, and the column you range-scan on last.
When Indexes Hurt
Indexes aren’t free:
- Write overhead. Every INSERT, UPDATE, and DELETE must update the index. More indexes = slower writes.
- Storage. Indexes consume disk space. A table with 10 indexes might have more index storage than data storage.
- Maintenance. Indexes can become fragmented over time and need rebuilding.
The EXPLAIN Plan
Always check whether your index is being used:
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 42
AND status = 'pending'
ORDER BY created_at DESC
LIMIT 10;
Look for:
- Index Scan or Index Only Scan: Your index is being used.
- Seq Scan: Full table scan. Either no suitable index exists or the optimizer decided scanning is cheaper (small tables, low selectivity).
- Bitmap Index Scan: The database is combining multiple indexes.
Rules of Thumb
- Index columns in WHERE, JOIN, and ORDER BY clauses.
- Don’t index everything. If a column has low cardinality (e.g., boolean), an index probably won’t help.
- Cover your queries. An “index-only scan” is fastest because the database never reads the table itself.
- Monitor slow queries. Let real workload data guide your indexing decisions, not guesses.
- Remove unused indexes. They cost write performance for zero benefit.
Good indexing is the difference between a query that takes 500ms and one that takes 2ms. It’s worth understanding deeply.