The 7 Postgres Indexes That Took My API From 400ms to 40ms
- I took one of my APIs from 400ms p95 to 40ms p95 by fixing 7 missing or wrong Postgres indexes.
- Most of my slow queries were not slow because of bad SQL, they were slow because Postgres had to scan the whole table.
- Partial indexes and covering indexes did more for me than plain B-tree indexes on primary columns.
- EXPLAIN ANALYZE and pg_stat_statements are the only two tools you need to find the real bottlenecks.
The API in question powers my analytics dashboard. Nothing fancy. A few endpoints that read from a Postgres database with around 8 million rows spread across 14 tables. It had gotten slow. 400ms p95 on the hot endpoints, 900ms p99, timeouts on the reports page when I filtered by date range. I spent a Saturday looking at indexes instead of rewriting code, and that Saturday is the reason the dashboard now answers in 40ms. This post is what I changed, in order, with the numbers before and after. Postgres indexes are the lever I keep underestimating, and I want to stop doing that.
I am going to assume you know roughly what an index is. What I am not assuming is that you have seen the specific index shapes that saved me the most time. Partial indexes, covering indexes, and expression indexes do not get enough airtime in the tutorials I learned from. They should.
The 400ms Baseline: What My Queries Actually Looked Like
Before touching anything, I turned on `pg_stat_statements` and let it run for a full day. This is step zero. If you do not know which queries are slow, you will optimize the wrong things. I have done it. I have spent an hour tuning a query that accounts for 0.4% of total query time while the one eating 41% of the database kept running untouched.
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
SELECT
substring(query, 1, 80) AS query,
calls,
round(total_exec_time::numeric, 1) AS total_ms,
round(mean_exec_time::numeric, 2) AS mean_ms,
round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 1) AS pct
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
The results were humbling. Four queries were responsible for 78% of total execution time. Every one of them was doing a sequential scan on tables with millions of rows. I ran `EXPLAIN (ANALYZE, BUFFERS)` on each and confirmed it. Seq Scan, Seq Scan, Seq Scan, and a Nested Loop with a Seq Scan on the inner side.
This is the honest baseline I started from:
| Endpoint | p50 | p95 | p99 | Worst query |
|---|---|---|---|---|
| GET /events | 120ms | 410ms | 920ms | filter by user_id + created_at |
| GET /reports | 280ms | 890ms | 2100ms | aggregate by date range |
| GET /users/:id/summary | 60ms | 220ms | 480ms | join users to events |
| GET /search | 180ms | 520ms | 1400ms | ILIKE on title |
I want to be clear that this was not a case of Postgres being slow. Postgres was doing exactly what I told it to do. I had not told it how to find the rows I cared about without reading every row, so it read every row. That is on me.
The 7 Indexes That Actually Moved the Needle
These are the seven indexes I added, in the order I added them, with the before and after for the query that changed most.
1. Composite index on (user_id, created_at DESC)
The events table is ordered by write time. My queries are ordered by (user_id, created_at DESC) because I always want a user's newest events first. A single-column index on `user_id` is not enough. Postgres can find the user's rows but then has to sort them by `created_at`, which on a busy user is a lot of rows.
CREATE INDEX CONCURRENTLY idx_events_user_created
ON events (user_id, created_at DESC);
Before: 410ms p95. After: 24ms p95. One index, 17x faster, and this was the single biggest win of the day. `CONCURRENTLY` matters. On a table I cannot take offline, a non-concurrent `CREATE INDEX` holds a lock that blocks writes. Always concurrent for production tables.
2. Partial index for unread events
A specific subset of events is "unread". That subset is less than 2% of total rows. A full index on the boolean `read_at IS NULL` column would touch every row. A partial index only stores the rows where the predicate is true, so it is smaller, faster to read, faster to maintain.
CREATE INDEX CONCURRENTLY idx_events_unread
ON events (user_id, created_at DESC)
WHERE read_at IS NULL;
Before: the notification badge count query took 180ms. After: 3ms. The index is 94% smaller than the full equivalent, which also means it fits in cache and stays there.
3. Covering index for the hottest list query
This is the one I wish I had learned five years earlier. When Postgres uses an index to find rows, it still has to fetch the row data from the heap unless all the columns you need are inside the index itself. A covering index adds those columns via `INCLUDE`, so the heap fetch never happens. This is called an index-only scan, and it is as fast as Postgres gets.
CREATE INDEX CONCURRENTLY idx_events_list_covering
ON events (user_id, created_at DESC)
INCLUDE (title, type, read_at);
`EXPLAIN` went from `Index Scan + Heap Fetch` to `Index Only Scan`. The list endpoint that renders the dashboard feed dropped from 96ms to 11ms. The trade-off is disk space. My covering index is 2.3x the size of the plain version. On an 8 million row table, that is about 240 MB. Worth it for a hot endpoint.
4. Expression index for case-insensitive search
My search endpoint was doing `LOWER(title) ILIKE '%term%'`. The `ILIKE` with a leading wildcard is a sequential scan no matter what. For the prefix case (`ILIKE 'term%'`), I can use an expression index on `LOWER(title)` with the right operator class.
CREATE INDEX CONCURRENTLY idx_events_title_lower
ON events (LOWER(title) text_pattern_ops);
Searches that start from the beginning of the title dropped from 520ms to 14ms. For full substring search I eventually added `pg_trgm` and a GIN index, but that is another post. Start with the cheap win. Most of my search queries were prefix matches anyway.
5. GIN index on JSONB metadata
My events have a JSONB `metadata` column with things like device type, country, referrer. I was filtering on `metadata ->> 'country' = 'DE'` all over the place. Postgres supports GIN indexes on JSONB, and specifically on the `jsonb_path_ops` operator class, which is smaller and faster than the default for equality checks.
CREATE INDEX CONCURRENTLY idx_events_metadata_gin
ON events USING GIN (metadata jsonb_path_ops);
Country filter queries dropped from 320ms to 18ms. The index is larger than a B-tree (28% of the table size in my case), but it covers any key inside the JSONB object. One index, many queries, all fast.
6. BRIN index for time-series scans
This one is niche but saved me on the reports endpoint. When your data is naturally ordered on disk by a column (in my case, `created_at`, because I never update old events), a BRIN index stores tiny summaries per block of pages instead of one entry per row. It is about 1000x smaller than an equivalent B-tree. For range scans across millions of rows, it is often fast enough and cheap enough to leave on every table.
CREATE INDEX CONCURRENTLY idx_events_created_brin
ON events USING BRIN (created_at) WITH (pages_per_range = 32);
The report aggregate that scanned 90 days of data dropped from 890ms to 62ms. The index is 4 MB on an 8 million row table. Four megabytes.
7. Foreign key index on the join side
I had a foreign key from `events.organization_id` to `organizations.id` but no index on `events.organization_id`. Postgres does not create one for you when you add a foreign key. The join from organizations to events was a sequential scan on the events table every time.
CREATE INDEX CONCURRENTLY idx_events_org_id
ON events (organization_id);
The organization summary endpoint dropped from 480ms to 9ms. I went back and audited every foreign key in my schema. I was missing three more. I added them all.
Where I Got Indexes Wrong
I also made mistakes. A few of the indexes I added first turned out to be useless or actively harmful, and I want to save you the debugging time.
Indexes I never used. My first instinct was to index every column that appeared in a `WHERE` clause. Postgres has `pg_stat_user_indexes` to tell you which indexes are actually being read. After a week of production traffic, I queried it and found three indexes with zero scans. I dropped them.
SELECT
schemaname, relname, indexrelname,
idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
Every unused index is pure cost. It slows down writes, bloats backups, competes for cache. If it is not being scanned, drop it.
Indexes on low-cardinality columns. I indexed a `status` column with 4 distinct values. Postgres looked at the stats and chose a sequential scan anyway, because fetching 40% of the rows through an index is slower than just reading the table. The correct fix was a partial index, like the unread one above. Whenever a column has few distinct values but you mostly query one specific value, a partial index wins over a full index.
Indexing before migration finishes. I once added an index during a schema migration on a table that was being heavily written to. Without `CONCURRENTLY`, the migration held an `ACCESS EXCLUSIVE` lock for 90 seconds and blocked production writes the entire time. That incident is why every index I create now has `CONCURRENTLY` in the statement, no exceptions, even in development, because muscle memory is how you avoid incidents.
Over-covering indexes. My second attempt at the covering index on events had `INCLUDE (title, type, read_at, metadata, raw_payload)`. Postgres refused to do an index-only scan anyway because `raw_payload` was too large. Covering indexes make sense when the included columns are small and hot. Large blobs belong in the heap.
Monitoring: How I Know an Index Is Earning Its Keep
After the dust settled, I set up three queries I now run weekly. Together they tell me whether my indexes are doing useful work.
Index hit ratio. The ratio of index reads coming from cache vs disk. Below 99% on a busy table is a sign the working set does not fit in memory or the indexes are too big.
SELECT
relname,
idx_blks_read,
idx_blks_hit,
round(100.0 * idx_blks_hit / nullif(idx_blks_hit + idx_blks_read, 0), 2) AS hit_pct
FROM pg_statio_user_indexes
WHERE idx_blks_read > 0
ORDER BY idx_blks_read DESC
LIMIT 20;
Bloat. Indexes accumulate dead tuples over time, especially on tables with lots of updates. The `pgstattuple` extension gives you a real number. Above 30% bloat, I run `REINDEX CONCURRENTLY` (available in Postgres 12+). Before 12 I used `pg_repack`.
Unused indexes. The same `pg_stat_user_indexes` query from above, run on a fresh snapshot once a month. Anything at zero scans gets dropped.
None of this is exotic. All of it is built into Postgres. Most of it I learned by reading the docs after the fact, instead of before. If you do this before you go to production, you are ahead of where I was.
Bottom Line
Indexes are the highest leverage work I have ever done on a database and the one I keep putting off because it feels less fun than writing new features. That is a trap. A Saturday on indexes gave me a 10x speedup on the user-visible parts of my dashboard and retired three tickets about slowness that had been open for weeks. The tools to find the right indexes are already in your database. `pg_stat_statements` for the hottest queries. `EXPLAIN ANALYZE` for the execution plan. `pg_stat_user_indexes` for usage. If you have not opened any of those this month, open one today. Your p95 will thank you.
Back to all articles