The table market_trades
has an integer column market_id
which takes values from 1 to 5. This seems like a pretty straightforward query to get the number of trades on each market:
WITH markets AS (
SELECT unnest('{1,2,4}'::int[]) AS market_id
)
SELECT m.market_id, count(td.id) AS agg
FROM markets m
LEFT JOIN market_trades td
ON td.market_id = m.market_id
GROUP BY m.market_id
ORDER BY m.market_id;
But it takes about 10 full seconds to run! I added an index on market_trades.market_id
, but it did not speed up the query at all.
The market_trades
table has about 30 million rows.
Here is the ouput of EXPLAIN ANALYZE
on the above query (on depesz):
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1142468.03..1142468.28 rows=100 width=8) (actual time=62052.515..62052.516 rows=3 loops=1)
Sort Key: m.market_id
Sort Method: quicksort Memory: 25kB
CTE markets
-> Result (cost=0.00..0.51 rows=100 width=0) (actual time=0.006..0.010 rows=3 loops=1)
-> HashAggregate (cost=1142463.20..1142464.20 rows=100 width=8) (actual time=62052.502..62052.504 rows=3 loops=1)
-> Hash Right Join (cost=3.25..992529.47 rows=29986746 width=8) (actual time=1.398..51289.914 rows=14297964 loops=1)
Hash Cond: (td.market_id = m.market_id)
-> Seq Scan on market_trades td (cost=0.00..580208.46 rows=29986746 width=8) (actual time=0.007..21659.649 rows=29985911 loops=1)
-> Hash (cost=2.00..2.00 rows=100 width=4) (actual time=0.023..0.023 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> CTE Scan on markets m (cost=0.00..2.00 rows=100 width=4) (actual time=0.010..0.017 rows=3 loops=1)
EDIT: Clearly the index on market_trades.market_id
is not being used. Why not?