I have a several process that aggregate some statistics on items (how much was shown, clicked … around 30 factors).
Once in a 5 minutes i flush the aggregate data to postgres (9.1).
I have 250K aggregate items with statistics.
tabe key:
item_id,process_id (in order to avoid locking process-id is part of the key),channel,(more 2 fields).
The process is like this:
open transaction
try to update (most of time this item already exist in db, so update success).
update is done like this – update statistics set counter_a= counter_a+1… where id = x and channel = y and …;
insert in case update failed.
commit
This take 15 minutes for 250K updates.
Any advices?
I saw this question :
Optimizing bulk update performance in Postgresql
But the case there is not relevant.
I cant drop indexes because i need them for the update.
Any way to avoid the overhead of the mvvc when i know that this is the only process that will write the data and no need in isolation or even transaction in case of failure.
db machine run on “hi1.4xlarge” amazon instance (64G RAM , 8 cores).