At high load conditions the server app stops responding to client because begin
/commit
statements are executing very slowly (some take 15 seconds). The queries are simple – insert, update two columns on one record by id, select a,b,c from d order by limit 50, etc…
pgAdmin shows a lot of locks and queries like
set extra_float_digits=3; set ssl_recognitation_limit=0; select 'npgsql12345';
DISCARD ALL
COMMIT
BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITED;
(2000 granted locks and 0 not granted)
After some time the server starts responding, the locks disappear (only 10 locks) but in another minute I have those 2000 locks again. The client app reports that server app response time goes from 600 ms (network worker interval) to 60 seconds, then goes down to ~4 seconds and then up again.
There are ~5 actively used tables, some of them have up to 200k rows.
The server app uses NHibernate with Npgsql.
VACUUM/ANALYZE didn’t help.
This query runs up to 11 seconds when executed remotely:
BEGIN;
UPDATE users SET nickname = 'abc' WHERE id = 55455;
COMMIT;
But after few attempts it executes in 160 ms.
The same query can also be slow without begin
/commit
so I’m not sure if it’s really related to transactions.
But EXPLAIN ANALYZE
executes very fast and shows execution time < 1 ms.
Postgres version: 9.4.
Log:
2015-02-05 10:19:57 GMT LOG duration: 663.000 ms statement: COMMIT
2015-02-05 10:20:04 GMT LOG duration: 556.000 ms statement: INSERT INTO public.user_items (upgrade_stage, upgrade_started_at, upgrade_end_at, upgrade_quick_delivery_done, item_info_id, user_id) VALUES (((0)::int4), ((NULL)::timestamp), ((NULL)::timestamp), ((FALSE)::bool), ((6)::int4), ((126437)::int4)); select lastval()
2015-02-05 10:20:04 GMT LOG duration: 1143.000 ms statement: COMMIT
2015-02-05 10:20:04 GMT LOG duration: 805.000 ms statement: COMMIT
2015-02-05 10:20:04 GMT LOG duration: 805.000 ms statement: COMMIT
2015-02-05 10:20:04 GMT LOG duration: 788.000 ms statement: COMMIT
2015-02-05 10:20:06 GMT LOG duration: 758.000 ms statement: COMMIT
What can be the cause of the slowdown?
Update
Settings:
log_min_duration_statement = 500
work_mem = 32MB
max_connections = 600
shared_buffers = 1024MB
effective_cache_size = 2048MB
checkpoint_segments = 10
Also I activated the stats extension.