So if I insert a single row into a postgres database it takes 18 ms. If I do this in a loop like this:
INSERT INTO contacts (numbers)
SELECT distinct array[
(random() * 99999999)::integer,
(random() * 99999999)::integer
]
FROM generate_series(1,4000000) AS x(id);
And I vary the number of rows inserted, the time is nonlinear. Here is the data:
-1 record – 18 ms
-20k records – 36 seconds
-50k records – 151 seconds
-100k records – 750 seconds
Why is this getting exponentially bigger? I need 10 Million records in my database for load testing and it seems to be faster to insert 50k rows and then reinsert the 50k again since 151 + 151 < 750
Any insight on this topic would be appreciated. I assume it is because postgres saves data to rollback in case the query critically fails or is cancelled by the user and postgres does not want to “half insert” the total request.