Quantcast
Channel: Question and Answer » postgresql
Viewing all articles
Browse latest Browse all 1138

Using EXCEPTION to ignore duplicates during bulk inserts

$
0
0

In pq 9.3.5, I’m importing records from an external source where duplicates are VERY rare, but they do happen. Given a readings table with a unique compound key on (real_time_device_id, recorded_at), the following will fail once in a blue moon:

INSERT INTO readings (real_time_device_id, recorded_at, duration) VALUES
  ('150', TIMESTAMP WITH TIME ZONE '2014-11-01 23:06:33 -0700', 10.0),
  ... many more records ...
  ('150', TIMESTAMP WITH TIME ZONE '2014-11-01 23:06:43 -0700', 10.0);

(FWIW, the above fails ‘properly’ with a duplicate key violation.)

I know that handling exceptions is expensive, but as I said, duplicate entries are very rare. So to keep the code simple, I followed an example given in Optimal way to ignore duplicate inserts?:

BEGIN
    INSERT INTO readings (real_time_device_id, recorded_at, duration) VALUES
      ('150', TIMESTAMP WITH TIME ZONE '2014-11-01 23:06:33 -0700', 10.0),
      ... many more records ...
      ('150', TIMESTAMP WITH TIME ZONE '2014-11-01 23:06:43 -0700', 10.0);
EXCEPTION WHEN unique_violation THEN
  -- silently ignore inserts
END;

The above gets two errors:

psql:sketches/t15.sql:11: ERROR:  syntax error at or near "INSERT"
LINE 2:         INSERT INTO readings (real_time_device_id, recorded_...
                ^
psql:sketches/t15.sql:14: ERROR:  syntax error at or near "EXCEPTION"
LINE 1: EXCEPTION WHEN unique_violation THEN
        ^

Can anyone set me straight on the correct syntax? Or is my error deeper than mere syntax? (For example, will all of the INSERTs be ignored if there is a single duplicate?)

updated question

As @dezso points out, there are a few problems with the code above. So (at the risk of having the question flagged as too general), what’s a good way to do bulk inserts where very few (< .1%) are duplicates?


Viewing all articles
Browse latest Browse all 1138

Trending Articles