Streaming replication Postgresql 9.3 using two different servers

August 20, 2014, 11:37 am

≫ Next: Postgresql: set default psql client encoding

≪ Previous: Explicitly granting permissions to update the sequence for a serial column necessary?

Settings in master server:

max_wal_senders = 1
wal_level = 'archive'
archive_mode = on
archive_command = 'cd .'
wal_keep_segments = 10000

Settings in slave server:
in recovery.conf file:

Standby_mode = 'on'
primary_conninfo = 'host=ipaddress of master user=repuser'
trigger_file = '/tmp/postgresql.trigger.5432'

log_connections=on is set up on both master and slave server

entry is made for replication user in pg_hba.conf file in master server

host     replication     repuser         ipaddress/32         trust

when trying to replicate i get the following error

2014-07-14 19:28:22 IST LOG:  database system was shut down in recovery at 2014-                                                                                        07-14 19:28:21 IST
2014-07-14 19:28:22 IST LOG:  entering standby mode
2014-07-14 19:28:22 IST WARNING:  WAL was generated with wal_level=minimal, data                                                                                         may be missing
2014-07-14 19:28:22 IST HINT:  This happens if you temporarily set wal_level=min                                                                                        imal without taking a new base backup.
2014-07-14 19:28:22 IST LOG:  consistent recovery state reached at 0/19FFE28
2014-07-14 19:28:22 IST LOG:  record with zero length at 0/19FFE28
2014-07-14 19:28:22 IST FATAL:  database system identifier differs between the p                                                                                        rimary and standby
2014-07-14 19:28:22 IST DETAIL:  The primary's identifier is 6022019027749040119                                                                                        , the standby's identifier is 6033562405193904122.
2014-07-14 19:28:23 IST LOG:  connection received: host=[local]
2014-07-14 19:28:23 IST FATAL:  the database system is starting up
2014-07-14 19:28:24 IST LOG:  connection received: host=[local]
2014-07-14 19:28:24 IST FATAL:  the database system is starting up
2014-07-14 19:28:25 IST LOG:  connection received: host=[local]
2014-07-14 19:28:25 IST FATAL:  the database system is starting up
2014-07-14 19:28:26 IST LOG:  connection received: host=[local]
2014-07-14 19:28:26 IST FATAL:  the database system is starting up
2014-07-14 19:28:27 IST LOG:  connection received: host=[local]
2014-07-14 19:28:27 IST FATAL:  the database system is starting up
"postgresql-2014-07-14_192822.log" 6630L, 756429C
2014-07-14 19:28:22 IST LOG:  database system was shut down in recovery at 2014-07-14 19:28:21 IST
2014-07-14 19:28:22 IST LOG:  entering standby mode
2014-07-14 19:28:22 IST WARNING:  WAL was generated with wal_level=minimal, data may be missing
2014-07-14 19:28:22 IST HINT:  This happens if you temporarily set wal_level=minimal without taking a new base backup.
2014-07-14 19:28:22 IST LOG:  consistent recovery state reached at 0/19FFE28
2014-07-14 19:28:22 IST LOG:  record with zero length at 0/19FFE28
2014-07-14 19:28:22 IST FATAL:  database system identifier differs between the primary and standby
2014-07-14 19:28:22 IST DETAIL:  The primary's identifier is 6022019027749040119, the standby's identifier is 6033562405193904122.
2014-07-14 19:28:23 IST LOG:  connection received: host=[local]
2014-07-14 19:28:23 IST FATAL:  the database system is starting up
2014-07-14 19:28:24 IST LOG:  connection received: host=[local]
2014-07-14 19:28:24 IST FATAL:  the database system is starting up
2014-07-14 19:28:25 IST LOG:  connection received: host=[local]
2014-07-14 19:28:25 IST FATAL:  the database system is starting up
2014-07-14 19:28:26 IST LOG:  connection received: host=[local]
2014-07-14 19:28:26 IST FATAL:  the database system is starting up
2014-07-14 19:28:27 IST LOG:  connection received: host=[local]
2014-07-14 19:28:27 IST FATAL:  the database system is starting up
2014-07-14 19:28:28 IST LOG:  connection received: host=[local]
2014-07-14 19:28:28 IST FATAL:  the database system is starting up
2014-07-14 19:28:29 IST LOG:  connection received: host=[local]
2014-07-14 19:28:29 IST FATAL:  the database system is starting up
2014-07-14 19:28:30 IST LOG:  connection received: host=[local]
2014-07-14 19:28:30 IST FATAL:  the database system is starting up
2014-07-14 19:28:31 IST LOG:  connection received: host=[local]
2014-07-14 19:28:31 IST FATAL:  the database system is starting up
2014-07-14 19:28:32 IST LOG:  connection received: host=[local]
2014-07-14 19:28:32 IST FATAL:  the database system is starting up
2014-07-14 19:28:33 IST LOG:  connection received: host=[local]
2014-07-14 19:28:33 IST FATAL:  the database system is starting up
2014-07-14 19:28:34 IST LOG:  connection received: host=[local]
2014-07-14 19:28:34 IST FATAL:  the database system is starting up
2014-07-14 19:28:35 IST LOG:  connection received: host=[local]
2014-07-14 19:28:35 IST FATAL:  the database system is starting up
2014-07-14 19:28:36 IST LOG:  connection received: host=[local]
2014-07-14 19:28:36 IST FATAL:  the database system is starting up
2014-07-14 19:28:37 IST LOG:  connection received: host=[local]
2014-07-14 19:28:37 IST FATAL:  the database system is starting up
2014-07-14 19:28:37 IST FATAL:  database system identifier differs between the primary and standby
2014-07-14 19:28:37 IST DETAIL:  The primary's identifier is 6022019027749040119, the standby's identifier is 6033562405193904122.
2014-07-14 19:28:38 IST LOG:  connection received: host=[local]
2014-07-14 19:28:38 IST FATAL:  the database system is starting up
2014-07-14 19:28:39 IST LOG:  connection received: host=[local]

IST FATAL:  database system identifier differs between the primary and standby
2014-07-14 19:28:37 IST DETAIL:  The primary's identifier is 6022019027749040119, the standby's identifier is 6033562405193904122.

What do these lines mean?

when I run this command

service postgresql-9.3 start

I get the following error in startup.log file

2014-07-15 11:25:59 IST FATAL:  lock file "postmaster.pid" already exists
2014-07-15 11:25:59 IST HINT:  Is another postmaster (PID 25961) running in data directory "/opt/postgres/PostgreSQL/9.3/data"?

Extend the question

We are new to Postgresql open source. We couldn’t understand the answer to this question. Please if possible explain briefly. We have followed the instructions from the book ‘Postgresql 9 Administration Cook book’. We followed these steps from this book.

Carry out the following steps:

Identify your Master and Standby nodes, and ensure that they have been configured according to the best practice recipe.
Configure replication security. Create or confirm the existence of the replication user on Master node
```
CREATE USER repuser 
SUPERUSER 
LOGIN 
CONNECTION LIMIT 1 
ENCRYPTED PASSWORD 'changeme'; 
```
Allow the replication user to authenticate. The following example allows access from any ip address using encrypted password authentication; you may wish to consider more restrictive options. Add the following line:
```
host replication repuser 127.0.0.1/0 md5 
```
Set logging options in postgresql.conf on both Master and Standby, so that you get increased information regarding replication connection attempts and associated failures.
```
log_connections = on
```
Set max_wal_senders on Master in postgresql.conf, or increment if the value is already non-zero.
```
max_wal_senders = 1 
wal_mode = 'archive' 
archive_mode = on 
archive_command = 'cd .' 
```
Adjust wal_keep_segments on Master in postgresql.conf. Set this to a value no higher than the amount of freespace on the drive on which the pg_xlog directory is mounted, divided by 16MB. If pg_xlog isn’t mounted on a separate drive, then don’t assume all of the current freespace is available for transaction log files.
```
wal_keep_segments = 10000 # e.g. 160 GB 
```
Adjust hot Standby parameters if required (see later recipe)

Take a base backup, very similar to the process for taking a physical backup as described in the backup chapter.

a. Start the backup

    psql -c "select pg_start_backup('base backup for streaming rep')"

b. Copy the data files (excluding the pg_xlog directory)

    rsync -cva --inplace --exclude=*pg_xlog*  
    ${PGDATA}/ $STANDBYNODE:$PGDATA

c. Stop the backup

   psql -c "select pg_stop_backup(), current_timestamp"

Set the recovery.conf parameters on the Standby. Note that the primary_ conninfo must not specify a database name, though can contain any other PostgreSQL connection option. Note also that all options in recovery.conf are enclosed in quotes, whereas postgresql.conf parameters need not be.
```
Standby_mode = 'on' 
primary_conninfo = 'host=192.168.0.1 user=repuser' 
trigger_file = '/tmp/postgresql.trigger.5432' 
```
Start Standby server
Carefully monitor replication delay until the catchup period is over. During the initial catchup period, the replication delay will be much higher than we would normally expect it to be. You are advised to set hot_Standby = off for the initial period only.

↧

Postgresql: set default psql client encoding

August 20, 2014, 11:48 am

≫ Next: Mixed search in multiple columns

≪ Previous: Streaming replication Postgresql 9.3 using two different servers

When I connect to a Postgresql DB using psql, I often get these messages:

=> SELECT * FROM question_view ;
ERROR:  character with byte sequence 0xd7 0x9e in encoding "UTF8" has no equivalent in encoding "LATIN1"

Following this SO answer, I understand that I should change the client encoding accordingly:

SET client_encoding = 'UTF8';

Changing the client_encoding every time I connect to the DB is cumbersome. Is there a way to permanently configure this setting, either in the .pgpass file or anywhere else?

↧

Mixed search in multiple columns

August 20, 2014, 11:48 am

≫ Next: PostgreSQL – If I run multiple queries concurrently, under what circumstances would I see a speedup? Under what circumstances would I see a slowdown?

≪ Previous: Postgresql: set default psql client encoding

I have a table which has a particular column of type JSON, where some user’s contact information are stored. I can alter the table and add new columns with cached values to increase search performance if needed (this is a new system, which is in the state of designing/at the beginning of implementation).

The top-level keys in the JSON are ‘name’ (both first and last name as one value), ‘address’ (a hash containing street, state, zip etc) and ‘other’ (tel, email, www etc). These are the three categories of information we are storing of customers.

One property of this JSON column is, that I really can’t tell what keys will be there (that’s why it is a JSON, so it remains flexible in this sense), but I need to search all the values.

My main problem is, that I’ve been told to implement a search functionality with one single input (like google), which would search in every value, even combined (so if u search ‘John Smith Slovakia’, it would find customers with the name John Smith who lives in Slovakia’).

Another required property is that it has to support partial searches, i.e. if you wan’t to find every customer with the last name Smith who live in street named ‘Somelongname’, then it would be enough to type in ‘smith somelong’ and it would still find them.

I’ve looked into full text search and it looked fine, except it did not really support partial searches.

Is there a better, more efficient solution, than searching every input word separately accross all the values using LIKE ‘%search_token%’ and then merging the results for every search token into the final result?

BTW I’m using PostgreSQL.

↧

PostgreSQL – If I run multiple queries concurrently, under what circumstances would I see a speedup? Under what circumstances would I see a slowdown?

August 20, 2014, 1:24 pm

≫ Next: Binary insert into Postgis geography column results in value being inserted as geometry

≪ Previous: Mixed search in multiple columns

I approach you all humbly as one who is NOT a DBA, and I’m sure that my question is fraught with conceptual shortcomings and “it depends on” land mines. I’m also pretty sure that all of you who choose to answer are going to want a lot more in the way of specifics than I can currently deliver.

That said, I’m curious about the following scenario in general:

Say that I have two non-trivial queries.
Query 1 requires 2 minutes to complete on average.
Query 2 requires 5 minutes to complete on average.

If I run them serially, one right after the other, I’m expecting it will require 7 minutes to complete on average. Is this reasonable?

More than that, however, what if I run the two queries concurrently? Two separate connections at the same time.

Under what conditions would I expect to see a speedup? (Total time < 7 minutes)
Under what conditions would I expect to see a slowdown? (Total time > 7 minutes)

Now, if I had 1,000 non-trivial queries running concurrently, I have a hunch that it would result in an overall slowdown. In that case, where would the bottleneck likely be? Processor? RAM? Drives?

Again, I know it’s probably impossible to answer the question precisely without knowing specifics (which I don’t have.) I’m looking for some general guidelines to think about when asking the following questions:

Under what circumstances do concurrent queries result in an overall speedup?
Under what circumstances do concurrent queries result in an overall slowdown?

↧

Binary insert into Postgis geography column results in value being inserted as geometry

August 20, 2014, 3:36 pm

≫ Next: How to change schema so that account_id reference is unique among 3 tables

≪ Previous: PostgreSQL – If I run multiple queries concurrently, under what circumstances would I see a speedup? Under what circumstances would I see a slowdown?

When writing WKB as a value (and as a binary parameter) to a geography column, the persisted value is not geography.

Example Java is also available here: https://github.com/ayuudee/issue-pad/blob/master/src/com/jesusthecat/im/pggeog/BinaryGeogTest.java

Here’s what it does:

Creates a table with an ID and Geography(Point, 4326)
inserts a row using WKT.
inserts a row using WKB (as bytes).
Prints out id, point, and ST_SUMMARY(point).

SQL result of step #4 is:

 id |                         pt                         | st_summary 
----+----------------------------------------------------+------------
  1 | 0101000020E610000009C6C1A5E3E662406BB75D68AEED40C0 | Point[GS]
  2 | 0101000020E610000009C6C1A5E3E662406BB75D68AEED40C0 | Point[S]

Example Log of the binary insert (where pt is Geography(Point, 4326))

LOG: execute : insert into px(pt) values($1)

DETAIL: parameters: $1 = ‘x0101000020e610000009c6c1a5e3e662406bb75d68aeed40c0′

You’ll notice that:

The WKB for both #1 and #2 are the same; and
The flags in the ST_Summary result for #1 are [GS], where for #2 (the binary) they’re [S].

The documentation for ST_Summary would indicate that #2 has spatial information, but is not geodetic (i.e. not geography).

I’m writing a Java library that seeks to persist geography as binary, but this would seem to indicate that it isn’t possible. Also, is it normal that this should be allowed to happen anyway (i.e. writing a value to a geography column that’s not geography)?

Versions:

POSTGIS=”2.1.3 r12547″ GEOS=”3.3.3-CAPI-1.7.4″ PROJ=”Rel. 4.7.1, 23 September 2009″ GDAL=”GDAL 1.9.0, released 2011/12/29″ LIBXML=”2.7.8″ LIBJSON=”UNKNOWN” TOPOLOGY RASTER

POSTGRES PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro? 4.6.3-1ubuntu5) 4.6.3, 64-bit

Would appreciate any help.

↧

How to change schema so that account_id reference is unique among 3 tables

August 20, 2014, 3:48 pm

≫ Next: PostgreSQL 9.3: Primary key violated by a trigger INSERT

≪ Previous: Binary insert into Postgis geography column results in value being inserted as geometry

Original image:
enter image description here

Updated to have more-correct terminology and an ‘is_debit’ column:
enter image description here

I am designing a (PostgreSQL) schema for a lottery website that uses double-entry accounting.

‘jackpots’, ‘users’, and ‘house’ are each tables which reference an account_id, which in turn is used in the journal to record a credit or debit of money ‘amount’.

I want to assure that the account_id is only in one table — and that it is unique in that table.

What is the best way to refactor the tables so that account_id is guaranteed to appear only 1x in all of them?

What I’ve considered:

Creating another table, ‘account_type’ that has different codes for user accounts, jackpot accounts, and the house’s account. Then add two colums to ‘accounts’, (‘account_type’ and ‘ref_id’, the latter of which will reference the id of ‘jackpots’, ‘users’, or ‘house’). This seems somewhat inelegant since I wouldn’t know how to link the table type to the table name using SQL, alone. I don’t mind using triggers, though, if this is the only way for an elegant, foolproof solution.
Looking for some sort of built in constraint that says account_id cannot be used more than once among 3 tables. Would guess that this doesn’t exist, however.

Help is appreciated!

↧

PostgreSQL 9.3: Primary key violated by a trigger INSERT

August 20, 2014, 7:24 pm

≫ Next: Add minutes contained in a column to a date

≪ Previous: How to change schema so that account_id reference is unique among 3 tables

My problem

Consider a table t with many frequent updates from users, from which only the last few are relevant.

In order to keep the table size reasonable, whenever a new row is inserted old rows from the same user_id are deleted. In order to keep an archive, the row is also written to t_history.

Both t and t_history have the same schema, in which id is a bigserial with a primary key constraint.

Implementation

Stored procedure

CREATE FUNCTION update_t_history()
RETURNS trigger
AS
$$
declare
BEGIN
    -- Insert the row to the t_history table. `id` is autoincremented
    INSERT INTO t_history (a, b, c, ...)
    VALUES (NEW.a, NEW.b, NEW.c, ...);

    -- Delete old rows from the t table, keep the newest 10 
    DELETE FROM t WHERE id IN (
                  SELECT id FROM t 
                  WHERE user_id = NEW.user_id 
                  ORDER BY id DESC
                  OFFSET 9);
    RETURN NEW;
END;
$$
LANGUAGE plpgsql;

Corresponding insertion trigger:

CREATE TRIGGER t_insertion_trigger
AFTER INSERT ON t
FOR EACH ROW
EXECUTE PROCEDURE update_t_history();

The error

The trigger works well, but when I run a few dozen insertions in a single transaction, I get the following error:

BEGIN
ERROR:  duplicate key value violates unique constraint "t_history_pkey"
DETAIL:  Key (id)=(196) already exists.

Updates

The id field in both tables (from d+ t):
- id|bigint|not null default nextval('t_id_seq'::regclass)
- "t_pkey" PRIMARY KEY, btree (id)
PostgreSQL version is 9.3.

Any idea why the stored procedure breaks the primary key constraint in transactions?

↧

Add minutes contained in a column to a date

August 22, 2014, 9:36 am

≫ Next: osm2pgsql missing coordinates

≪ Previous: PostgreSQL 9.3: Primary key violated by a trigger INSERT

In CartoDB I want to add minutes contained in a column of data type number to a date.
e.g.

SELECT my_initial_date + ‘xx minute’::INTERVAL as my_date FROM table

What is working is:

SELECT my_initial_date + '7 minute'::INTERVAL as my_date FROM table

This adds 7 minutes to each my_initial_date.
What I want to do is the same but with a variable number of minutes stored in another column, e.g.:

SELECT my_initial_date + 'column_name minute'::INTERVAL as my_date FROM table

This doesn’t work in CartoDB! How to do this?

↧

osm2pgsql missing coordinates

August 22, 2014, 12:00 pm

≫ Next: PostgreSQL replication with fsync disabled?

≪ Previous: Add minutes contained in a column to a date

I am quite new to GIS and OSM and am trying to get my head around exporting custom OSM region and uploading them into PostgreSQL.

I have started with a very small map area to export with only two shapes – an island and a meadow – (http://www.openstreetmap.org/#map=19/43.86436/15.34266) which export the following .osm data:

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="CGImap 0.3.3 (23630 thorn-02.openstreetmap.org)" copyright="OpenStreetMap and contributors" attribution="http://www.openstreetmap.org/copyright" license="http://opendatacommons.org/licenses/odbl/1-0/">
    <bounds minlat="43.8635800" minlon="15.3411900" maxlat="43.8650000" maxlon="15.3445500"/>
    <node id="63910084" visible="true" version="3" changeset="17709161" timestamp="2013-09-06T20:07:53Z" user="idelac" uid="183676" lat="43.8644519" lon="15.3432012"/>
    <node id="63930603" visible="true" version="3" changeset="17709161" timestamp="2013-09-06T20:07:54Z" user="idelac" uid="183676" lat="43.8642369" lon="15.3431567"/>
    <node id="63968158" visible="true" version="3" changeset="17709161" timestamp="2013-09-06T20:07:57Z" user="idelac" uid="183676" lat="43.8645365" lon="15.3424617"/>
    <node id="1122490202" visible="true" version="2" changeset="17709161" timestamp="2013-09-06T20:07:44Z" user="idelac" uid="183676" lat="43.8640377" lon="15.3426171"/>
    <node id="63879158" visible="true" version="3" changeset="17709161" timestamp="2013-09-06T20:07:52Z" user="idelac" uid="183676" lat="43.8639894" lon="15.3430113"/>
    <node id="2448417703" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8641440" lon="15.3427266"/>
    <node id="2448417704" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8641542" lon="15.3426165"/>
    <node id="2448417706" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8642112" lon="15.3428451"/>
    <node id="2448417707" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8642458" lon="15.3424979"/>
    <node id="2448417708" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8643394" lon="15.3428931"/>
    <node id="2448417709" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8644045" lon="15.3425008"/>
    <node id="2448417711" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8645022" lon="15.3428874"/>
    <node id="2448417712" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8645144" lon="15.3425403"/>
    <node id="2448417713" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8645937" lon="15.3427745"/>
    <node id="2448417714" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:06:52Z" user="idelac" uid="183676" lat="43.8645978" lon="15.3426673"/>
    <node id="1122490153" visible="true" version="2" changeset="17709161" timestamp="2013-09-06T20:07:43Z" user="idelac" uid="183676" lat="43.8645802" lon="15.3431461"/>
    <node id="1122490171" visible="true" version="2" changeset="17709161" timestamp="2013-09-06T20:07:43Z" user="idelac" uid="183676" lat="43.8646997" lon="15.3426923"/>
    <node id="1122490322" visible="true" version="2" changeset="17709161" timestamp="2013-09-06T20:07:44Z" user="idelac" uid="183676" lat="43.8647101" lon="15.3429354"/>
    <node id="1122490460" visible="true" version="2" changeset="17709161" timestamp="2013-09-06T20:07:45Z" user="idelac" uid="183676" lat="43.8641347" lon="15.3424630"/>
    <node id="1122490524" visible="true" version="2" changeset="17709161" timestamp="2013-09-06T20:07:45Z" user="idelac" uid="183676" lat="43.8642836" lon="15.3424149"/>
    <way id="8837893" visible="true" version="3" changeset="7098380" timestamp="2011-01-26T21:37:50Z" user="goranT" uid="217969">
        <nd ref="63879158"/>
        <nd ref="63930603"/>
        <nd ref="63910084"/>
        <nd ref="1122490153"/>
        <nd ref="1122490322"/>
        <nd ref="1122490171"/>
        <nd ref="63968158"/>
        <nd ref="1122490524"/>
        <nd ref="1122490460"/>
        <nd ref="1122490202"/>
        <nd ref="63879158"/>
        <tag k="natural" v="coastline"/>
        <tag k="place" v="island"/>
        <tag k="source" v="PGS"/>
    </way>
    <way id="236881851" visible="true" version="1" changeset="17709161" timestamp="2013-09-06T20:07:26Z" user="idelac" uid="183676">
        <nd ref="2448417709"/>
        <nd ref="2448417707"/>
        <nd ref="2448417704"/>
        <nd ref="2448417703"/>
        <nd ref="2448417706"/>
        <nd ref="2448417708"/>
        <nd ref="2448417711"/>
        <nd ref="2448417713"/>
        <nd ref="2448417714"/>
        <nd ref="2448417712"/>
        <nd ref="2448417709"/>
        <tag k="landuse" v="meadow"/>
    </way>
</osm>

Running osm2pgsql -S default.style -d gis -U postgres island.osm uploads the data into my gis database with one row in the planet_osm_line table (for the island) and one row in the planet_osm_polygon table (for the meadow).

Now, when issuing sql query SELECT ST_AsText(way) FROM planet_osm_line, pqsql returns

"LINESTRING(1707992.39 5444456.51,1707997.34 5444489.7,1707915.02 5444502.76,1707932.32 5444425.75)"

Here I don’t understand two things-

Questions

Why does the Linestring only contain four points when the original coastline of the island has 10 distinct nodes? Similarly when issuing the sql query on the planet_osm_polygon table to see the polygon for the meadow shape, the polygon has fewer coordinates than its original shape described by the nodes (7 instead of 11).
(answered just below in update) How are the original lat,lng coordinates projected? I know osm2pgsql defaults to mercator projection. However, I don’t understand how it computed the values in the Linestring above. Is the width of a tile (256px?) and some zoom (z=?) taken into account in the projection? Would LatLng[0,0] be projected as a Point[0,0]?

Update

The answer to my second question is that osm2pgsql projects from latitude and longitude coordinates to mercator in meters. LatLng[0,0] will indeed be projected as Point[0,0] where the point coordinates are the distance in meters from the origin on a mercator plane.

Here is the projection function in case someone stumbles upon this question and looks for a solution

var degrees2meters = function(lon,lat) {
        var x = lon * 20037508.34 / 180;
        var y = Math.log(Math.tan((90 + lat) * Math.PI / 360)) / (Math.PI / 180);
        y = y * 20037508.34 / 180;
        return [x, y]
}

I still don’t know why some of the nodes are missing in the resulting shapes in the database. Has anyone experienced this?

↧

PostgreSQL replication with fsync disabled?

August 23, 2014, 6:00 pm

≫ Next: Adding standard set of constraints (rules) to PostGIS `raster` type column

≪ Previous: osm2pgsql missing coordinates

I need to run PostgreSQL in-memory (for performance reasons), so I intend to disable fsync, which means that no writes will be sent to the WAL.

However, as part of my scheme to meet another requirement (that the in-memory database have somewhere to recover from when volatile memory is lost), I would like to stream or otherwise push writes to a replica. However, the PostgreSQL hot standby capability is based on the WAL. Clearly, I can’t use this.

How could I achieve these goals using PostgreSQL features?

Thanks.

↧

Adding standard set of constraints (rules) to PostGIS `raster` type column

August 23, 2014, 11:12 pm

≫ Next: Need some help with this query or some ideas for change it to better model

≪ Previous: PostgreSQL replication with fsync disabled?

Following on from my related prior question, let me elaborate this topic a bit.

raster2pgsql is a raster loader executable that loads GDAL supported raster formats in PostGIS. It has a -C flag defined as follows:

gislinux@gislinux-Precision-M4600:~$ raster2pgsql

Output:

-C  Set the standard set of constraints on the raster
column after the rasters are loaded. Some constraints may fail
if one or more rasters violate the constraint.

When I am importing my raster file like this:

gislinux@gislinux-Precision-M4600:~$ raster2pgsql -d -I -C -M -F -t 100x100 -s 4326
us_tmin_2012.01.asc chp05.us_tmin_new | psql -h localhost -p 5432 -U postgres -d pgrouting

Output:

ANALYZE
NOTICE:  Adding SRID constraint
CONTEXT:  PL/pgSQL function addrasterconstraints line 53 at RETURN
NOTICE:  Adding scale-X constraint

Few constraints have been applied to this new table by the -C flag.

pgrouting=# d+ chp05.us_tmin_new

Output:

Indexes:
"us_tmin_new_pkey" PRIMARY KEY, btree (rid)
"us_tmin_new_rast_gist" gist (st_convexhull(rast))
Check constraints:
"enforce_height_rast" CHECK (st_height(rast) = ANY (ARRAY[100, 21]))
"enforce_max_extent_rast" CHECK (st_coveredby(st_convexhull(rast),

The standard constraints comprise the following rules:

Width and height: This rule states that all the rasters must have the same width and height.
Scale X and Y: This rule states that all the rasters must have the same scale X and Y.
SRID: This rule states that all rasters must have the same SRID.
Same alignment: This rule states that all rasters must be aligned to one another.
Maximum extent: This rule states that all rasters must be within the table’s
maximum extent.
Number of bands: This rule states that all rasters must have the same number
of bands.
NODATA values: This rule states that all raster bands at a specific index must have the same NODATA value.
Out-db: This rule states that all raster bands at a specific index must be in-db or out-db, not both.
Pixel type: This rule states that all raster bands at a specific index must be of the same pixel type.

NOW, in order to run ST_MapAlgebra function, I had to drop these std constraints individually, which I did using:

ALTER TABLE chp05.us_tmin_new DROP CONSTRAINT enforce_scalex_rast

in pgAdmin SQL Editor for each of those std constraints. But now I don’t know how to bring these standard constraints back? The following command is not working:

ALTER TABLE chp05.us_tmin_new ADD CONSTRAINT enforce_scalex_rast unique (rast);

and giving the following error:

ERROR:  data type raster has no default operator class for access method "btree"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.

↧

Need some help with this query or some ideas for change it to better model

August 23, 2014, 11:12 pm

≫ Next: File database vs application server database for concurrent performance & bulk data

≪ Previous: Adding standard set of constraints (rules) to PostGIS `raster` type column

I’ve a dilemma in my application and I don’t get how to solve it at DB level. The idea is that a user can be subscribed to one or more services but services are show depending on user type. This can sound tricky but it does not. So I have a form where I pick which type of user I’m registering but then from services table I need to show only the services allowed for that type of user. I made this table:

CREATE TABLE "nomencla"."service" (
    "id" int4 NOT NULL,
    "name" varchar(80) COLLATE "default" NOT NULL,
    "active" bool,
    "cedulabenefsigesp" varchar(10) COLLATE "default" NOT NULL,
    "user_type" text COLLATE "default",
    CONSTRAINT "tipo_servicio_pkey" PRIMARY KEY ("id")
) WITH (OIDS=FALSE);

ALTER TABLE "nomenclator"."service" OWNER TO "postgres";
COMMENT ON COLUMN "nomenclator"."service"."user_type" IS '(DC2Type:array)';

In user_type column, I stored, as serialized values, the allowed user type for each service, ex:

1   Service1    t   0000000039  a:1:{i:0;s:1:"1";}
2   Service2    t   0000000040  a:3:{i:0;s:1:"2";i:2;s:1:"3";i:3;s:1:"4";}

For row 1 it means: Service1 will be available for users of type 1 and Service2 will be available for users of type 2,3 and 4 only so in register form for users of type 1 I should only show the Service1 option and for users of type 2,3 or 4 then Service2 will be showed, I need some help building a query for get that data or help to change my model with better solution to that problem, any?

↧

File database vs application server database for concurrent performance & bulk data

August 24, 2014, 5:48 pm

≫ Next: Creating a unique constraint on a PostGIS 'raster' type column

≪ Previous: Need some help with this query or some ideas for change it to better model

I asked a question about Ripple’s database implementation, and received this response:

The ripple server uses SQLite for structured data and a configurable “back end” for unstructured “bulk” storage.

The structured data consists of things like transactions indexed by which accounts they affected. The unstructured data consists of “chunks” of data indexed by hash that constitute portions of network history.

The preferred back end for bulk storage is currently RocksDB on Linux platforms.

This strikes me as strange since Ripple’s structure allows the developers to place almost any demand they wish upon the server operators. In other words, why not use a database server, specifically PostgreSQL?

I found this interesting breakdown of PostgreSQL vs SQLite, and this explanation:

It breaks down to how they implement snapshot isolation.

SQLite uses file locking as a means to isolate transactions, allowing writes to hit only once all reads are done.

Postgres, in contrast, uses a more sophisticated approach called multiconcurrency version control (mvcc), that allows multiple writes to occur in parallel with multiple reads.

First, is it true that the ideal implementation for bulk storage is to use a file database?

Second, is it true that for concurrent reads & writes, PostgreSQL vastly outperforms a file database?

Lastly, when tables approach billions of rows in length, for concurrent performance, is a file database or PostgreSQL superior?

Please assume both alternatives are ideally tuned.

↧

Creating a unique constraint on a PostGIS 'raster' type column

August 25, 2014, 12:36 am

≫ Next: How do I ORDER BY typical software release versions like X.Y.Z?

≪ Previous: File database vs application server database for concurrent performance & bulk data

I am using the following command to add constraints to one of the raster image in PostGIS-2.1.3 (PostgreSQL-9.1.14).

ALTER TABLE schema1.table1 ADD CONSTRAINT enforce_scalex_rast unique (rast);

But getting the following errors:

ERROR:  data type raster has no default operator class for access method "btree"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.

Kindly someone help me to fix this error up. I have no basic idea about the operator classes. Thx.

Zia.

↧

How do I ORDER BY typical software release versions like X.Y.Z?

August 25, 2014, 12:36 am

≫ Next: Split two rows into two columns

≪ Previous: Creating a unique constraint on a PostGIS 'raster' type column

Given a “SoftwareReleases” table:

| id | version |
|  1 | 0.9     |
|  2 | 1.0     |
|  3 | 0.9.1   |
|  4 | 1.1     |
|  5 | 0.9.9   |
|  6 | 0.9.10  |

How do I produce this output?

| id | version |
|  1 | 0.9     |
|  3 | 0.9.1   |
|  5 | 0.9.9   |
|  6 | 0.9.10  |
|  2 | 1.0     |
|  4 | 1.1     |

↧

Split two rows into two columns

August 25, 2014, 1:24 am

≫ Next: Index for finding an element in a JSON Array in PostgreSQL (with multiple JSON fields)

≪ Previous: How do I ORDER BY typical software release versions like X.Y.Z?

I have the following table:

id   | name   | action | count
------------------------------
 1   | Value1 |    0   | 27
 1   | Value1 |    1   | 49
 2   | Value2 |    0   | 7
 2   | Value2 |    1   | 129
 3   | Value3 |    0   | 9
 3   | Value3 |    1   | 7

I need to make the ‘action’ column appear twice, with the count value of each line in it, something like this:

id   | name   | action1 | action2
---------------------------------
 1   | Value1 |    27   | 49
 2   | Value2 |    7    | 129
 3   | Value3 |    9    | 7

How can I do this? Here’s my script:

SELECT m.id,
    t.name,
    m.action,
    count(m.id) AS count
FROM table1 m
LEFT JOIN table2 t ON (m.id = t.id)
WHERE m.status != '2'
GROUP BY m.id,
    t.name,
    m.action
ORDER BY 1, 3

↧

Index for finding an element in a JSON Array in PostgreSQL (with multiple JSON fields)

August 25, 2014, 1:24 am

≫ Next: PostgreSQL synchronous replication time out?

≪ Previous: Split two rows into two columns

Original question of what this is based on: http://stackoverflow.com/questions/18404055/index-for-finding-an-element-in-a-json-array

This works fine if you only want simple matches. Suppose tracks have both Artist and Title field in the JSON data. So we have something like

INSERT INTO tracks (id, data)  VALUES (1, '[{"artist": "Simple Plan", "title": "Welcome to My Life"}]');

We create the index like this (similar to the original question)

CREATE INDEX tracks_artists_gin_idx ON tracks
USING GIN (json_val_arr(data, 'artist'));

CREATE INDEX tracks_title_gin_idx ON tracks
USING GIN (json_val_arr(data, 'title'));

So now we have two fields to match. As you can see, if we perform the original query (with very naive modifications) of:

SELECT *
FROM   tracks
WHERE  '{"ARTIST NAME"}'::text[] <@ (json_val_arr(data, 'artist'))
AND '{"TITLE"}'::text[] <@ (json_val_arr(data, 'title'))

This will give the wrong answer because the indices of the array of artist and title in the JSON array do not have to match for this query to match something in the JSON. What is the proper way of doing this query so we can get the exact match we need? Does json_val_arr need to be changed?

Edit: Why this is wrong

Suppose our table has records like

INSERT INTO tracks (id, data)  VALUES (1, '[{"artist": "Simple Plan", "title": "Welcome to My Life"}]');
INSERT INTO tracks (id, data)  VALUES (2, '[{"artist": "Another Artist", "title": "Welcome to My Life"}, {"artist": "Simple Plan", "title": "Perfect"}]');

If you query like

SELECT *
FROM   tracks
WHERE  '{"Simple Plan"}'::text[] <@ (json_val_arr(data, 'artist'))
AND '{"Welcome to my Life"}'::text[] <@ (json_val_arr(data, 'title'))

Both records will be matched (both record 1 and 2), even though you really only wanted the first record

↧

PostgreSQL synchronous replication time out?

August 25, 2014, 4:36 am

≫ Next: Grant usage partially on schema to user on Postgres

≪ Previous: Index for finding an element in a JSON Array in PostgreSQL (with multiple JSON fields)

I’m using PostgreSQL 9.3 with synchronous replication. I have 2 synchronous replicators and 1 async replicator. My problem is when 2 synchronous replicators are down, transaction will wait indefinitely.
I found following discussion about this,

http://postgresql.1045698.n5.nabble.com/Timeout-for-asynchronous-replication-Re-Timeout-and-wait-forever-in-sync-rep-td3293679.html

but could not found a solution. Is there a way to resolve this problem ? ( may be using tool like pg-pool )

↧

Grant usage partially on schema to user on Postgres

August 25, 2014, 4:36 am

≫ Next: How to make PostgreSQL default_tablespace work properly?

≪ Previous: PostgreSQL synchronous replication time out?

I granted INSERT in a specific table for one user. The problem is that I need to also grant USAGE in the schema for this same user, but granting USAGE in schema also turn all relations in that schema visible for that user.

I need that this specific user can only INSERT in this specific table and is unable of viewing other relations (tables, sequences, etc) in the same schema. By ‘unable of viewing’ I mean not being able of seeing that these relations exists.

Being even more clear, in the vision of this user, the schema has only that table inside and nothing more.

↧

How to make PostgreSQL default_tablespace work properly?

August 25, 2014, 6:00 pm

≫ Next: How to optimise window queries in postgres

≪ Previous: Grant usage partially on schema to user on Postgres

My question is, “Why isn’t database ‘screwy’ created in tablespace ‘screwy’?

I made the following script, with it’s output following, to show the problem I am having :

#!/bin/bash
#
export OWNER=yourself
export OBJECT=screwy
export OTHER=not${OBJECT}
export PWD='seecret'
#
sudo -u postgres psql -tc "DROP DATABASE IF EXISTS ${OWNER};"
sudo -u postgres psql -tc "DROP DATABASE IF EXISTS ${OBJECT};"
sudo -u postgres psql -tc "DROP DATABASE IF EXISTS ${OTHER};"
sudo -u postgres psql -tc "DROP TABLESPACE IF EXISTS ${OBJECT};"
sudo -u postgres psql -tc "DROP ROLE IF EXISTS ${OWNER};"
#
str=$(cat /etc/lsb-release | grep DESC)
echo "Linux version is : ${str#*=}"
echo "Postgres version is :"
sudo -u postgres psql -qtc "SELECT version();"
#
echo "CREATE ROLE ${OWNER} WITH CREATEDB LOGIN PASSWORD '${PWD}'"
sudo -u postgres psql -tc "CREATE ROLE ${OWNER} WITH CREATEDB LOGIN PASSWORD '${PWD}';"
echo "CREATE TABLESPACE ${OBJECT} OWNER ${OWNER} LOCATION '/home/${OWNER}/data';"
sudo -u postgres psql -tc "CREATE TABLESPACE ${OBJECT} OWNER ${OWNER} LOCATION '/home/${OWNER}/data';"
sudo -u postgres psql -tc "SELECT 'Tablespace "' || spcname || '" is owned by ' || usename 
                             FROM pg_catalog.pg_tablespace, pg_catalog.pg_user 
                            WHERE spcname='${OBJECT}' AND usesysid = spcowner;"
echo "CREATE DATABASE ${OWNER} TABLESPACE ${OBJECT};"
sudo -u postgres psql -tc "CREATE DATABASE ${OWNER} TABLESPACE ${OBJECT};"
#
echo "ALTER ROLE ${OWNER} SET DEFAULT_TABLESPACE TO ${OBJECT};"
sudo -u postgres psql -tc "ALTER ROLE ${OWNER} SET DEFAULT_TABLESPACE TO ${OBJECT};"
echo "GRANT CREATE ON TABLESPACE ${OBJECT} TO ${OWNER};"
sudo -u postgres psql -tc "GRANT CREATE ON TABLESPACE ${OBJECT} TO ${OWNER};"
#
echo "## As ${OWNER}"
echo "*:*:*:yourself:${PWD}" | cat > ~/.pgpass
chmod 600 ~/.pgpass
#
echo "SHOW DEFAULT_TABLESPACE;"
psql -tc "SHOW DEFAULT_TABLESPACE;"
#
echo "CREATE DATABASE ${OBJECT};"
psql -tc "CREATE DATABASE ${OBJECT};"
#
echo "CREATE DATABASE ${OTHER};"
psql -tc "CREATE DATABASE ${OTHER} TABLESPACE ${OBJECT};"
#
echo "## As postgres"
sudo -u postgres psql -qtc "SELECT 'Database "' || d.datname || '" is in tablespace "' || t.spcname || '" in disk directory "' || pg_tablespace_location(t.oid) || '".' FROM pg_database d 
                         LEFT JOIN pg_catalog.pg_tablespace t 
                                ON t.oid = d.dattablespace 
                             WHERE datistemplate = false 
                              AND datname IN ('${OTHER}', '${OBJECT}', '${OWNER}');"

When I execute it, it looks like this :

DROP DATABASE
DROP DATABASE
DROP DATABASE
DROP TABLESPACE
DROP ROLE
Linux version is : "Ubuntu 14.04 LTS"
Postgres version is :
 PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-16ubuntu6) 4.8.2, 64-bit

CREATE ROLE yourself WITH CREATEDB LOGIN PASSWORD 'seecret'
CREATE ROLE
CREATE TABLESPACE screwy OWNER yourself LOCATION '/home/yourself/data';
CREATE TABLESPACE
 Tablespace "screwy" is owned by yourself

CREATE DATABASE yourself TABLESPACE screwy;
CREATE DATABASE
ALTER ROLE yourself SET DEFAULT_TABLESPACE TO screwy;
ALTER ROLE
GRANT CREATE ON TABLESPACE screwy TO yourself;
GRANT
##  As yourself
SHOW DEFAULT_TABLESPACE;
 screwy

CREATE DATABASE screwy;
CREATE DATABASE
CREATE DATABASE notscrewy TABLESPACE screwy;
CREATE DATABASE
## As postgres
 Database "yourself" is in tablespace "screwy" in disk directory "/home/yourself/data".
 Database "screwy" is in tablespace "pg_default" in disk directory "".
 Database "notscrewy" is in tablespace "screwy" in disk directory "/home/yourself/data".

My question is, “Why isn’t database ‘screwy’ created in tablespace ‘screwy’?

↧