This queue is for tickets about the DBD-Pg CPAN distribution.

Report information
The Basics
Id:
100648
Status:
resolved
Priority:
Low/Low
Queue:

People
Owner:
greg [...] turnstep.com
Requestors:
TIMB [...] cpan.org
Cc:
david [...] justatheory.com
AdminCc:

BugTracker
Severity:
Critical
Broken in:
(no value)
Fixed in:
3.5.0



CC: david@justatheory.com
Subject: Behaviour of DBD::Pg ping() doesn't match DBI spec. It doesn't check the "database server is still running" in some cases.
MIME-Version: 1.0
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 727
The DBD::Pg docs define the ping() method as "check the validity of a database handle" and it returns true in some cases *without communicating with the db server*. The DBI docs define the ping method as checking that "... the database server is still running and the connection to it is still working". I've been debugging a problem with code that's using DBIx::Connector in 'fixup' mode where DBIx::Connector didn't recover after a database crash. As fas a libpq was concerned the connection was within a transaction *so it returned true* without checking that "the database server is still running and the connection to it is still working". This meant a critical service went down and stayed down after the db restarted.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 197
Essentially, ping should never return true without actually communicating with the postgres server. That also means correctly detecting when a pgbouncer is up but the pg instance behind it is down.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org> <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-10001-1417471541-73.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 96
A potential workaround would be to add a ping() override method into DBIx::Connector::Driver::Pg
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org> <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-21904-1417482198-425.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 253
Looks like we issue a SELECT only when we are active but not when idle in a transaction. I think the correct answer is to issue our SELECT in both cases (PQTRANS_IDLE and PQTRANS_INTRANS). I've applied that in: 3e2979fe4428fa9603206455dbfb90917f1a80d3
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-21904-1417482198-425.100648-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org> <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org> <rt-4.0.18-21904-1417482198-425.100648-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-3595-1417536523-780.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 2004
For the record, from IRC: [11:14am] timbunce_: G_SabinoMullane: why even consider the value of pg_db_txn_status at all? The semantics of ping are unrelated to transactions. [11:14am] timbunce_: (but maybe I'm missing something as I'm not familiar with libpq) [11:42am] G_SabinoMullane: timbunce_: Well we know we can't do anything if it returns active (query already running) or in a failed transaction. [11:43am] G_SabinoMullane: But while we can't run a ping in those circumstances, we can return a little more information (well, pg_ping does: while ping is boolean, pg_ping provides more information via integer return code) [3:42pm] timbunce_: G_SabinoMullane: it's vital that ping returns false on failure of the server/connection regardless of the state of the txn/client. [3:42pm] timbunce_: It sounds like you're saying that in certain txn/client states the state of server/connection can't be determined. Is that right? If so, please give more details. [3:45pm] G_SabinoMullane: If we are, for example, in a failed transaction, we cannot send the 'SELECT 1' to the backend. [3:45pm] G_SabinoMullane: However.. [3:45pm] G_SabinoMullane: I suppose we could still send it and swallow the error, in the theory that the ping server-still-there response trumps everything else. [3:46pm] timbunce_: G_SabinoMullane: yes, that would be fine. [3:47pm] G_SabinoMullane: I suppose the same could be said for the other states too. We can always *send* the SELECT 1, but we cannot promise we will get back a valid answer. [3:47pm] G_SabinoMullane: But in this context, a valid answer could be an ERROR. As long as it came from the server, ping can reply true. [3:48pm] timbunce_: Yes, exactly. Thanks. [3:48pm] G_SabinoMullane: Okay. I will whip up some tests when I get some tuits. [3:50pm] timbunce_: Thanks. Hopefully there's some way to cartegories errors-from-server vs errors-from-libpq. [4:07pm] timbunce_: pgbouncer is another case that needs considering - it might be up but the pg backend be down.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-3595-1417536523-780.100648-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org> <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org> <rt-4.0.18-21904-1417482198-425.100648-0-0@rt.cpan.org> <rt-4.0.18-3595-1417536523-780.100648-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-2158-1418041841-266.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 432
In case it helps, you can use this code to simulate a lost network connection: my $fd = $dbh->{pg_socket} or die "fd"; open(DBH_PG_FH, "<&=".$fd) or die "open(DBH_PG_FH): $!"; close DBH_PG_FH or die "close(DBH_PG_FH): $!"; A test case for this bug would be to collect to the db, start a transaction, call the above code, and then call ping(). Currently ping() returns true in this situation. It needs to return false.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-2158-1418041841-266.100648-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org> <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org> <rt-4.0.18-21904-1417482198-425.100648-0-0@rt.cpan.org> <rt-4.0.18-3595-1417536523-780.100648-0-0@rt.cpan.org> <rt-4.0.18-2158-1418041841-266.100648-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-21821-1419434085-1818.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 139
Applied some patches to fix this, test this, and properly document this. Latest git commit is 903ce34f026edc41b63a8b2dff9b55a25fa47f7f
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-21821-1419434085-1818.100648-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org> <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org> <rt-4.0.18-21904-1417482198-425.100648-0-0@rt.cpan.org> <rt-4.0.18-3595-1417536523-780.100648-0-0@rt.cpan.org> <rt-4.0.18-2158-1418041841-266.100648-0-0@rt.cpan.org> <rt-4.0.18-21821-1419434085-1818.100648-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-7209-1419778312-1368.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 19
Looks good. Thanks.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-7209-1419778312-1368.100648-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org> <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org> <rt-4.0.18-21904-1417482198-425.100648-0-0@rt.cpan.org> <rt-4.0.18-3595-1417536523-780.100648-0-0@rt.cpan.org> <rt-4.0.18-2158-1418041841-266.100648-0-0@rt.cpan.org> <rt-4.0.18-21821-1419434085-1818.100648-0-0@rt.cpan.org> <rt-4.0.18-7209-1419778312-1368.100648-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-11191-1420143347-1715.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 103
A much improved version here: 95d03c1eb2ba1332935a6a44dcfdf8ed5068c9f5 Much better testing, as well.
MIME-Version: 1.0
In-Reply-To: <rt-4.0.18-11191-1420143347-1715.100648-0-0@rt.cpan.org>
X-Mailer: MIME-tools 5.504 (Entity 5.504)
Content-Disposition: inline
X-RT-Interface: Web
References: <rt-4.0.18-11187-1417471188-1856.0-0-0@rt.cpan.org> <rt-4.0.18-11187-1417471466-803.100648-0-0@rt.cpan.org> <rt-4.0.18-21904-1417482198-425.100648-0-0@rt.cpan.org> <rt-4.0.18-3595-1417536523-780.100648-0-0@rt.cpan.org> <rt-4.0.18-2158-1418041841-266.100648-0-0@rt.cpan.org> <rt-4.0.18-21821-1419434085-1818.100648-0-0@rt.cpan.org> <rt-4.0.18-7209-1419778312-1368.100648-0-0@rt.cpan.org> <rt-4.0.18-11191-1420143347-1715.100648-0-0@rt.cpan.org>
Content-Type: text/plain; charset="utf-8"
Message-ID: <rt-4.0.18-18446-1420645422-234.100648-0-0@rt.cpan.org>
Content-Transfer-Encoding: binary
X-RT-Original-Encoding: utf-8
X-RT-Encrypt: 0
X-RT-Sign: 0
Content-Length: 133
Thank you for pushing this along. Version 3.5.0 has been released which should address all of the problems with ping() and pg_ping()


This service runs on Request Tracker, is sponsored by The Perl Foundation, and maintained by Best Practical Solutions.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.