Error "Database on system with hostname ______ is not responding"

SUMMARY

This error occurs when a managing system cannot communicate with one of its sources or managed systems.

ISSUE

This issue appears in two forms:

When clicking on a system in the navigation pane, the error message pops up "Database on system with hostname ______ is not responding"

Alternatively, in the navigation pane of the legacy UI, you may see an '!' next to the system. Hovering over the system causes a tooltip to pop-up stating "This system was incompletely loaded". 

RESOLUTION

  1. From the command line, ping the source system to verify that the target is able to communicate with the source. 
  2. Verify that the source system is listed as "is_active" in the target's database. This command will automatically correct the database if the source is listed as inactive.
    psql -U postgres bpdb -c "update bp.systems set is_active='t'"
  3. Verify that you can communicate with the source system database, and that the source system has the target system listed as a manager. (HOSTNAME in the sample below should be replaced by the actual hostname of the managed system)
    psql -U postgres bpdb -h HOSTNAME -c "select * from bp.managers"
If you cannot communicate with the source database after all steps have been completed, then there is likely an error in /usr/bp/data/pg_hba.conf on the source, or /usr/bp/data/pg_service.conf on the target. 

Only the last part of pg_hba.conf is configurable, and should look like this:

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     trust
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
# IPv6 local connections:
host    all             all             ::1/128                 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
host    bpdb    +bpexch,wguest  0.0.0.0/0              md5
hostssl bpdb     postgres       172.17.3.1/32         trust

These lines are required at minimum but there may also be additional entries. 

pg_service.conf should appear as follows:

[localhost]
user=postgres
connect_timeout=5

[connpooldb]
user=postgres
dbname=pgbouncer
host=localhost
port=6432
connect_timeout=5

[upsilon]
user=postgres
host=localhost
port=6432
sslmode=prefer
connect_timeout=30

[HOSTNAME]
user=postgres
host=HOSTNAME
connect_timeout=3
sslmode=prefer

Where HOSTNAME is again replaced by the hostname of the managed system. Make any needed corrections to these files, restart the database, and then repeat step 2 to resolve the issue. 

CAUSE

This error occurs when communication is interrupted between the two systems, often due to network outages. 

Damage to the pg_hba and pg_service files is usually caused by an improper dump and reload of the database. 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Contact us