haproxy: Monitor PostgreSQL for current master

September 24, 2019 - Reading time: 2 minutes

There are couple of methods for haproxy to monitor what PostgreSQL instance is currently master so that "dumb" clients can always write to a PostgreSQL instance (instead of connecting to the standby node).

First there is the send-binary method by replicating the PostgreSQL V3 protocol but requires trusted connection and is TCP/IP only (no SSL). Another issue is that haproxy terminates the connection dirty when connecting to standby instances.

Another method is letting haproxy call xinetd service that runs script to check if the current node is the master.

Third method is letting haproxy call monitor script directly using the external-check command. This requires that the "psql" program be installed on the haproxy server.

In haproxy.cfg the "external-check" has to be enabled in the global preferences


global
    external-check
    # chroot cannot be enabled
    # chroot /var/lib/haproxy
    ...

frontend pg_frontend
        mode tcp
        bind *:6001
        default_backend pg_backend

backend pg_backend
        mode tcp
        option external-check
        external-check command /path/to/monitor/script/check_postgresql_master.sh
        server pg01 10.0.0.1:5432 check inter 1s
        server pg02 10.0.0.2:5432 check inter 1s
        server pg03 10.0.0.3:5432 check inter 1s

The monitor script can be customized since it uses the standard psql command.


#!/bin/bash

PG_MONITOR_USER=haproxy
PG_MONITOR_PASS=haproxy
PG_MONITOR_DB=postgres

PG_PSQL=/usr/bin/psql

VIP=$1
VPT=$2
RIP=$3

if [ "$4" == "" ]; then
  RPT=$VPT
else
  RPT=$4
fi

STATUS=$(PGPASSWORD="$PG_MONITOR_PASS" $PG_PSQL -qtAX -c "select pg_is_in_recovery()" -h "$RIP" -p "$RPT" --dbname="$PG_MONITOR_DB" --username="$PG_MONITOR_USER")

if [ "$STATUS" == "f" ]; then
  # Master
  exit 0
else
  exit 1
fi

Remember to add the monitor user to the pg_hba.conf so haproxy can log in.

Caveat: Haproxy can't run in chroot mode unless all binaries and library are put into the chroot directory.


vlan bridges in netplan

March 21, 2019 - Reading time: ~1 minute

Took me a while to figure this one out in the netplan yaml syntax:


network:
  version: 2
  renderer: networkd
  ethernets:
    eno1:
      dhcp4: no
      dhcp6: no
  bridges:
    br20:
      dhcp4: no
      dhcp6: no
      interfaces: [ vlan20 ]
      addresses: [ 192.168.20.5/24 ]
      gateway4: 192.168.20.1
      nameservers:
        addresses:
          - "192.168.20.1"
    br40:
      dhcp4: no
      dhcp6: no
      interfaces: [ vlan40 ]
  vlans:
    vlan20:
      id: 20
      link: eno1
      dhcp4: no
      dhcp6: no
    vlan40:
      id: 40
      link: eno1
      dhcp4: no
      dhcp6: no

Bonus .vimrc settings for yaml


autocmd FileType yaml setlocal ts=2 sts=2 sw=2 expandtab

Pipe cron output to syslog/journalctl

March 8, 2019 - Reading time: ~1 minute

Simple way to pipe cron script output to journalctl / syslog


MAILTO=""
* * * * * user (echo "testing") 2>&1 | logger -t tag-to-use

gives following in journalctl


$ journalctl SYSLOG_IDENTIFIER=tag-to-use
-- Logs begin at Tue 2019-02-26 16:48:53 UTC, end at Fri 2019-03-08 22:12:03 UTC. --
Mar 08 22:04:01 server tag-to-use[25769]: testing

Convert PostgreSQL cluster to use page checksums

February 26, 2019 - Reading time: 3 minutes

When upgrading PostgreSQL cluster from one major version to another you cannot use pg_upgrade tool to upgrade/convert cluster with page checksums disabled to a cluster with page checksums enabled.

Instead you have to do dump and restore.

On Debian/Ubuntu based systems this is made easier using the pg_createcluster / pg_dropcluster commands.

Here I'm converting a PostgreSQL 11 cluster named 11/main (with no page checksums enabled) to new cluster named 11/data with page checksums enabled.

Start by editing /etc/postgresql-common/createcluster.conf and change

#initdb_options = ''

to

initdb_options = '--data-checksums'

pg_lsclusters should only show one cluster running:

$ pg_lsclusters
Ver Cluster Port Status Owner    Data directory              Log file
11  main    5432 online postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log

Create a new cluster with

$ pg_createcluster 11 data

This creates a empty cluster with page checksums enabled on another port (usually port 5433). Verify it with pg_lsclusters:

$ pg_lsclusters
Ver Cluster Port Status Owner    Data directory              Log file
11  main    5432 online postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log

11  data    5433 online postgres /var/lib/postgresql/11/data /var/log/postgresql/postgresql-11-data.log

Verify that page checksums are enabled with

$ psql -p 5433 -c "show data_checksums"

 data_checksums
----------------
 on
(1 row)

To dump and reload it is simplest to do (as the postgres user)

$ pg_dumpall --port=5432 --clean | psql --port=5433

This dumps all users and databases and restores them to the new cluster.

After this is done (assuming no transactions happened in the meantime) remember to copy pg_hba.conf and other settings from /etc/postgresql/11/main to /etc/postgresql/11/data.

Drop the old cluster with

$ pg_dropcluster 11 main

and change the port on 11/data to 5432.


View CPU microcode revision from powershell

June 28, 2018 - Reading time: 5 minutes

A small powershell snippet to show what CPU microcode revision is running and what microcode revision the BIOS provides.

$registrypath = "Registry::HKEY_LOCAL_MACHINE\HARDWARE\DESCRIPTION\System\CentralProcessor\0\"

$processor = (Get-ItemProperty -Path $registrypath )."ProcessorNameString"
$biosMicrocode = (Get-ItemProperty -Path $registrypath )."Previous Update Revision"
$runningMicrocode = (Get-ItemProperty -Path $registrypath )."Update Revision"

# Convert to string and remove leading zeros
$biosMicrocodeInHex = (-join ( $biosMicrocode[0..4] | foreach { $_.ToString("X2") } )).TrimStart('0')
$runningMicrocodeInHex = (-join ( $runningMicrocode[0..4] | foreach { $_.ToString("X2") } )).TrimStart('0')

Write-Host "CPU: $processor"
Write-Host "BIOS microcode revision: 0x$biosMicrocodeInHex"
Write-Host "Current microcode revision: 0x$runningMicrocodeInHex"

Example output:

CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz
BIOS microcode version:  0x28
Current microcode running: 0x2D

Blacklist bad memory addresses in Windows

June 22, 2018 - Reading time: 9 minutes

I recently experienced random MEMORY_MANAGEMENT BSOD on my personal computer. When consulting Microsoft Dev Center for the error code it indicated a issue with the RAM.

The computer is 6 years old and only has 8 GB of RAM (two sticks) I’m not about to spend money to replace the RAM.

I ran memtest86 and got two bad memory spots in two tests (Moving inversions, random pattern):

  • 0x1BFAE5474
  • 0x116A3FE55

Windows is able to blacklist bad memory addresses using the bcdedit tool. However it only blacklists page of memory (4KB) instead of single address. In order to convert from memtest86 single address syntax to bcdedit pages syntax I had to remove the last 3 letters from each memory address (since 0xFFF is 4KB).

memtest86 bcdedit Memory range from Memory range to Size
0x1BFAE5474 0x1BFAE5 0x1BFAE5000 0x1BFAE5FFF 4 KB
0x116A3FE55 0x116A3F 0x116A3F000 0x116A3FFFF 4 KB

Running command promt as administrator I could blacklist the memory addresses:

# Enable memory blacklisting
bcdedit /set {badmemory} badmemoryaccess no
# Specify what addresses to blacklist
bcdedit /set {badmemory} badmemorylist 0x1bfae5 0x116a3f

and reboot.

To verify that the memory addresses are blacklisted use

bcdedit /enum {badmemory}

or use RAMMap from Microsoft.

This solved my MEMORY_MANAGEMENT BSOD and random program crashes.

Caveat

When upgrading from one major Windows versions to another (for example 1709 to 1803) these settings are lost and have to be reapplied.

This is not a permanent fix but should help on older systems or systems where you can’t change the RAM (such as embedded systems).