Introducing Query Tuning Workbooks to safely tune Postgres queries on production with pganalyze!

Step 1: Deploy the Collector

In order to monitor a Crunchy Bridge database you run the pganalyze collector directly on your database server using Postgres Container Apps.

Grant pg_monitor role and log file access to the application user

We recommend using a role with limited privileges for monitoring. You can use Crunchy Bridge's application user as the user for monitoring, or create a separate role. The instructions here assume you are using the application user.

To ensure the application user can access monitoring statistics for all users (not just itself), and can read the server log, run the following with the superuser credentials:

GRANT pg_monitor TO application;

Additionally, run the following to allow the application user to read the Postgres server log:

CREATE SCHEMA IF NOT EXISTS pganalyze;
GRANT USAGE ON SCHEMA pganalyze TO application;
CREATE OR REPLACE FUNCTION pganalyze.read_log_file(log_filename text, read_offset bigint, read_length bigint) RETURNS text AS
$$
DECLARE
  result text;
BEGIN
  IF log_filename !~ '\A[\w\.-]+\Z' THEN
    RAISE EXCEPTION 'invalid log filename';
  END IF;

  SELECT pg_catalog.pg_read_file(
    pg_catalog.current_setting('data_directory') || '/' || pg_catalog.current_setting('log_directory') || '/' || log_filename,
    read_offset,
    read_length
  ) INTO result;

  RETURN result;
END
$$ LANGUAGE plpgsql VOLATILE SECURITY DEFINER;

Then, connect to each database that you plan to monitor on this server as a superuser (or equivalent) and run the following to enable the collection of additional column statistics and extended statistics:

CREATE SCHEMA IF NOT EXISTS pganalyze;
GRANT USAGE ON SCHEMA pganalyze TO pganalyze;

DROP FUNCTION IF EXISTS pganalyze.get_column_stats;
CREATE FUNCTION pganalyze.get_column_stats() RETURNS TABLE(
  schemaname name, tablename name, attname name, inherited bool, null_frac real, avg_width int, n_distinct real, correlation real
) AS $$
  /* pganalyze-collector */
  SELECT schemaname, tablename, attname, inherited, null_frac, avg_width, n_distinct, correlation
  FROM pg_catalog.pg_stats
  WHERE schemaname NOT IN ('pg_catalog', 'information_schema') AND tablename <> 'pg_subscription';
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;

DROP FUNCTION IF EXISTS pganalyze.get_relation_stats_ext;
CREATE FUNCTION pganalyze.get_relation_stats_ext() RETURNS TABLE(
  statistics_schemaname text, statistics_name text,
  inherited boolean, n_distinct pg_ndistinct, dependencies pg_dependencies,
  most_common_val_nulls boolean[], most_common_freqs float8[], most_common_base_freqs float8[]
) AS
$$
  /* pganalyze-collector */ SELECT statistics_schemaname::text, statistics_name::text,
  (row_to_json(se.*)::jsonb ->> 'inherited')::boolean AS inherited, n_distinct, dependencies,
  most_common_val_nulls, most_common_freqs, most_common_base_freqs
  FROM pg_catalog.pg_stats_ext se
  WHERE schemaname NOT IN ('pg_catalog', 'information_schema') AND tablename <> 'pg_subscription';
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;

Note: We never collect actual table data through this method (see the NULL values in the get_column_stats function and omitted fields like most_common_vals in the get_relation_stats_ext function), but we do collect statistics about the distribution of values in your tables. You can skip creating the get_column_stats and get_relation_stats_ext helper functions if the database contains highly sensitive information and statistics about it should not be collected. This will impact the accuracy of Index Advisor recommendations.

In later steps you can now specify the application user credentials as the DB_URL.

Enable Postgres Container Apps

If you have not yet enabled Postgres Container Apps, connect to your Crunchy Bridge Postgres database, and run the following commands as a superuser:

CREATE EXTENSION IF NOT EXISTS pgpodman;

Create Crunchy Bridge API key

Providing the Crunchy Bridge API key to the collector will allow the collector to obtain accurate Storage Space information, as well as cluster information. It is recommended to create a new API key for the collector and passing it as an environment value. See the Crunchy Bridge documentation for how to create API keys.

Run collector test

Run the following command to verify the configuration. For the DB_URL, specify the application user credentials. For the CRUNCHY_BRIDGE_API_KEY, specify the API key created above.

SELECT run_container('-t -e DB_URL="REPLACE_ME" -e PGA_API_KEY="API_KEY" -e CRUNCHY_BRIDGE_API_KEY="REPLACE_ME" -e DB_ALL_NAMES=1 quay.io/pganalyze/collector:stable test');

This should return output like this:

                                run_container
--------------------------------------------------------------------------------
 I Running collector test with pganalyze-collector X.XX.X
 I [default] Testing statistics collection...
 I [default]   Test collection successful for PostgreSQL XX.X
 I [default]   Test snapshot successful (27.9 KB)
 I [default] Testing activity snapshots...
 I [default]   Test snapshot successful (2.28 KB)
 I [default] Testing log download...
 I [default]   Test snapshot successful (1.45 KB)
 I [default]   Log test successful

 Test successful. View servers in pganalyze:
  - [default]: https://app.pganalyze.com/servers/SERVER_ID

(1 row)

Start collector in the background

With a successful test, you can now start the collector container in the background:

SELECT run_container('-d -t -e DB_URL="REPLACE_ME" -e PGA_API_KEY="API_KEY" -e CRUNCHY_BRIDGE_API_KEY="REPLACE_ME" -e DB_ALL_NAMES=1 quay.io/pganalyze/collector:stable');

Once you've confirmed the install is successful and you're receiving query data in pganalyze, we recommend setting up Log Insights as a follow-up step, to automatically track log events in your database.


Couldn't find what you were looking for or want to talk about something specific?
Start a conversation with us →