Step 1: Create Monitoring User
We recommend creating a separate monitoring user on your PostgreSQL database for pganalyze.
Enabling pg_stat_statements
Connect to your database with cloudsqlsuperuser / alloydbsuperuser privileges (i.e. the initial user that was created on instance creation), e.g. using psql.
Run the following SQL commands to enable the pg_stat_statements extension, and make sure it was installed correctly:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
SELECT * FROM pg_stat_statements LIMIT 1;Create monitoring user
You can either create a monitoring user manually using psql, or create a service account and use IAM authentication.
Connect to the database you will be monitoring, again as a user with cloudsqlsuperuser / alloydbsuperuser privileges. Then run the following to create a monitoring user (we’ve generated a random password for you, but you can replace it with one of your choosing):
CREATE USER pganalyze WITH PASSWORD 'mypassword' CONNECTION LIMIT 5;
GRANT pg_monitor TO pganalyze;
CREATE SCHEMA pganalyze;
GRANT USAGE ON SCHEMA pganalyze TO pganalyze;
GRANT USAGE ON SCHEMA public TO pganalyze;
If you enable the optional reset mode (usually not required), you will also need this helper method:
CREATE OR REPLACE FUNCTION pganalyze.reset_stat_statements() RETURNS SETOF void AS
$$
/* pganalyze-collector */ SELECT * FROM public.pg_stat_statements_reset();
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;Then, connect to each database that you plan to monitor on this server as a superuser (or equivalent) and run the following to enable the collection of additional column statistics and extended statistics:
CREATE SCHEMA IF NOT EXISTS pganalyze;
GRANT USAGE ON SCHEMA pganalyze TO pganalyze;
DROP FUNCTION IF EXISTS pganalyze.get_column_stats;
CREATE FUNCTION pganalyze.get_column_stats() RETURNS TABLE(
schemaname name, tablename name, attname name, inherited bool, null_frac real, avg_width int, n_distinct real, correlation real
) AS $$
/* pganalyze-collector */
SELECT schemaname, tablename, attname, inherited, null_frac, avg_width, n_distinct, correlation
FROM pg_catalog.pg_stats
WHERE schemaname NOT IN ('pg_catalog', 'information_schema') AND tablename <> 'pg_subscription';
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;
DROP FUNCTION IF EXISTS pganalyze.get_relation_stats_ext;
CREATE FUNCTION pganalyze.get_relation_stats_ext() RETURNS TABLE(
statistics_schemaname text, statistics_name text,
inherited boolean, n_distinct pg_ndistinct, dependencies pg_dependencies,
most_common_val_nulls boolean[], most_common_freqs float8[], most_common_base_freqs float8[]
) AS
$$
/* pganalyze-collector */ SELECT statistics_schemaname::text, statistics_name::text,
(row_to_json(se.*)::jsonb ->> 'inherited')::boolean AS inherited, n_distinct, dependencies,
most_common_val_nulls, most_common_freqs, most_common_base_freqs
FROM pg_catalog.pg_stats_ext se
WHERE schemaname NOT IN ('pg_catalog', 'information_schema') AND tablename <> 'pg_subscription';
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;Note: We never collect actual table data through this method (see the NULL values in the get_column_stats function and omitted fields like most_common_vals in the get_relation_stats_ext function), but we do collect statistics about the distribution of values in your tables. You can skip creating the get_column_stats and get_relation_stats_ext helper functions if the database contains highly sensitive information and statistics about it should not be collected. This will impact the accuracy of Index Advisor recommendations.
Note that it is important you run these as a user with cloudsqlsuperuser / alloydbsuperuser privileges in order to pass down the full access to statistics tables.
Verify connection
Let’s make sure we can connect to the database with the new user:
PGPASSWORD=mypassword psql -h 1.2.3.4 -d mydatabase -U pganalyzeWrite down the username and password of the monitoring user, we will use it in the last step of this guide.
Create a service account for monitoring user
Cloud SQL and AlloyDB support IAM database authentication, which allows users to connect to the database using Google Cloud IAM credentials instead of traditional passwords. This means you can use a service account as a monitoring user.
First, create a new service account with the ID “pganalyze”, and set the following two roles in the case of Cloud SQL:
- Cloud SQL Client (
roles/cloudsql.client)- Provides connectivity access to Cloud SQL instances
- Cloud SQL Instance User (
roles/cloudsql.instanceUser)- Role allowing access to a Cloud SQL instance
Or the following roles in case of AlloyDB:
- Cloud AlloyDB Database User (
roles/alloydb.databaseUser)- Provides connectivity access to AlloyDB clusters
- Service Usage Customer (
roles/serviceusage.serviceUsageConsumer)- Provides the user access to an API that checks permissions
The email address pganalyze@your-project-id.iam.gserviceaccount.com will be given to this service account.
For the database username, Cloud SQL and AlloyDB remove the .gserviceaccount.com suffix in the email,
therefore the username for this service account becomes pganalyze@your-project-id.iam.
This username will be used in later steps.
Configure new and existing instances for IAM database authentication
Once the service account for a monitoring user is created, configure the database to monitor for IAM database authentication. Follow the Cloud SQL or AlloyDB documentation to configure.
You can skip this part if your database is already configured for IAM database authentication.
Add a service account to a Cloud SQL instance / AlloyDB cluster
Next, add a service account to a Cloud SQL instance or AlloyDB cluster to monitor. This creates an actual Postgres user for this service account. The following steps add the service account using the Google Cloud console. Please refer to the Cloud SQL or AlloyDB documentation for other ways to add it.
- In the Google Cloud console, open the Overview page of an instance / cluster to monitor.
- Select Users from the SQL navigation menu.
- Click Add user account.
- Click the Cloud IAM radio button.
- Add the email address
pganalyze@your-project-id.iamin the Principal field. - Click Add. The service account is now in the user account list.
Grant database privileges to the monitoring user and create helper functions
Next, grant database privileges needed for monitoring to the created user, then create helper functions.
Note that it is important you run these as a user with cloudsqlsuperuser / alloydbsuperuser privileges in order to pass down the full access to statistics tables.
ALTER ROLE "pganalyze@your-project-id.iam" CONNECTION LIMIT 5;
GRANT pg_monitor TO "pganalyze@your-project-id.iam";
CREATE SCHEMA pganalyze;
GRANT USAGE ON SCHEMA pganalyze TO "pganalyze@your-project-id.iam";
GRANT USAGE ON SCHEMA public TO "pganalyze@your-project-id.iam";If you enable the optional reset mode (usually not required), you will also need this helper method:
CREATE OR REPLACE FUNCTION pganalyze.reset_stat_statements() RETURNS SETOF void AS
$$
/* pganalyze-collector */ SELECT * FROM public.pg_stat_statements_reset();
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;Then, connect to each database that you plan to monitor on this server as a superuser (or equivalent) and run the following to enable the collection of additional column statistics and extended statistics:
CREATE SCHEMA IF NOT EXISTS pganalyze;
GRANT USAGE ON SCHEMA pganalyze TO "pganalyze@your-project-id.iam";
DROP FUNCTION IF EXISTS pganalyze.get_column_stats;
CREATE FUNCTION pganalyze.get_column_stats() RETURNS TABLE(
schemaname name, tablename name, attname name, inherited bool, null_frac real, avg_width int, n_distinct real, correlation real
) AS $$
/* pganalyze-collector */
SELECT schemaname, tablename, attname, inherited, null_frac, avg_width, n_distinct, correlation
FROM pg_catalog.pg_stats
WHERE schemaname NOT IN ('pg_catalog', 'information_schema') AND tablename <> 'pg_subscription';
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;
DROP FUNCTION IF EXISTS pganalyze.get_relation_stats_ext;
CREATE FUNCTION pganalyze.get_relation_stats_ext() RETURNS TABLE(
statistics_schemaname text, statistics_name text,
inherited boolean, n_distinct pg_ndistinct, dependencies pg_dependencies,
most_common_val_nulls boolean[], most_common_freqs float8[], most_common_base_freqs float8[]
) AS
$$
/* pganalyze-collector */ SELECT statistics_schemaname::text, statistics_name::text,
(row_to_json(se.*)::jsonb ->> 'inherited')::boolean AS inherited, n_distinct, dependencies,
most_common_val_nulls, most_common_freqs, most_common_base_freqs
FROM pg_catalog.pg_stats_ext se
WHERE schemaname NOT IN ('pg_catalog', 'information_schema') AND tablename <> 'pg_subscription';
$$ LANGUAGE sql VOLATILE SECURITY DEFINER;Note: We never collect actual table data through this method (see the NULL values in the get_column_stats function and omitted fields like most_common_vals in the get_relation_stats_ext function), but we do collect statistics about the distribution of values in your tables. You can skip creating the get_column_stats and get_relation_stats_ext helper functions if the database contains highly sensitive information and statistics about it should not be collected. This will impact the accuracy of Index Advisor recommendations.
Next we continue by installing the pganalyze collector:
Proceed to Step 2: Install the collector
Couldn't find what you were looking for or want to talk about something specific?
Start a conversation with us →