pganalyze Collector settings
The collector can be configured either through an
INI config file (in a package-based install) or environment
variables (typically for running via its Docker image). Most settings can be configured through either mechanism.
If both are present, the config file takes precedence. Note that a single collector instance can monitor more than
one database server, though this is not supported when configuring through environment variables.
INI-based setup, the
[pganalyze] section describes settings that apply to all servers. Other sections
describe the servers to monitor, how to connect to them, and server-specific configuration settings. You should
name the other sections after the servers they correspond to, though note these are not the names that will appear
in the app. In-app names are based on hostname settings as determined during monitoring. When you set environment
variables in addition to specifying a configuration file, the environment variable settings apply to all monitored
After you make changes, you can run
pganalyze-collector --test --reload to verify the new configuration and load
the new configuration in the collector background process if they work correctly. This minimizes monitoring
interruptions and simplifies config file updates.
The tables below list configuration settings, their defaults if not set, and their descriptions. If a setting is
configurable through environment variables, the environment variable name follows the setting in parentheses.
Environment variables for boolean settings expect
1 for true and
0 for false.
Note that these settings apply to the latest version of the collector.
Common settings for configuring collector behavior, independent of the platform.
|api_key (||n/a, required||API key to authenticate the collector to the pganalyze app. We will show this when adding a server in the app, and you can review it on your organization's Settings page under the API keys tab.|
|api_base_url (||https://api.pganalyze.com||Base URL for contacting the pganalyze API. You typically do not need to change this, unless you are running pganalyze Enterprise Server.|
|skip_if_replica (||false||Skip all metadata collection and snapshot submission while this server is a replica (according to |
|enable_log_explain (||false||Enable log-based EXPLAIN. See setup instructions, but note we recommend using auto_explain instead if possible.|
Server connection settings
How to connect to the server(s) pganalyze will be monitoring. Note that when monitoring multiple databases on the same server, the first specified is considered the primary database. This is where helper functions are expected to be defined, and the one we connect to for server-wide metrics.
|db_url (||n/a, either this or individual settings below are required||URL of the database server to monitor|
|db_name (||n/a, either this or db_url is required||Name of database to monitor; or, comma-separated list of all databases to monitor, starting with primary (last entry can be |
|db_username (||n/a, either this or db_url is required||Postgres user to connect as (we recommend using the pganalyze monitoring user here)|
|db_password (||[none]||Password for the Postgres user|
|db_host (||n/a, either this or db_url is required||Host to connect to.|
Or, if connecting to a local server using peer authentication, the path to a local unix domain socket as configured by unix_socket_directories.
|db_port (||5432||Port to connect on|
|db_sslmode (||prefer||The |
|db_sslrootcert (||system certificate store||Path to SSL certificate authority (CA) certificate(s) to use to verify the server's certificate, or |
|db_sslrootcert_contents (||n/a, see above||Alternative to above, using actual contents of the certificate(s) instead|
|db_sslcert (||[none]||Path to the client SSL certificate (optional, usually not required)|
|db_sslcert_contents (||[none]||Alternative to above, using actual contents of the certificate instead|
|db_sslkey (||[none]||Path to the secret key used for the client certificate|
|db_sslkey_contents (||[none]||Alternative to above, using actual contents of the key|
PII Filtering settings
We take the responsibility of access to your database very seriously. As discussed above, we already limit the direct access we have to your data, but some personally-identifiable information or other sensitive values can still come up in query text or logs. To address this, the collector has several settings to filter these before we collect them.
|filter_log_secret (||none||One or more of |
|filter_query_sample (||none||Either |
|filter_query_text (||unparsable||Either |
By default the pganalyze collector does not filter data. Our recommended configuration for servers containing sensitive data is as follows:
filter_log_secret: all filter_query_sample: normalize filter_query_text: unparsable
normalize setting for
filter_query_sample, which will remove all query parameter values from query samples, including those contained within automatically collected EXPLAIN plans.
In order to utilize
auto_explain with this configuration, make sure to set
json (other formats are currently not supported by the EXPLAIN normalization).
Schema filter settings
The pganalyze collector limits the number of schema objects (tables, views, etc.) that can be monitored on each database server. This limit is currently 5,000 tables or views per database server. If this limit is exceeded, no schema information will be collected.
You can avoid reaching this limit by using the following setting to select which tables/views should be excluded:
|ignore_schema_regexp (||[none]||Skip collecting metadata for all matching tables, schemas, or functions; match is checked against schema-qualified object names (e.g. to ignore table "foo" only in the public schema, set to |
To validate whether the setting is working as intended, you can run a query on your database server to count the number of monitored tables and views. Note you would have to run this on each database on the server and then summarize the counts, which should not exceed the 5,000 limit in aggregate:
SELECT current_database() AS dbname, COUNT(*) AS table_and_view_count FROM pg_class c LEFT JOIN pg_catalog.pg_namespace n ON (n.oid = c.relnamespace) WHERE c.relkind IN ('r','v','m','p') AND c.relpersistence <> 't' AND c.relname NOT IN ('pg_stat_statements') AND n.nspname NOT IN ('pg_catalog','pg_toast','information_schema') AND (n.nspname || '.' || c.relname) !~* 'REGEXP';
Make sure to replace
REGEXP with the value of your
Only relevant if you are running your database in Amazon RDS or Amazon Aurora. See our RDS/Aurora setup instructions for details.
Note that the
aws_endpoint_* settings are only relevant if you are using custom AWS endpoints. See
the AWS documentation for details.
|aws_region (||auto-detected from hostname||Region your AWS server is running in|
|aws_db_instance_id (||auto-detected from hostname||Instance ID of your Amazon RDS instance; may need to be set manually when using IP addresses or custom DNS records|
|aws_db_cluster_id (||auto-detected from hostname||Cluster ID of your Amazon Aurora cluster (either cluster or reader endpoint); may need to be set manually when using IP addresses or custom DNS records|
|aws_access_key_id (||[none]||Only necessary if not using recommended instance roles configuration|
|aws_secret_access_key (||[none]||See above|
|aws_account_id (||[none]||If specified, and api_system_scope (see below) is not specified, this is prepended to the auto-generated system scope (optional, can be used to, e.g., differentiate staging from production)|
|db_use_iam_auth (||[none]||Fetches a short-lived token for logging into the database instance from the AWS API, instead of using a hardcoded password in the collector configuration file. To use this setting, IAM authentication needs to be enabled on the database instance / cluster, the pganalyze IAM policy needs to cover the "rds-db:connect" privilege for the pganalyze user, and the user needs to be granted the "rds_iam" role in Postgres.|
|aws_assume_role (||[none]||If using cross-account role delegation, the ARN of the role to assume; see the AWS documentation for details|
|aws_web_identity_token_file (||[none]||If running the collector inside EKS, can be used with aws_role_arn in order to access AWS resources; see the AWS documentation for details|
|aws_role_arn (||[none]||If running the collector inside EKS, can be used with aws_web_identity_token_file in order to access AWS resources; see the AWS documentation for details|
|aws_endpoint_signing_region (||[none]||Region to use for signing requests (optional, usually not required)|
|aws_endpoint_rds_url (||[none]||URL of RDS service (optional, usually not required)|
|aws_endpoint_ec2_url (||[none]||URL of EC2 service (optional, usually not required)|
|aws_endpoint_cloudwatch_url (||[none]||URL of CloudWatch service (optional, usually not required)|
|aws_endpoint_cloudwatch_logs_url (||[none]||URL of CloudWatch log service (optional, usually not required)|
Only relevant if you are running your database in Azure using Azure Database for PostgreSQL. See ourAzure setup instructions for details.
|azure_db_server_name (||auto-detected from hostname||Name of your server; may need to be set manually when using IP addresses or custom DNS records|
|azure_eventhub_namespace (||n/a, required for Log Insights||Event Hub namespace to use for log handling|
|azure_eventhub_name (||n/a, required for Log Insights||Event Hub name to use for log handling|
|azure_ad_tenant_id (||[none]||The "Directory (tenant) ID" on your application. Only necessary if not using the recommended Managed Identity setup; see these setup instructionsfor details|
|azure_ad_client_id (||[none]||The "Application (client) ID" on your application|
|azure_ad_client_secret (||[none]||When using client secrets, specify the generated secret here|
|azure_ad_certificate_path (||[none]||When using certificates, specify the path to your certificate here|
|azure_ad_certificate_password (||[none]||When using certificates, specify your certificate password here, if required|
Google Cloud Platform
Only relevant if you are running your database in GCP using Google Cloud SQL or Google AlloyDB. See the GCP setup instructions for details.
|gcp_cloudsql_instance_id (||n/a, required for Log Insights (for Cloud SQL)||Google Cloud SQL instance ID|
|gcp_alloydb_cluster_id (||n/a, required for Log Insights (for AlloyDB)||Google AlloyDB cluster ID|
|gcp_alloydb_instance_id (||n/a, required for Log Insights (for AlloyDB)||Google AlloyDB instance ID (within the given cluster)|
|gcp_project_id (||n/a, required||GCP project ID; see Google documentation for details|
|gcp_pubsub_subscription (||n/a, required for Log Insights||See GCP setup instructions for details|
|gcp_credentials_file (||[none]||Only necessary if not using the recommended method of assigning the Service Account to the VM directly; see these setup instructions for details|
If running on your own infrastructure, a platform other than the cloud providers listed above, or in a self-managed VM on a cloud provider, the configuration settings here may be useful.
|db_log_location (||[none]||database log file or directory location (must be readable by pganalyze system user)|
|db_log_syslog_server (||[none]||local address (host:port) to listen on for syslog messages (see syslog server instructions)|
|always_collect_system_data (||false||Always gather local system metrics, regardless of whether the database address is local or remote. This is useful for setups which connect to a local database with a non-local IP address.|
Like the general settings above, but less commonly used. We only recommend using these settings after talking to pganalyze support.
|api_system_id (||Automatically detected||Overrides the ID of the system, used for uniquely identifying the server with the pganalyze API. This is commonly what's used to refer to a single server, and defaults to the instance ID for managed database providers.|
|api_system_type (||Automatically detected||Overrides the type of the system, used for uniquely identifying the server with the pganalyze API. Must be one of the following: amazon_rds, azure_database, google_cloudsql, self_hosted or heroku.|
|api_system_scope (||Automatically detected||Overrides the scope of the system, used for uniquely identifying the server with the pganalyze API. Can be used for auxiliary identifying characteristics (e.g., region of a server ID that's re-used).|
|api_system_scope_fallback (||[none]||When the pganalyze backend receives a snapshot with a fallback scope set, and there is no server created with the regular scope, it will first search the servers with the fallback scope. If found, that server's scope will be updated to the (new) regular scope. If not found, a new server will be created with the regular scope. The main goal of the fallback scope is to avoid creating a duplicate server when changing the scope value.|
|db_log_docker_tail||[none]||Experimental: name of docker container to collect logs from using |
|disable_citus_schema_stats (||[none]||If using the Citus extension in your database, turn off the collection of statistics for distributed indexes and tables. For very large schemas, this collection can error out due to timeouts or locks. When using this option it's recommended to instead monitor the workers directly for table and index sizes.|
|query_stats_interval (||60||How often to collect query statistics, in seconds; supported values are |
|max_collector_connections (||10||Maximum connections allowed to the database with the collector application_name, in order to protect against accidental connection leaks in the collector|
|http_proxy (||[none]||Proxy to be used for all HTTP connections, such as API calls. Use for proxies that do not support SSL. Example: |
|https_proxy (||[none]||Proxy to be used for all HTTP connections, such as API calls. Use for proxies that support SSL. Example: |
|no_proxy (||[none]||Comma-delimited list of hostnames that should be accessed directly, without using a configured proxy. Has no effect unless either HTTP_PROXY or HTTPS_PROXY is specified.|
|disable_logs (||false||Disable Log Insights data collection|
|disable_activity (||false||Disable activity snapshot data collection (VACUUM, connection traces, etc.)|
|error_callback||[none]||Script to call if snapshot fails (learn more in the collector README)|
|success_callback||[none]||Script to call if snapshot succeeds (learn more in the collector README)|
Couldn't find what you were looking for or want to talk about something specific?
Start a conversation with us →