PII Filtering
Personally Identifiable Information (PII) about your users must be handled very carefully in order to respect your users' privacy and to comply with laws and regulations your organization is subject to. When a third-party tool like pganalyze processes your database statistics and logs, it may involve handling data that includes this PII.
Specifically, pganalyze collects data that may include PII from Postgres logs, the pg_stat_activity view, and the pg_stat_statements view.
Note that this filtering happens in the collector, the open-source agent that connects to your database and runs on your infrastructure. Filtering occurs before the data ever reaches the pganalyze cloud service, or a pganalyze Enterprise Server installation.
In order to strike the right balance between what you share with pganalyze and the impact on monitoring, statistics, and insights available through our service, there are several collector settings that control collection of potentially sensitive information.
The default configuration may cause PII contained in query texts and parameter values to be sent to pganalyze. However, Postgres credentials and log information that could not be classified is redacted.
For servers containing sensitive information in the logs and to maximize data privacy (at the cost of less useful data in pganalyze), we recommend the following configuration:
filter_log_secret: all
filter_query_sample: normalize
filter_query_text: unparsable
Filter settings
Log secrets
Setting: filter_log_secret
/ FILTER_LOG_SECRET
This determines how content coming from logs is filtered:
credential
: Text identified as passwords or other credentials, like private keysparsing_error
: Text logged by Postgres for parsing errors (could contain anything, including credentials)statement_text
: All statement texts (which may contain table data if not using bind parameters)statement_parameter
: Bind parameters for a statement (which may contain table data for statements like INSERT or UPDATE)table_data
: Table data from constraint violations and COPY errorsops
: System errors, network errors, file locations, pg_hba.conf contents, and configured commands (e.g.archive_command
)unidentified
: Text that could not be identified and might contain secrets
Additionally, the special all
and none
setting values apply all or none of the above filters, respectively.
Defaults to credential,parsing_error,unidentified
.
Impact: Filtered content from log lines visible in Log Insights
will show up as [redacted]
.
Query samples / Automated EXPLAIN
Setting: filter_query_sample
/ FILTER_QUERY_SAMPLE
This setting determines how content in query samples (individual executions of a query collected from logs, which include Automated EXPLAIN plans) is filtered:
all
: do not collect any query samplesnormalize
: filter query samples and collected EXPLAIN plans to remove any literal constant values and replace them with placeholdersnone
: do not filter query samples at all
In order to utilize auto_explain
with the normalize
value, make sure to set
auto_explain.log_format
to json
(other formats are currently not supported
by the EXPLAIN normalization).
Defaults to none
.
Impact: Query samples and EXPLAIN plans shown in Query Performance are filtered or normalized.
Query texts
Setting: filter_query_text
/ FILTER_QUERY_TEXT
This setting determines how query text from pg_stat_statements and pg_stat_activity is filtered:
none
: do not filter query text at allunparsable
: redact query texts that could not be parsed by the collector
Defaults to unparsable
.
Impact: Unparseable queries are redacted in Query
Performance and Connections
and may show as <unparsable query>
or <truncated query>
.
Couldn't find what you were looking for or want to talk about something specific?
Start a conversation with us →