2024.04.0 Release
Docker tag: quay.io/pganalyze/enterprise:v2024.04.0
Notes
- This release requires a database migration when coming from older releases, be sure to follow the recommended upgrade steps
- pganalyze Enterprise Server now requires Postgres 12 or newer for the database that is used for storing pganalyze internal data
- If you are on an older Postgres, you need to upgrade Postgres before deploying this release
Features
- New: pganalyze Index Advisor 3.0
- Read the full details in the announcement blog post
- New: pganalyze VACUUM Simulator
- Read about it in the blog post
- New: OpenTelemetry tracing integration
- Learn how to set it up in the documentation
- Update bundled collector from 0.51.1 to 0.55.0
- See the collector changelog for the full details
- Schema Statistics: Add index scans per minute stats in the Indexes tab in Schema Statistics page
- These scans per minute stats are calculated using last 24 hours data
- You can view the data for index scans per minute over time on the index details page.
- Add n_distinct values from pg_stats to Schema Statistics page
- This is now available on the Columns tab of the table detail page when viewing a specific table. It shows how many distinct values are estimated to be in a given column
- Update query parsing logic to Postgres 16 parser
- This adds compatibility for queries that use new syntax added in Postgres 16 (for example the new SQL/JSON syntax), when parsed for table references, or syntax highlighting purposes
- EXPLAIN Plans: Improve visualization
- Padding between nodes is reduced so more content fits on screen
- Table and index references now link to their respective pages
- CTE scans now link to their parent node, making it easier to navigate large plans
- Alerts: Show 30 days of history for resolved alerts, up from 7 days
- Schema Statistics: Add extended statistics information in the Columns tab
- Support setting a default role for new members onboarded via SAML
- Previously, "View & Modify (All Servers)" role was assigned to any new members onboarded via SAML and there was no way to change this behavior. With this change, the default role can be set during the integration setup. It also can be updated to any member roles of the organization from the Integrations tab in the Settings page.
- GraphQL API: Add additional filter options for getServers
- This adds the following optional arguments:
- withRecentData (boolean, filters for servers that have received data in the last 24 hours)
- systemType (string, API system type set by the collector)
- systemScope (string, API system scope set by the collector)
- systemId (string, API system ID set by the collector)
- These can be used to either locate a specific server based on the API type/scope/id, or to check which servers may have integration issues, by virtue of not having recent data.
- This adds the following optional arguments:
- The
web
container command now uses a NGINX reverse proxy, which:- Will always run on port
5000
for regular HTTP connections, and will proxy to either the Ruby or Rust-based backend services - Additionally offers SSL support on port
5001
, either through an auto-generated self-signed certificate or one provided via the newSSL_CERT
andSSL_KEY
environment variables
- Will always run on port
- The Snapshot API will now process snapshots directly instead of uploading them to S3/MinIO and queuing them in Redis
- A typical "combined" image installation does not require any change, however you may need to adjust container configurations when running components separately
- The
snapshot_worker
andworker_all
commands have been removed, since all snapshot processing now happens in theweb
command
Performance
- Schema statistics: Improve performance of loading scans list
- Schema statistics: Speed up loading of per-table list of scans when many scans are present
- Server-wide query performance overview: Group timeseries data if needed
- This avoids performance issues that prevented the page from loading when a server has a lot of databases. For anything beyond the 5th database (as sorted by queries per second, descending) the timeseries graph on top will now group the data as "Other databases"
Other improvements
- Query Performance: Include query tags in header of query detail page
- Add optional support for AWS_ASSUME_ROLE to access AWS S3 and KMS
- When deploying pganalyze Enterprise Server on AWS, it can sometimes be necessary to run the container in a different AWS account than the S3 buckets and KMS key provisioned for it
- In such situations an explicit
sts:AssumeRole
call is needed to allow successful access. To enable this, set the newAWS_ASSUME_ROLE
setting, pointing to the ARN of the IAM role to be assumed - Note that this setting will also be used as the default by the bundled collector that runs in the Enterprise image (see settings), but can be overwritten on a per-server basis by changing the AWS-specific server settings in the pganalyze UI
- This release changes the internal mechanism for storing timeseries data, to more efficiently compress data
- This will result in a slight temporary increase in storage used by the pganalyze statistics database, since pganalyze will dual-write in both the old and the new format
- This dual write mechanism will be removed in a subsequent Enterprise release, effectively causing a reduction in storage used once that future release is deployed
- Schema Statistics: Improve formatting of scans on table detail page
- Right-align the values in the Cost and Est. Scans / Min columns, and use a fixed-width font to make them easier to compare
- Resolved issues list: Sort list by resolved at timestamp descending by default
- The previous sort order was ascending. With descending sort order the most recently resolved events correctly show at the top of the list
- EXPLAIN Plans: Add support for parsing Memoize nodes in text-format auto_explain output
- Add support for Postgres 16 in in-app documentation
- When the server is using Postgres 16, PostgreSQL Documentation accessible via the Config Setting page will display the Postgres 16 documentation, instead of devel (Development) version
- Warn when a server is over the table limit
- When a server goes over its plan table limit, schema processing is disabled, which breaks many pganalyze features. A warning is emitted in the collector, but this situation was not obvious in the app before. See the documentation on how to filter monitored tables to stay within the limit
- Query Performance: Warn when pg_stat_statements.track is set to "none"
- This setting disables pg_stat_statements from tracking query stats, and so effectively disables the Query Performance feature. Showing a warning makes this situation easier to discover
- Index Advisor: Keep additional type information in query analysis, to reduce data type related errors
- Previously we would often extract "col1 = $1" in the scan as part of query analysis (which looks at the complete query), and then later fail with a type related error when working with the scan. Improve some cases by instead keeping additional type information, so the scan is instead represented as "col1 = $1::type" (or similar)
- System: Show cluster information in the Server Info panel for Crunchy Bridge servers
- When the cluster information of Crunchy Bridge servers is passed from the collector, show it in the Server Info panel in the System Overview page
Bugfixes
- Increase density of events shown on Vacuum Timeline for multi-day time selections
- Improve time range selection handling on charts
- If selecting a time range right up against the edge of a chart, it was easy to lose the selection previously, since it would be cleared immediately when the mouse left the chart area.
- Now we give a small grace period before clearing the selection, making it easier to select right up to the available data.
- Index Advisor: Fix column finding problem due to the table referenced wrongly
- This fixes errors like
column "column_1" does not exist
, when there are several tables with the same name and one table does not have "column_1"
- This fixes errors like
- VACUUM Advisor: Use the oldest available transaction ID assigned at time when the corresponding assigned at time of the xmin horizon is not found
- Previously, when the assigned at time of the xmin horizon was not found within the stored statistics, we didn't try to calculate the xmin horizon assigned at time and treated it as data not available. With this fix, we use the oldest available transaction ID assigned at time whenever there is the xmin horizon but the corresponding assigned at time is not found
- This will help to trigger the VACUUM Blocked by Xmin Horizon check properly moving forward with such cases
- Fix loading of query stats when a query cannot be tied to a Postgres role
- Alerts & Check-Up: Make email notification available for High Replication Lag alert
- Query details page: Show IN/VALUES lists with multiple parameters as ($0) instead of (?)
- Timeseries graphs: Fix client-side cache keys to avoid stale data when switching objects
- In certain circumstances, a bug with the client-side caching key caused stale data to be shown on a subsequent request that was getting the same type of time series data with the same start/end timestamps. For example, this might have caused the query overview statistics graph on the Query Performance page to show the data from a previous database or server that was already visited in the same browser session of the same user, when navigating to a different database.
- The correct data would have shown when doing a full page refresh in the browser.
- This bug was limited to showing stale data on the client side, and did not permit any access to data the user did not already have access to.
- Query overview statistics: Show summary entry with runtime as 0.0 if no queries have run
- Previously, in cases where no queries occurred within a given 1 minute time window, the pganalyze UI would indicate that data was missing (with a grey dashed background) in the Query Performance overview. Whilst unlikely to occur on production databases, this situation can easily happen in staging/dev/QA environments.
- In order to avoid confusion, this changes the overview graph to show 0.0 as the median/95th/... runtime in such situations. If there is indeed a data problem (i.e. collector not sending statistics snapshots due to an error) the missing data indicator will be shown.
- Fix test snapshot server URL when DOMAIN_NAME is not set
- Previously, despite pointing the collector to a pganalyze Enterprise Server installation, one might have seen server URLs returned that start with "app.pganalyze.com"
- This occurred only when the DOMAIN_NAME setting was not set. Going forward, this will instead return the hostname used in the request to the test snapshot endpoint as a fallback.
- Exclude Google Cloud SQL setting from "Config - Disabled features" check
- The Alerts & Check-Up system's Config - Disabled features
check warns about disabled planner features. However, for Postgres servers on Google Cloud SQL,
it could also spuriously warn about the
enable_google_stats_collection
setting, since it follows the planner setting naming convention. - The setting has now been excluded from the check and will be ignored.
- The Alerts & Check-Up system's Config - Disabled features
check warns about disabled planner features. However, for Postgres servers on Google Cloud SQL,
it could also spuriously warn about the
- Workers: Allow snapshots up to 500MB size to be processed
- This raises the maximum size limit for snapshots from 200MB to 500MB, which can be necessary for certain use cases involving a lot of tables.
- Improve guidance when a query has no query samples
- Xmin horizon check: Correctly handle servers with stale statistics data
- Previously we may have triggered the Xmin horizon alert incorrectly when a server stopped sending data to pganalyze, since the reference time used for comparison was the time the check was run, not the time data was last received.
- Query analysis: Remove scan information when invalidated due to schema changes
- Previously, the query analysis process that finds the scans in a given query, would only add new information into the system (such as an index that got added), but never remove it. This would cause incorrect data to show, for example when indexes get removed, or other schema information changed that caused a scan method to no longer apply. Additionally, in situations where the query analysis logic itself changed, this may have cause duplicate scans to show on the same query.
- This fix will be applied when each query is re-analyzed. To accelerate re-analysis, for example when seeing incorrect data, please reach out to support to request a given set of queries to be re-analyzed right away.
- Query Analysis: Include outdated index scan information in output, and flag as
Index Scan using <unknown index>
- Sometimes, due to bugs in the processing logic, or other problems, the scan data on a given table or query
can become stale. In such situations the lookup of which index a particular scan refers to may fail.
Previously such scans were ignored in parts of the system, but not in others. For consistency, and to
flag such problems more clearly, always show such scans, but use the special scan method
Index Scan using <unknown index>
to make it apparent the data is stale. - To fix, query analysis needs to be re-run on affected queries, which will re-analyze the indexes that are available for use by a given query, using current schema information. This will usually resolve itself, but if it doesn't, please reach out to us for instructions on how to resolve.
- Sometimes, due to bugs in the processing logic, or other problems, the scan data on a given table or query
can become stale. In such situations the lookup of which index a particular scan refers to may fail.
Previously such scans were ignored in parts of the system, but not in others. For consistency, and to
flag such problems more clearly, always show such scans, but use the special scan method
- Improve fetching log lines associated with a backend
- The Backend Details view (available from the Connections page) can sometimes include associated log lines in the log line count, but inadvertently exclude them when fetching the actual log lines. This fixes these discrepancies
- Improve handling of unique index constraints in UI
- In some situations, unique indexes in Postgres do not get entries in the
pg_constraint
catalog table that normally tracks constraints. Since
pg_constraint
information was used to display information about constraints in pganalyze, this could lead to some indexes not being flagged as unique in the per-table index listing (they would still appear correctly on the individual index detail page) and their constraints being omitted in the per-table constraint listing - Update the logic to always treat unique indexes as constraints
- In some situations, unique indexes in Postgres do not get entries in the
pg_constraint
catalog table that normally tracks constraints. Since
- Query Details: Add a hint when a query is truncated
- Index Advisor: Correctly use scans from tables with camelCase names or special characters
- Previously Index Advisor was not able to correctly load scans if the table name contained uppercase letters, special characters or other cases where additional quoting of the table name was required. This may have resulted in no index recommendations showing, even if there were missing indexes
- Update expandable panels to open when clicking on the title
- Some panels within pganalyze are collapsed by default, and include a caret in the title bar to expand them. These panels can now also be expanded by clicking on their title
- Query analysis: Fix errors loading system functions like "make_interval" on PG 14+
- Query analysis: Add support for "Tid Scan" node types
- When a query queries the internal "ctid" column, query analysis would previously fail, since we did not correctly recognize the special "Tid Scan", which indicates that Postgres does a lookup based on the given internal tuple (row version) location
- Schema Statistics: Show correct scan method for queries with set returning functions in the SELECT targetlist
- Previously queries that contained a set returning function (SRF) in the target list did not get processed correctly, and caused the Scans list on the Queries tab of the individual per-table Schema Statistics pages to error out
- Index Advisor: Correctly handle scans involving OR conditions
- In some cases Index Advisor incorrectly assumed an expression like "a AND (b OR c)" was not able to use a simple Index Scan that supports "a", even though "a" could be indexed in such cases
- Alerts & Check-Up: Fix layout bug in organization-wide check config listing
- Index Advisor: Fix handling of JOIN conditions when a parameterized index scan is unlikely
- Previously Index Advisor would treat a JOIN condition as being able to use an index, even if a parameterized index scan was unlikely. With this bug fix in place, it's less likely that indexes will be recommended on JOIN conditions, as they will only be considered when the generic query plan (that the pganalyze Indexing Engine calculates) contains a Nested Loop with the relevant table on the inner side, allowing a parameterized index scan
- System page: Fix missing CPU Utilization data for Amazon RDS when enhanced monitoring is off
- Onboarding: Fix error when initially loading new server from end of setup instructions
Security
- Update Ruby version for Enterprise Server to 3.2.3
- Updated to Ubuntu 22.04 base image for Enterprise Server
- Require Postgres 12+ for the internal statistics database used by pganalyze Enterprise Server
- Require Redis 6.0+ when using an external Redis server specified via
REDIS_URL
- Routine security updates to packages in the base image and library dependencies
Couldn't find what you were looking for or want to talk about something specific?
Start a conversation with us →