Customer Story:
How Notion Runs PostgreSQL at Scale on Amazon RDS with pganalyze
Notion, the popular productivity tool, has undergone tremendous growth since its founding in 2015. By 2021, a viral TikTok trend led to exponential user growth, bringing with it significant challenges to scale its Postgres-based database infrastructure. To help address these growing pains, Notion adopted pganalyze to ensure optimal PostgreSQL database performance, as well as maintaining reliability and availability.
The challenge
From 2015 to 2020, Notion relied on a single large PostgreSQL database hosted on Amazon RDS, which served them well during the early stages of product development. However, as user adoption surged, this setup began to show cracks. Challenges included outages, inefficiencies in vacuum processes, and scaling limitations that necessitated major infrastructure overhauls.
In early 2021, Notion began its journey to address these issues. This included sharding its core database and seeking tools to provide deeper visibility into PostgreSQL internals. After a recommendation from PGExperts, a leading PostgreSQL consultancy, Notion adopted pganalyze in March 2021 as a critical solution for managing database performance.
The impact of pganalyze on Notion’s database operations
With pganalyze, Notion engineers gain visibility into issues that were previously opaque, such as unused indexes contributing to table bloat and inefficient queries consuming excessive CPU and I/O.
A particularly impactful improvement involved refining a GIN index on the space_id
and permission
columns, used for JSON filtering with the contains operator (@>
). On Notion’s development database, the index was not being used due to poor cardinality and statistics. With the help of pganalyze, it was determined that switching to a jsonb_path_ops
index would better serve the query patterns for larger workspaces.
Implementing this change resulted in a 733% performance improvement, reducing query runtime by more than 8x, from approximately 5000 ms to 600 ms. Comparisons in staging showed no mismatches, paving the way for a seamless deployment to production.
Another time, Ben Hughes, Software Engineer on the infrastructure team, recalled encountering a performance issue while a new feature was in pre-production testing. The database was becoming overloaded, causing pages in the app to load slowly or crash entirely. Some initial investigation revealed that this new feature was responsible for 20% of the application’s query load in its current state.
Adoption across engineering teams
Initially adopted by the infrastructure team, pganalyze provided much-needed insights into database performance bottlenecks. Over time, its use expanded across engineering teams, empowering application developers to identify and resolve database issues independently. This broad adoption was facilitated by integrating pganalyze with Okta, allowing seamless access and fostering collaboration.
Lessons learned and future plans
Notion’s experience highlights the importance of investing in database visibility and proactive monitoring tools at the right time. By equipping engineers with pganalyze, the company reduced its reliance on infrastructure specialists while enabling faster incident resolution.
Conclusion
Notion’s partnership with pganalyze has been pivotal in its ability to scale PostgreSQL effectively. By providing actionable insights and empowering teams across the organization, pganalyze has helped Notion maintain a seamless user experience during a period of exponential growth.