5mins of Postgres E28: Row Level Security in Postgres, security invoker views and why LEAKPROOF functions matter

Today, we're going to talk about Postgres Row Level Security, the BYPASSRLS attribute, security invoker views, and Postgres performance problems when using LEAKPROOF functions and Row Level Security policies.



Share this episode: Click here to share this episode on twitter, or sign up for our newsletter and check out the newsletter archive and subscribe to our YouTube channel.


Transcript

Let's jump in.

Using Postgres Row Level Security for multi-tenant databases

In this blog post by Miki from the Nile team, Miki describes how to use RLS for a multi-tenant database. Multi-tenant databases are databases where you have different customers, aka tenants, and they each own a portion of the database. There are other approaches that you can use, for example you could use a schema per tenant or database per tenant approach.

Miki describes how to use a single database with multiple tenants. We can look at an example of how RLS works with these three tables here. We have a users table, an orgs table and an org_members table. We want to limit access to the orgs table so that only users who are actually members of the organization can see it.

First, we'll enable the ROW LEVEL SECURITY flag on the orgs table. Then, we'll create a function that determines the current user that is logged in. In Postgres, you can define arbitrary settings on the database that are kept local for the connection. So here, we use this app namespace, and then we use the current_app_user setting.

You just do a SET command on the connection, and then this is read using this current_app_user function. We then create a policy on the orgs table that says, if the SELECT from the org_members table matches the current org ID and then also the user that we're currently logged in as is a member of this organization then we return the org.

In this case, you only have a single database user, but then you have the setting that's used to control which user is actually logged into your application. When you authenticate a customer in the application, then you would do the SET command, SET app.current_app_user to the current user, and that tells the database who's currently authorized.

Row Level Security in Postgres and the BYPASSRLS attribute

RLS is pretty cool. The thing to know about RLS that's mentioned here is that when you have a user that has a BYPASSRLS attribute, then the RLS attributes don't apply. Which makes sense, cause you might want to be able to query the database or dump and restore the database for internal processes. Superusers, by default, have this BYPASSRLS attribute applied, which means if you're testing things with superuser, don't be surprised if RLS doesn't work.

Using Row Level Security in Postgres together with a view

The other thing that was mentioned in regards to security in the Hacker News discussion about Miki's article was something that I actually had forgotten that was a problem, which is when you use RLS together with a view.

If you define a view in Postgres, a view is actually defined as what's called a security definer. That means when you create a view, the user who creates the view defines the permissions for which table can be accessed or which part of the table can be loaded. If you create a view and then you query it with a user that you would expect to have access to just a subset of the view, that doesn't actually apply. Views, by default, are loading the whole data, not taking any RLS policies into account.

Views by default are loading the whole data, not taking any RLS policies into account.

Security invoker views in Postgres 15

Postgres 15 actually improves on this. Postgres 15 adds what's called a security invoker view. I can now say CREATE VIEW security_invoker = true. That means when the view is queried, then it would actually apply everything as if you were the user just querying those tables directly. For RLS that's in most cases the better choice.

There is a way to do this in older Postgres versions as well. You can use what's described in this blog post by Ben Burwell here, where you essentially define a function. This is a bit obscure, but it does apply RLS restrictions correctly.

Postgres performance problems when using LEAKPROOF functions and Row Level security policies

The other thing I want to mention is when to not use RLS. Somebody mentioned in this thread on Hacker News they used RLS and it can cause performance headaches. Even for the simplest of queries! RLS had some historic issues with complex queries and they've been improved, the planner has better logic these days.

RLS can cause performance headaches, Even for the simplest of queries.

But there is still a current problem, and it's around when functions are LEAKPROOF. When Postgres works with a row level security policy in place and it plans your query, it has to decide when to execute functions and expressions. First, you have to filter down the result set and then you can apply functions that may leak data from other rows. The function might be looking at not just the row it's actually matching, but other rows as well. When something is not marked as LEAKPROOF it means that it may leak data from outside of the actual row you are returning.

LEAKPROOF helps the optimizer, the planner, choose to apply functions ahead of the row level security check. For performance reasons, that's actually very important.

Here's a Stack Overflow post from somebody two years ago, who said "Postgres is ignoring my index". The reason was they were using an operator that was not LEAKPROOF. Operators underneath the hood are functions. ILIKE wasn't using the index when used inside RLS. They actually changed the existing function in Postgres to apply the LEAKPROOF-ness attribute. You may choose to do that.

There is actually an interesting discussion that occurred roughly around the same time, a couple of years ago, where they were discussing, upstream, which internal functions in Postgres are marked LEAKPROOF and which are not. The conclusion was mostly that they didn't really change, which is unfortunate, because sometimes you can't actually change the LEAKPROOF attribute, for example, when you're using RDS or other managed database services.

RLS can be very useful, but it does have performance implications, even for simple queries.

That's it for today's episode. Thank you so much for listening. This was 5mins of Postgres. Subscribe to our YouTube channel or follow us on Twitter to hear about next week's episodes. Talk to you next week!

What we have discussed in this episode of 5mins of Postgres


Enjoy blog posts like this?

Get them once a month to your inbox