A practical introduction to Constraint Programming using CP-SAT and Python
Imagine you're an e-commerce giant that would like to build a new warehouse to improve service to your customers, but you need to know what is the best location for it. Or you're a global shipping company that assigns packages to their delivery trucks and has to choose the best routes in order to save gas and reduce driver overtime. Or an airline that is looking to offer service to a new location, and needs to know which types of planes they should use and on what schedule, to maximize the resulting revenue.
These kinds of problems mentioned above are known as discrete optimization problems. There exist several methods that can be used to tackle such problems. In this article, we will discuss the theory and practice for one of them, called constraint programming.
This is the first part in a two part series on constraint programming, used inside pganalyze Indexing Engine and pganalyze Index Advisor. We're sharing this knowledge to help the wider programming community become familiar with how to utilize solvers like CP-SAT in practice.
A declarative paradigm
Constraint programming (CP) is a declarative paradigm used to solve discrete optimization problems. This contrasts with the imperative paradigm that we are generally used to. When programming imperatively, we describe the steps necessary to reach a result. For example, suppose that we want to know who the adults are from a given list of people:
Name | Age |
---|---|
Phil | 20 |
Emma | 17 |
David | 11 |
Thomas | 51 |
Sarah | 45 |
Rebecca | 6 |
A typical imperative approach would explain the sequence of operations required to get the desired result:
adult_people = []
for person in people:
if person.Age >= 18:
adult_people += person.Name
Phil
Thomas
Sarah
Meanwhile, the same result can be described declaratively as:
SELECT person_name FROM people WHERE age >= 18;
Phil
Thomas
Sarah
Both approaches are equivalent in their outcome, but the two processes are different. In the imperative case, the program follows each step in sequence in order to reach the result. In the declarative case, the program is given a description of the desired result (using the available constructs of the language) and gets there by itself.
The basics of constraint programming (CP)
Similarly to the declarative example mentioned above, with CP we describe the desired result to a problem. This description is called a model. The main components of a model are variables and constraints. Variables represent what we are looking for, and each variable has an associated domain which is the set of values that this variable is allowed to take. Constraints describe relationships between variables.
A solution is an assignment of values to the variables (from their domains) such that the constraints are satisfied. Let's jump straight in with a simple example:
Alice, Bob, and Carol each have $20, and they want to pool their money to purchase a candy bar worth $50 (yes, inflation is running wild). Alice has stated that she will put in at least as much money as Bob. Carol only has $5 bills, so her contribution will be a multiple of that. None of them want to contribute the exact same amount as any other.
How much should each of them chip in?
Here, we are looking for the amount of money each person should contribute to the purchase of the candy bar. This means that we need one variable per person, indicating the amount of money that this person should contribute toward the purchase (a
for Alice, b
for Bob, and c
for Carol). We start by setting the domains of these variables (the symbol ∈
means "in"):
a ∈ {0, ..., 20}
b ∈ {0, ..., 20}
c ∈ {0, ..., 20}
These domains ensure that the final values of the variables represent amounts of money that the contributors actually have in their pockets (that is, a maximum of $20). Next, we want to make sure that the combined contributions are enough to cover the price of the candy bar, so we add the constraint:
a + b + c == 50
Alice will contribute at least as much as Bob, so we translate this into:
a >= b
Carol's contribution must be a multiple of 5:
c % 5 == 0
Finally, the amount of each of their contributions must be unique. We could model this using the following constraints:
a != b
a != c
b != c
We only have a handful of people in this example so these disequalities would work well. But what if they were hundreds, or thousands? It turns out that CP has a rich catalogue of expressive constraints that can encapsulate complex concepts, called global constraints. An alternative to the above would be to use the so-called alldifferent
constraint, which ensures that a set of variables are all assigned different values:
alldifferent(a, b, c)
This completes the model of this problem. You will note that we have not assigned any values to a
, b
, or c
ourselves. We have simply defined three variables and their domains, and described the properties of the problem using constraints on those variables. Our job is done.
The piece of software that interprets this model and returns a solution is called a solver. The inner workings of a solver are outside the scope of this article, so for our purposes we will consider the solver as a black box that takes a model as input, and returns a valid solution:
a = 19
b = 11
c = 20
The solution returned is valid as each variable takes a value from its domain and all the constraints are satisfied. However, we see that Carol contributes almost twice Bob's amount. Perhaps there exists another valid solution where all parties contribute more equally to the purchase?
We can add an objective to our CP model to try to reach this goal. Adding an objective to a model allows us to minimize or maximize an expression, without compromising the validity of the resulting solution (with respect to the constraints). If we can somehow minimize the amount of money spent by the largest contributor, this should push the three contributions closer together. Our objective will then be to find a valid solution where this value is minimal:
x ∈ {0, ..., 20}
maximum(x, [a, b, c])
minimize: x
To achieve this, we create a new variable x
representing the amount of the largest contribution. The maximum
constraint takes care of assigning to x
the largest value from [a, b, c]
. The objective is then to minimize x
. The solution returned by the solver is now:
a = 18
b = 17
c = 15
x = 18
Previously there was a $9 difference between the largest and smallest contributions. With the objective we introduced this has now been reduced to $3, and this is as fair as this is going to get. Now that the basic concepts are cleared up, let's move on to a more challenging problem.
A practical example with Python and CP-SAT
Let's use this new CP knowledge to solve a more complex real-world example: the scheduling of employees for a small business.
A store owner wishes to create the weekly work schedule for its employees. The store is open from 8AM to 8PM every day, and each day is divided into three shifts of 4 hours: morning, afternoon, and evening. There are two roles in the store: cashier and restocker.
Some employees are qualified to do either role, but others can only be a cashier, or a restocker.
There has to be a cashier scheduled at all times, but restocking only takes about 4 hours every day. Hence, for the restocking task we only need to schedule an employee for a single shift every day. This can be any shift, but two restocking shifts cannot be scheduled one after the other. If a restocking is scheduled on the evening shift on Tuesday, for example, we cannot schedule the Wednesday restocking on the morning shift.
An employee that is qualified in both roles can still only be assigned to one role per shift.
Employees cannot work more than 8 hours per day, which is 2 shifts. If they do work 2 shifts in a day, we must ensure that there is no idle time between these shifts—for example, we can't schedule them on both the morning and the evening shifts of the same day, as they would be idle for 4 hours during the afternoon shift.
This is the basic premise of the problem. Let's break this down into manageable parts.
An empty model
We first start by creating an empty model using CP-SAT, an open-source CP solver developed by Google as part of its OR-Tools project.
from ortools.sat.python import cp_model
model = cp_model.CpModel()
The data
A store owner wishes to create the weekly work schedule for its employees. The store is open from 8AM to 8PM every day, and each day is divided into three shifts of 4 hours: morning, afternoon, and evening. There are two roles in the store: cashier and restocker. Some employees are qualified to do either role, but others can only be a cashier, or a restocker.
Let's create a list of employees and the roles they are qualified for:
employees = {"Phil": ["Restocker"],
"Emma": ["Cashier", "Restocker"],
"David": ["Cashier", "Restocker"],
"Rebecca": ["Cashier"]}
The schedule is said to span a week, and we are told that there are three types of shifts, and two types of roles:
days = ["Monday",
"Tuesday",
"Wednesday",
"Thursday",
"Friday",
"Saturday",
"Sunday"]
shifts = ["Morning",
"Afternoon",
"Evening"]
roles = ["Cashier",
"Restocker"]
The variables
Now, let's define what we are looking for. To describe the schedule, we need to refer to employees, roles, days, and shifts: Does Emma work as a restocker on the Monday evening shift? This can be achieved using boolean variables. A boolean variable is a variable with a domain of {0, 1}
.
schedule = {e:
{r:
{d:
{s: model.new_bool_var(f"schedule_{e}_{r}_{d}_{s}")
for s in shifts}
for d in days}
for r in roles}
for e in employees}
The function model.new_bool_var()
creates and then returns a boolean variable, which we store in schedule
. Within this structure, schedule["Emma"]["Restocker"]["Monday"]["Evening"]
refers to one of those boolean variables. This variable is equal to 1
if Emma does work as a restocker on the Monday evening shift, or to 0
if she doesn't.
Our schedule
variables are currently unconstrained. By following the problem description presented earlier and constraining these variables accordingly, we should be able to get a schedule that satisfies the requirements of the store owner.
The constraints
Let's go through the rest of the problem description to correctly constrain the schedule
variables. To add a new constraint to the model, we simply use model.add(...)
.
There has to be a cashier scheduled at all times.
Since the schedule is comprised of boolean variables, it's easy to see how we can put limits on subsets of these variables by summing them and enforcing constraints on these sums:
for d in days:
for s in shifts:
model.add(sum(schedule[e]["Cashier"][d][s] for e in employees) == 1)
If we need a cashier at all times, this means that for every day-shift pair, the sum of employees assigned the "Cashier"
role has to be equal to 1
.
For the restocking task we only need to schedule an employee for a single shift every day.
Similarly to the previous constraint, we now want the sum of all employees assigned the "Restocker"
role for all shifts of a given day to be equal to 1
:
for d in days:
model.add(sum(schedule[e]["Restocker"][d][s] for e in employees for s in shifts) == 1)
This [restocking] can be any shift, but two restocking shifts cannot be scheduled one after the other.
Because of the previous constraint, we already know that two restocking shifts cannot take place on the same day. The only way two restocking shifts could be scheduled one after the other would be to assign one on the evening shift of a day, and another one on the morning shift of the next day. By enforcing that the sum of restocking shifts for each evening-morning pair is not greater than 1
, we ensure that at most one restocking shift is scheduled for those pairs:
for i in range(len(days)-1):
model.add(sum(schedule[e]["Restocker"][days[i]]["Evening"] + schedule[e]["Restocker"][days[i+1]]["Morning"] for e in employees) <= 1)
An employee that is qualified in both roles can still only be assigned to one role per shift.
For every employee, the sum of all assigned roles for all day-shift pairs is either going to be 1
(they work a single role on this day-shift slot), or 0
(they don't work on that day-shift slot):
for e in employees:
for d in days:
for s in shifts:
model.add(sum(schedule[e][r][d][s] for r in roles) <= 1)
Some employees are qualified to do either role, but others can only be a cashier, or a restocker.
To prevent an employee from being assigned a role that they are not qualified for, we simply match the value of that role to 0 (or put differently, we're adding a constraint asserting that it's zero) everywhere for that employee:
for e in employees:
for r in roles:
for d in days:
for s in shifts:
if r not in employees[e]:
model.add(schedule[e][r][d][s] == 0)
Employees cannot work more than 8 hours per day, which is 2 shifts. If they do work 2 shifts in a day, we must ensure that there is no idle time between these shifts—in other words, we can't schedule them on both the morning and the evening shifts of the same day, as they would be idle for 4 hours during the afternoon shift.
It turns out that a single constraint can take care of both of these requirements: An employee can work either the morning shift, or the evening shift, or neither of these shifts:
for e in employees:
for d in days:
model.add(sum(schedule[e][r][d]["Morning"] + schedule[e][r][d]["Evening"] for r in roles) <= 1)
Note that the above constraint does not need to specify anything about the afternoon shift. If the employee works in the morning, they can't work in the evening, and vice-versa. This both ensures that the employee works a maximum of two shifts per day, and also that there is no idle time, since there can only be idle time if the employee works both the morning and the evening shift.
The modeling of the problem is now complete. Let's see what the results are.
Solving the model
To solve the model, we simply call a solver, with the model as an argument:
solver = cp_model.CpSolver()
solver.solve(model)
After the solving process, we can get the resulting values of the schedule
variables with solver.value(...)
. We have organized these values in the schedule shown below:
| Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | Total |
| M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | |
Phil | | R | | | R | | | R | | | R | | | R | | | R | | | R | | 7 |
Emma | | | | C | | | C | | | C | | | C | | | C | | | C | | | 6 |
David | C | | | | | | | | | | | | | | | | | | | | | 1 |
Rebecca | | C | C | | C | C | | C | C | | C | C | | C | C | | C | C | | C | C | 14 |
The resulting schedule satisfies the constraints of the problem. Now let's think of additional real-world constraints that would make this more interesting.
Adding more constraints
We see that Rebecca works 14 shifts for that week. The store owner is not keen on paying overtime wages, so they wish to cap each employee's schedule to a maximum of 40 hours per week (10 work shifts):
for e in employees:
model.add(sum(schedule[e][r][d][s] for r in roles for d in days for s in shifts) <= 10)
| Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | Total |
| M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | |
Phil | | R | | | R | | | R | | | R | | | R | | | R | | | R | | 7 |
Emma | | C | C | | C | C | C | | | C | | | C | | | C | | | C | | | 9 |
David | C | | | C | | | | | | | | | | | | | | | | | | 2 |
Rebecca | | | | | | | | C | C | | C | C | | C | C | | C | C | | C | C | 10 |
Phil is a full-time student and would like to work exactly 4 shifts per week. He also cannot work the morning and afternoon shifts during the week, in order to attend his classes.
model.add(sum(schedule["Phil"][r][d][s] for r in roles for d in days for s in shifts) == 4)
model.add(sum(schedule["Phil"][r][d][s] for r in roles for d in days if d not in ["Saturday", "Sunday"] for s in shifts if s in ["Morning", "Afternoon"]) == 0)
| Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | Total |
| M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | |
Phil | | | | | | R | | | R | | | R | | | | | R | | | | | 4 |
Emma | C | R | | | | C | C | | | C | | | C | R | | C | | | C | R | | 10 |
David | | C | C | C | C | | | | | | | | | | | | | | | | | 4 |
Rebecca | | | | | | | | C | C | | C | C | | C | C | | C | C | | C | C | 10 |
Phil and Emma do not get along very well, and as such we don't want them working the same shifts.
for d in days:
for s in shifts:
model.add(sum(schedule[e][r][d][s] for e in ["Phil", "Emma"] for r in roles) <= 1)
| Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | Total |
| M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | |
Phil | | | R | | | R | | | R | | | R | | | | | | | | | | 4 |
Emma | C | | | C | C | | C | C | | C | C | | C | C | | | C | | | | | 10 |
David | | C | C | | | C | | | C | | | C | | R | | | R | C | C | R | | 10 |
Rebecca | | | | | | | | | | | | | | | C | C | | | | C | C | 4 |
No employee really likes to work on the weekend, so we would like to distribute these shifts equally between all employees. There are 8 such shifts (3 cashier shifts and 1 restocker shift on each day), and since we have 4 employees, we can give them all 2 shifts each for the weekend.
for e in employees:
model.add(sum(schedule[e][r][d][s] for r in roles for d in ["Saturday", "Sunday"] for s in shifts) == 2)
| Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | Total |
| M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | |
Phil | | | | | | R | | | R | | | | | | | R | | | R | | | 4 |
Emma | R | C | | C | C | | C | | | | | C | | C | C | | | | | C | C | 10 |
David | C | | | | | C | | C | C | C | R | | C | R | | | | C | C | | | 10 |
Rebecca | | | C | | | | | | | | C | | | | | C | C | | | | | 4 |
Emma would like to take Monday to Friday off, and asks us if this is possible. Let's see:
model.add(sum(schedule["Emma"][r][d][s] for r in roles for d in ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"] for s in shifts) == 0)
Status: INFEASIBLE
The solver returns a status of "infeasible". What does this mean?
Interlude: The solver status
The solver takes as input a model, and returns a status and a solution. Consider this simple case:
x ∈ {0, ..., 10}
y ∈ {0, ..., 10}
x + y >= 5
minimize: x + y
Status: OPTIMAL
x = 5
y = 0
The variables x
and y
both have the domain {0, ..., 10}
. The sum of the values assigned to them must be 5 or greater, and the objective is to minimize this sum. The solution (x, y) = (5, 0)
is called optimal. A solution is optimal when there does not exist another solution that is strictly better than it. For instance, the solution (x, y) = (3, 2)
would also be optimal.
Now consider the case:
x ∈ {0, ..., 10}
x >= 15
Status: INFEASIBLE
An infeasible status means that there exists no assignment of values to the variables such that the constraints are satisfied. In the above example, it is not possible to assign a value to x
that is at least as large as 15, since the largest value in its domain is 10.
There are also other possibilities. Let's take a look at this last example:
x ∈ {0, ..., 10}
y ∈ {0, ..., 10}
... // Many more variables
x + y <= 12
x >= 5
... // Many more constraints
minimize: x - 3*y + ... // Very complex objective
Sometimes, a problem is so large and complex, and takes so much time to solve, that we must interrupt the solver due to time constraints. In such a case, the solver will return one of two statuses:
- If a solution has been found, the status returned is feasible. A feasible solution means that the solution satisfies the constraints, but the solver does not know if that solution is optimal. In other words, we have a solution that works, but there may still exist a better one.
- If no solution has been found, the status returned is unknown. This means that while the solver has not found a solution, it does not know whether one exists (feasible) or if the problem has no solution (infeasible).
Let's get back to our schedule.
"Sorry, Emma"
With the infeasible status returned to us previously, we are forced to inform Emma that she can't take the whole week off, otherwise it would be impossible to fill the schedule without violating some of the other constraints. She decides to only take Monday to Wednesday off instead:
model.add(sum(schedule["Emma"][r][d][s] for r in roles for d in ["Monday", "Tuesday", "Wednesday"] for s in shifts) == 0)
| Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | Total |
| M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | |
Phil | | | R | | | | | | R | | | | | | | R | | | R | | | 4 |
Emma | | | | | | | | | | | R | C | | R | C | | | | | C | C | 6 |
David | C | | | C | R | | C | | | C | C | | C | C | | | | C | C | | | 10 |
Rebecca | | C | C | | C | C | | C | C | | | | | | | C | C | | | | | 8 |
Phil works exactly 4 shifts as he wants, but the other shifts are not very well distributed: Emma gets 6, David gets 10, and Rebecca gets 8. Let's see if we can improve this by adding an objective.
Objective: Distributing the shifts more evenly
We would like to distribute the shifts as fairly as possible between Emma, David, and Rebecca. We will recall that an objective allows us to minimize the value of an expression. We could, for example, minimize the shift difference between the employee who is assigned the fewest shifts, and the one who is assigned the most. Currently, Emma is assigned 6 shifts to David's 10, so this difference stands at 4 shifts.
We start by creating integer variables to track the number of shifts assigned to each employee, and enforcing the values of these variables:
# total_shifts[e] indicates the number of shifts worked by employee `e`
total_shifts = {e: model.new_int_var(0, 10, f"total_shifts_{e}")
for e in employees}
for e in employees:
model.add(total_shifts[e] == sum(schedule[e][r][d][s] for r in roles for d in days for s in shifts))
Integer variables are created with model.new_int_var(...)
, and the parameters allow us to specify the lower and upper bounds of the domain (in this case, the domain is {0, ..., 10}
). We then create variables to track the highest and lowest number of shifts assigned to any employee (with the exception of Phil, who works part-time):
min_shifts = model.new_int_var(0, 10, "min_shifts")
model.add_min_equality(min_shifts, [total_shifts[e] for e in employees if e != "Phil"])
max_shifts = model.new_int_var(0, 10, "max_shifts")
model.add_max_equality(max_shifts, [total_shifts[e] for e in employees if e != "Phil"])
The constraints model.add_min_equality(...)
and model.add_max_equality(...)
work in the same fashion as the maximum(...)
constraint presented in the candy bar example (with Alice, Bob, and Carol). For example, model.add_min_equality(var, list)
assigns to integer variable var
the smallest value among the integer variables in list
.
Finally, we minimize the difference between max_shifts
and min_shifts
:
model.minimize(max_shifts - min_shifts)
| Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday | Total |
| M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | M | A | E | |
Phil | | | R | | | | | | R | | | | | | | R | | | R | | | 4 |
Emma | | | | | | | | | | | R | C | | R | C | | | | | C | C | 6 |
David | C | | | C | R | | C | | | | C | | C | C | | | | C | C | | | 9 |
Rebecca | | C | C | | C | C | | C | C | C | | | | | | C | C | | | | | 9 |
This looks pretty good. David and Rebecca both get 9 shifts. Emma only gets 6, but that is understandable as she is taking 3 days off that week. Phil gets exactly 4 shifts, as he is supposed to. Hopefully, everyone will be happy with this schedule.
Concluding remarks
We have introduced the basics of constraint programming and discussed its main components. We have built a model to generate work schedules that can take into account the types of constraints that one could reasonably expect to be faced with in the real world.
Thanks to this model, we could easily see that it would not be possible for Emma to take five days off as she initially wanted, and we were able to distribute the work shifts as fairly as possible between the employees. In the end, we created a schedule that satisfied both the requiremends of the store owner, as well as the needs of the employees.
In the next article, we will discuss how to use constraint programming for index selection in Postgres.
The code presented in this article can be found on the pganalyze GitHub.