from scipy.optimize import linprogIn this post, I will solve a simple linear programming (LP) problem using three Python packages: Google’s ORTools, PuLP and SciPy.
This is the first time I am trying optimization in Python. My previous experience with solver software has been with some commercial solvers or MS Excel while I was in college, and using supply chain modeling software at work. For my first try, I am being guided by the posts of Mirko Stojiljković from Real Python, and Maximme Labonne from Towards Data Science.
Since I will be focusing on the packages, I won’t be explaining Linear or Mathematical Programming in this post, and will go straight to the problem, the mathematical formulation, and then the building and optimizing of the model using the three packages mentioned. Mirko’s and Maximme’s posts give good overviews, but there are a large number of material online and offline covering linear programming and operations research.
A Sample Problem
I will be using a single problem to try the three packages. I am taking a review problem from one of my college textbooks. (Winston, Wayne & Venkataraman. 2003. Introduction to Mathematical Programming. Operations Research: Volume One. Fourth Edition. Brooks/Cole. Thomson Learning) The chosen problem will be a pure linear programming problem so we will not focus on any integer or binary variables for now.
A simple profit maximization problem
A company produces tools at two plants and sells them to three customers.The cost of producing 1000 tools at a plant and shipping them to a customer is given in the table below.
| Plant | Customer 1 | Customer 2 | Customer 3 |
|---|---|---|---|
| 1 | 60 | 30 | 160 |
| 2 | 130 | 70 | 170 |
Customers 1 and 3 pay $200 per thousand tools; Customer 2 pays $150. To produce 1,000 tools at Plant 1, 200 hours of labor are needed, while 300 hours are needed at Plant 2. A total of 5,500 hours of labor are available are available for use at the two plants. Additional labor hours can be purchased at $20 per labor hour. Plant 1 can produce up to 10,000 tools while Plant 2 can produce up to 12,000 tools. Demand by each customer is assumed to be unlimited.
How many tools should be produced and shipped by each plant to each customer?
LP Formulation
I will jump straight into fomulating the problem as an LP model by defining its three major parts: the decision variables, the objective funtion, the constraints.
The decision variables are the unknowns that we want to solve for. In our problem, these are:
$ X_{ij} =$ the number of tools in thousands produced in plant \(i\) and shipped to customer \(j\) for \(i = 1, 2\), and \(j = 1, 2, 3\)
$ L =$ additional labor hours to purchase
The objective function is a value that we want to maximize or minimize and is a function of the decision variables. As we have sales and cost figures, we would expect that we want to maximize the profit generated by our decisions:
Maximize:
\(Z =\) sales from customers + cost to ship to customers + cost of additional labor hours
\(Z = 200 \sum_{i=1}^{2} X_{i1}\ + 150 \sum_{i=1}^{2} X_{i2}\ + 200 \sum_{i=1}^{2} X_{i3}\ - \sum t_{ij}X_{ij} \ - 20L\) , where \(t_{ij}\) is the cost of shipping 1000 units from plant \(i\) to customer \(j\)
\(Z = (200-60)X_{11} + (200-130)X_{21} + (150-30)X_{12} + (150-70)X_{22} + (200-160)X_{13} + (200-170)X_{23} - 20L\)
\(Z = 140X_{11} + 70X_{21} + 120X_{12} + 80X_{22} + 40X_{13} + 30X_{23} - 20L\)
The constraints represent the conditions or requirements and limit the values that variables can take.
Labor constraint:
\(200 (X_{11}+X_{12}+X_{13}) + 300(X_{21}+X_{22}+X_{23}) \leq 5500 + L\)
Production constraints:
\(X_{11} + X_{12} + X_{13} \leq 10\)
\(X_{21} + X_{22} + X_{23} \leq 12\)
Nonnegativity: All variables are nonnegative
\(X_{ij} \geq 0\) ; for every plant \(i\) and customer \(j\)
\(L \geq 0\)
Solving using SciPy
I will follow the steps outlined in Mirko’s post to solve the above problem using SciPy. The first step is to load the linprog() from SciPy’s Optimize module.
linprog() only works with minimization problems that don’t have constraints with greater than inequalities. (i.e., all variables on the left side are greater than or equal to a constant on the right side) Our problem is currently a maximization type, but we can solve this by negating the whole expression to turn it into a minimization problem.
Maximize:
\(Z = 140X_{11} + 70X_{21} + 120X_{12} + 80X_{22} + 40X_{13} + 30X_{23} - 20L\)
is equivalent to, Minimize:
\(-Z = 20L -140X_{11} - 70X_{21} - 120X_{12} - 80X_{22} - 40X_{13} - 30X_{23}\)
linprog() requires a mandatory input c which is a 1D array denoting the objective function. The following are some of its optional arguments:
A_ub- a 2D array for the left hand side of inequality constraintsb_ub- a 1D array for the right hand side of inequality constraintsA_eq- a 2D array for the left hand side of equality constraintsb_eq- a 1D array for the right hand side of equality constraintsbounds- a sequence of tuples denoting the minimum and maximum value of each variable. If only one tuple is defined, then the bounds apply to all variablesmethod- the algorithm used to solver the problem. The default is'highs'and can also be set to'highs-ds'and'highs-ipm'. Other (legacy) methods are available at the moment but will soon be deprecatedcallback- callback function that will be run once per iteration of the algorithmintegrality- a 1D array which specifies the integrality of each variable. This accepts a value of0,1,2,3for each variable.0is the default and specifies that there is no integrality constraint, while1specifies that the variable must take an integer value. A value of2or3indicates that the variable is either continuous or integral within bounds, or is zero. This argument is only accepted by the'highs'method.
Based on this, we can define variables to be used as inputs to the c and the first five arguments. When building the input arrays or lists, the order of the variables should be consistent. For our case, we will use the following order, which is how the variables appear in the minimization objective function:
\(L, X_{11} , X_{21} , X_{12} , X_{22} , X_{13} , X_{23}\)
obj = [20, -140, -70, -120, -80, -40, -30]
lhs_ineq = [[-1, 200, 300, 200, 300, 200, 300], # Labor constraint left side
[0, 1, 0, 1, 0, 1, 0], # Plant 1 production constraint left side
[0, 0, 1, 0, 1, 0, 1]] # Plant 2 production constraint left side
rhs_ineq = [5500, # Labor constraint right side
10, # Plant 1 constraint right side
12] # Plant 2 constraint right side
# no equality constraints
bnd = [(0, None)]We can pass these into the arguments of linprog() to solve the problem. The function returns an object of type scipy.optimize.OptimizeResult that gives the optimum values of the decision variables in the field x and the minimum value of the objective function in the field fun.
opt = linprog(c=obj, A_ub=lhs_ineq, b_ub=rhs_ineq, bounds=bnd)
opt message: Optimization terminated successfully. (HiGHS Status 7: Optimal)
success: True
status: 0
fun: -2333.333333333333
x: [ 0.000e+00 1.000e+01 0.000e+00 0.000e+00 1.167e+01
0.000e+00 0.000e+00]
nit: 1
lower: residual: [ 0.000e+00 1.000e+01 0.000e+00 0.000e+00
1.167e+01 0.000e+00 0.000e+00]
marginals: [ 1.973e+01 0.000e+00 1.000e+01 2.000e+01
0.000e+00 1.000e+02 5.000e+01]
upper: residual: [ inf inf inf inf
inf inf inf]
marginals: [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00
0.000e+00 0.000e+00 0.000e+00]
eqlin: residual: []
marginals: []
ineqlin: residual: [ 0.000e+00 0.000e+00 3.333e-01]
marginals: [-2.667e-01 -8.667e+01 -0.000e+00]
mip_node_count: 0
mip_dual_bound: 0.0
mip_gap: 0.0
The solver was able to find the optimal value of $2,333 (fun) achieved by shipping 10000 pcs from Plant 1 to Customer 1 and 11667 pieces from Plant 2 to Customer 2.
Solving using PuLP
PuLP is another package for optimization and is able to call external solvers to solve the model. I will again use Mirko’s post for the steps.
We first import the functions that we will use from the pulp package.
from pulp import LpMaximize, LpProblem, LpStatus, lpSum, LpVariableThe first step is creating an object that represents the LP model using LpProblem() The sense argument specifies the objective of the problem which could either be LpMinimize (default, minimization) or LpMaximize. (maximization)
model = LpProblem(name="sample-problem", sense=LpMaximize)The next step is defining the variables using the function LpVariable(). This function has the following arguments, of which only the first is mandatory
name- name of the variablelowBound,upBound- minimum and maxumum value of the variable. Default is negative infinity and positive infinitycat- The category of the variable. The default is'Continuous', but can also be set as'Integer'or'Binary'e- used for column-based modeling (we will not discuss further for this post)
We then define the 7 decision variables with separate calls of LpVariable():
L = LpVariable(name = "L", lowBound = 0)
x11 = LpVariable(name = "x11", lowBound = 0)
x12 = LpVariable(name = "x12", lowBound = 0)
x13 = LpVariable(name = "x13", lowBound = 0)
x21 = LpVariable(name = "x21", lowBound = 0)
x22 = LpVariable(name = "x22", lowBound = 0)
x23 = LpVariable(name = "x23", lowBound = 0)In a nutshell, constraints can be added to the model as inequalities/equalities, and the objective function can be added as an expression. To illustrate:
print(type(2*L)) # This is an expression and represents 2 multiplied by L
print(type(2*L <= 5)) # This is a constraint as it includes an inequality<class 'pulp.pulp.LpAffineExpression'>
<class 'pulp.pulp.LpConstraint'>
We will add these into the model by simply adding them into model. We will be writing them in full, but there are other functions in the package that can be used to construct expressions or constraints.
To add the (maximization) objective function, we write the following:
model += 140 * x11 + 70 * x21 + 120 * x12 + 80 * x22 + 40 * x13 + 30 * x23 - 20 * LConstraints are added similarly, but as a tuple which includes a name for each constraint.
model += (200 * x11 + 200 * x12 + 200 * x13 + 300 * x21 + 300 * x22 + 300 * x21 + 300 * x23 - L <= 5500, "Labor")
model += (x11 + x12 + x13 <= 10, "Plant1_Production")
model += (x21 + x22 + x23 <= 12, "Plant2_Production")Calling the object displays the model we have defined.
modelsample-problem:
MAXIMIZE
-20*L + 140*x11 + 120*x12 + 40*x13 + 70*x21 + 80*x22 + 30*x23 + 0
SUBJECT TO
Labor: - L + 200 x11 + 200 x12 + 200 x13 + 600 x21 + 300 x22 + 300 x23 <= 5500
Plant1_Production: x11 + x12 + x13 <= 10
Plant2_Production: x21 + x22 + x23 <= 12
VARIABLES
L Continuous
x11 Continuous
x12 Continuous
x13 Continuous
x21 Continuous
x22 Continuous
x23 Continuous
We use .solve() to solve the model. A solver argument can be specified to override the default solver to be used. We use GLPK as the solver for this post.
from pulp import GLPK
model.solve(solver=GLPK(msg=False))
print(f"status: {model.status}, {LpStatus[model.status]}")
print(f"objective: {model.objective.value()}")
for var in model.variables():
print(f"{var.name}: {var.value()}")
for name, constraint in model.constraints.items():
print(f"{name}: {constraint.value()}")status: 1, Optimal
objective: 2333.3360000000002
L: 0.0
x11: 10.0
x12: 0.0
x13: 0.0
x21: 0.0
x22: 11.6667
x23: 0.0
Labor: 0.010000000000218279
Plant1_Production: 0.0
Plant2_Production: -0.3332999999999995
The results show the same objective and decision variable values as SciPy.
Solving using ORTools
The last package that we will test out in this post is Google’s ORTools. I will be using Maximme’s post as a guide where he mentions that the package is quite user-friendly and is the highest starred/rated on GitHub. The package also includes libraries specific to vehicle routing and graph problems.
pywraplp is the linear programming solver wrapper which interfaces to MIP (mixed integer programming) and linear solvers. We use the code below to import the module and then declare a solver instance for our problem using Solver(). We specify GLOP as the engine to be used.
from ortools.linear_solver import pywraplp
solver = pywraplp.Solver('Maximize profit', pywraplp.Solver.GLOP_LINEAR_PROGRAMMING)Variables can be added to the model or solver instance by using on of three functions depending on the variable type:
NumVar- continuous variablesIntVar- integer variablesBoolVar- binary variables
The functions accept the lower-bound, the upper-bound and the variable name as inputs. To specify no upper-bound, we need to use .infinity() as a value.
Our model only contains continuous variables so we will just use NumVar to declare them. This approach is similar to what was done in PuLP.
var_L = solver.NumVar(0, solver.infinity(), 'L')
var_x11 = solver.NumVar(0, solver.infinity(), 'x11')
var_x12 = solver.NumVar(0, solver.infinity(), 'x12')
var_x13 = solver.NumVar(0, solver.infinity(), 'x13')
var_x21 = solver.NumVar(0, solver.infinity(), 'x21')
var_x22 = solver.NumVar(0, solver.infinity(), 'x22')
var_x23 = solver.NumVar(0, solver.infinity(), 'x23')We add constraints using .Add() for each constraint and then passing the inequality (or equality) as an argument.
solver.Add(200 * var_x11 + 200 * var_x12 + 200 * var_x13 + 300 * var_x21 + 300 * var_x22 + 300 * var_x21 + 300 * var_x23 - var_L <= 5500)
solver.Add(var_x11 + var_x12 + var_x13 <= 10)
solver.Add(var_x21 + var_x22 + var_x23 <= 12)<ortools.linear_solver.pywraplp.Constraint; proxy of <Swig Object of type 'operations_research::MPConstraint *' at 0x000001FFCB089990> >
The objective function can be defined by using .Maximize() or .Minimize() depending on the problem. We will be using the first one to define the original maxmimization version of the objective.
solver.Maximize(140 * var_x11 + 70 * var_x21 + 120 * var_x12 + 80 * var_x22 + 40 * var_x13 + 30 * var_x23 - 20 * var_L)With the model defined
status = solver.Solve()
print('Optimized: ', status == pywraplp.Solver.OPTIMAL)Optimized: True
We can display the objective value and the decision variables by calling the specific objects in solver.
print(f"Objective value = {solver.Objective().Value():0.1f}")
for i in solver.variables():
print(f"{i} = {i.solution_value():0.1f}")Objective value = 2333.3
L = 0.0
x11 = 10.0
x12 = 0.0
x13 = 0.0
x21 = 0.0
x22 = 11.7
x23 = 0.0
We again see that the output is the same as the previous two. The method of defining variables, constraints, and the objective function is very similar to PuLP, but all three methods produced the same results.
Initial Thoughts
We have gone through three different ways to solve LP problems in Python. It is too early for me to tell which one is best for me. The last two appear more user-friendly when coding in variables and constraints, but I am not sure how easy they are to use for large-scale problems where you will load the coefficients of the objective function or the constraints from a separate file or spreadsheet.
This demo is also limited to a very simple linear problem– there are no integer or binary variables. We also did not go through any further analysis (e.g., sensitivity analysis) and these should be considered when choosing a package to use or focus on.
For now, I have three new packages in my toolset and I will definitely play around and explore some more.