Starting Linear Programming in Python

Python
Optimization
Linear Programming
Learn With Me
Author

Derek Rodriguez

Published

November 28, 2024

In this post, I will solve a simple linear programming (LP) problem using three Python packages: Google’s ORTools, PuLP and SciPy.

This is the first time I am trying optimization in Python. My previous experience with solver software has been with some commercial solvers or MS Excel while I was in college, and using supply chain modeling software at work. For my first try, I am being guided by the posts of Mirko Stojiljković from Real Python, and Maximme Labonne from Towards Data Science.

Since I will be focusing on the packages, I won’t be explaining Linear or Mathematical Programming in this post, and will go straight to the problem, the mathematical formulation, and then the building and optimizing of the model using the three packages mentioned. Mirko’s and Maximme’s posts give good overviews, but there are a large number of material online and offline covering linear programming and operations research.

A Sample Problem

I will be using a single problem to try the three packages. I am taking a review problem from one of my college textbooks. (Winston, Wayne & Venkataraman. 2003. Introduction to Mathematical Programming. Operations Research: Volume One. Fourth Edition. Brooks/Cole. Thomson Learning) The chosen problem will be a pure linear programming problem so we will not focus on any integer or binary variables for now.

A simple profit maximization problem

A company produces tools at two plants and sells them to three customers.The cost of producing 1000 tools at a plant and shipping them to a customer is given in the table below.

Plant Customer 1 Customer 2 Customer 3
1 60 30 160
2 130 70 170

Customers 1 and 3 pay $200 per thousand tools; Customer 2 pays $150. To produce 1,000 tools at Plant 1, 200 hours of labor are needed, while 300 hours are needed at Plant 2. A total of 5,500 hours of labor are available are available for use at the two plants. Additional labor hours can be purchased at $20 per labor hour. Plant 1 can produce up to 10,000 tools while Plant 2 can produce up to 12,000 tools. Demand by each customer is assumed to be unlimited.

How many tools should be produced and shipped by each plant to each customer?

LP Formulation

I will jump straight into fomulating the problem as an LP model by defining its three major parts: the decision variables, the objective funtion, the constraints.

The decision variables are the unknowns that we want to solve for. In our problem, these are:

$ X_{ij} =$ the number of tools in thousands produced in plant \(i\) and shipped to customer \(j\) for \(i = 1, 2\), and \(j = 1, 2, 3\)

$ L =$ additional labor hours to purchase

The objective function is a value that we want to maximize or minimize and is a function of the decision variables. As we have sales and cost figures, we would expect that we want to maximize the profit generated by our decisions:

Maximize:

\(Z =\) sales from customers + cost to ship to customers + cost of additional labor hours

\(Z = 200 \sum_{i=1}^{2} X_{i1}\ + 150 \sum_{i=1}^{2} X_{i2}\ + 200 \sum_{i=1}^{2} X_{i3}\ - \sum t_{ij}X_{ij} \ - 20L\) , where \(t_{ij}\) is the cost of shipping 1000 units from plant \(i\) to customer \(j\)

\(Z = (200-60)X_{11} + (200-130)X_{21} + (150-30)X_{12} + (150-70)X_{22} + (200-160)X_{13} + (200-170)X_{23} - 20L\)

\(Z = 140X_{11} + 70X_{21} + 120X_{12} + 80X_{22} + 40X_{13} + 30X_{23} - 20L\)

The constraints represent the conditions or requirements and limit the values that variables can take.

Labor constraint:

\(200 (X_{11}+X_{12}+X_{13}) + 300(X_{21}+X_{22}+X_{23}) \leq 5500 + L\)

Production constraints:

\(X_{11} + X_{12} + X_{13} \leq 10\)

\(X_{21} + X_{22} + X_{23} \leq 12\)

Nonnegativity: All variables are nonnegative

\(X_{ij} \geq 0\) ; for every plant \(i\) and customer \(j\)

\(L \geq 0\)

Solving using SciPy

I will follow the steps outlined in Mirko’s post to solve the above problem using SciPy. The first step is to load the linprog() from SciPy’s Optimize module.

from scipy.optimize import linprog

linprog() only works with minimization problems that don’t have constraints with greater than inequalities. (i.e., all variables on the left side are greater than or equal to a constant on the right side) Our problem is currently a maximization type, but we can solve this by negating the whole expression to turn it into a minimization problem.

Maximize:

\(Z = 140X_{11} + 70X_{21} + 120X_{12} + 80X_{22} + 40X_{13} + 30X_{23} - 20L\)

is equivalent to, Minimize:

\(-Z = 20L -140X_{11} - 70X_{21} - 120X_{12} - 80X_{22} - 40X_{13} - 30X_{23}\)

linprog() requires a mandatory input c which is a 1D array denoting the objective function. The following are some of its optional arguments:

  • A_ub - a 2D array for the left hand side of inequality constraints

  • b_ub - a 1D array for the right hand side of inequality constraints

  • A_eq - a 2D array for the left hand side of equality constraints

  • b_eq - a 1D array for the right hand side of equality constraints

  • bounds - a sequence of tuples denoting the minimum and maximum value of each variable. If only one tuple is defined, then the bounds apply to all variables

  • method - the algorithm used to solver the problem. The default is 'highs' and can also be set to 'highs-ds' and 'highs-ipm'. Other (legacy) methods are available at the moment but will soon be deprecated

  • callback - callback function that will be run once per iteration of the algorithm

  • integrality - a 1D array which specifies the integrality of each variable. This accepts a value of 0, 1, 2, 3 for each variable. 0 is the default and specifies that there is no integrality constraint, while 1 specifies that the variable must take an integer value. A value of 2 or 3 indicates that the variable is either continuous or integral within bounds, or is zero. This argument is only accepted by the 'highs' method.

Based on this, we can define variables to be used as inputs to the c and the first five arguments. When building the input arrays or lists, the order of the variables should be consistent. For our case, we will use the following order, which is how the variables appear in the minimization objective function:

\(L, X_{11} , X_{21} , X_{12} , X_{22} , X_{13} , X_{23}\)

obj = [20, -140, -70, -120, -80, -40, -30]

lhs_ineq = [[-1, 200, 300, 200, 300, 200, 300],  # Labor constraint left side
            [0, 1, 0, 1, 0, 1, 0],  # Plant 1 production constraint left side
            [0, 0, 1, 0, 1, 0, 1]]  # Plant 2 production constraint left side

rhs_ineq = [5500,  # Labor constraint right side
            10,  # Plant 1 constraint right side
            12]  # Plant 2 constraint right side

# no equality constraints

bnd = [(0, None)]

We can pass these into the arguments of linprog() to solve the problem. The function returns an object of type scipy.optimize.OptimizeResult that gives the optimum values of the decision variables in the field x and the minimum value of the objective function in the field fun.

opt = linprog(c=obj, A_ub=lhs_ineq, b_ub=rhs_ineq, bounds=bnd)

opt
        message: Optimization terminated successfully. (HiGHS Status 7: Optimal)
        success: True
         status: 0
            fun: -2333.333333333333
              x: [ 0.000e+00  1.000e+01  0.000e+00  0.000e+00  1.167e+01
                   0.000e+00  0.000e+00]
            nit: 1
          lower:  residual: [ 0.000e+00  1.000e+01  0.000e+00  0.000e+00
                              1.167e+01  0.000e+00  0.000e+00]
                 marginals: [ 1.973e+01  0.000e+00  1.000e+01  2.000e+01
                              0.000e+00  1.000e+02  5.000e+01]
          upper:  residual: [       inf        inf        inf        inf
                                    inf        inf        inf]
                 marginals: [ 0.000e+00  0.000e+00  0.000e+00  0.000e+00
                              0.000e+00  0.000e+00  0.000e+00]
          eqlin:  residual: []
                 marginals: []
        ineqlin:  residual: [ 0.000e+00  0.000e+00  3.333e-01]
                 marginals: [-2.667e-01 -8.667e+01 -0.000e+00]
 mip_node_count: 0
 mip_dual_bound: 0.0
        mip_gap: 0.0

The solver was able to find the optimal value of $2,333 (fun) achieved by shipping 10000 pcs from Plant 1 to Customer 1 and 11667 pieces from Plant 2 to Customer 2.

Solving using PuLP

PuLP is another package for optimization and is able to call external solvers to solve the model. I will again use Mirko’s post for the steps.

We first import the functions that we will use from the pulp package.

from pulp import LpMaximize, LpProblem, LpStatus, lpSum, LpVariable

The first step is creating an object that represents the LP model using LpProblem() The sense argument specifies the objective of the problem which could either be LpMinimize (default, minimization) or LpMaximize. (maximization)

model = LpProblem(name="sample-problem", sense=LpMaximize)

The next step is defining the variables using the function LpVariable(). This function has the following arguments, of which only the first is mandatory

  • name - name of the variable

  • lowBound , upBound - minimum and maxumum value of the variable. Default is negative infinity and positive infinity

  • cat - The category of the variable. The default is 'Continuous', but can also be set as 'Integer' or 'Binary'

  • e - used for column-based modeling (we will not discuss further for this post)

We then define the 7 decision variables with separate calls of LpVariable():

L = LpVariable(name = "L", lowBound = 0)

x11 = LpVariable(name = "x11", lowBound = 0)
x12 = LpVariable(name = "x12", lowBound = 0)
x13 = LpVariable(name = "x13", lowBound = 0)

x21 = LpVariable(name = "x21", lowBound = 0)
x22 = LpVariable(name = "x22", lowBound = 0)
x23 = LpVariable(name = "x23", lowBound = 0)

In a nutshell, constraints can be added to the model as inequalities/equalities, and the objective function can be added as an expression. To illustrate:

print(type(2*L)) # This is an expression and represents 2 multiplied by L

print(type(2*L <= 5)) # This is a constraint as it includes an inequality
<class 'pulp.pulp.LpAffineExpression'>
<class 'pulp.pulp.LpConstraint'>

We will add these into the model by simply adding them into model. We will be writing them in full, but there are other functions in the package that can be used to construct expressions or constraints.

To add the (maximization) objective function, we write the following:

model += 140 * x11 + 70 * x21 + 120 * x12 + 80 * x22 + 40 * x13 + 30 * x23 - 20 * L

Constraints are added similarly, but as a tuple which includes a name for each constraint.

model += (200 * x11 + 200 * x12 + 200 * x13 + 300 * x21 + 300 * x22 + 300 * x21 + 300 * x23 - L <= 5500, "Labor")
model += (x11 + x12 + x13 <= 10, "Plant1_Production")
model += (x21 + x22 + x23 <= 12, "Plant2_Production")

Calling the object displays the model we have defined.

model
sample-problem:
MAXIMIZE
-20*L + 140*x11 + 120*x12 + 40*x13 + 70*x21 + 80*x22 + 30*x23 + 0
SUBJECT TO
Labor: - L + 200 x11 + 200 x12 + 200 x13 + 600 x21 + 300 x22 + 300 x23 <= 5500

Plant1_Production: x11 + x12 + x13 <= 10

Plant2_Production: x21 + x22 + x23 <= 12

VARIABLES
L Continuous
x11 Continuous
x12 Continuous
x13 Continuous
x21 Continuous
x22 Continuous
x23 Continuous

We use .solve() to solve the model. A solver argument can be specified to override the default solver to be used. We use GLPK as the solver for this post.

from pulp import GLPK
model.solve(solver=GLPK(msg=False))

print(f"status: {model.status}, {LpStatus[model.status]}")

print(f"objective: {model.objective.value()}")

for var in model.variables():
  print(f"{var.name}: {var.value()}")

for name, constraint in model.constraints.items():
  print(f"{name}: {constraint.value()}")
status: 1, Optimal
objective: 2333.3360000000002
L: 0.0
x11: 10.0
x12: 0.0
x13: 0.0
x21: 0.0
x22: 11.6667
x23: 0.0
Labor: 0.010000000000218279
Plant1_Production: 0.0
Plant2_Production: -0.3332999999999995

The results show the same objective and decision variable values as SciPy.

Solving using ORTools

The last package that we will test out in this post is Google’s ORTools. I will be using Maximme’s post as a guide where he mentions that the package is quite user-friendly and is the highest starred/rated on GitHub. The package also includes libraries specific to vehicle routing and graph problems.

pywraplp is the linear programming solver wrapper which interfaces to MIP (mixed integer programming) and linear solvers. We use the code below to import the module and then declare a solver instance for our problem using Solver(). We specify GLOP as the engine to be used.

from ortools.linear_solver import pywraplp

solver = pywraplp.Solver('Maximize profit', pywraplp.Solver.GLOP_LINEAR_PROGRAMMING)

Variables can be added to the model or solver instance by using on of three functions depending on the variable type:

  • NumVar - continuous variables

  • IntVar - integer variables

  • BoolVar - binary variables

The functions accept the lower-bound, the upper-bound and the variable name as inputs. To specify no upper-bound, we need to use .infinity() as a value.

Our model only contains continuous variables so we will just use NumVar to declare them. This approach is similar to what was done in PuLP.

var_L = solver.NumVar(0, solver.infinity(), 'L')

var_x11 = solver.NumVar(0, solver.infinity(), 'x11')
var_x12 = solver.NumVar(0, solver.infinity(), 'x12')
var_x13 = solver.NumVar(0, solver.infinity(), 'x13')

var_x21 = solver.NumVar(0, solver.infinity(), 'x21')
var_x22 = solver.NumVar(0, solver.infinity(), 'x22')
var_x23 = solver.NumVar(0, solver.infinity(), 'x23')

We add constraints using .Add() for each constraint and then passing the inequality (or equality) as an argument.

solver.Add(200 * var_x11 + 200 * var_x12 + 200 * var_x13 + 300 * var_x21 + 300 * var_x22 + 300 * var_x21 + 300 * var_x23 - var_L <= 5500)
solver.Add(var_x11 + var_x12 + var_x13 <= 10)
solver.Add(var_x21 + var_x22 + var_x23 <= 12)
<ortools.linear_solver.pywraplp.Constraint; proxy of <Swig Object of type 'operations_research::MPConstraint *' at 0x000001FFCB089990> >

The objective function can be defined by using .Maximize() or .Minimize() depending on the problem. We will be using the first one to define the original maxmimization version of the objective.

solver.Maximize(140 * var_x11 + 70 * var_x21 + 120 * var_x12 + 80 * var_x22 + 40 * var_x13 + 30 * var_x23 - 20 * var_L)

With the model defined

status = solver.Solve()
print('Optimized: ', status == pywraplp.Solver.OPTIMAL)
Optimized:  True

We can display the objective value and the decision variables by calling the specific objects in solver.

print(f"Objective value = {solver.Objective().Value():0.1f}")
for i in solver.variables():
  print(f"{i} = {i.solution_value():0.1f}")
Objective value = 2333.3
L = 0.0
x11 = 10.0
x12 = 0.0
x13 = 0.0
x21 = 0.0
x22 = 11.7
x23 = 0.0

We again see that the output is the same as the previous two. The method of defining variables, constraints, and the objective function is very similar to PuLP, but all three methods produced the same results.

Initial Thoughts

We have gone through three different ways to solve LP problems in Python. It is too early for me to tell which one is best for me. The last two appear more user-friendly when coding in variables and constraints, but I am not sure how easy they are to use for large-scale problems where you will load the coefficients of the objective function or the constraints from a separate file or spreadsheet.

This demo is also limited to a very simple linear problem– there are no integer or binary variables. We also did not go through any further analysis (e.g., sensitivity analysis) and these should be considered when choosing a package to use or focus on.

For now, I have three new packages in my toolset and I will definitely play around and explore some more.