Applications of Lagrange Multipliers
The purpose of this note is to present a number of problem classes that can be solved using the method of Lagrange multipliers.
Equality-Constrained Quadratic Programming (ECQP)
Let \(Q\in\mathbb{R}^n\) be a symmetric and positive-definite matrix, \(c\in\mathbb{R}^n\), \(d\in\mathbb{R}\), \(A\in\mathbb{R}^{m\times n}, b\in\mathbb{R}^m\) where \(m<n\) and \(\text{rank}A = m\).
Consider the following minimization problem with equality constraints:
Let
and
If \(x^*\in\mathbb{R}^n\) is a minimizer, then by the method of Lagrange multipliers, there exists \(\lambda^*\in\mathbb{R}^m\) such that the dual feasibility equality holds:
where \(0\) on the R.H.S. denotes the zero vector in \(\mathbb{R}^n\).
Note that the following facts are being used implicitly:
- \(Df(x^*) = \frac{1}{2}{x^*}^T(Q + Q^T) + c^T = {x^*}^TQ + c^T\), and
- \(Dg(x^*) = A\)
Since \(x^*\) by definition is a minimizer, it has to satisfy the primal feasibility equality:
The dual feasibility equality and the symmetry of \(Q\) tell us that
or
Substitute the above back to the primal feasibility equality, we have
or
Here \(AQ^{-1}A^T\) is invertible since \(\text{rank}A = m\) and \(Q\) is invertible.
Now \(x^*\) has the following form:
Let's write some Python code to validate this formula.
import numpy as np
from scipy.optimize import OptimizeResult, minimize
def generate_psd(n: int):
A = np.random.randn(n, n)
Q = A.T @ A
return Q
def generate_matrix(size: tuple[int, int]):
A = np.random.randn(*size)
return A
def objective(_x, Q, c):
return 0.5 * _x.T @ Q @ _x + c.T @ _x
def lagrangian(_x, _lambda, Q, c, A, b):
return 0.5 * _x.T @ Q @ _x + c.T @ _x + _lambda.T @ (A @ _x - b)
def first_ecqp_solver(Q, c, A, b):
Q_inverse = np.linalg.pinv(Q)
M = np.linalg.pinv(A @ Q_inverse @ A.T)
_lambda = -M @ b - M @ A @ Q_inverse @ c
_x = -Q_inverse @ A.T @ _lambda - Q_inverse @ c
return _x, _lambda
m, n = 3, 5
Q = generate_psd(n)
c = np.random.randn(n)
A = generate_matrix((m, n))
b = np.random.randn(m)
constraints = [{"type": "eq", "fun": lambda x: A @ x - b}]
def fun(x):
return objective(x, Q, c)
x0 = np.random.randn(n)
result: OptimizeResult = minimize(fun, x0=x0, constraints=constraints)
_x, _lambda = first_ecqp_solver(Q, c, A, b)
_xstar = result.x
primal_min = result.fun
dual_max = lagrangian(_x=_x, _lambda=_lambda, Q=Q, c=c, A=A, b=b)
print(f"m = {m}; n = {n}")
print(f"primal minimizer by minimize(): {_xstar}")
print(f"primal minimizer by manual computation: {_x}")
print(f"primal min: {primal_min}")
print(f"dual max: {dual_max}")
print(f"duality gap: {primal_min - dual_max}")
Paste this code and run it, you would get the similar output as follows:
m = 3; n = 5
primal minimizer by minimize(): [ 1.13225638 0.88939761 1.02840566 2.21816789 -0.45254542]
primal minimizer by manual computation: [ 1.13225633 0.88939765 1.02840563 2.2181678 -0.45254541]
primal min: 1.1312102952306051
dual max: 1.1312102952125787
duality gap: 1.8026469206233742e-11
In this code, my implementation first_ecqp_solver() is just a PoC used for validating the formula derived, not meant to compete with the battle-tested minimize() function in scipy.
Feel free to tweak values of m and n, just make sure the assumption \(m < n\) holds.
Here the key points are:
- Two primal minimizers should be numerically the same
- The absolute value of the duality gap should be extremely small (you might encounter the negative duality gap sometimes)
The Minimum-Norm Problem
Let \(A\) and \(b\) be defined as in the ECQP problem.
Consider the Minimum-Norm Problem:
Note that the objective is not differentiable at \(x=0\in\mathbb{R}^n\), one often considers the following problem instead as both problems have the same minimizer:
Before solving this problem, let's approach it from a geometric perspective. We want to find the "shortest" \(x\) that satisfies \(Ax=b\). In general, any such \(x\) can be written as \(x = x_p + x_n\) where \(x_p\) is the projection of \(x\) onto the row space of \(A\) and \(x_n\) is the projection of \(x\) onto the null space of \(A\). Thus, the shortest \(x\) is the one that satisfies \(x_p = b\) and \(x_n = 0\).
Now let's return to our original treatment of this problem. This problem is exactly a member of the ECQP class. Setting \(Q=I\) and \(c=0\), we have:
It is worthwhile to understand the solution from different perspectives.
Orthogonal Projection of a Point Onto the Affine Set
This problem is pretty similar to the minimum-norm problem.
Let \(x_0\in\mathbb{R}^n\) be fixed. Consider
The constraint \(Ax = b\) can be rewritten as \(A(x - x_0) = b - Ax_0\). And using the result obtained from the minimum-norm problem, we have
In the case \(m = 1\) and \(n = 2\), if we let \(A = [a_1\ a_2]\) and \(x_0 = [u_1\ u_2]^T\), then
and
and the distance \(\|x - x_0\|\) has the following form:
which recovers the formula of distance from a point to a line.
The Minimum of the Rayleight Quotient
Given a symmetric matrix \(A\in\mathbb{R}^n\) and nonzero \(x\in\mathbb{R}^n\), the Rayleight Quotient is defined to be
Consider the minimization problem:
This unconstrained problem can be turned to another minimization problem with the equality constraint:
If \(x^*\in\mathbb{R}^n\) is a minimizer to the above problem, then there exists \(\lambda^*\in\mathbb{R}\) such that
(Notice that \(\lambda^*\) is a real number, so \(\lambda^* = {\lambda^*}^T\))
which means that
Hence \(x^*\) is an eigenvector of \(A\) and \(\lambda^*\) is the eigenvalue of \(A\) with respect to \(x^*\).
Now the objective evaluated at \(x=x^*\) becomes
That means that the objective attains its minimum at eigenvectors of \(A\) that correspond to the smallest eigenvalues.
Strictly speaking, the Minimum-Norm Problem and Orthogonal Projection of a Point Onto the Affine Set are both instances of ECQP problems, so this post does not introduce many distinct problem classes. However, I still present them together because they are merely different faces of the same underlying concept.