Primitive steps

Primitive steps create symbolic points, gradients, function values, and constraints for common algorithmic operations. They are intended to keep example scripts close to the mathematical method being analyzed.

PEPit.inexact_gradient_step!Function
inexact_gradient_step!(x0::AbstractPoint, f::AbstractFunction, gamma::Real, epsilon::Real; notion::String="absolute")

Create the symbolic primitive step inexact_gradient_step!.

This routine performs a step $x \leftarrow x_0 - \gamma d_{x_0}$ where $d_{x_0}$ is close to the gradient of $f$ in $x_0$ in the following sense:

\[\|d_{x_0} - \nabla f(x_0)\|^2 \leqslant \left\{ \begin{aligned} & \varepsilon^2 & \text{if notion is set to 'absolute'}, \\ & \varepsilon^2 \|\nabla f(x_0)\|^2 & \text{if notion is set to 'relative'}. \end{aligned} \right.\]

This relative approximation is used at least in 3 PEPit examples, in particular in 2 unconstrained convex minimizations: an inexact gradient descent, and an inexact accelerated gradient.

References: [1] E. De Klerk, F. Glineur, A. Taylor (2020). Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation. SIAM Journal on Optimization, 30(3), 2053-2082.

Arguments

  • x0: starting point x0.
  • f: a function.
  • gamma: step-size parameter.
  • epsilon: the required accuracy.
  • notion: defines the mode (absolute or relative inaccuracy). By default, notion="absolute".

Returns

  • x: the output point.
  • dx0: the approximate (sub)gradient of f at x0.
  • fx0: the value of the function f at x0.

Throws

  • ErrorException (via error): if notion is not one of "absolute" or "relative".

See also Point, Expression, and add_constraint!.

PEPit.bregman_gradient_step!Function
bregman_gradient_step!(gx0::AbstractPoint, sx0::AbstractPoint, mirror_map::AbstractFunction, gamma::Real)

Create the symbolic primitive step bregman_gradient_step!.

This routine outputs $x$ by performing a mirror step of step-size $\gamma$. That is, denoting $f$ the function to be minimized and $h$ the mirror map, it performs

\[x = \arg\min_x \left[ f(x_0) + \left< \nabla f(x_0);\, x - x_0 \right>\]

      + \frac{1}{\gamma} D_h(x; x_0) \right],

where $D_h(x; x_0)$ denotes the Bregman divergence of $h$ on $x$ with respect to $x_0$.

\[D_h(x; x_0) \triangleq h(x) - h(x_0) - \left< \nabla h(x_0);\, x - x_0 \right>.\]

Arguments

  • gx0: descent direction $\textbf{gx0} \triangleq \nabla f(x_0)$.
  • sx0: starting gradient $\textbf{sx0} \triangleq \nabla h(x_0)$.
  • mirror_map: the reference function $h$ we computed Bregman divergence of.
  • gamma: step-size parameter.

Returns

  • x: new iterate $\textbf{x} \triangleq x$.
  • sx: $h$'s gradient on new iterate $x$ $\textbf{sx} \triangleq \nabla h(x)$.
  • hx: $h$'s value on new iterate $\textbf{hx} \triangleq h(x)$.

See also Point, Expression, and add_constraint!.

PEPit.bregman_proximal_step!Function
bregman_proximal_step!(sx0::AbstractPoint, mirror_map::AbstractFunction, min_function::AbstractFunction, gamma::Real)

Create the symbolic primitive step bregman_proximal_step!.

This routine outputs $x$ by performing a proximal mirror step of step-size $\gamma$. That is, denoting $f$ the function to be minimized and $h$ the mirror map, it performs

\[x = \arg\min_x \left[ f(x) + \frac{1}{\gamma} D_h(x; x_0) \right],\]

where $D_h(x; x_0)$ denotes the Bregman divergence of $h$ on $x$ with respect to $x_0$.

\[D_h(x; x_0) \triangleq h(x) - h(x_0) - \left< \nabla h(x_0);\, x - x_0 \right>.\]

Arguments

  • sx0: starting gradient $\textbf{sx0} \triangleq \nabla h(x_0)$.
  • mirror_map: the reference function $h$ we computed Bregman divergence of.
  • min_function: function we aim to minimize.
  • gamma: step-size parameter.

Returns

  • x: new iterate $\textbf{x} \triangleq x$.
  • sx: $h$'s gradient on new iterate $x$ $\textbf{sx} \triangleq \nabla h(x)$.
  • hx: $h$'s value on new iterate $\textbf{hx} \triangleq h(x)$.
  • gx: $f$'s gradient on new iterate $x$ $\textbf{gx} \triangleq \nabla f(x)$.
  • fx: $f$'s value on new iterate $\textbf{fx} \triangleq f(x)$.

See also Point, Expression, and add_constraint!.

PEPit.epsilon_subgradient_step!Function
epsilon_subgradient_step!(x0::AbstractPoint, f::AbstractFunction, gamma::Real)

Create the symbolic primitive step epsilon_subgradient_step!.

This routine performs a step $x \leftarrow x_0 - \gamma g_0$ where $g_0 \in\partial_{\varepsilon} f(x_0)$. That is, $g_0$ is an $\varepsilon$-subgradient of $f$ at $x_0$. The set $\partial_{\varepsilon} f(x_0)$ (referred to as the $\varepsilon$-subdifferential) is defined as (see [1, Section 3])

\[\partial_{\varepsilon} f(x_0)=\left\{g_0:\,\forall z,\, f(z)\geqslant f(x_0)+\left< g_0;\, z-x_0 \right>-\varepsilon \right\}.\]

An alternative characterization of $g_0 \in\partial_{\varepsilon} f(x_0)$ consists in writing

\[f(x_0)+f^*(g_0)-\left< g_0;x_0\right>\leqslant \varepsilon.\]

References: [1] A. Brndsted, R.T. Rockafellar. On the subdifferentiability of convex functions. Proceedings of the American Mathematical Society 16(4), 605-611 (1965)

Arguments

  • x0: starting point x0.
  • f: a function.
  • gamma: step-size parameter.

Returns

  • x: the output point.
  • g0: an $\varepsilon$-subgradient of f at x0.
  • f0: the value of the function f at x0.
  • epsilon: the value of epsilon.

See also Point, Expression, and add_constraint!.

PEPit.exact_linesearch_step!Function
exact_linesearch_step!(x0::AbstractPoint, f::AbstractFunction, directions)

Create the symbolic primitive step exact_linesearch_step!.

This routine outputs some $x$ by mimicking an exact line/span search in specified directions. It is used for instance by the Julia examples examples/unconstrained_convex_minimization/gradient_exact_line_search.jl and examples/unconstrained_convex_minimization/conjugate_gradient.jl.

The routine aims at mimicking the operation:

\[\begin{aligned} x & = & x_0 - \sum_{i=1}^{T} \gamma_i d_i,\\ \text{with } \overrightarrow{\gamma} & = & \arg\min_\overrightarrow{\gamma} f\left(x_0 - \sum_{i=1}^{T} \gamma_i d_i\right), \end{aligned}\]

where $T$ denotes the number of directions $d_i$. This operation can equivalently be described in terms of the following conditions:

\[\begin{aligned} x - x_0 & \in & \text{span}\left\{d_1,\ldots,d_T\right\}, \\ \nabla f(x) & \perp & \text{span}\left\{d_1,\ldots,d_T\right\}. \end{aligned}\]

In this routine, we instead constrain $x_{t}$ and $\nabla f(x_{t})$ to satisfy

\[\begin{aligned} \forall i=1,\ldots,T: & \left< \nabla f(x);\, d_i \right> & = & 0,\\ \text{and } & \left< \nabla f(x);\, x - x_0 \right> & = & 0, \end{aligned}\]

which is a relaxation of the true line/span search conditions.

Arguments

  • x0: the starting point.
  • f: the function on which the (sub)gradient will be evaluated.
  • directions: the list of all directions required to be orthogonal to the (sub)gradient of x.

Returns

  • x: such that all vectors in directions are orthogonal to the (sub)gradient of f at x.
  • gx: a (sub)gradient of f at x.
  • fx: the function f evaluated at x.

See also Point, Expression, and add_constraint!.

PEPit.inexact_proximal_step!Function
inexact_proximal_step!(x0::AbstractPoint, f::AbstractFunction, gamma::Real; opt::String="PD_gapII")

Create the symbolic primitive step inexact_proximal_step!.

This routine encodes an inexact proximal operation with step size $\gamma$. That is, it outputs a tuple $(x, g\in \partial f(x), f(x), w, v\in\partial f(w), f(w), \varepsilon)$ which are described as follows.

First, $x$ is an approximation to the proximal point of $x_0$ on function $f$:

\[x \approx \mathrm{prox}_{\gamma f}(x_0)\triangleq\arg\min_x \left\{ \gamma f(x) + \frac{1}{2}\|x-x_0\|^2\right\},\]

where the meaning of $\approx$ depends on the option "opt" and is explained below. The notions of inaccuracy implemented within this routine are specified using primal and dual proximal problems, denoted by

\[ \begin{aligned} &\Phi^{(p)}_{\gamma f}(x; x_0) \triangleq \gamma f(x) + \frac{1}{2}\|x-x_0\|^2,\\ &\Phi^{(d)}_{\gamma f}(v; x_0) \triangleq -\gamma f^*(v)-\frac{1}{2}\|x_0-\gamma v\|^2 + \frac{1}{2}\|x_0\|^2,\\ \end{aligned}\]

where $\Phi^{(p)}_{\gamma f}(x;x_0)$ and $\Phi^{(d)}_{\gamma f}(v;x_0)$ respectively denote the primal and the dual proximal problems, and where $f^*$ is the Fenchel conjugate of $f$. The options below encode different meanings of "$\approx$" by specifying accuracy requirements on primal-dual pairs:

\[(x,v) \approx_{\varepsilon} \left(\mathrm{prox}_{\gamma f}(x_0),\,\mathrm{prox}_{f^*/\gamma}(x_0/\gamma)\right),\]

where $\approx_{\varepsilon}$ corresponds to require the primal-dual pair $(x,v)$ to satisfy some primal-dual accuracy requirement:

\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(v;x_0) \leqslant \varepsilon,\]

where $\varepsilon\geqslant 0$ is the error magnitude, which is returned to the user so that one can constrain it to be bounded by some other values.

Relation to the exact proximal operation: In the exact case (no error in the computation, $\varepsilon=0$), $v$ corresponds to the solution of the dual proximal problem and one can write

\[x = x_0-\gamma g,\]

with $g=v=\mathrm{prox}_{f^*/\gamma}(x_0/\gamma)\in\partial f(x)$, and $x=w$.

Reformulation of the primal-dual gap: In regard with the exact proximal computation; the inexact case under consideration here can be described as performing

\[x = x_0-\gamma v + e,\]

where $v$ is an $\epsilon$-subgradient of $f$ at $x$ (notation $v\in\partial_{\epsilon} f(x)$) and $e$ is some additional computation error. Those elements allow for a common convenient reformulation of the primal-dual gap, written in terms of the magnitudes of $\epsilon$ and of $e$:

\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(v;x_0) = \frac{1}{2} \|e\|^2 + \gamma \epsilon.\]

Options: The following options are available (a list of such choices is presented in [4]; we provide a reference for each of those choices below).

- 'PD_gapI' : the constraint imposed on the output is the vanilla (see, e.g., [2])

\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(v;x_0) \leqslant \varepsilon.\]

This approximation requirement is used in one PEPit example: an accelerated inexact forward backward.

- 'PD_gapII' : the constraint is stronger than the vanilla primal-dual gap, as more structure is imposed
               (see, e.g., [1,5]) :

\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(g;x_0) \leqslant \varepsilon,\]

where we imposed that $v\triangleq g\in\partial f(x)$ and $w\triangleq x$. This approximation
requirement is used in two PEPit examples: in a relatively inexact proximal point algorithm and in a partially
inexact Douglas-Rachford splitting.

- 'PD_gapIII' : the constraint is stronger than the vanilla primal-dual gap, as more structure is imposed
                (see, e.g., [3]):

\[\Phi^{(p)}_{\gamma f}(x;x_0)-\Phi^{(d)}_{\gamma f}(\tfrac{x_0 - x}{\gamma};x_0) \leqslant \varepsilon,\]

where we imposed that $v \triangleq \frac{x_0 - x}{\gamma}$.

References:

[[1] R.T. Rockafellar (1976).
Monotone operators and the proximal point algorithm. SIAM journal on control and optimization, 14(5), 877-898.](https://epubs.siam.org/doi/pdf/10.1137/0314056)

[[2] R.D. Monteiro, B.F. Svaiter (2013).
An accelerated hybrid proximal extragradient method for convex optimization
and its implications to second-order methods.
SIAM Journal on Optimization, 23(2), 1092-1125.](https://epubs.siam.org/doi/abs/10.1137/110833786)

[[3] S. Salzo, S. Villa (2012).
Inexact and accelerated proximal point algorithms.
Journal of Convex analysis, 19(4), 1167-1192.](http://www.optimization-online.org/DB_FILE/2011/08/3128.pdf)

[[4] M. Barre, A. Taylor, F. Bach (2020).
Principled analyses and design of first-order methods with inexact proximal operators.](https://arxiv.org/pdf/2006.06041v3.pdf)

[[5] A. d'Aspremont, D. Scieur, A. Taylor (2021).
Acceleration Methods.
Foundations and Trends in Optimization: Vol. 5, No. 1-2.](https://arxiv.org/pdf/2101.09545.pdf)

Arguments

  • x0: point for which we aim to compute an approximate proximal step.
  • f: function whose proximal operator is approximated.
  • gamma: step-size parameter.
  • opt: option (type of error requirement) among 'PDgapI', 'PDgapII', and 'PD_gapIII'.

Returns

  • x: the approximated proximal point.
  • gx: a (sub)gradient of f at x (subgradient used in evaluating the accuracy criterion).
  • fx: f evaluated at x.
  • w: a point w such that v (see next output) is a subgradient of f at w.
  • v: the approximated proximal point of the dual problem, (sub)gradient of f evaluated at w.
  • fw: f evaluated at w.
  • eps_var: value of the primal-dual gap (which can be further bounded by the user).

See also Point, Expression, and add_constraint!.

PEPit.proximal_step!Function
proximal_step!(x0::AbstractPoint, f::AbstractFunction, gamma::Real)

Create the symbolic primitive step proximal_step!.

This routine performs a proximal step of step-size gamma, starting from x0, and on function f. That is, it performs:

\[\begin{aligned} x \triangleq \text{prox}_{\gamma f}(x_0) & \triangleq & \arg\min_x \left\{ \gamma f(x) + \frac{1}{2} \|x - x_0\|^2 \right\}, \\ & \Updownarrow & \\ 0 & = & \gamma g_x + x - x_0 \text{ for some } g_x\in\partial f(x),\\ & \Updownarrow & \\ x & = & x_0 - \gamma g_x \text{ for some } g_x\in\partial f(x). \end{aligned}\]

Arguments

  • x0: starting point x0.
  • f: function on which the proximal step is computed.
  • gamma: step-size parameter.

Returns

  • x: proximal point.
  • gx: the (sub)gradient of f at x.
  • fx: the function value of f on x.

See also Point, Expression, and add_constraint!.

PEPit.linear_optimization_step!Function
linear_optimization_step!(dir::AbstractPoint, ind::AbstractFunction)

Create the symbolic primitive step linear_optimization_step!.

This routine outputs the result of a minimization problem with linear objective (whose direction is provided by dir) on the domain of the (closed convex) indicator function ind. That is, it outputs a solution to

\[\arg\min_{\text{ind}(x)=0} \left< \text{dir};\, x \right>,\]

One can notice that $x$ is solution of this problem if and only if

\[- \text{dir} \in \partial \text{ind}(x).\]

Linear optimization oracles are classically used in conditional gradient-type algorithm (a.k.a., Frank-Wolfe) [1].

References: [1] M. Frank, P. Wolfe (1956). An algorithm for quadratic programming. Naval research logistics quarterly, 3(1-2), 95-110.

Arguments

  • dir: direction of optimization
  • ind: convex indicator function

Returns

  • x: oracle output.
  • gx: the (sub)gradient of ind on x.
  • fx: the function value of ind on x.

See also Point, Expression, and add_constraint!.

PEPit.shifted_optimization_step!Function
shifted_optimization_step!(dir::AbstractPoint, f::AbstractFunction)

Create the symbolic primitive step shifted_optimization_step!.

This routine outputs a stationary point of a minimization problem:

\[\arg\min_{x} f(x)-\left< \text{dir};\, x \right>.\]

That is, it outputs $x$ such that

\[\text{dir} \in \partial f(x).\]

Shifted optimization oracles are classically used in difference-of-convex algorithms (a.k.a., convex-concave procedure), see, e.g., [1].

References: [1] H.A. Le Thi, T. Pham Dinh (2018). DC programming and DCA: thirty years of developments. Mathematical Programming, 169(1), 5-68.

Arguments

  • dir: direction/linear shift in the objective of the optimization problem
  • f: function

Returns

  • x: oracle output.
  • gx: the (sub)gradient of f at x.
  • fx: the function value of f at x.

See also Point, Expression, and add_constraint!.