Online Gradient Descent

Source file

wc_online_gradient_descent(M::Real, D::Real, n::Int; verbose = true)

Problem statement

Compute a PEPit worst-case guarantee for wc_online_gradient_descent.

Consider the online convex minimization problem, whose goal is to sequentially minimize the regret

\[R_n \triangleq \max_{x\in Q} \sum_{i=1}^n f_i(x_i)-f_i(x),\]

where the functions $f_i$ are $M$-Lipschitz and convex, and where $Q$ is a bounded closed convex set with diameter upper bounded by $D$. We also denote by $x_\star\in Q$ the solution to the minimization problem defining $R_n$ (i.e., $x_\star$ is a reference point). Classical references on the topic include [1, 2]; such algorithms were studied using the performance estimation technique in [3] and using the related IQCs in [4].

Performance metric

This code computes a worst-case guarantee for online gradient descent (OGD) with a step-size $\gamma=D/M/\sqrt{n}$. That is, it computes the smallest possible $\tau(n, M, D)$ such that the guarantee

\[R_n \leqslant \tau(n, M, D)\]

is valid for any such sequence of queries of OGD; that is, $x_t$ are the query points of OGD.

In short, for given values of $n$, $M$, $D$: $\tau(n, M, D)$ is computed as the worst-case value of $R_n$.

Algorithm

Online gradient descent is described by

\[x_{t+1} = x_t - \gamma \nabla f_t(x_t),\]

where $\gamma=D/M/\sqrt{n}$ is a step-size.

Theoretical guarantee

We compare the numerical results with those of [2, Section 2.1.2]:

\[R_n \leqslant MD\sqrt{n}.\]

References

[1] E. Hazan (2016). Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4), 157-325.

[2] F. Orabona (2025). A Modern Introduction to Online Learning.

[3] J. Weibel, P. Gaillard, W.M. Koolen, A. Taylor (2025). Optimized projection-free algorithms for online learning: construction and worst-case analysis

[4] F. Jakob, A. Iannelli (2025). Online Convex Optimization and Integral Quadratic Constraints: A new approach to regret analysis

Arguments

M: the Lipschitz parameter.
D: the diameter of the set.
n: number of iterations.
verbose: print example and solver progress information when true.

Returns

pepit_tau: worst-case value
theoretical_tau: theoretical value

Julia usage

pepit_tau, theoretical_tau = wc_online_gradient_descent(M, D, n; verbose=true)
## Returns approximately: (pepit_tau, theoretical_tau) = (0.707107, 0.707107)