Workshop Program

The workshop will take place on Monday February the 13th (from 11:30 a.m.) and Tuesday February the 14th (to 3 p.m.).

The program summary and the list of participants can be found in this document.
The talks will also be streamed online. Contact the organizers to receive access.

The recording of the talks are available on youtube playlist.

Monday the 13th

11:30am Welcome & practical information!
11:45am Opening talk - Adrien Taylor (Inria Paris)
Biography Adrien obtained his PhD in early 2017 at the department of mathematical engineering of the Université catholique de Louvain. Then, he was a postdoctoral researcher in the SIERRA team at INRIA Paris in 2017-2019 before starting as a research scientist in the same team in 2019.
My personal view on computer-assisted analyses and design of optimization methods via performance estimation problems
Abstract In this presentation, I want to provide an overview of the performance estimation approach for analyzing and designing optimization methods. I will focus on what I believe are the key ingredients, strengths, and weaknesses of the methodology. I will also discuss a few aspects and links that I believe were a bit overlooked in the last years, as well as the use of performance estimation beyond its comfort zone.

References:
[1] B. Goujaud, C. Moucer, F. Glineur, J. M. Hendrickx, A. Taylor, A. Dieuleveut, “PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python”, 2023. PDF
[2] Adrien Taylor, Francis Bach, “Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions”, 2021. PDF
[ Preprint , Slides , Video ]
12:45am Lunch
2:00pm Talks Session 1
Anne Rubens (UCLouvain) - A novel approach for interpolation In this talk, we present a novel approach for the interpolation problem of given function classes: what necessary and sufficient conditions must a set of data satisfy to ensure the existence of a function of the class defined on the whole space and interpolating the data? The derivation of such conditions is crucial in the PEP framework , since a priori tight performance guarantees can only be obtained for function classes whose interpolation conditions are known.
Our method to analyze conditions and hence extend the field of applications of the framework is the following: given a constraint satisfied by a finite set of data, we require it to be extensible to any arbitrary new point, instead of interpolable by a function defined on the whole space. This approach to the interpolation problem allows one to get rid of all analytic properties of the function class to work only at an algebraic level. As a side product, it allows to easily obtain counterexamples when a condition characterizing a function class is not interpolable. As an illustration, we provide interpolation conditions for weakly convex functions whose quasi-subgradient is bounded, and weakly convex functions satisfying a sharpness condition.

Sebastian Banert (Lund University) - Deriving algorithms with deviations This talk will present a methodology to derive algorithms for convex optimisation and monotone inclusions. It is based on performance estimation of one step of a Lyapunov function with symbolic calculations. The resulting algorithms allow for a trade-off between flexibility and guaranteed performance, where the former is expressed by the possibility to choose the iteration within a ball of a radius that is given by quantities that have already been computed. We call this concept "deviations", and allows for heuristic modifications of existing algorithms (e.g., by means of deep learning) while retaining convergence or performance guarantees.

Sébastien Colla (UCLouvain) - Automated Performance Estimation for Decentralized Optimization via Network Size Independent Problems We present a methodology to automatically compute worst-case performance bounds for a large class of first-order decentralized optimization algorithms. These algorithms aim at minimizing the average of local functions that are distributed across a network of agents. They typically combine local computations and consensus steps. Our methodology is based on the approach of Performance Estimation Problem (PEP), which allows computing the worst-case performance and a worst-case instance of first-order optimization algorithms by solving an SDP. In previous work, we propose two ways of representing consensus steps in PEPs, which allow writing and solving PEPs for decentralized optimization. In these formulations, the size of the PEP is growing with the number of agents in the network but in most cases, the worst-case guarantees obtained are not changing with the number of agents. In this work we therefore propose a new PEP formulation for decentralized optimization whose size is independent of the number of agents in the network. For this purpose, we take a global view of the decentralized problem and we also decouple the consensus subspace and its orthogonal complement. We apply our methodology to different decentralized methods such as DGD, DIGing and EXTRA and obtain numerically tight performance guarantees that are valid for any network size.
[ Preprint , Slides , Video ]
3:15pm Break
3:30pm Talks Session 2
Hadi Abbaszadehpeivasti (Tilburg University) - Convergence rate analysis of the gradient descent-ascent method for convex-concave saddle-point problems In this talk, I present our study on the gradient descent-ascent method for convex-concave saddle-point problems. We derive a new non-asymptotic global convergence rate in terms of distance to the solution set by using the semidefinite programming performance estimation method. The given convergence rate incorporates most parameters of the problem, and it is exact for a large class of strongly convex-strongly concave saddle-point problems for one iteration. We also investigate the algorithm without strong convexity, and we provide some necessary and sufficient conditions under which the gradient descent-ascent enjoys linear convergence.
[ Preprint , Slides , Video ]

Eduard Gorbunov (Mohamed bin Zayed University of Artificial Intelligence) - Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity Algorithms for min-max optimization and variational inequalities are often studied under monotonicity assumptions. Motivated by non-monotone machine learning applications, we follow the line of works [Diakonikolas et al., 2021, Lee and Kim, 2021, Pethick et al., 2022, Böhm, 2022] aiming at going beyond monotonicity by considering the weaker negative comonotonicity assumption. In particular, we provide tight complexity analyses for the Proximal Point, Extragradient, and Optimistic Gradient methods in this setup, closing some questions on their working guarantees beyond monotonicity.
[ Preprint , Slides , Video ]

Moslem Zamani (Tilburg University) - The exact worst-case convergence rate of the alternating direction method of multipliers In this talk, we derive new non-ergodic convergence rates for the alternating direction method of multipliers (ADMM) by using performance estimation. We give some examples which show the exactness of the given bounds. Moreover, we study the linear (R-linear) convergence of ADMM under some scenarios which are weaker than the existing ones in the literature.
[ Preprint , Slides , Video ]
4:45pm Break
5:00pm Long Talk - Etienne de Klerk (Tilburg University)
Biography Etienne de Klerk completed his BSc and MSc degrees at the University of Pretoria in South Africa, and obtained his PhD degree from the Delft University of Technology in The Netherlands in 1997. From January 1998 to September 2003, he held assistant professorships at the Delft University of Technology, and from September 2003 to September 2005 an associate professorship at the University of Waterloo, Canada, in the Department of Combinatorics & Optimization. In September 2004 he was appointed at Tilburg University, The Netherlands, first as an associate professor, and then as full professor (from June 2009). From August 29th, 2012, until August 31st, 2013, he was also appointed as full professor in the Division of Mathematics of the School of Physical and Mathematical Sciences at the Nanyang Technological University in Singapore. From September 1st, 2015 to August 31st 2019, he also held a full professorship at the Delft University of Technology (1 day/week position).
Some recent advances in SDP performance analysis
Abstract Semidefinite programming (SDP) performance analysis was introduced in 2014 in a seminal paper by Teboulle and Drori. Since then it has become a standard tool in establishing the (often exact) rates of convergence of many iterative methods, by formulating the worst case convergence rate as the optimal value of an SDP problem. This has yielded new insight on many methods, including gradient descent, Newton’s method for self-concordant functions, ADMM, DCA, etc. In this talk we will review some recent results on SDP performance analysis for iterative methods. The talk will be based on joint work with Hadi Abbaszadehpeivasti and Moslem Zamani.
[ Preprint , Slides ]
6:00pm End of the day
7:00pm Dining

Tuesday the 14th

9:00am Talks Session 3
Nizar Bousselmi (UCLouvain) - Performance Estimation of First-Order Methods on Functions composed with Linear Mappings We develop a Performance Estimation Problem representation for linear mappings. We consider convex optimization problems involving linear mappings, such as those whose objective functions include compositions of the type \(g(Mx)\), or featuring linear constraints of the form \(Mx=b\). First-order methods designed to tackle these problems will typically exploit their specific structure and will need to compute at each iteration products of iterates by matrices \(M\) or \(M^T\).
Our goal is to identify the worst-case behavior of such first-order methods, based on the Performance Estimation Problem (PEP) methodology. We develop interpolation conditions for linear operators M and \(M^T\). It allows us to embed them in the PEP framework, and thus, to evaluate the worst-case performance of a wide variety of problems involving linear mappings. We cover both the symmetric and nonsymmetric cases and allow bounds on the spectrum of these operators (lower and upper bounds on the eigenvalues in the symmetric case, maximum singular value in the nonsymmetric case). As a byproduct we also obtain interpolation conditions and worst-case performance for the class of convex quadratic functions.
We demonstrate the scope of our tool by computing several tight worst-case convergence rates, including that of the gradient method applied to the minimization of \(g(Mx)\) and that of the Chambolle-Pock algorithm.
[ Preprint , Slides , Video ]

Céline Moucer (Inria Paris) - A systematic approach to Lyapunov analyses of continuous-time models in convex optimization First-order methods are often analyzed via their continuous-time models, where their worst-case convergence properties are usually approached via Lyapunov functions. In this work, we provide a systematic and principled approach to find and verify Lyapunov functions for classes of ordinary and stochastic differential equations. More precisely, we extend the performance estimation framework, originally proposed by Drori and Teboulle (2012), to continuous-time models. We retrieve convergence results comparable to those of discrete methods using fewer assumptions and convexity inequalities, and provide new results for stochastic accelerated gradient flows.
[ Preprint , Slides , Video ]

Yassine Kamri (UCLouvain) - On the worst-case analysis of block coordinate-wise algorithms Block coordinate-wise algorithms are an essential class of first-order optimization methods widely used to solve large-scale optimization problems. However, their worst-case performance is still not well understood and their practical success has not yet been explained by existing convergence analyses. We analyze the worst-case behavior of cyclic versions of Block coordinate-wise algorithms in the context of unconstrained optimization of convex functions with coordinate-wise Lipschitz gradients. For this purpose, we rely on the recently proposed Performance Estimation Problem (PEP) and develop a new characterization for this class of function, from which we obtain necessary interpolation conditions.
In this paper, we present a unifying framework for the worst-case analysis of Block coordinate-wise algorithms, and in some situations, we are able to substantially outperform the best current bounds.
Preprint: https://arxiv.org/pdf/2211.17018.pdf
10:15am Break
10:35am Long Talk - Shuvomoy Das Gupta (MIT)
Biography Shuvomoy Das Gupta is a fourth-year Ph.D. student at the MIT Operations Research Center. His research focuses on developing efficient algorithms for large-scale optimization and machine learning. He obtained his M.A.Sc. from the ECE Department at the University of Toronto in 2016 and then worked as a researcher at the Research & Technology Department of Thales Canada for three years. His M.A.Sc. research on energy-efficient railway timetables has been applied to the largest installed base of communication-based train control systems worldwide. Previously, he obtained a Bachelor of Science in Electrical and Electronic Engineering from the Bangladesh University of Engineering and Technology.
Design and analysis of first-order methods via nonconvex QCQP frameworks
Abstract In this presentation, we will provide an overview of recent approaches for analyzing and designing first-order methods using nonconvex QCQPs. In this computer-assisted approach, the key idea involves posing the analysis or design of first-order methods as nonconvex but practically tractable QCQPs and then solving them to global optimality using custom spatial branch-and-bound algorithms. In particular, we will illustrate how to use this nonconvex QCQP framework to (i) analyze adaptive first-order methods, and (ii) design efficient fixed-step first-order methods.
First, we will discuss a nonconvex QCQP framework [1] to analyze the worst-case convergence of nonlinear conjugate gradient methods (NCGMs) for smooth strongly convex minimization. NCGMs are adaptive methods known for their good empirical performance but have incomplete analyses. By solving the associated nonconvex QCQPs to global optimality, we construct mathematical proofs that establish the first non-asymptotic convergence bound for Fletcher-Reeves NCGM (historically the first NCGM) and an improved non-asymptotic convergence bound for Polak-Ribi`ere-Polyak NCGM. Conversely, from the solutions of those QCQPs we construct “bad” functions on which these NCGMs perform worse than the regular steepest descent.
Then, we will briefly discuss BnB-PEP [2], which is a framework for numerically constructing optimal fixed-step first-order methods for convex and nonconvex optimization. We will provide examples of numerically constructed optimal methods and analytical convergence proofs with potential function structures that are generated using BnB-PEP.

References:
[1] Shuvomoy Das Gupta, Robert M. Freund, Xu Andy Sun, Adrien B. Taylor, “Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses”, 2023. PDF, Code
[2] Shuvomoy Das Gupta, Bart P.G. Van Parys, Ernest K. Ryu, “Branch-and-Bound Performance Estimation Programming: A Unified Methodology for Constructing Optimal Optimization Methods”, 2022. PDF, Code
[ Preprint , Slides , Video ]
11:35am Break
11:50am Talks Session 4
Radu Dragomir (EPFL) - Computer-aided analysis of Bregman methods In this talk, I will present applications of the performance estimation framework to the analysis of Bregman/mirror descent methods in the context of relatively-smooth optimization. For generic Bregman functions, the performance estimation problem is a linear program. Solving it allows to infer a lower bound stating that the standard mirror descent method is optimal. Then, I will discuss attempts for refining the analysis to the specific case of entropic mirror descent.
[ Preprint , Slides , Video ]

Teodor Rotaru (UCLouvain & KULeuven) - Tight convergence rates of the gradient method applied on smooth hypoconvex functions In this talk we present the first tight convergence analysis of the gradient method applied to smooth hypoconvex (weakly convex) functions. Hypoconvex functions are smooth nonconvex functions whose curvature is bounded and assumed to belong to the interval \([\mu, L]\), with \(\mu< 0\). Our convergence rates improve and extend the existing analysis for smooth nonconvex functions with \(L\)-Lipschitz gradient (which corresponds to the case \(\mu=\) −\(L\)), and smoothly interpolate between that class and the class of smooth convex functions. We obtain our results using the performance estimation framework adapted to hypoconvex functions, for which new interpolation conditions are determined.
We show explicit upper bounds on the minimum gradient norm of the iterates for the entire range of step sizes \((0, 2/L)\), prove that these rates are tight when step sizes are shorter or equal to some threshold larger than \(1.5/L\) and conjecture their tightness above the threshold. Convergence rates in the convex case are a particular case of our analysis. Finally, we identify the optimal constant step size that minimizes the worst-case of the gradient method applied to hypoconvex functions.
12:40pm Lunch
2:00pm Long Talk - Pontus Giselsson (Lund University) - part of INMA Seminar series
Biography Pontus Giselsson is an Associate Professor at the Department of Automatic Control at Lund University, Sweden. His current research interests include mathematical optimization and its applications in, e.g., control, machine learning, statistical estimation, and signal processing. He received an M.Sc. degree from Lund University in 2006 and a Ph.D. degree from Lund University in 2012.
This talk will be co-presented by Manu Updhyaya (Lund University).
Tight Lyapunov function existence analysis for first-order methods
Abstract We present a unifying framework for establishing linear convergence rates for common first-order methods used to solve convex optimization problems. In particular, we consider i) classes of convex optimization problems of finite sum form with (possibly strongly) convex and (possibly) smooth functional components, and ii) first-order methods that can be written in so-called state-space form, i.e., as a linear system in feedback interconnection with the subdifferentials of the functional components of the objective function. The validity of a given target linear convergence rate is established by deriving a necessary and sufficient condition for verifying the existence of a quadratic Lyapunov function for the algorithm and problem class under consideration for the chosen rate, which amounts to the feasibility of a small-sized semidefinite program. This allows us to find the smallest linear convergence rate for which such a quadratic Lyapunov function exists, by bisection search, yielding a tight procedure. The approach is numerically exemplified on several algorithmic schemes.
[ Preprint , Slides , Video ]

3:00pm End of PEP talks!
Interested participants are welcome to stay onsite for the remainder of the day if they wish, for further discussions or social activities.