# 2016-2017

## 2017/06/07: Erik Waingarten, "Settling the query complexity of non-adaptive junta testing"

Erik Waingarten (Columbia University)

**Abstract:**

The problem of testing whether an n-variable function is a k-junta has been one of the main problems in property testing. In this talk, I will show that any non-adaptive algorithm that tests whether an unknown Boolean function f\colon\{0,1\}^n\to\{0,1\} is a k-junta or \varepsilon-far from every k-junta must make \tilde{\Omega}(k^{3/2})/\varepsilon many queries. This result is essentially optimal given Blais’s non-adaptive junta tester, which makes \tilde{O}(k^{3/2})/\varepsilon queries and shows that adaptivity enables polynomial savings in query complexity for junta testing. At a very high level, the proof proceeds by reducing the non-adaptive junta testing of a new class of random Boolean functions to a problem of distinguishing two binomial distributions with a specific kind of noisy query.

This is joint work with Xi Chen, Rocco Servedio, Li-Yang Tan, and Jinyu Xie.

## 2017/05/24: Mohsen Ghaffari, "On the Complexity of Local Distributed Graph Problems"

Mohsen Ghaffari (ETH Zurich)

**Abstract:**

We revisit the complexity of graph problems in the well-studied LOCAL model of distributed computing, introduced by Linial [FOCS ’87]. It is widely known that for many of the classic distributed graph problems (including maximal independent set (MIS) and $(\Delta+1)$-vertex coloring), the randomized complexity is at most polylogarithmic in the size $n$ of the network, while the best known deterministic complexity is typically $2^{O(\sqrt{\log n})}$. Understanding and potentially narrowing down this exponential gap is one of the central long-standing open questions in the area of distributed graph algorithms.

We investigate this question by introducing a complexity-theoretic framework, which allows us to shed some light on the role of randomness in the LOCAL model. In particular, we prove completeness results which, roughly speaking, show that some rather rudimentary-looking problems are “complete” for essentially all the classic local distributed problems. In other words, if any of these complete problems can be solved deterministically in $\poly\log n$ rounds in the LOCAL model, we can deterministically solve essentially all the classic LOCAL-problems (including MIS and $(\Delta+1)$-coloring) in $\poly\log n$ rounds.

Perhaps the most surprising complete problem is the following: Color the nodes of the graph red or blue such that each node of a sufficiently large polylogarithmic degree has at least one neighbor of each color. The problem admits a trivial zero-round randomized algorithm, simply color each node randomly. This completeness result can be viewed as showing that the only obstacle to getting efficient deterministic algorithms in the LOCAL model is an efficient algorithm to round fractional values into integer values, while approximately preserving some linear constraints.

This talk is based on a joint work with Fabian Kuhn and Yannic Maus.

A version of the paper can be found here: https://arxiv.org/abs/1611.02663

## 2017/05/10: Avishay Tal, "Computing Requires Larger Formulas than Approximating"

Avishay Tal (IAS)

**Abstract:**

A de-Morgan formula over Boolean variables x_1, …, x_n is a binary tree whose internal nodes are marked with AND or OR gates and whose leaves are marked with variables or their negation. We define the size of the formula as the number of leaves in it. Proving that some explicit function (in P or NP) requires large formulas is a central open question in computational complexity. I

n this work, we introduce a size-amplification hardness reduction for de-Morgan formulas. We show that average-case hardness implies worst-case hardness for a larger size. More precisely, if a function f cannot be computed correctly on more than 1/2 + eps of the inputs by any formula of size s, then computing f correctly on all inputs requires size ~s*log(1/eps). The tradeoff is essentially tight. Quite surprisingly, the proof relies on a result from quantum query complexity by Reichardt [SODA, 2011].

As an application, we improve the best known formula size lower bounds for explicit functions by logarithmic factors to ~n^3/log(n). In addition, we propose candidates for explicit functions that we believe have formula size ~n^4, and prove non trivial super-quadratic formula size lower bounds for them using our reduction.

## 2017/04/26: Santosh Vempala, "Sampling Polytopes: From Euclid to Riemann"

Santosh Vempala (Georgia Tech)

**Abstract:**

We introduce the Geodesic Walk for sampling Riemannian manifolds and apply it to the problem of generating uniform random points from the interior of a convex polytope in n-dimensional Euclidean space. The walk is a simulation of a stochastic differential equation defined using a convex barrier function; each step is the solution of an ordinary differential equation. It improves the time complexity of sampling polytopes and is the first walk to achieve sub-quadratic complexity. Most of the talk will be spent introducing relevant aspects of manifolds, stochastic processes, efficient solution of differential equations, and how this walk overcomes the bottlenecks of random walks in Euclidean space. No background in string theory (or Riemannian geometry) will be assumed.

Joint work with Lee Yin Tat.

## 2017/04/12: Kasper Green Larsen, "Crossing the Logarithmic Barrier for Dynamic Boolean Data Structure Lower Bounds"

Kasper Green Larsen (Aarhus University)

**Abstract:**

This talks presents the first super-logarithmic lower bounds on the cell probe complexity of dynamic boolean (a.k.a. decision) data structure problems, a long-standing milestone in data structure lower bounds.

We introduce a new method for proving dynamic cell probe lower bounds and use it to prove a $\tilde{\Omega}(\lg^{1.5} n)$ lower bound on the operational time of a wide range of boolean data structure problems, most notably, on the query time of dynamic range counting over F_2 ([Pat07]). Proving an $\omega(\lg n)$ lower bound for this problem was explicitly posed as one of five important open problems in the late Mihai Patrascu’s obituary [Tho13]. This result also implies the first $\omega(\lg n)$ lower bound for the classical 2D range counting problem, one of the most fundamental data structure problems in computational geometry and spatial databases. We derive similar lower bounds for boolean versions of dynamic polynomial evaluation and 2D rectangle stabbing, and for the (non-boolean) problems of range selection and range median.

Our technical centerpiece is a new way of “weakly” simulating dynamic data structures using efficient one-way communication protocols with small advantage over random guessing. This simulation involves a surprising excursion to low-degree (Chebychev) polynomials which may be of independent interest, and offers an entirely new algorithmic angle on the “cell sampling” method of Panigrahy et al. [PTW10].

## 2017/03/29: Noah Stephens-Davidowitz, "A Reverse Minkowski Theorem"

Noah Stephens-Davidowitz (NYU)

**Abstract:**

A classical problem in the geometry of numbers asks us to estimate how many lattice points lie in some ball around the origin. Minkowski’s celebrated theorem gives us a tight lower bound for this number that depends only on the determinant of the lattice. (One can think of the determinant as the limit of the density of lattice points inside large balls. So, Minkowski’s theorem gives a tight lower bound on a lattice’s “local density” based on its “global density.”) We resolve a conjecture due to Dadush that gives a nearly tight converse to Minkowski’s theorem—an upper bound on the number of lattice points in a ball that depends only on the determinants of sublattices. This theorem has numerous applications, from complexity theory, to the geometry of numbers, to the behavior of Brownian motion on flat tori.

Based on joint work with Oded Regev.

**Abstract**

A matrix M is called rigid if M has ‘high’ rank, and one needs to change ‘many’ entries of M before it has ‘low’ rank. Matrix rigidity was introduced by Leslie Valiant in 1977, as a path towards proving arithmetic circuit lower bounds. In order to prove such a bound, one needs to give an explicit rigid matrix (for certain appropriate rigidity parameters). The problem of finding such an explicit rigid matrix is still open.

One family of matrices which was conjectured to be rigid is the Walsh-Hadamard transform (a.k.a. Sylvester matrices, a.k.a. the communication matrix of Inner Product modulo 2). I will present a proof that the Walsh-Hadamard transform is, in fact, not sufficiently rigid for Valiant’s program. Our proof uses a connection from recent “polynomial method” algorithms between low rank matrices and low degree polynomials. I will also discuss a natural notion we call the probabilistic rank of a matrix, which measures the extent to which a matrix can be probabilistically approximated by low rank matrices.

Based on joint work with Ryan Williams to appear in STOC’17.

## 2017/02/29: Jelani Nelson, "Optimality of the Johnson-Lindenstrauss lemma"

Jelani Nelson (Harvard)

**Abstract**

We consider the question: given integers n,d > 1, and some 0 < epsilon < 1, what is the minimum value of m such that for all n-point subsets X of d-dimensional Euclidean space, there exists an embedding f from X to m-dimensional Euclidean space with distortion at most 1+epsilon? We show that for nearly the full range of interest for the parameters n, d, and epsilon, the Johnson-Lindenstrauss lemma is tight: there exists an n-point subset of d-dimensional Euclidean space such that any such f must have m in Omega(epsilon^{-2} log n).

Joint work with Kasper Green Larsen (Aarhus University).

## 2017/02/01: Nikhil Bansal, "Algorithmic discrepancy beyond partial coloring"

Nikhil Bansal (Eindhoven University of Technology)

**Abstract**

Partial coloring is one of the most widely used methods in combinatorial discrepancy. However, in many cases it leads to sub-optimal bounds as the partial coloring step must be iterated a logarithmic number of times, and the errors can add up arbitrarily. I will describe a recent algorithmic approach that overcomes these limitations and can be used to efficiently find colorings matching the so-called Banaszczyk’s bound for many problems.

Based on joint works with Daniel Dadush and Shashwat Garg.

## 2016/12/07: Jerry Li, "Robust Estimators in High Dimensions without the Computational Intractability"

Jerry Li (MIT)

**Abstract**

We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples. Such questions have a rich history spanning statistics, machine learning and theoretical computer science. Even in the most basic settings, the only known approaches are either computationally inefficient or lose dimension-dependent factors in their error guarantees. This raises the following question: is high-dimensional agnostic distribution learning even possible, algorithmically?

In this talk, we present the first computationally efficient algorithms with dimension-independent error guarantees for agnostically learning several fundamental classes of high-dimensional distributions: (1) a single Gaussian, (2) a product distribution on the hypercube, (3) mixtures of two product distributions (under a natural balancedness condition), and (4) mixtures of spherical Gaussians. Our algorithms achieve error that is independent of the dimension, and in many cases scales nearly optimally with the fraction of adversarially corrupted samples. Moreover, we develop a general recipe for detecting and correcting corruptions in high-dimensions, that may be applicable to many other problems.

Based on joint work with Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Ankur Moitra, and Alistair Stewart

**Abstract**

We introduce the notion of an Extremely Lossy Function (ELF). An ELF is a family of functions with an image size that is tunable anywhere from injective to having a polynomial-sized image. Moreover, for any efficient adversary, for a sufficiently large polynomial r (necessarily chosen to be larger than the running time of the adversary), the adversary cannot distinguish the injective case from the case of image size r.

We develop a handful of techniques for using ELFs, and show that such extreme lossiness is useful for instantiating random oracles in several settings. In particular, we show how to use ELFs to build secure point function obfuscation with auxiliary input, as well as polynomially-many hardcore bits for any one-way function. Such applications were previously known from strong knowledge assumptions — for example polynomially-many hardcore bits were only know from differing inputs obfuscation, a notion whose plausibility has been seriously challenged. We also use ELFs to build a simple hash function with output intractability, a new notion we define that may be useful for generating common reference strings.

Next, we give a construction of ELFs relying on the exponential hardness of the decisional Diffie-Hellman problem, which is plausible in pairing-based groups. Combining with the applications above, our work gives several practical constructions relying on qualitatively different — and arguably better — assumptions than prior works.

**Abstract**

When can pieces of information be termed substitutes or complements? I’ll propose an answer based on “value of information” theory and show that these definitions are fundamental to some problems both in game theory and algorithms. The “main result” shows that substitutes characterize good “efficient market hypothesis” equilibria in prediction markets; another shows that substitutes characterize efficient approximation algorithms for information acquisition.

The goal is to have a tutorial feel, introducing some fun concepts and tools that are less familiar to theory audiences. It will be heavy on concepts and definitions, light on technicalities.

## 2016/10/26: Claire Mathieu, "Local search yields approximation schemes for k-means and k-median in Euclidean and minor-free metrics"

Claire Mathieu (ENS Paris)

**Abstract**

We give the first polynomial-time approximation schemes (PTASs) for the following problems: (1) uniform facility location in edge-weighted planar graphs; (2) k-median and k-means in edge-weighted planar graphs; (3) k-means in Euclidean spaces of bounded dimension. Our first and second results extend to minor-closed families of graphs. All our results extend to cost functions that are the p-th power of the shortest-path distance. The algorithm is local search where the local neighborhood of a solution S consists of all solutions obtained from S by removing and adding 1/ϵO(1) centers.

Joint work with Vincent Cohen-Addad and Philip N. Klein.

## 2016/10/19: Ronen Eldan, "Kernel-based methods for Bandit Convex Optimization"

Ronen Eldan (Weizmann Institute)

**Abstract**

I will present the first poly-time and sqrt{T}poly{n}-regret algorithm for bandit convex optimization.

Given a convex body K, the problem can be described as the following sequential game: at each time step t=1,..,T, a player selects an action x_t in K, and simultaneously an adversary selects a convex loss function l_t:K \to [0,1]. The player’s feedback is its suffered loss, l_t(x_t). The player has access to external randomness, and can select her action based on the history. The player’s perfomance at the end of the game is measured through the regret, sum_{t=1}^T l_t(x_t) - min_{x in K} l_t(x), which compares her cumulative loss to the smallest cumulative loss she could have obtained had she known the sequence of loss functions.

Joint w. Sebastien Bubeck and Yin-Tat Lee.

## 2016/09/28: Swastik Kopparty, "High rate locally-correctable codes and locally-testable codes with subpolynomial query complexity"

Swastik Kopparty (Rutgers)

**Abstract**

Locally-correctable codes (LCCs) and locally-testable codes (LTCs) are error-correcting codes which admit sublinear-time error correcting/detecting algorithms. They have interesting connections to complexity theory, cryptography and classical coding theory. In recent years, a number of different LCCs and LTCs with constant rate and distance were discovered, and in all cases the local correcting/testing algorithms had arbitrary polynomially small query complexity.

I will talk about a recent result showing the existence of codes with constant rate and constant distance which are locally-correctable and locally-testable with subpolynomial query complexity. The key new ingredient (in the context of local correction/testing) is an expander-based distance amplification method of Alon, Edmonds and Luby.

Joint work with Or Meir, Noga Ron-Zewi and Shubhangi Saraf

## 2016/09/14: Sushant Sachdeva, “Fast Approximate Gaussian Elimination for Laplacians"

Sushant Sachdeva (Google)

**Abstract**

Solving systems of linear equations in graph Laplacians is a fundamental primitive in scientific computing. Starting with the seminal work of Spielman-Teng that gave the first nearly-linear time algorithm for solving Laplacian systems, there has been a long line of work giving faster Laplacian solvers. These solvers have had a large impact on the design of fast graph algorithms.

In this talk, I’ll present a very simple, nearly-linear time Laplacian solver that is based purely on random sampling, and does not use any graph theoretic constructions such as low-stretch trees, sparsifiers, or expanders. Our solver builds a sparse Cholesky factorization for Laplacians — the symmetric version of Gaussian elimination. More precisely, it approximates a Laplacian L as U’U, where U is a sparse upper triangular matrix. Since triangular matrices are easy to invert, this immediately implies a fast Laplacian solver via iterative refinement. The analysis is based on concentration of matrix martingales.

*This is joint work with Rasmus Kyng, and will appear at the upcoming FOCS.*