# 2017-2018

## 2018/05/23: Leonard Schulman, "Explicit Binary Tree Codes with Polylogarithmic Size Alphabet"

Leonard Schulman (Caltech)

**Abstract:**

Tree codes are “real time” or “causal” error-correcting codes. They are known to exist but explicit construction has been a longstanding open problem. We report on progress on this problem.

For every constant delta we give an explicit binary tree code with distance delta and alphabet size poly(log n), where n is the depth of the tree. This is the first improvement over a two-decade-old construction that has an exponentially larger alphabet of size poly(n).

As part of the analysis, we prove a bound on the number of positive integer roots a real polynomial can have in terms of its sparsity with respect to the Newton basis–a result of independent interest.

Joint work with Gil Cohen (Princeton) and Bernhard Haeupler (CMU)

## 2018/04/25: Danupon Nanongkai, "Distributed All-Pairs Shortest Paths, Exactly"

Danupon Nanongkai (KTH)

**Abstract:**

I will present the ~O(n^{5/4})-time distributed algorithm for computing all-pairs shortest paths exactly by Huang, Nanongkai, and Saranurak (FOCS 2017; https://arxiv.org/abs/1708.03903). The algorithm is fairly simple, and the talk will cover necessary backgrounds. I will also briefly survey recent progresses and some open problems in the field of distributed graph algorithms, where this work lies in.

**Abstract:**

Comparisons are a classical and well studied algorithmic tool that is used in a variety of contexts and applications. We will discuss two manifestations of the expressiveness of these queries in machine learning and complexity theory (a more detailed overview is given below). Both manifestations are based on the notion of “inference dimension” that can be viewed as another instance of the fruitful link between machine learning and discrete mathematics — a link dating back to the discovery of the VC dimension.

**Active classification with comparison queries.** Active learning is a model for semi-supervised learning that captures situations in which unlabeled data is abundant and manually labelling it is expensive. We consider an extension of active learning in which the learning algorithm may ask the annotator to compare the distances of two examples from the boundary of their label-class. For example, in a recommendation system application (say for restaurants), the annotator may be asked whether she liked or disliked a specific restaurant (a label query); or which one of two restaurants did she like more (a comparison query). We prove that the usage of comparison queries leads to an exponential improvement in the query complexity of some well studied problems. Specifically, for the class of half-spaces, we show that under natural assumptions, such as large margin or bounded bit-description of the input examples, it is possible to reveal all the labels of a sample of size n using approximately O(logn) queries.

**Nearly optimal linear decision trees for k-SUM and related problems**. We use the above mentioned techniques to construct linear decision trees for a variety of decision problems in combinatorics and discrete geometry. For example, for any constant k, we construct linear decision trees that solve the k-SUM problem on n elements using O(n log n) linear queries. Moreover, the queries we use are comparison queries, which compare the sums of two k-subsets; when viewed as linear queries, comparison queries are 2k-sparse and have only {−1,+1} coefficients. We give similar constructions for sorting sumsets A+B and for solving the SUBSET-SUM problem, both with optimal number of queries, up to poly-logarithmic terms.

Based on joint works with Daniel Kane, Shachar Lovett, and Jiapeng Zhang.

## 2018/03/28: Artur Czumaj, "Round Compression for Parallel Matching Algorithms"

Artur Czumaj (Warwick)

**Abstract:**

For over a decade now we have been witnessing the success of massive parallel computation (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises in this context is though: can we leverage this additional power to obtain even faster parallel algorithms?

A prominent example here is the fundamental graph problem of finding maximum matching. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in O(log n) rounds. However, the exact complexity of this problem in the MPC framework is still far from understood. Lattanzi et al. showed that if each machine has n^{1+Ω(1)} memory, this problem can also be solved 2-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow up work, seem though to get stuck in a fundamental way at roughly O(log n) rounds once we enter the near-linear memory regime. It is thus entirely possible that in this regime, which captures in particular the case of sparse graph computations, the best MPC round complexity matches what one can already get in the PRAM model, without the need to take advantage of the extra local computation power.

In this talk, we finally show how to refute that perplexing possibility. That is, we break the above O(log n) round complexity bound even in the case of slightly sublinear memory per machine. In fact, our improvement here is almost exponential: we are able to deliver a (2+ϵ) approximation to maximum matching, for any fixed constant ϵ>0, in O((loglog n)^2) rounds.

This is a joint work with Jakub Łącki, Aleksander Mądry, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski.

**Abstract:**

Is matching in NC? In other words, is there a deterministic fast parallel algorithm for it? This has been an open question for over three decades, ever since the discovery of Random NC matching algorithms. Within this question, the case of planar graphs had remained an enigma: On one hand, counting the number of perfect matchings is generally believed to be harder than finding one (the former is #P-complete and the latter is in P), and on the other, for planar graphs, counting has long been known to be in NC whereas finding one has resisted a solution.

The case of bipartite planar graphs was solved by Miller and Naor in 1989 via a flow-based algorithm. In 2000, Mahajan and Varadarajan gave an elegant way of using counting matchings to finding one, hence giving a different NC algorithm.

However, non-bipartite planar graphs still didn’t yield: the stumbling block being odd tight cuts. Interestingly, these are also a key to the solution: a balanced odd tight cut leads to a straight-forward divide and conquer NC algorithm. The remaining task is to find such a cut in NC. This requires several algorithmic ideas, such as finding a point in the interior of the minimum weight face of the perfect matching polytope and uncrossing odd tight cuts.

Paper available at: https://arxiv.org/pdf/1709.07822.pdf

Joint work with Vijay Vazirani.

## 2018/02/28: Sanjam Garg, "Identity-Based Encryption from the Diffie-Hellman Assumption"

Sanjam Garg (Berkeley)

**Abstract:**

In this talk, I will describe new constructions of identity-based encryption based on the hardness of the Diffie-Hellman (without using groups with pairings) Problem. Previously, constructions based on this assumption were believed to be impossible. Our construction is based on new techniques that bypass the known impossibility results using garbled circuits that make a non-black-box use of the underlying cryptographic primitives.

(Based on joint work with Nico Döttling.)

## 2018/02/14: Dor Minzer, "2-to-2 Games via expansion on the Grassmann Graph"

Dor Minzer (Tel Aviv University)

**Abstract:**

A fundamental goal in the theory of PCPs asks whether a special type of PCP, namely 2-to-2 Games, exists. This is a variant of Khot’s well-known Unique Games conjecture.

In this talk we will discuss a recent line of study establishing the 2-to-2 games conjecture, albeit with imperfect completeness. At the heart of the approach lies an object called the Grassmann Graph, and a certain linearity test on it. This leads to the study of edge expansion in this graph, and in particular, the study of (small) sets of vertices in the Grassmann Graph, whose edge expansion is bounded away from 1.

Based on joint works with Irit Dinur, Subhash Khot, Guy Kindler and Muli Safra

## 2018/01/31: Avi Wigderson, "Optimization, Complexity and Math (through the lens of one problem and one algorithm)"

Avi Wigderson (IAS)

**Abstract:**

In this lecture, we introduce and motivate the main characters in this plot:

– Singularity of symbolic matrices: a basic problem in both computational complexity.

– Alternating Minimization: a basic heuristic in non-convex optimization.

I will explain how variants of this algorithm are applied to variants of this problem, how they are analyzed, and how the analysis gives rise to problems in and connections between a surprisingly diverse set of mathematical areas, including quantum information theory, non-commutative algebra and invariant theory, and analysis. Time permitting, we will discuss challenges this work raises in invariant theory and non-convex optimization.

## 2017/12/13: Sébastien Bubeck, "k-server via multiscale entropic regularization"

Sébastien Bubeck (Microsoft Research)

**Abstract:**

I will start by describing how mirror descent is a natural strategy for online decision making, specifically in online learning and metrical task systems. To motivate the k-server problem I will also briefly recall what we know and what we don’t know for structured state/action spaces in these models. Using the basic mirror descent calculations I will show how to easily obtain a log(k)-competitive algorithm for k-paging. I will then introduce our new parametrization of fractional k-server on a tree, and explain how to analyze the movement cost of entropy-regularized mirror descent on this parametrization. This leads to a depth*log(k)-competitive (fractional) algorithm for general trees, and log^2(k) for HSTs. I will also briefly mention dynamic embeddings to go beyond the standard log(n) loss in the reduction from general metrics to HSTs.

Joint work with Michael B. Cohen, James R. Lee, Yin Tat Lee, and Aleksander Madry.

## 2017/11/29: Jon Kelner, "Almost-Linear-Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs"

Jon Kelner (MIT)

**Abstract:**

In the analysis of Markov chains, there has been a longstanding algorithmic gap between the general case, corresponding to random walks on directed graphs, and the special case of reversible chains, for which the corresponding graph can be taken to be undirected. This begins with the most basic computational task, computing the stationary distribution, and persists for many of the fundamental problems associated with random walks, such as computing hitting and commute times, escape probabilities, and personalized PageRank vectors. In the undirected case, there are algorithms for all of these problems that run in linear or nearly-linear time, whereas it was unknown in the directed case whether one could solve any of them more efficiently than an arbitrary linear system.

More broadly, this gap has its origins in a substantial discrepancy between the state of algorithmic spectral graph theory in the undirected and directed settings. While the undirected case has a richly developed theory and a powerful collection of algorithmic tools, similar results have remained elusive for directed graphs.

In this talk, I will begin to address this by giving an algorithmic framework that solves all of the problems listed above in almost-linear time. To do so, I will develop the first directed versions of several foundational primitives from undirected algorithmic spectral graph theory that had not been known to exist for directed graphs, notably including the first directed version of graph sparsification and an almost-linear-time solver for directed Laplacian systems. If time permits, I will also briefly discuss more recent work that improves the running time to be nearly linear, thereby eliminating the gap between the undirected and directed versions of these problems (up to polylogarithmic factors).

This talk is based on work with Michael Cohen, Rasmus Kyng, John Peebles, Richard Peng, Anup Rao, Aaron Sidford, and Adrian Vladu

## 2017/11/15: Vinod Vaikuntanathan, "Program Obfuscation and Random CSPs: The Love-Hate Relationship"

Vinod Vaikuntanathan (MIT)

**Abstract:**

A recent line of work shows how to construct indistinguishability obfuscation under two assumptions: (a) that there exist k-linear maps for some constant k; and (b) that certain random O(k)-constraint satisfaction problems (CSPs) are hard in an appropriate sense. The latest of these works (by Lin and Tessaro) assumes the existence of 3-linear maps and the hardness of certain random 3-CSPs. We have 1-linear maps since the 1970s; 2-linear maps since the 1990s; but the existence of 3-linear maps is wide open. On the other hand, we do have reasonable constructions of “secure” random 3-CSPs. The first part of the talk will describe these developments.

Much more surprising was a result (from the same work of Lin and Tessaro) which showed a construction from 2-linear maps and the hardness of random 2-CSPs over a large alphabet. Overnight, the burden of existence of IO went from the question of whether 3-linear maps exist to the completely unrelated question of whether random 2-CSPs over large alphabets is hard. In a nutshell, they require the existence of pseudo-random generators G: \Sigma^n \to {0,1}^m for some poly(n)-size alphabet \Sigma where each output bit depends on at most two input alphabet symbols, and which achieve sufficiently large stretch. In the second part of the talk, we will present a polynomial-time algorithm that breaks these random CSPs.

Based on joint work with Alex Lombardi (MIT) and Rachel Lin (UCSB).

## 2017/11/08: Ola Svensson, "A Constant-factor Approximation Algorithm for the Asymmetric Traveling Salesman Problem"

Ola Svensson (EPFL)

**Abstract:**

We give a constant-factor approximation algorithm for the asymmetric traveling salesman problem. Our approximation guarantee is analyzed with respect to the standard LP relaxation, and thus our result confirms the conjectured constant integrality gap of that relaxation.

Our techniques build upon the constant-factor approximation algorithm for the special case of node-weighted metrics. Specifically, we give a generic reduction to structured instances that resemble but are more general than those arising from node-weighted metrics. For those instances, we then solve Local-Connectivity ATSP, a problem known to be equivalent (in terms of constant-factor approximation) to the asymmetric traveling salesman problem.

This is joint work with Jakub Tarnawski and László Végh.

## 2017/10/25: Seth Pettie, "A Time Hierarchy Theorem for the LOCAL Model"

Seth Pettie (University of Michigan)

**Abstract:**

The celebrated *Time Hierarchy Theorem* for Turing machines states, informally, that more problems can be solved given more time. The extent to which a time hierarchy-type theorem holds in the classic distributed LOCAL model has been open for many years. In particular, it is consistent with previous results that all natural problems in the LOCAL model can be classified according to a small *constant *number of complexities, such as O(\log^* n), O(\log n), 2^{O(\sqrt{\log n})}, etc.

We establish the first time hierarchy theorem for the LOCAL model and prove that several *gaps* exist in the LOCAL time hierarchy. One of the gap results can be interpreted as showing that the distributed Lovasz local lemma is *complete* for randomized sublogarithmic time.

## 2017/10/11: Moses Charikar, "Hashing-based-Estimators for Kernel Density in High Dimensions"

Moses Charikar (Stanford)

**Abstract:**

Given a set of points in d dimensions, imagine putting a Gaussian distribution around each of them. How quickly can we evaluate the sum of these Gaussian densities at a new point? This computational problem (and its generalization for densities other than the Gaussian) is called kernel density estimation. This problem arises as a basic primitive in statistics (non-parametric density estimation), machine learning (kernel methods) and scientific computing (physical simulations).

The batch version of this question (compute the sum of n kernels at m given query points) is addressed by the celebrated fast multiple method from the late 80s which has linear or near-linear complexity for points in 2 and 3 dimensions. The high dimensional case has been challenging because at a high level, typical space partitioning approaches have an exponential dependence on the dimension.

In this talk, I will show that locality sensitive hashing (introduced in the late 90s for the approximate nearest neighbor problem in high dimensions) can be adapted to devise unbiased estimators for kernel density in high dimensions.

## 2017/09/27: Yuval Peres, "Trace reconstruction for the deletion channel"

Yuval Peres (Microsoft Research)

**Abstract:**

In the trace reconstruction problem, an unknown string $x$ of $n$ bits is observed through the deletion channel, which deletes each bit with some constant probability $q$, yielding a contracted string. How many independent outputs (traces) of the deletion channel are needed to reconstruct $x$ with high probability?

The best lower bound known is linear in $n$. Until recently, the best upper bound was exponential in the square root of $n$. We improve the square root to a cube root using statistics of individual output bits and some complex analysis; this bound is sharp for reconstruction algorithms that only use this statistical information. (Similar results were obtained independently and concurrently by De, O’Donnell and Servedio). If the string $x$ is random and $q<1/2$, we can show a subpolynomial number of traces suffices by comparison to a biased random walk.

(Joint works with Fedor Nazarov, STOC 2017 and with Alex Zhai, FOCS 2017).