Testing in the Modern World — STOC 2026 Workshop

Date:: June 25-26, 8:30am-11:00am
Location:: Alpine Ballroom A
Contact:: Anindya De <anindyad@seas.upenn.edu>
Shivam Nadimpalli <shivamn@mit.edu>

Property testing asks when we can make reliable global decisions about massive objects (such as graphs, probability distributions, and Boolean functions) by inspecting only tiny parts of them. This simple idea has led to a rich theory, with deep connections to combinatorics, complexity theory, analysis of Boolean functions, and sublinear-time computation.

Many classical questions in the field, especially those motivated by Boolean-function query models and complexity theory, are now reaching a mature stage. This workshop will look ahead to new models and problem formulations that arise naturally in data science, statistics, and machine learning.

The goal is to bring together researchers in property testing, learning theory, complexity theory, statistics, and machine learning to revisit classical problems through modern lenses and highlight promising new directions.

Schedule

Click on talk titles or arrows for abstracts.

June 25, 8:30am-11:00am
8:30-8:35	Opening remarks
8:35-9:20	On Passive Models of Property TestingShivam Nadimpalli (MIT)slides Property testing traditionally studies what can be decided about a large object using a small number of carefully chosen queries. In many statistical and applied settings, however, such query access is unavailable: the algorithm must instead work passively with samples generated by the object or by an associated measurement process. Classical passive models of property testing often lead to strong negative results; for example, Boolean function monotonicity and linearity testing. In this talk, we will highlight a complementary perspective: there are natural and important settings where, despite having only sample access, one can obtain surprisingly strong positive results. I will focus on two such sources of passive access. The first is truncated statistics, where samples are observed only after conditioning on an unknown event, introducing both information loss and sampling bias. The second comes from signal processing and sparse recovery, where one observes random measurements, projections, or samples rather than individual coordinates of an unknown signal.
9:20-10:10	Testing high-dimensional distributions with subcube conditioningErik Waingarten (University of Pennsylvania)slides I'll discuss distribution testing in high-dimensional settings, and how the curse of dimensionality leads to sample complexities which are exponential in the underlying dimension. To get around these exponential lower bounds, we study subcube conditioning --- a query oracle which allows the algorithm to receive samples conditioned on them lying in a particular subcube of the domain. I'll discuss the new algorithmic techniques that this oracle allows us to explore, recent and not-so-recent results, and highlight some open problems.
10:10-11:00	Your data is blue, but can you prove it?Tal Herman (UC Berkeley and MIT)slides Given i.i.d. samples from an unknown distribution, what can we learn about the distribution? How many samples are required to determine whether the distribution admits various properties? These are the central questions of the field of distribution testing, and they have spurred a rich body of work, establishing a deep understanding of the computational complexity associated with such tasks, and their many applications across TCS. In this talk we explore a novel perspective on this topic: what is the complexity of verifying claims about distributions? Suppose some untrusted party allegedly drew many samples from the distribution, ran some complicated analysis on the samples, and claims that the distribution admits a certain property (for example, that the distribution has high entropy, is monotone, is far from uniform, etc.). Can they provide a concise proof of their claim? Can we verify their proof with fewer resources than required for testing the property? We review a recent line of work addressing these questions, where verification is considered via interactive proof systems (as defined in the seminal work of Goldwasser, Micali, and Rackoff '85, and introduced to this setting by Chiesa and Gur '17). We will discuss efficient proof systems for rich families of distribution properties, known lower bounds and limitations of such proofs, and present avenues for future research and possible applications for verification of data-intensive algorithms.
June 26, 8:30am-11:00am
8:30-9:20	Boolean function property testing: the next generationRocco Servedio (Columbia University)slides In the original model of property testing of Boolean functions, the goal is to determine whether an unknown and arbitrary function has a particular property versus is far from every function with the property. After several decades of study, essentially optimal bounds have been established for some fundamental testing problems in this original framework. Much less is known, though, for natural (and more challenging) variants of the original model such as tolerant property testing and relative-error property testing. This talk will survey the state of the art in these Boolean function property testing models, with an emphasis on open questions and directions for future work.
9:20-10:10	Quality control in sublinear timeCassandra Marcussen (Harvard University)slides In this talk, I will introduce "quality control problems." This is a class of problems that assess when it is safe to apply an algorithm that works well on average to a specific input. Specifically, a quality control algorithm for a distribution D and a real-valued quality measuring function rho that is concentrated around 1 for inputs from D, takes an input x and rejects every x whose quality is not close to 1, while accepting most inputs from D. Our goal is to design quality control algorithms that perform this assessment efficiently, for example, in sublinear time. Quality control problems are thus asymmetric hypothesis testing problems with worst-case soundness guarantees ("no false positives") and completeness on average ("low probability of false negatives"). An input passed by the quality control algorithm is thus safe to run on downstream algorithms whose correctness is assured on high-quality instances. In this talk, I will describe some quality control algorithms that use k-clique counts (for constant k) as the quality measure on random graphs G_{n,p}. I will show that quality control algorithms are provably faster than worst-case algorithms for clique-count estimation. I will also highlight two key techniques behind these results: composability and graph quasirandomness. Based on joint work with Ronitt Rubinfeld and Madhu Sudan.
10:10-11:00	Testable Sanitization via Boundary GeometryManolis Zampetakis (Yale University)slides Machine learning models are prone to attacks where an adversary changes the model's prediction by adding a small perturbation to the input. Two such well-known threats in classification are: 1) backdoor attacks, where a malicious trainer plants a hidden structure in the model that allows them to flip the label by small alterations of the input, and 2) adversarial examples, which arise even in honestly trained models, where an adversary can efficiently find a nearby point with the opposite label for most of the input points. Both of these liabilities share a common geometric root: the decision boundary of the model passes too close to too many data points. In this work, we introduce testable sanitization, a framework that either improves the decision boundary of a classifier or rejects it as too risky to deploy. More formally, we ask for an algorithm that, given oracle access to a classifier f and sample access to a data distribution D, either accepts and outputs a function g which is ε-close to f and robust to small perturbations, or rejects. The algorithm performs testable sanitization of the function class F if it is guaranteed to accept whenever f belongs to F. Next, we construct an algorithm that performs testable sanitization of several interesting function classes. First, our algorithm performs testable sanitization of the class of functions with a small decision boundary for accuracy ε that depends on the boundary size. We prove that this dependence is essentially optimal, i.e., no testable sanitization algorithm of this class can achieve arbitrary accuracy ε. Second, we show our algorithm achieves testable sanitization of the class of functions with a small and smooth decision boundary, for arbitrary accuracy ε. Finally, we show that our results apply to the class of halfspaces and of polynomial threshold functions.