PRIMES: Research Papers

2025 Research Papers

484) Thanh Can (PRIMES) and Thomas Rüd, Measures on Cameron's treelike classes and applications to tensor categories (4 March 2026)

Measures on Fraïssé classes are a key input in the Harman-Snowden (2022) construction of tensor categories. Treelike Fraïssé classes provide a particularly tractable source of examples. In this paper, we complete the classification of measures on Cameron's elementary treelike classes. In particular, for the class $\partial \mathfrak{T}_3(n)$ of node-colored rooted binary tree structures with $n$ colors, we classify measures by an explicit bijection with directed rooted trees edge-labeled by $\{1, \dots, n\}$ with a distinguished vertex, yielding $(2n+2)^n$ distinct $\mathbb{Z}\left[\frac 12\right]$-valued measures. For each $n \geq 1$, we use a family of measures $\mu_n^I$ and their supports $\partial \mathfrak{T}_3(n)^{\mathrm{ord}}_I$ (where $I \subseteq \{1, \dots n\}$) to construct the Karoubi envelopes $\mathbf{Rep}(\partial \mathfrak{T}_3(n)^{\mathrm{ord}}_I;\mu^I_n)$, producing infinite families of semisimple tensor categories with superexponential growth that cannot be obtained via Deligne's interpolation of representation categories. We also prove the nonexistence of measures on the $n$-colored tree class $C_n\mathfrak{T}$ for $n \geq 2$ and the labeled tree class $L \mathfrak{T}$, extending Snowden's results for uncolored trees.

483) Sophia Hou, Field of definition of abelian surfaces with maximal Picard rank (18 Feb 2026)

We study the field of definition of abelian surfaces of maximal Picard rank and of the closely related singular K3 surfaces. Such abelian surfaces decompose as a product of two isogenous CM elliptic curves and are determined (up to isomorphism) by an integer parameter measuring the relative conductors of its elliptic curve factors and by a single CM elliptic curve. In the key case where the first parameter is a power of a prime, we show that the smallest number field over which the surface can be defined is related to the ring class field attached to the larger conductor: it is either the full ring class field or its maximal real subfield, with the two cases distinguished precisely by whether the elliptic curve parameter’s j-invariant is real. Our approach uses explicit constructions from CM theory and class field theory. For the field of definition of K3 surfaces, we use the classification via quadratic forms given by Shioda and Inose to show that for a K3 surface associated with a quadratic form that has trivial square in the form class group, a smaller field of definition than the appropriately chosen ring class field can be achieved.

482) Advaith Mopuri and Maggie Shen, Investigating Particle Properties of Saturn’s Narrow Rings from Diffraction Reconstructed Profiles Obtained from Cassini Radio Science (17 Feb 2026)

The Cassini mission’s Radio Science Subsystem (RSS) conducted occultation observations of the rings by transmitting coherent radiation at wavelengths of 0.94cm (Ka band), 3.6cm (X band), and 13cm (S band) into several Deep Space Networks (DSNs) across the globe. Diffraction effects for each band are primarily caused by particles comparable in size to their wavelengths. As such, comparing the wavelength dependence of optical depth to values predicted by Mie scattering theory allows us to constrain the power-law size distribution of particles in Saturn’s rings. We utilize a novel highresolution reconstruction method to obtain the diffraction-corrected optical depth of narrow ringlets at 3 wavelengths. From these measurements, we infer that there are significant regional differences in size distributions throughout various narrow ringlets, which may hold clues about their varying dynamical environments. In particular, we identify differences in particle size distributions between the F ring, the structurally similar Strange ringlet, and the C ring plateaus. Our results indicate that the F ring properties are different from those of other ring regions, which may be related to the speculated clumpy nature of the F ring itself.

481) Kyle Wu, On Number Fields with Unit Group of a Prescribed Reduction (15 Feb 2026)

We investigate the attainability of various subgroups of $\mathbb F_{p^n}^\times$ as images of the unit group of a number field under reduction modulo an inert prime. We prove several results about possible images under reduction when fixing a finite field $\mathbb F_{p^n}$ and varying the number field $K$ of degree $n$ in which $p$ is inert. Using the finite-field norm, we fully describe the maximal image for general $n$ and obtain a complete description of the possible images in the quadratic case. We also consider the analogous problem for unit groups of non-maximal orders of a quadratic number field, where the number field is fixed and the order is varied. Similarly, we consider the analogous problem for $S$-unit groups of localizations of the ring of integers, where the number field is fixed and the choice of localization is varied.

480) Aiden Jeong, Bounds on the Distinguishing (Chromatic) Number of Posets (13 Feb 2026)

In 2021, Collins and Trenk introduced the distinguishing number and distinguishing chromatic number of posets as analogs for the distinguishing (chromatic) number of graphs. A coloring $c$ of a poset $P$ is distinguishing if there are no nontrivial automorphisms of $P$ preserving $c$. The distinguishing number $D(P)$ is the minimum number of colors in a distinguishing coloring. The distinguishing chromatic number $\chi_D(P)$ is the minimum number of colors in a proper distinguishing coloring. We present the bound $D(L)\le |Q_L|-h(L)+2$, where $L$ is a lattice with join-irreducible set $Q_L$ and height $h(L)$. For distributive lattices, Collins and Trenk showed that $\chi_D(L)\le |Q_L|+\chi_D(Q_L)-1$ when $\chi_D(Q_L)\ge 3$. We improve the bound to $|Q_L|+k$, where $k\le 3$ is determined by $Q_L$. Our bound is sharp for boolean lattices. We also establish bounds on the distinguishing number of graded lattices via the motion lemma, and we compute the distinguishing (chromatic) number of the Young-Fibonacci lattice.

479) Anay Aggarwal, Computer-Aided Discovery of Extremal Unit-Distance Graphs & Quantum Contextuality (11 Feb 2026)

We develop a general framework for the computer-aided discovery of extremal unit-distance graphs (UDGs) in Euclidean spaces and on spheres, motivated by problems in Euclidean Ramsey theory and quantum contextuality. Our methods aim to construct UDGs with large edge-density. We introduce two main computational paradigms: approximation-based methods, which relax unit-distance constraints via $(\varepsilon, \delta)$-graphs, and lattice-based methods, which reduce the infinite search space using number-theoretic lattices. We implement and compare three algorithmic approaches—reinforcement learning, simulated annealing, and numerical optimization—demonstrating their effectiveness in constructing dense planar UDGs and offering promising pathways toward new bounds in quantum contextuality. Our framework adapts to higher-dimensional spaces via the Raiskii spindle and naturally embeds into the sphere $\mathbb{S}^d_{1/\sqrt{2}}$, discovering novel UDGs in both cases. This work lays the foundation for computational discovery of many types of extremal UDGs.

478) Jiayu (Jerry) Liu (PRIMES) and Nathan Williams, Full-Twist Presentations for Fundamental Groups of Complements of Complexified Hyperplane Arrangements (8 Feb 2026)

This paper conjectures full-twist presentations for the fundamental group of the complement of a complexified finite central real hyperplane arrangement. We first investigate the case of Coxeter arrangements, generalizing the classical Artin presentation of pure braid groups. Building on Salvetti’s combinatorial model for arrangement complements, we introduce a family of generators arising from a minimal gallery of regions and describe relations produced from rank two subarrangements. We then focus on the notion of the full twist and conjecture a presentation of the fundamental group by equating all the reduced words for the full twist. We will prove this conjecture for Coxeter arrangements of types $A$, $B$, $D$, $H_3$, $I_2(m)$, and $F_4$.

477) Luv Udeshi (PRIMES) and Brynmor Chapman, Extending $\mathsf{CC}^0$ circuit upper bounds beyond symmetric functions (3 Feb 2026)

The class $\mathsf{CC}^0[m]$ of constant‐depth circuits built from unbounded‐fan‐in $\operatorname{MOD}_m$ gates reveals a remarkable landscape: when $m$ is prime, computational ability is limited in small size, yet composite $m$ unlocks unexpectedly rich capabilities. In this work, we illuminate and extend the true power of composite‐modulus counting by generalizing Chapman and Williams' result for computing symmetric functions in $\mathsf{CC}^0$, helping to support the fact that $\mathsf{CC}^0[m]$ is a versatile class capable of capturing complex structures with surprising efficiency. We additionally provide a Satisfiability Modulo Theories (SMT)-based framework for explicitly constructing and enumerating small $\mathsf{CC}^0[m]$ circuits, offering a computational tool for future research.

476) Bryan Zhu (PRIMES) and Ziang Chen, Exact Instance Compression for Convex Empirical Risk Minimization via Color Refinement (arXiv.org, 31 Jan 2026)

Empirical risk minimization (ERM) can be computationally expensive, with standard solvers scaling poorly even in the convex setting. We propose a novel lossless compression framework for convex ERM based on color refinement, extending prior work from linear programs and convex quadratic programs to a broad class of differentiable convex optimization problems. We develop concrete algorithms for a range of models, including linear and polynomial regression, binary and multiclass logistic regression, regression with elastic-net regularization, and kernel methods such as kernel ridge regression and kernel logistic regression. Numerical experiments on representative datasets demonstrate the effectiveness of the proposed approach.

475) Eddy Li (PRIMES) and Kenta Suzuki, Standard modules of the Temperley-Lieb algebra at zero (30 Jan 2026)

We explicitly describe the category of modules of the Temperley-Lieb algebra $TL_n(\beta)$ under specialization $\beta=0$ for even $n$ in terms of a quiver algebra, analogous to a result of Berest-Etingof-Ginzburg. In particular, we explicitly construct an exact sequence of the standard modules of $TL_n(0)$, which categorifies a numerical coincidence regarding the evaluation of the Jones polynomial at $t=-1$. We furthermore deduce a consequence in the representation theory of symmetric groups over characteristic two.

474) Eddy Li, Knot polynomials on braid closures (30 Jan 2026)

We explore properties of the Jones and Alexander polynomials of braid closures of braids of index $3$. We analyze how the evaluations of these polynomials at $t=-1$ force implications regarding the topological structure of the braid closure, such as its number of circles or the existence of splittings. As a consequence, we construct an infinite number of pairs of distinct non-split links that have the same HOMFLYPT polynomial.

473) Grant Blitz, Darren Han, and Hengrui Liang, Divisibility in power monoids (29 Jan 2026)

Let $M$ be a commutative monoid. The (restricted) finitary power monoid of $M$ is the monoid consisting of all finite nonempty subsets of $M$ (containing a unit) under the so called sumset operation. We say that $M$ possesses the MCD (resp., MCD-finite) property if every nonempty finite subset of $M$ admits at least one (at most finitely many) maximal common divisors (MCD). In this paper we investigate divisibility conditions based on MCDs in the class of finitary power monoids. Our first goal is to study the ascent of the MCD and MCD-finite properties from the base monoid $M$ to its finitary power monoid $\mathcal{P}_{\mathrm{fin}}(M)$ and its restricted finitary power monoid $\mathcal{P}_{\mathrm{fin},\mathcal{U}}(M)$. We prove that both properties ascend in full generality: if $M$ is an MCD monoid (resp., an MCD-finite monoid), then so is $\mathcal{P}_{\mathrm{fin}}(M)$ (resp., $\mathcal{P}_{\mathrm{fin},\mathcal{U}}(M)$). Then we turn to the irreducible-divisor-finite (IDF) property, whose ascent to polynomial extensions has been considered over the past three decades by many authors, including Malcolmson, Okoh, and Zafrullah. In this direction, we prove that the IDF property ascends to finitary power monoids over the class of MCD-finite monoids, whose analogue for polynomial extensions was established in 2018 by Eftekhari and Khorsandi. In the final section we consider polynomial extensions: first, we prove that the MCD-finite property ascends to polynomial extensions, and then we prove that every primitive-super-primitive monoid (PSP monoid) possesses the MCD-finite property, which allows us to connect two recent results about the ascent of the IDF property to polynomial extensions.

472) Jialai She (PRIMES), Gil Alterovitz, SOLVE: A structured orthogonal latent variable framework for disentangling confounding in matrix data (28 Jan 2026), published in Biology Methods and Protocols 11:1 (2026); see reports at AAAS EurekaAlert! and MedicalXpress

Latent factor models are valuable in bioinformatics for accounting for unmeasured variation alongside observed covariates. Yet many methods struggle to separate known effects from latent structure and to handle losses beyond standard regression. We present a unified framework that augments row and column predictors with a low-rank latent component, jointly modeling measured effects and residual variation. To remove ambiguity in estimating observed and latent effects, we impose a carefully designed set of orthogonality constraints on the coefficient and latent factor matrices, relative to the spans of the predictor matrices. These constraints ensure identifiability, yield a decomposition in which the latent term captures only variation unexplained by the covariates, and improve interpretability. An efficient algorithm handles general non-quadratic losses via surrogates with monotone descent. Each iteration updates the latent term by truncated singular value decomposition of a doubly projected residual and refines coefficients by projections. The number of latent factors is selected by applying an elbow rule to a degrees-of-freedom-adjusted information criterion. A parametric bootstrap provides valid inference on feature-outcome associations under the regularized low-rank structure. Applied to real pharmacogenomic data, the method recovers biologically coherent gene-drug associations missed by standard factor models, such as the EGFR-inhibitor link, highlights novel candidates with plausible mechanisms, and reveals gene programs aligned with compound modes of action, including a latent unfolded-protein-response module affecting drug sensitivity. These results support the framework’s utility for precision oncology, yielding stronger biomarkers for patient stratification and deeper insight into drug resistance mechanisms.

471) Neil Kolekar, Maiya Qiu, and Richard Wang, On irreducible polynomial with positive integer coefficients (23 Jan 2026)

We investigate the distribution of irreducible polynomials within the semidomain of polynomials with non-negative integer coefficients. Our main result establishes that the atomic density of this structure is $1$; that is, asymptotically almost all polynomials with nonnegative integer coefficients are irreducible. We contrast this with the set of polynomials having coefficients restricted to zero and one, proving that their atomic density is exactly $1/2$. Furthermore, we derive improved asymptotic bounds for the number of reducible polynomials with bounded degree and height. Finally, we apply these global density results to the local setting, providing a new proof of a Goldbach-type theorem for Laurent polynomials.

470) Timothy Chen, Tony Lu, and Alan Yao, On the additive structure of simple semiring extensions (22 Jan 2026)

For $\alpha \in \mathbb{C}$, let $\mathbb{N}_0[\alpha]$ be the subsemiring of $\mathbb{C}$ obtained as a homomorphic image of the $\alpha$-evaluation map $\mathbb{N}_0[x] \to \mathbb{C}$ defined as $p(x) \mapsto p(\alpha)$ for each polynomial $p(x) \in \mathbb{N}_0[x]$. Fundamental arithmetic and atomic aspects of the additive structure of $\mathbb{N}_0[\alpha]$ were first studied by the second author and Correa–Morris in 2022 under the assumption that the $\alpha \in \mathbb{R}$. In this paper, we continue the investigation, now from the valuation–theoretic perspective and assuming the more general case of $\alpha \in \mathbb{C}$. Let $\mathcal{V}$ denote the class consisting of all the semirings $\mathbb{N}_0[\alpha]$ containing no additive irreducibles (these are precisely those having non-atomic additive structure). We show that for any algebraic number $\alpha$ the additive monoid of $\mathbb{N}_0[\alpha]$ is isomorphic to the direct product of finitely many isomorphic valuation monoids (i.e., monoids whose principal ideals form a chain under inclusion). Moreover, for any algebraic number $\alpha \in (0,1)$, the semiring $\mathbb{N}_0[\alpha]$ belongs to $\mathcal{V}$ if and only if $\alpha^{-1}$ is a Perron number having no other positive conjugates besides itself. In addition, we offer a description of the algebraic parameters $\alpha$ for which the additive structure of $\mathbb{N}_0[\alpha]$ is a valuation monoid. We also argue that the subset of $(0,1)$ consisting of all algebraic parameters $\alpha$ such that the additive structure of $\mathbb{N}_0[\alpha]$ is a valuation monoid is dense in $(0,1)$. Finally, we consider some atomic and ideal theoretical aspects of the monoids $M_\alpha$, identifying various classes of algebraic parameters $\alpha$ for which $M_\alpha$ is atomic but does not satisfy the ACCP. Finally, we construct non-ACCP atomic monoids $M_\alpha$ whose sets of lengths are arithmetic sequences of a prescribed difference.

469) Bofan Liu, Seabert Mao, and Michael Zhao, Prime element stability in ring extensions of integral domains (22 Jan 2026)

The behavior of prime elements under ring extensions of integral domains is a fundamental topic in commutative algebra. Given an extension of integral domains $R \subseteq T$ and a prime element $p$ of $R$, we identify conditions under which $p$ remains prime in intermediate rings. Assuming that $p$ is prime in $T$ , we prove that $p$ remains prime in every intermediate ring whenever $T$ is an integral overring of a $1$-dimensional domain $R$. Furthermore, we show that if $p$ is coprime to the conductor of the extension $R \subseteq T$, then $p$ remains prime in $T$ and all intermediate rings. Next, with the help of a result on prime behavior in minimal extensions, we prove that this prime stability holds for any extension satisfying the FCP condition, i.e., every chain of distinct intermediate rings between $R$ and $T$ is finite. Finally, we determine that if an extension $R \subset T$ satisfies prime stability for a given prime element $p$ and $v_p(r)$ is finite for all nonzero $r \in R$, then $T$ must be an overring of $R$.

468) Jessica Hu, Aryan Raj, and Shijie Zhang (MIT), Sparse Inference of Nonlinear Laboratory Earthquake Dynamics (18 Jan 2026)

The frictional state of faults plays a fundamental role in earthquake nucleation and recurrence. Laboratory earthquake (“labquake”) experiments provide controlled conditions for investigating stick-slip dynamics, yielding high-resolution data on shear stress, gouge evolution, and acoustic emissions (AEs). While prior work has shown that AEs can be used to forecast failure times with machine learning, such models provide limited physical interpretability and require large datasets. Moreover, mechanistically discovering the governing equations of labquake systems remains difficult due to their nonlinear and stochastic nature. In contrast, data-driven inference of governing equations offers a transparent and flexible pathway to uncover the physical rules behind fault stress evolution. Here, we introduce a novel stochastic Bayesian inference framework to systematically analyze labquake frictional dynamics. We infer governing stochastic differential equations (SDEs) with inhomogeneous Poisson processes, incorporating both microslip and major slip failures. Our results reveal that the growth function for fault stress follows a nonlinear hyperbolic sine relation with respect to the frictional state, inferred using the ADAM-SINDy framework, which combines sparse nonlinear model discovery with Adam optimization. This approach provides physically interpretable governing equations, bridging the gap between phenomenological labquake data and theoretical friction laws. More broadly, our methodology demonstrates how stochastic inference can mechanistically uncover governing equations in complex natural systems such as landslides and fracture mechanics.

467) Maxwell Fishelson (MIT) and Michael Han, Improved Bounds for Novelty Games (13 Jan 2026)

We study the \emph{novelty game}, a combinatorial problem in which $pk$ integers from $[1,N]$ are distributed evenly among $p$ non-communicating players, who each output $m$ numbers. The players must collectively ensure that at least one of them outputs a number not in the original list. Focusing on oblivious strategies, we propose a new framework for novelty games and then introduce a sequence of six optimizations based off that framework. These optimizations lead to an improvement on the upper bound compared to the previous state of the art. More specifically, we improve the bound for the $(3,2,1)$ game from approximately $1.71 \times 10^6$ to $193{,}050$, which is a reduction of over $99.8\%$. Our techniques also lead to exponential improvements in the general $(p,k,1)$ game, with a reduction of $e^{2k^\frac{p}{2}}\prod_{i=0}^{p-1}(k^i)!$. We also provide two conjectures on the lower bounds of the novelty game and conjecture that our upper bound is tight up to subexponential factors.

466) Kai Lum, Extremal Structural Results for Feedback Arc Sets and Graph Inversions (12 Jan 2026)

In a digraph, a feedback arc set is a set of edges whose removal eliminates every directed cycle, and the minimum size of such a set is denoted by $\beta(G)$. A digraph is $r$-free if it contains no directed cycles of length at most $r$. In this paper, we investigate the minimum feedback arc set in digraphs that are $(r-1)$-free. We prove that $\beta(G) \leq 1$ if $r > \lfloor\frac{2n}{3}\rfloor$, and $\beta(G) \leq 2$ if $r > \frac{n}{2}$ with a forbidden structure. We also present an efficient linear-time algorithm to identify the minimum feedback arc set when $\beta(G) = 1$. For tournaments, we further refine the extremal parameter $\operatorname{inv}_k(n)$. It is the minimum number of inversions required to transform an $n$-vertex tournament into an acyclic tournament, where each step involves reversing all edges within a subset of at most $k$ vertices. We improve the known upper bound for $\operatorname{inv}_4(n)$ using techniques involving Ramsey numbers with monochromatic subgraph structures, and the bound for $\operatorname{inv}_k(n)$ with two different approaches.

465) Ashley Yu, Time Efficient Swap Regret Minimization (8 Jan 2026)

No-regret learning algorithms provide principled frameworks for multi-agent decision-making, with swap regret minimization enabling convergence to correlated equilibria, a stronger solution concept than the coarse correlated equilibria achieved by external regret algorithms. The classical Blum-Mansour (BM) algorithm achieves optimal $O(\sqrt{NT \log N})$ swap regret bounds, but computing the stationary distribution of an $N \times N$ Markov chain at each iteration requires $O(N^3)$ time complexity that severely limits scalability. We propose a novel approach that replaces exact stationary distribution computation with efficient sampling-based estimation, reducing per-iteration complexity from $O(N^3)$ to $O(N)$ while maintaining the fundamental structure of the original algorithm.

464) Aaryan Arora, MT-BN: Multi-Scale Topological Bayesian Networks for Tractable, Interpretable Structure Learning in $p \gg n$ (2 Jan 2026)

High-dimensional datasets with far more variables than samples ($p \gg n$) overwhelm classical flat Bayesian-network learners: their search space over directed acyclic graphs (DAGs) grows super-exponentially and, by treating all nodes at one level, they ignore the modular, multi-scale organization that real systems exhibit, hurting both computational tractability and interpretability. We introduce MT-BN (Multi-Scale Topological Bayesian Networks), a Bayesian structure-learning framework that infers an adaptive hierarchy of modules and learns directed influence networks at multiple resolutions, while defining within-resolution directionality on resolution-specific innovations rather than on inherited shared signal. Concretely, MT-BN learns a nested partition of the $p$ variables (via a truncated nCRP prior), associates each module at each level with latent states, decomposes each state into inherited signal plus a level-specific innovation, and learns within-level DAGs exclusively on innovation latents to avoid spurious dependencies induced by common ancestry. Connectivity is regularized by topology-aware priors that encourage sparsity, hierarchy-consistent proximity structure, and heterogeneous hub degree profiles, and a hybrid inference pipeline combines variational inference for continuous latents with stochastic structure search over hierarchies and DAGs to yield scalable computation and edge posterior summaries.

463) Adam Ge, Fairness in Embedding-Based Machine Learning Models (31 Dec 2025)

Fairness in machine learning has become a critical concern, particularly for decision making systems that rely on learned representations and are trained on data containing historical and societal biases. In this work, we study fairness in embedding-based models from two complementary perspectives. First, we examine gender bias in text embeddings produced by pretrained language models and propose a baseline method based on sparse autoencoders to disentangle a gender-related feature and mitigate bias at the embedding level. While effective for natural language data, this approach relies on carefully constructed contrasting examples and is difficult to extend to other data modalities.
To address this limitation, we propose IterativeSifting, a general framework for improving fairness in embedding-based decision making models. IterativeSifting iteratively identifies and removes latent features and proxy information associated with one or more sensitive attributes, including their intersections, while preserving task-relevant information for accurate prediction. The method is model-agnostic and applicable to a wide range of data types, including tabular and graph-structured data.
We evaluate IterativeSifting on standard fairness benchmarks, including the Adult Census Income and ACSIncome datasets, using gender and race as sensitive attributes. Experimental results show that IterativeSifting substantially reduces sensitive attribute information in learned embeddings and significantly improves intersectional fairness, as measured by mutual information and maximum equalized odds difference, while maintaining competitive predictive performance. These results demonstrate the effectiveness of IterativeSifting as a practical approach for mitigating bias in embedding-based decision making systems.

462) Aadya Goel and Mayuri Sridhar (MIT), Delete and Retain: Efficient Unlearning for Document Classification (20 Dec 2025)

Machine unlearning aims to efficiently remove the influence of specific training data from a model without full retraining. While much progress has been made in unlearning for LLMs, document classification models remain relatively understudied. In this paper, we study class-level unlearning for document classifiers and present Hessian Reassignment, a two-step, model-agnostic solution. First, we perform a single influence-style update that subtracts the contribution of all training points from the target class by solving a Hessian–vector system with conjugate gradients, requiring only gradient and Hessian-vector products. Second, in contrast to common unlearning baselines that randomly reclassify deleted-class samples, we enforce a decision-space guarantee via Top-1 classification. On standard text benchmarks, Hessian Reassignment achieves retained-class accuracy close to full retrain-without-class while running orders of magnitude faster. Additionally, it consistently lowers membership-inference advantage on the removed class, measured with pooled multi-shadow attacks. These results demonstrate a practical, principled path to efficient class unlearning in document classification.

461) Shihan Kanungo, Upper Bounds for Sequence Saturation (arXiv.org, 19 Dec 2025)

In this paper, we study the saturation function $\mathrm{Sat}(n,u)$ for sequences. Saturation for sequences was introduced by Anand, Geneson, Kaustav, and Tsai (2021), who proved that $\mathrm{Sat}(n,u)=O(n)$ for two-letter sequences $u$ and conjectured that this bound holds for all sequences. We present an algorithm that constructs a $u$-saturated sequence on $n$ letters and apply it to show $\mathrm{Sat}(n,u)=O(n)$ for several families of sequences $u$, including all repetitions of the form $abcabc\dots$. We further establish $\mathrm{Sat}(n,u)=O(n)$ for a broad class of sequences of the form $aa\dots bb$. In addition, we prove that for most sequences $u$, there exists an infinite $u$-saturated sequence. For three-letter sequences of the form $abc\dots xyz$, where $a,b,c$ are distinct and $xyz$ is a permutation of $abc$, we show—under certain structural assumptions on $u$—that $\mathrm{Sat}(n,u)=O(n)$. Finally, we describe a linear program that computes the exact value of $\mathrm{Sat}(n,u)$ for arbitrary $n$ and $u$.

460) Albert Lu, Alcatraz: Secure Remote Computation via Sequestered Encryption in Minimally Trusted Hardware (8 Dec 2025)

This paper introduces “Alcatraz,” a new architecture that enables secure remote computation with minimal trust in the hardware. In Alcatraz, sensitive data is always encrypted, except when it is inside a small, trusted circuit, which is composed of an Arithmetic Logic Unit (ALU) gated by a decryption and encryption engine. By design, the internal states of the trusted circuit is inaccessible from any software, and unencrypted data is never exposed outside the trusted circuit. Thus it is extremely difficult for any attacker to gain information about the sensitive data by observing or attacking other parts of the processor and computer, such as registers or caches, or by exploiting any microarchitectural side channels.
We implemented Alcatraz on a field-programmable gate array (FPGA), and verified with a formal proof that the circuit is secure at the wire-level, which is stronger than the register-transfer-level (RTL) security proved previously. Wire-level verification has the benefit that it’s much closer to the physical reality, i.e., the timing and level of signals on the wire, that may be observed by attackers.
We apply Alcatraz to single-server private information retrieval and estimate based on benchmark that Alcatraz achieves 7x to 21x speedup when compared with the current state-of-the-art approach for private information retrieval. Our approach also reduces the communication size by five orders of magnitude.

459) Michael Middlezong, Lucas Qi, and Thomas Rüd (MIT), Orbital Integrals over Linear Groups as Local Densities (31 Oct 2025)

Orbital integrals are central to the representation theory of reductive groups, with applications to the trace formula, isogeny classes of elliptic curves, and the (relative) Langlands program. Yet explicit computations are difficult beyond $\mathrm{GL}_2$. Building on methods of Gekeler and Achter--Gordon, we extend finite counting techniques for orbital integrals to $\mathrm{GL}_n(\mathbb{Q}_p)$ and $\mathrm{SL}_n(\mathbb{Q}_p)$. For $\mathrm{GL}_n$, we relate orbital integrals to limits of density ratios over finite quotients of $\mathbb{Z}_p$, yielding explicit formulas with respect to the geometric measure. We further treat arbitrary bi-$\mathrm{GL}_n(\mathbb{Z}_p)$-invariant spherical test functions that detect distinct double cosets. For $\mathrm{SL}_n$, we introduce new conjugacy criteria and a modified ratio that accounts for orbit splitting. In the case of $\mathrm{SL}_2$, we show that these limits recover orbital integrals for both geometric and canonical measures. In all settings, we prove that the corresponding ratios converge to the desired orbital integrals.

458) Bryan Zhu, Dimension Reduction for Smooth Convex Optimization via Color Refinement (15 Oct 2025)

Color refinement is a graph isomorphism routine that can be extended to matrices to produce equitable partitions. It can be used for dimension reduction for linear programs and convex linearly constrained quadratic programs. We propose a novel dimension reduction technique for smooth convex optimization problems. This approach is, to our knowledge, the first effort to extend techniques that exploit fractional automorphisms beyond quadratic forms to arbitrary smooth convex objectives and constraints. For polynomial cases, we further (i) generalize equitability from matrices to tensors of arbitrary rank, (ii) develop a higher-order color refinement algorithm to capture higher-dimensional symmetries, and (iii) prove the uniqueness of the resulting coloring. Our reductions are exact, preserving feasibility, optimality, and convexity. Experiments show that our method achieves substantial reduction in both dimension and runtime on existing convex quadratically constrained linear programs under rounding to various precisions, as well as synthetic convex quadratically constrained quadratic programs.

457) Amogh Akella (PRIMES), Rupert Li, Diameter Bounds for Friends-and-Strangers Graphs (arXiv.org, 27 Sept 2025)

Consider two $n$-vertex graphs $X$ and $Y$, where we interpret $X$ as a social network with edges representing friendships and $Y$ as a movement graph with edges representing adjacent positions. The friends-and-strangers graph $\mathsf{FS}(X,Y)$ is a graph on the $n!$ permutations $V(X)\to V(Y)$, where two configurations are adjacent if and only if one can be obtained from the other by swapping two friends located on adjacent positions. Friends-and-strangers graphs were first introduced by Defant and Kravitz, and generalize sliding puzzles as well as token swapping problems. Previous work has largely focused on their connectivity properties. In this paper, we study the diameter of the connected components of $\mathsf{FS}(X, Y)$. Our main result shows that when the underlying friendship graph is a star with $n$ vertices, the friends-and-strangers graph has components of diameter $O(n^4)$. This implies, in particular, that sliding puzzles are always solvable in polynomially many moves. Our work also provides explicit efficient algorithms for finding these solutions. We then extend our results to general graphs in two ways. First, we show that the diameter is polynomially bounded when both the friendship and the movement graphs have large minimum degree. Second, when both the underlying graphs $X$ and $Y$ are Erdős-Rényi random graphs, we show that the distance between any pair of configurations is almost always polynomially bounded under certain conditions on the edge probabilities.

456) Alyssa Yu (PRIMES), Laura P. Schaposnik, Dynamics of Infection Spread and Hotspot Growth in Bi-Pathogen Networks (3 Sept 2025)

Understanding the spatio-temporal evolution of epidemics with multiple pathogens requires not only new theoretical models but also careful analysis of their practical consequences. Building on the Multiplex Bi-Virus Reaction-Diffusion framework (MBRD) introduced in our companion paper, we investigate how the super-infection model (MBRD-SI) and the co-infection model (MBRD-CI) behave under different epidemiological and network conditions. Through numerical experiments, we study the effects of pathogen virulence, diffusion rates, and cross-diffusion on epidemic hotspot formation and long-term prevalence. Our results highlight the role of multiplex structure in amplifying or suppressing co-circulating infections, and provide quantitative insight into conditions that drive persistent epidemic patterns. Beyond epidemiology, these findings have broader implications for multiplex contagion processes such as information diffusion and malware propagation.

455) Alyssa Yu (PRIMES), Laura P. Schaposnik, Spatial Super-Infection and Co-Infection Dynamics in Networks (21 Aug 2025)

Understanding interactions between the spread of multiple pathogens during an epidemic is crucial to assessing the severity of infections in human communities. In this paper, we introduce two new Multiplex Bi-Virus Reaction-Diffusion models (MBRD) on multiplex metapopulation networks: the super-infection model (MBRD-SI) and the co-infection model (MBRD-CI). These frameworks capture two-pathogen dynamics with spatial diffusion and cross-diffusion, allowing the prediction of infection clustering and large-scale spatial distributions. We establish conditions for Turing and Turing-Hopf instabilities in both models and provide experimental evidence of epidemic pattern formation. Beyond epidemiology, we discuss applications of the MBRD framework to information propagation, malware diffusion, and urban transportation networks.

454) Jiya Dani, Anna Deng, Marly Gotti (Apple), Bryan Li, Arav Paladiya, Joseph Vulakh (MIT), and Jason Zeng (CrowdMath-2025), On the set of atoms and strong atoms in additive monoids of cyclic semidomains (arXiv.org, 15 Aug 2025), forthcoming in Communications in Algebra

Let $M$ be a cancellative and commutative monoid. A non-invertible element of $M$ is called an atom (or irreducible element) if it cannot be factored into two non-invertible elements, while an atom $a$ of $M$ is called strong if $a^n$ has a unique factorization in $M$ for every $n \in \mathbb{N}$. The monoid $M$ is atomic if every non-invertible element factors into finitely many atoms (repetitions allowed). For an algebraic number $\alpha$, we let $M_\alpha$ denote the additive monoid of the subsemiring $\mathbb{N}_0[\alpha]$ of $\mathbb{C}$. The atomic structure of $M_\alpha$ reflects intricate interactions between algebraic number theory and additive semigroup theory. For $m, n \in \mathbb{N}_0 \cup \{ \infty \}$ (with $m \le n$), the pair $(m,n)$ is called realizable if there exists an algebraic number $\alpha \in \mathbb{C}$ such that $M_\alpha$ has $m$ strong atoms and $n$ atoms. Our primary goal is to identify classes of realizable pairs with the long-term goal of providing a complete description of the full set of realizable pairs.

2024 Research Papers

453) Coleman DuPlessie, Dead Feature Counts in Sparse Autoencoders Predict Underlying Deep Q Networks’ Effectiveness (1 Jul 2025)

Sparse autoencoders (SAEs) are machine learning models that can be used to express the inner workings of certain other models as human-interpretable features. While sparse autoencoders work well when applied to language models, there has been little research that investigates the extent to which they generalize to other applications of machine learning. This work investigates the application of SAEs to a deep Q network trained to complete a simple task. We find that, although SAEs tend to perform well and find a number of human-interpretable features, they contain a large number of “dead features” that never activate, which suggests that more research is necessary to adapt SAEs to the unique tasks reinforcement learning models solve. In particular, we note that the most effective deep Q networks trained to complete a task tend to result in sparse autoencoders with a consistent quantity of dead features. This suggests that these sparse autoencoders may in some sense be capturing the “optimal” or “true” number of features needed to solve the toy problem we study, and the high number of dead features may simply imply that additional live features past a certain quantity are unhelpful.

452) Rohan Dhillon, Patterns in the Stable sl(N) Homology of Torus Knots (23 June 2025)

Gorsky, Oblomkov, and Rasmussen conjectured that the stable Khovanov homology of $T(n, \infty)$ — which is the limit of the Khovanov homology of the $(n,m)$-torus link as $m \to \infty$ — is isomorphic to the homology of a certain Koszul complex $W_n$. In this paper, we define a grading $L$ and conjecture that the $L$-homogeneous summands of the homology of $W_n$ satisfy a recursive relationship, reminiscent of the inclusion-exclusion principle, which would imply that the homology of $W_n$ is determined by finitely many bidegrees. We present theoretical and computational evidence for this relationship and discuss an analogous conjecture for $sl(N)$ and Lee homologies. We also make a conjecture concerning the maximal torsion order appearing in the homology of Koszul complexes corresponding to $sl(N)$ analogues of Lee homology.

451) Rajarshi Mandal (PRIMES), Ning Xie, Gil Alterovitz, Unveiling the Biochemical Mechanisms of Aging and the Implications of Oxidative Stress on Cellular Senescence through Multi-Omics Analysis of Fibroblasts (bioRxiv.org, 23 May 2025), published in 2025 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (IEEE CIBCB)

This research investigates the complex biochemical mechanisms underlying aging by analyzing primary human fibroblasts using a longitudinal multi-omics dataset. This dataset includes cytology, DNA methylation and epigenetic clocks, bioenergetics, mitochondrial DNA sequencing, RNA sequencing, and cytokine profiling. Key findings indicate that mitochondrial efficiency declines with age, while glycolysis becomes more prevalent to compensate for energy demands. Epigenetic clocks, such as Hannum and PhenoAge, showed strong correlations with biological age (ρ > 0.650, p < 1e-6), validating the experimental setup and confirming that the cultured fibroblasts were aging appropriately. Fibroblasts with SURF1 mutations exhibited accelerated aging, marked by bioenergetic deficits, increased cell volume, and reduced proliferative capacity, underscoring the pivotal role of mitochondrial dysfunction in cellular senescence. Novel insights were gained from analyzing cytokines like IL18 and PCSK9, some of which were linked to age-related diseases such as Alzheimer's and cardiovascular disorders. Experimental treatments revealed distinct effects on cellular aging. Dexamethasone reduced inflammation but also increased DNA methylation, induced metabolic inefficiencies, and shortened cellular lifespan. Oligomycin heightened oxidative stress and RNA degradation, emphasizing how such treatments contribute to cellular stress and metabolic imbalance while shedding light on aging mechanisms. By uncovering connections between mitochondrial dysfunction, epigenetic biomarkers, and immune dysregulation, this study identifies potential therapeutic targets for age-related diseases. Future research could validate the most promising biomarkers across diverse cell types and experimental treatments to build a more comprehensive understanding of aging.

450) Hannah Fox and Sammy Luo (MIT), Monochromatic components with many edges in random graphs (19 May 2025)

In an $r$-coloring of edges of the complete graph on $n$ vertices, how many edges are there in the largest monochromatic connected component? A construction of Gyárfás shows that for infinitely many values of $r$, there exist colorings where all monochromatic components have at most $\left(\frac{1}{r^2-r}+o(1)\right)\binom{n}{2}$ edges. Conlon, Luo, and Tyomkyn conjectured that components with at least this many edges are attainable for all $r \ge 3$. Conlon, Luo, and Tyomkyn proved this conjecture for $r=3$ and Luo proved it for $r=4$, along with a lower bound of $\frac{1}{r^2-r+\frac54}{n\choose 2}$ for all $r\ge 2$ and $n$. In this paper, we look at extensions of this problem where the graph being $r$-colored is a sparse random graph or a graph of high minimum degree. By extending several intermediate technical results from previous work in the complete graph setting, we prove analogues in both the sparse random setting and the high minimum degree setting of the bounds for $r=3$ and general $r$.

449) Arun S. Kannan (MIT) and Shihan Kanungo, Representation Theory of the Twisted Yangians in Complex Rank (arXiv.org, 9 May 2025)

In 2016, Etingof defined the notion of a Yangian in a symmetric tensor category and posed the problem to study them in the context of Deligne categories. This problem was studied by Kalinov in 2020 for the Yangian $Y(\mathfrak{gl}_t)$ of the general linear Lie algebra $\mathfrak{gl}_t$ in complex rank using the techniques of ultraproducts. In particular, Kalinov classified the simple finite-length modules over $Y(\mathfrak{gl}_t)$. In this paper, we define the notion of a \textit{twisted Yangian} in Deligne's categories we extend these techniques to classify finite-length simple modules over the twisted Yangians $Y(\mathfrak{o}_t)$ and $Y(\mathfrak{sp}_t)$ of the orthogonal and symplectic Lie algebras $\mathfrak{o}_t,\mathfrak{sp}_t$ in complex rank.

448) Nitya Mani (MIT) and Owen Jianwen Zhang, Tetrahedron-intersecting families of 3-uniform hypergraphs (7 May 2025)

An $H$-intersecting family is a collection of (hyper)graphs $\mathcal{F}$ on a fixed underlying set of labeled vertices, such that for each pair $G_1, G_2 \in \mathcal{F}$, the intersection $G_1 \cap G_2$ contains a subgraph isomorphic to $H$. Understanding how large $\mathcal{F}$ can be for a given $H$ is of great importance in extremal combinatorics and theoretical computer science. Ellis, Filmus, and Friedgut conjectured a tight upper bound on the size of a $K_t$-intersecting family, but only the cases of $t=3$ and $t=4$ have been resolved (by Ellis, Filmus, and Friedgut, and Berger and Zhao respectively). We resolve the case $t=5$. We also give the first resolution of an analogous conjecture in the hypergraph setting, giving a tight bound on the size of a tetrahedron-intersecting family of $3$-uniform hypergraphs.

447) Victor Gonzalez and Ishan Panpaliya (CrowdMath-2024), On GL-domains and the ascent of the IDF property (arXiv.org, 14 Apr 2025)

Following the terminology introduced by Arnold and Sheldon back in 1975, we say that an integral domain $D$ is a GL-domain if the product of any two primitive polynomials over $D$ is again a primitive polynomial. In this paper, we study the class of GL-domains. First, we propose a characterization of GL-domain in terms of certain elements we call prime-like. Then we identify a new class of GL-domains. An integral domain $D$ is also said to have the IDF property provided that each nonzero element of $D$ is divisible by only finitely many non-associate irreducible divisors. It was proved by Malcolmson and Okoh in 2009 that the IDF property ascends to polynomial extensions when restricted to the class of GCD-domains. This result was recently strengthened by Gotti and Zafrullah to the class of PSP-domains. We conclude this paper by proving that the IDF property does not ascend to polynomial extensions in the class of GL-domains, answering an open question posed by Gotti and Zafrullah.

446) Michael Lu and Tariq Osman (Brandeis University), Bounds for Smooth Siegel Theta Sums at Special Rational Parameters (30 Mar 2025)

We define inhomogeneous theta sums as exponential sums of the form \begin{equation*} S_M^f (X ; \boldsymbol{\alpha}, \boldsymbol{\beta}) := \sum_{\boldsymbol{k} \in \mathbb{Z}^n + \boldsymbol{\beta}} f(M^{-1} \boldsymbol{k}) e^{2 \pi i\left( \tfrac{1}{2} \boldsymbol{k} X {}^\mathrm{t}\!\boldsymbol{k} + \boldsymbol{k}^\mathrm{t} \boldsymbol{\alpha}\right)}, \end{equation*} where $X$ is an $n \times n$ symmetric matrix, $\boldsymbol{\alpha}, \boldsymbol{\beta} \in \mathbb{R}^n$, and $f$ is a fixed weight function. In recent work, F. Cellarosi and the second named author showed that when $n = 1$ and $f$ is a fixed Schwartz function, there exist $\alpha, \beta \in \mathbb{R}$ such that $|S^f_M (x ; \alpha, \beta)| \ll_{f, \alpha, \beta} \sqrt{M}$ for every $x \in \mathbb{R}$ and $M \in \mathbb{N}$. We show that this does not extend to higher dimensions, i.e., there are no $\boldsymbol{\alpha}, \boldsymbol{\beta} \in \mathbb{R}^n$ for which the bound $|S^f_M (X ; \boldsymbol{\alpha}, \boldsymbol{\beta})| \ll_{f, \boldsymbol{\alpha}, \boldsymbol{\beta}} M^{n/2}$ holds for every real symmetric matrix $X$ and every $M \in \mathbb{N}$ when $n > 1$.

445) David Cui (MIT), Jerry Zhang, Quantum-sound property tests for linear and affine linear functions (16 Mar 2025)

Given three players with some shared function over $\mathbb{F}_2$, the BLR test certifies that their shared function is linear in constant time. More specifically, if the test succeeds with probability $1-\epsilon$, then the players' functions differ from a linear function in at most $O(\epsilon)$ inputs. It has been shown that this three-player version of the BLR test is \emph{quantum-sound}: it similarly certifies the presence of a linear function over $\mathbb{F}_{2}$ within their strategies even when the players are allowed to share nontrivial correlations through entanglement. The existence of quantum-sound protocols allows us to quantize existing classical interactive protocols and prove containment between quantum multi-prover interactive proof systems, acting as the basis of foundational results in quantum complexity theory, such as $\text{MIP} \subseteq \text{MIP*}$ and $\text{MIP*} = \text{RE}$. In this paper, we generalize this property testing result and show that a variation of the linearity test over $\mathbb{F}_{p}$ is also quantum-sound. Additionally, we show that even without the consistency test, the presence of linear functions can be certified.

444) Andrew Brahms, Alan Duan, Jesse Geneson (SJSU), and Jacob Greene, Saturation of 0-1 Matrices (21 Feb 2025; arXiv.org, 5 Mar 2025)

A 0-1 matrix $M$ contains a 0-1 matrix $P$ if $M$ has a submatrix $P'$ which can be turned into $P$ by changing some of the ones to zeroes. Matrix $M$ is $P$-saturated if $M$ does not contain $P$, but any matrix $M'$ derived from $M$ by changing a zero to a one must contain $P$. The saturation function sat(n, P) is defined as the minimum number of ones of an $n \times n$ $P$-saturated 0-1 matrix. Fulek and Keszegh showed that each pattern $P$ has sat(n, P) = $O(1)$ or sat(n, P) = $\Theta(n)$. This leads to the natural problem of classifying forbidden 0-1 matrices according to whether they have linear or bounded saturation functions. Some progress has been made on this problem: multiple infinite families of matrices with bounded saturation function and other families with linear saturation function have been identified. We answer this question for all patterns with at most four ones, as well as several specific patterns with more ones, including multiple new infinite families. We also consider the effects of certain matrix operations, including the Kronecker product and insertion of empty rows and columns. Additionally, we consider the simpler case of fixing one dimension, extending results of (Fulek and Keszegh, 2021) and (Berendsohn, 2021). We also generalize some results to $d$-dimensional saturation.

443) Weian (Andrew) Xie, Worst-case Error Bounds for Online Learning of Smooth Functions (16 Feb 2025; arXiv.org, 23 Feb 2025)

Online learning is a model of machine learning where the learner is trained on sequential feedback. We investigate worst-case error for the online learning of real functions that have certain smoothness constraints. Suppose that $\mathcal{F}_q$ is the class of all absolutely continuous functions $f: [0, 1] \rightarrow \mathbb{R}$ such that $\|f'\|_q \le 1$, and $\operatorname{opt}_p(\mathcal{F}_q)$ is the best possible upper bound on the sum of the $p^{\text{th}}$ powers of absolute prediction errors for any number of trials guaranteed by any learner. We show that for any $\delta, \epsilon \in (0, 1)$, $\operatorname{opt}_{1+\delta} (\mathcal{F}_{1+\epsilon}) = O(\min(\delta, \epsilon)^{-1})$. Combined with the previous results of Kimber and Long (1995) and Geneson and Zhou (2023), we achieve a complete characterization of the values of $p, q \ge 1$ that result in $\operatorname{opt}_p(\mathcal{F}_q)$ being finite, a problem open for nearly 30 years. We study the learning scenarios of smooth functions that also belong to certain special families of functions, such as polynomials. We prove a conjecture by Geneson and Zhou (2023) that it is not any easier to learn a polynomial in $\mathcal{F}_q$ than it is to learn any general function in $\mathcal{F}_q$. We also define a noisy model for the online learning of smooth functions, where the learner may receive incorrect feedback up to $\eta \ge 1$ times, denoting the worst-case error bound as $\operatorname{opt}^{\text{nf}}_{p, \eta} (\mathcal{F}_q)$. We prove that $\operatorname{opt}^{\text{nf}}_{p, \eta} (\mathcal{F}_q)$ is finite if and only if $\operatorname{opt}_p(\mathcal{F}_q)$ is. Moreover, we prove for all $p, q \ge 2$ and $\eta \ge 1$ that $\operatorname{opt}^{\text{nf}}_{p, \eta} (\mathcal{F}_q) = \Theta (\eta)$.

442) Alex Huang, Kartik Ramachandrula, Agniv Sarkar, and Lu Lu (Yale), Augmented Systems-Biology Informed Neural Networks for Parameter Identification of the Notch Model (11 Feb 2025)

The process of neurogenesis in the mammalian brain is controlled by the Notch signaling pathway, which can be modeled with a system of ordinary differential equations relating the concentrations of species. However, this system contains a relatively large number of state variables (species) and parameters; as such, it is computationally costly to model the system, even with current techniques. In this paper, we describe a neural network pipeline to elucidate properties of the system as well as forecast species. First, we extensively discuss the use of identifiability analysis in systems biology problems to offer guidance in modeling. We show the utilization Systems-Biology Informed Neural Networks (SBINNs) architecture to extract values of ODE parameters as well as model the dynamics of the chemical species. In addition, we describe the implementation of additions to SBINNs such as warm-starting and considering sensitivity of parameters that enhance the learning of the model. Our results should provide accurate predictions of the biochemical dynamics in the Notch signaling pathway and help neuroscientists in the field better understand the formation of neurons. We also describe how we can further this technique and evaluate other modern architectures such as PINNformers and KANs to enhance predictions.

441) Skyler Mao and David Darrow, Extending the Droplet-Wave Statistical Correspondence in Walking Droplet Dynamics (6 Feb 2025; arXiv.org, 13 Aug 2025), published in Chaos: An Interdisciplinary Journal of Nonlinear Science 36, 023112 (February 2026)

Walking droplets -- millimetric oil droplets that self-propel across the surface of a vibrating fluid bath -- exhibit striking emergent statistics that remain only partially understood. In particular, in a variety of experiments, a robust correspondence has been observed between the droplet's statistical distribution and the time-average of the wave field that guides it. Durey, Milewski, and Bush (2018) rigorously established such a correspondence in the case of a single droplet on a uniform, unbounded fluid bath, with a single, instantaneous droplet-bath impact during each vibration period. They also outlined how their result might extend to more general bath topography, and the work of Durey, Milewski, and Wang (2020) implemented such an approach for a circular bath. We attempt to complete this program in the present work, rigorously extending this statistical correspondence to account for arbitrary bath topography, arbitrary droplet-bath impact models, multiple droplet interactions, and non-resonant bouncing. We investigate this correspondence numerically in systems of one and two droplets in 1-D geometries, and particularly, we observe how the time-averaged wave field can distinguish between correlated and uncorrelated pairs of droplets.

440) Maya Kalai and Ella Kim, A Note on Inner-Product Secret-Key ABE (5 Feb 2025)

Consider a scenario where Alice would like to encrypt a message such that Bob and Charlie can decrypt it if and only if they have an attribute satisfying a given predicate. This problem is solved by attribute-based encryption and has applications in many areas, including electronic record keeping and the Internet of Things. In this note, we construct and implement a secret-key attribute-based encryption (SK-ABE) scheme supporting inner-product predicates, purely in the random oracle model. Our construction works by adapting the work of Attrapadung et al. (Crypto ‘18) building secret-key ABE from constrained PRFs and the recent work of Servan-Schreiber (Asiacrypt ‘24) building constrained pseudorandom functions from random oracles. Our construction is practical, which we demonstrate with a prototype implementation. With attribute vectors of length 1,000, each encryption and decryption operation takes approximately 250 microseconds, and scales linearly with the length of the attribute vectors.

439) Sargam Mondal, Exact Factorizations of G-Crossed Braided Fusion Categories (31 Jan 2025)

For a finite group $ G $, a $ G $-crossed braided fusion category is defined as a $ G $-graded fusion category equipped with a $ G $-action and a $ G $-braiding. In this work, we investigate $ G $-crossed braiding structures within exact factorizations of fusion categories, which are analogous to the Zappa–Szép product in group theory. For a fusion category $\mathcal{B}$ faithfully graded by its universal grading group $ U(\mathcal{B}) $, we establish that if $\mathcal{B} = \mathcal{A} \bullet \mathcal{C}$ is an exact factorization, then the subcategories $\mathcal{A}$ and $\mathcal{C}$ are $ U(\mathcal{A}) $- and $ U(\mathcal{C}) $-crossed braided, respectively. We extend these results to $ G $-crossed commutative fusion rings, where we analyze the $ U(\mathsf{R}) $-action in exact factorizations of fusion rings. Additionally, we introduce the notion of the generalized semidirect product of fusion categories and rings and show its relationship to the bicrossed product, an equivalent formulation of exact factorization. We further establish that an exact factorization $\mathcal{B} = \mathcal{A} \bullet \mathcal{C}$ is braided if and only if $\mathcal{B} \cong \mathcal{A} \boxtimes \mathcal{C}$, and we provide a complete characterization of conditions under which bicrossed products of categories and rings are commutative or braided. Finally, for a general group $ G $, we present criteria for exact factorization and examine the implications of fusion subcategories of $ G $-crossed braided fusion categories.

438) Hunter Dinkins (MIT), Jiwu Jang, Vertex Functions of Type D Nakajima Quiver Varieties (31 Jan 2025; arXiv.org, 19 Feb 2025)

We study the quasimap vertex functions of type $D$ Nakajima quiver varieties. When the quiver varieties have isolated torus fixed points, we compute the coefficients of the vertex functions in the $K$-theoretic fixed point basis. We also give an explicit combinatorial description of zero-dimensional type $D$ quiver varieties and their vertex functions using the combinatorics of minuscule posets. Using Macdonald polynomials, we prove that these vertex functions can be expressed as products of $q$-binomial functions, which proves a degeneration of the conjectured 3d mirror symmetry of vertex functions. We provide an interpretation of type $D$ spin vertex functions as the partition functions of the half-space Macdonald processes of Barraquand, Borodin, and Corwin. This hints that the geometry of quiver varieties may provide new examples of integrable probabilistic models.

437) Chris Bao, Joshua Wang, William Zhao, Minimum and Approximate Minimum $k$-Cuts in Hypergraphs (25 Jan 2025)

The minimum cut problem and its generalizations are important to combinatorial optimization and have numerous applications in network reliability, circuit design, and clustering. Our work considers the minimum $k$-way cut problem in hypergraphs, which asks for a $k$-way partition of the vertex set that minimizes the number of crossing hyperedges. We begin by extending the work of Kogan and Krauthgamer (2014), using the randomized contraction technique introduced by Karger and Stein (1995), to bound the number of approximate minimum $k$-way cuts in low-rank hypergraphs. Next, we consider the branching contraction algorithm of Fox et al. (2019) as applied to the minimum $k$-way cut problem in unweighted hypergraphs. Under a conjectural bound on the scaled proportions of small hyperedges, we improve the running time to $\tilde{O}(mn^k + n^{4k-3})$. Finally, we generalize the near-linear time $(2+\varepsilon)$-approximation algorithm of Quanrud (2019) for the graph $k$-way cut problem, achieving an approximation ratio of $r(1+\varepsilon)$ for hypergraphs of rank $r$. As a component, we provide an algorithm for finding a minimum hypertree with improved runtime compared to the prior result of Baïou and Barahona (2023).

436) Henry Jiang, The Axial Electric Potential and Length of a Torus Knot (17 Jan 2025)

Physical knot theory, where knots are treated like physical objects, is important to many fields. One natural problem is to give a knot a uniform charge, and analyze the resulting electric field and electric potential. There have been some results on the number of critical points of the electric potential from knots, such as by Lipton (2021) and Lipton, Townsend, and Strogatz (2022). However, little analysis has been done on the electric field and electric potential using calculations for specific knots. We focus on torus knots, specifically a parametrization that embeds it on a torus centered at the origin with rotational symmetry about the z-axis. Particularly, in this project, we analyze the electric field along the z-axis to take advantage of symmetry. We also analyze the length of the knot as a simpler integral. We show that the electric field is zero only at the origin, and investigate the extreme points of the electric field and electric potential using numerical methods and calculations. We also demonstrate a new way to apply methods for contour integration in complex analysis to calculate the length, electric potential, and electric field, and provide an explicit approximation for the length of a torus knot.

435) Eric Chen and Rohith Raghavan, Comparing Methods of Opportunistic Risk Limiting Audits (17 Jan 2025)

Auditing elections is an important part of preserving faith in the electoral system and verifying the accuracy of the reported results of an election. Conventional election audits involve taking a set number or percentage of ballots and checking if the samples match the reported winner. However, these methods are unreliable for close races and excessive for races with a wide margin. Risk-limiting audits use statistical tests in order to assign a certain risk limit, the maximum probability that the results are incorrect, by sampling ballots one at a time until the risk limit is achieved. Our research explores opportunistic auditing, the ability to audit multiple races at the same time, and attempts to determine what strategies are most effective for opportunistic auditing. We examine complex multi-state and strata races that are audited using the ALPHA (Stark) supermartingale and test different sampling strategies across drifts and margins to answer the core question: how can existing auditing tests/martingales provide useful risk guarantees over multiple simultaneous races?

434) Raj Saha, Figurative Language and Mobilization to Action: a Multimethod Approach (15 Jan 2025)

This paper seeks to analyze the effect of figurative on mobilization to action and empowerment to act. The approach toward this hypothesis is multimodal; on the one hand, we will computationally analyze peer advice datasets, and on the other hand, we will experimentally discern the influence of non-literal language. The first computational approach necessitates an accurate figurative language detection tool. Following previous literature, we plan to leverage Large Language Models for this task; BERT has already been effective for preliminary tests on metaphor detection. After developing the computational tool to detect figurative language in text, we will analyze queries and replies in online peer advice communities such as health forums, and test the influence of a reply’s figurative language on the adherence with the advice given. We will then augment this analysis with two controlled experiments. The first will test whether figurative language in advice and directives increases compliance due to trust, and the second will investigate what mental processes underlie the ratio of figurative language in giving advice and responding to it.

433) Rohan Das, The Warped Tensor Product of Frobenius Algebras (15 Jan 2025)

Frobenius algebras were first studied in the 1930s due to their importance to the representation theory of finite groups. Recently, they have returned to popularity because commutative Frobenius algebras correspond exactly to two-dimensional Topological Quantum Field Theories, which combine the principles of classical field theory, special relativity, and quantum mechanics. In this paper, we introduce the warped tensor product and use it to build new symmetric monoidal structures on Frobenius algebras.

432) Coleman DuPlessie, Sparse Autoencoders for Interpretability in Reinforcement Learning Models (15 Jan 2025)

Recent work has shown that sparse autoencoders (SAEs) are able to effectively discover human-interpretable features in language models, at scales ranging from toy models to state-of-the-art large language models. This work explores whether the use of SAEs can be generalized to other varieties of machine learning, specifically, reinforcement learning, and what, if any, modifications are necessary to adapt SAEs to this substantially different task. This research investigates both qualitative and quantitative measures of SAEs’ ability to represent reinforcement learning models’ activations as interpretable features, using a toy reinforcement learning environment to conduct empirical experiments. It finds that SAEs are successfully able to break down deep Q networks’ internal activations into human-interpretable features, and, furthermore, that some of these human-interpretable features represent an internal understanding of the underlying task that could not have been discovered from a deep Q network’s output alone.

431) Govind Velamoor and Adrita Samanta, Adaptive Timeout Strategies for Microservice Applications (15 Jan 2025)

Timeouts are critical in determining whether a request has succeeded or failed. Developers face several challenges when setting timeout values in distributed systems; the specific challenge we investigate being the systems’ propensity to change, both over the short- and long-term. We propose a timeout-optimized policy targeting change over different time scales, assuming APIs that are both idempotent and atomic. We evaluate our approaches on a home-grown microservices testbed, by comparing the timeout percentage, total time taken, and closeness to actual latency when our approaches and the industry standard of Exponential Backoff are used in simulated environments with changing system performance.

430) Sophia Liao, Orbits of Standard and Semistandard Young Tableaux Under the Cactus Group (14 Jan 2025)

The cactus group $J_n$, generated by the Bender--Knuth involutions $t_i$, acts on standard and semistandard Young tableaux by swapping entries of $i$ and $i+1$. The action $J_n$ is a combinatorial abstraction of the problem of finding natural bijections between bases of irreducible representations of the group $S_n$ and the group $S_n\times GL(N)$. We fully classify the orbits of the action of the cactus group on standard Young tableaux and pairs of standard Young tableaux. In particular, we show that the action of $J_n$ is transitive on standard Young tableaux and nearly transitive on pairs, and we conjecture that the image of $J_n$ on standard Young tableaux is either the permutation group or the alternating group. Although standard Young tableaux are transitive under $J_n$, semistandard Young tableaux are not. We establish several invariants, and we find a sufficient condition for one of these invariants to be a complete invariant.

429) Jason Mao, Differentiating Geometric Structure of Point Cloud Distributions Using Persistent Homology (14 Jan 2025)

The field of topological data analysis aims to characterize datasets by their topological structures. In particular, representative tools such as persistent homology and persistence landscapes are used to condense the information provided by a shape or a point cloud into a more compact form that highlights structural properties of the data. These tools have previously been used to analyze global properties of shapes, such as their connectivity or their genus. Our work shows the potential of these tools to capture the local, geometric properties of shapes, such as the sharpness of its angles. Furthermore, we prove a theoretical result on how numerical metrics on persistence landscapes can capture geometric distinctions between point cloud distributions with arbitrarily high probability.

428) Eric Wang, The Conformal Type Problem for Riemann Surfaces Through Discrete Analysis of Extended Speiser Graphs (11 Jan 2025)

The conformal type (parabolic or hyperbolic) of a covering surface of the Riemann sphere with $n$ singular values can be determined by the type (recurrent or transient respectively) of the corresponding extended Speiser graph. In this paper, we look at the extended Speiser graphs of some covering surfaces whose conformal type is known via analytic methods. For a covering surface known to be parabolic, we use discrete techniques of shorting and discrete extremal length to verify that the type of its extended Speiser graph is recurrent. For a covering surface known to be hyperbolic we provide ideas of possible approaches for transience of the extended Speiser graph.

427) Ziang Chen, Qiao Zhang (PRIMES), and Runzhong Wang, On the Expressive Power of Subgraph Graph Neural Networks for Graphs with Bounded Cycles (11 Jan 2025; arXiv.org, 6 Feb 2025)

Graph neural networks (GNNs) have been widely used in graph-related contexts. It is known that the separation power of GNNs is equivalent to that of the Weisfeiler-Lehman (WL) test; hence, GNNs are imperfect at identifying all non-isomorphic graphs, which severely limits their expressive power. This work investigates $k$-hop subgraph GNNs that aggregate information from neighbors with distances up to $k$ and incorporate the subgraph structure. We prove that under appropriate assumptions, the $k$-hop subgraph GNNs can approximate any permutation-invariant/equivariant continuous function over graphs without cycles of length greater than $2k+1$ within any error tolerance. We also provide an extension to $k$-hop GNNs without incorporating the subgraph structure. Our numerical experiments on established benchmarks and novel architectures validate our theory on the relationship between the information aggregation distance and the cycle size.

426) Victor Gonzalez, Felix Gotti, Ishan Panpaliya (CrowdMath-2024), On the ascent of almost and quasi-atomicity to monoid semidomains (arXiv.org, 9 Jan 2025)

A commutative monoid is atomic if every non-invertible element factors into irreducibles (also called atoms), while an integral (semi)domain is atomic if its multiplicative monoid is atomic. Notions weaker than atomicity have been introduced and studied during the past decade, including almost atomicity and quasi-atomicity, which were coined and first investigated by Boynton and Coykendall in their study of graphs of divisibility of integral domains. The ascent of atomicity to polynomial extensions was settled by Roitman back in 1993 while the ascent of atomicity to monoid domains was settled by Coykendall and the second author in 2019 (in both cases the answer was negative). The main purpose of this paper is to study the ascent of almost atomicity and quasi-atomicity to polynomial extensions and monoid domains. Under certain reasonable conditions, we establish the ascent of both properties to polynomial extensions (over semidomains). Then we construct an explicit example illustrating that, with no extra conditions, quasi-atomicity does not ascend to polynomial extensions. Finally, we show that, in general, neither almost atomicity nor quasi-atomicity ascend to monoid domains, improving upon a construction first provided by Coykendall and the second author for the non-ascent of atomicity.

425) Hwisoo (Harry) Kim and Kenta Suzuki (MIT), Affine Subregular Kazhdan-Lusztig Polynomials in Types D and E (8 Jan 2025)

Affine Weyl groups $\widehat W$ have a two-sided cell $c_\mathrm{subreg}$—the subregular cell—which decomposes into left cells $c^j_\mathrm{subreg}$—subregular left cells—indexed by the set $\widehat{S} = \{s_i\}$ of simple reflections. Bezrukavnikov, Kac, and Krylov compute Kazhdan-Lusztig polynomials on the left cell $c^0_\mathrm{subreg}$ corresponding to the affine reflection $s_0\in\widehat S$ for simply-laced Lie algebras, and find new character formulas for simple modules of affine Lie algebras. We extend their work and provide an explicit description of the left cell module attached to $c^j_\mathrm{subreg}$ for all $s_j\in\widehat S$ in types D and E. As a corollary, we compute the values of all parabolic inverse affine Kazhdan-Lusztig polynomials on the subregular cell. Moreover, while Bezrukavnikov, Kac, and Krylov's methods were geometric, our methods are purely algebraic, so even when $j=0$ our proof is new.

424) Aidan Gao and Junhong Lin (MIT), ConstellationNet: Reinventing Spatial Clustering through GNNs (3 Jan 2025)

Spatial clustering is a crucial field, finding universal use across criminology, pathology, and urban planning. However, most spatial clustering algorithms cannot pull information from nearby nodes and suffer performance drops when dealing with higher dimensionality and large datasets, making them suboptimal for large-scale clustering. To improve upon this, we develop ConstellationNet, a convolution neural network(CNN)-graph neural network(GNN) framework that leverages the embedding power of a CNN, the neighbor aggregation of a GNN, and a neural network’s ability to deal with batched data to improve spatial clustering and classification with graph augmented predictions. ConstellationNet achieves state-of-the-art performance on both supervised classification and unsupervised clustering across several datasets, outperforming state-of-the-art classification and clustering while reducing model size and training time by up to tenfold and improving baselines by 10 times. Because of its fast training and powerful nature, ConstellationNet holds promise in fields like epidemiology and medical imaging, able to quickly train on new and limited data to develop robust responses.

423) Arjun Agarwal, Rachel Chen, and Rohan Garg, The Davenport Constant and Automorphically Equivalent Elements (2 Jan 2025)

Let $G$ be a finite abelian group and let $D(G)$ be the Davenport constant of the group. In this paper we demonstrate several bounds on the Davenport constant. We also investigate whether $D(G)$, along with other numerical invariants of the group, is sufficient to uniquely determine its structure. Our investigations lead us to a conjecture that relates the divisibility of the Davenport constant of the subgroups to the structure of the group. We also study the inverse Davenport problem–the structure of maximal $0$-sequences of length $D(G)$. The structure of these sequences motivates the study of necessary and sufficient conditions for two elements $x$ and $y \in G$ to be automorphic images of one another. We ultimately prove that there exists $\varphi \in \text{Aut}(G)$ such that $\varphi(x) = y$ if and only if $G/\langle x \rangle \cong G/\langle y \rangle$. This result leads to our development of the two fastest known algorithms to determine if two elements of a finite abelian group are automorphic images of one another. We use this algorithm to develop the fastest known algorithm to compute the orbits of finite abelian groups.

422) Jiya Dani, Felix Gotti (MIT), Leo Hong, Bangzheng Li (MIT), and Shimon Schlessinger, On Finitary Power Monoids of Linearly Orderable Monoids (2 Jan 2025; arXiv.org, 6 Jan 2025)

A commutative monoid $M$ is called a linearly orderable monoid if there exists a total order on $M$ that is compatible with its operation. The finitary power monoid of a commutative monoid $M$ is the monoid consisting of all nonempty finite subsets of $M$ under the so-called sumset. In this paper, we investigated whether certain atomic and divisibility properties ascend from linearly orderable monoids to their corresponding finitary power monoids.

421) Siddharth Nirgudkar, Contextualized Transfer Learning: Transforming Heterogeneity into Predictive Power with Generative Latent Structures in Resource-Limited Settings (2 Jan 2025)

Predicting biomedical outcomes in resource-limited settings is challenging due to data scarcity and patient variability: retraining models locally lacks power, while borrowing models fails to capture context-specific causes. Current approaches frame these challenges as a tradeoff: transfer learning enhances generalization by leveraging data from other settings but sacrifices patient-specific adaptation, while contextualized learning adapts to specific contexts but struggles with limited data. We introduce Contextualized Transfer Learning (CTL) as a novel approach that reconciles these conflicting goals by introducing a new notion of shared heterogeneity. We propose that some latent variable $z$ generates many predictors, outcomes, across different tasks. Through a biological lens, $z$ can represent the set of genes and genetic regulators. By modeling the joint distribution of predictors and outcomes $ p(x, y \mid c) \sim f(z(c)) $, where $ f(z(c)) $ represents the latent structure shared across contexts, we can enable information sharing across disparate outcomes, patients, and predictors, introducing a new dimension to transfer learning: generalizing across tasks while simultaneously tailoring predictions to individual patient contexts. Sample specific understanding is still retained, as the architecture of CTL is built off Contextualized Learning (CL). Instead of only taking one context modality, CTL is able to accept numerous context modalities, some which contain information passed from upstream models, and others that have covariate data for the current task. These context modalities serve to create a unique "bar code" for each sample, called a subtype. This is then weighted against the archetype dictionary (extrema models) to create sample specific models, hence retaining patient specific understanding while also allowing for the transfer of information across diverse settings. We apply CTL to predicting Alzheimer's disease and show that CTL reduces mean square error by 22.9% compared to contextualized regression (CR) and boosts classification accuracy by 8%, outperforming population-based methods by 30%. We also show the interpretibility of CTL, which places heavy emphasis on a select few predictors which is critical for understanding biological insight. These results highlight CTL's potential as a powerful tool for precision diagnostics, particularly in resource-limited settings.

420) Sophia Yan (PRIMES), Steve A. McCarroll, and Nicole B. Rockweiler, A multi-omic approach to uncover enhancer-gene interactions in the human brain (31 Dec 2024)

Gene regulation is a complicated process, critical for maintaining cell type-specific functions by controlling RNA expression. Diseases such as cancer, neurological disorders, and autoimmunity stem from the mis-regulation of gene expression. Here, we computationally explored one of the primary regulators of gene expression, enhancers, at a single nucleus resolution, aiming to understand the cell type specific functions of enhancers. We’ve proposed a strategy to use single nucleus RNA-seq (snRNA-seq) to detect a canonical marker of active enhancers, enhancer RNA (eRNA), in the Brodmann area 46 (BA46) of 180 human brains across 1,217,965 nuclei. We assessed these putative enhancers by creating a scoring system to quantify their bidirectional transcription, a property of eRNA. We found a significant positive shift in the bidirectional score distributions between our putative enhancers and the nearby regions (Wilcoxon Signed Rank Test, pvalue = 5e-143), providing confidence that our regions show enhancer behavior. We then utilized the unique power of our single nucleus sequencing data to explore the cell type-specificity of these putative enhancers and observed two-fold as many enhancers expressed in multiple cell types (n = 6,477) compared to cell type specific enhancers (n = 2,699). The mean expression level of cell type specific enhancers was also 29 times lower than ubiquitously expressed enhancers. In addition, we mapped these enhancers to their putative target genes by testing for correlation between putative eRNA expression and gene expression within the same topologically associating domain. Among the 4,147 potential enhancer-gene pairs we found across seven cell types using snRNA-seq, 116 (~3%) pairs are in chromatin loops and likely interact with each other in 3D space from a bulk Hi-C data analysis. Moreover, the enhancer-gene pairs identified in the multiome dataset are wellreplicated in the snRNA-seq dataset across the same 20 donors, with the expression correlation values following the line of best fit y = 0.400x. Furthermore, increasing the sample size of snRNA-seq dataset to 160 donors yields similar correlation values when compared to the 20-donor snRNA-seq dataset (line of best fit of y = 0.746x), highlighting the robustness of snRNA-seq for studying enhancer-gene interactions. The putative enhancer-gene pairs provide insight into the complex regulatory networks in the brain, shedding light on how dysregulation of these regions may contribute to brain-related disorders. In addition, our framework for identifying putative active enhancers

419) Michael Han and Ashley Yu, An Empirical Evaluation of Convergence to Correlated Equilibria: Introducing Multi-Stage Multiplicative-Weights Update (31 Dec 2024)

No-regret learning algorithms are an important component of advances in solving large-scale games. These algorithms are commonly used to solve games such as Diplomacy, an AI benchmark with a large action space where agents compete to dominate a map of Europe. We introduce Multi-Stage Multiplicative-Weights Update (MS-MWU), which shows an improvement upon existing external-regret minimizing algorithms such as MWU across all our experiments. We also perform an empirical evaluation of classic no-regret algorithms such as Multiplicative-Weights Update (MWU) and Optimistic Multiplicative-Weights Update (OMWU). Furthermore, we test swap regret minimization algorithms such as the no swap-regret algorithm of Blum & Mansour (2007) and the TreeSwap algorithm of Dagan et al (2024). We play these algorithms against each other and randomized adversaries on hundreds of subgames of Diplomacy along with Kuhn Poker and random games. Across all these games, our experiments show that MS-MWU converges significantly faster than MWU/OMWU. We experimentally show that swap regret and external regret remain very similar at all iterations. In other words, external regret minimization algorithms such as MWU outperform swap regret minimization algorithms such as BM in terms of rate of convergence and time complexity, even for very large time horizons.

418) Adam Ge and Aadya Goel, Unlearning Mechanisms in Graph Models and Document Classification (31 Dec 2024)

We look at Machine Unlearning, the concept of making AI models “forget” about a particular section of data. In our research, we look at how the use of graphs helps in the convergence of two problems: unlearning a document classification label and unlearning the edges of a graph. We consider a graph containing nodes of words and documents, with edges indicating whether there is a relationship between a word and a document, or between two documents. Current state-of-the-art algorithms randomly reclassify the documents, but we argue that this decreases model utility. Instead, we use similarity scores to reclassify the documents into the next best class. We refine the model further by ensuring the privacy guarantee of the unforgotten class by making sure it is indistinguishable from the remaining classes. Additionally, we introduce edge/relation unlearning to refine this process. The current state-of-the-art method for edge unlearning, called GNNDelete, decreases the predicted probability of an unlearned edge to very close to 0 and assumes there is no latent relationship, which we argue decreases model utility. Instead, we refine this assumption and forget only solely the information we want to forget.

417) Celine Zhang, Eric Archerman, and Simon Langowski (MIT), Using Ideas From Hardware To Accelerate Zero-Knowledge Virtual Machines (30 Dec 2024)

Zero-knowledge virtual machines (zkVMs) are an up-and-coming solution to the problem of verifiable computation: they allow a prover to generate a proof showing the correct execution of a computer program. A verifier can then quickly verify this proof without knowing certain potentially private details about the program. zkVMs stand out for how they combine the math of traditional verifiable computation schemes with the user-friendly functionality of standard compilers, such as Clang and Rust-C, and widely used programming languages like C++ and Rust. Traditional approaches often require translating programs into low-level domain-specific languages by hand, a process that is both labor-intensive and error-prone. zkVMs solve this issue by accepting programs in the native assembly of the chosen virtual processor (rather than an esoteric DSL). This convenience, however, comes at the cost of additional overhead, particularly in memory emulation, and makes zkVMs generally less performant than traditional techniques.
Using the prominent zkVM Jolt as an example, we seek to reconcile this gap in performance by optimizing Jolt’s memory proofs, primarily by mirroring the memory access patterns of physical CPUs in the virtual setting. Just like physical CPUs have been improved by various hardware optimizations, our improvements – multiple in-flight instructions, batched memory reads, caching, and faster registers – seek to improve Jolt’s performance. In this report, we outline our implementation plan for these optimizations and give some preliminary predictions of their results.

416) Frank Wang (MIT) and Eric Yee, Hilbert Series of $S_3$-Quasi-Invariant Polynomials in Characteristics 2, 3 (27 Dec 2024; arXiv.org, 30 Dec 2024), published in Symmetry, Integrability and Geometry: Methods and Applications (SIGMA) 21 (2025)

We compute the Hilbert series of the space of $n=3$ variable quasi-invariant polynomials in characteristic $2$ and $3$, capturing the dimension of the homogeneous components of the space, and explicitly describe the generators in the characteristic $2$ case. In doing so we extend the work of the first author in 2023 on quasi-invariant polynomials in characteristic $p>n$ and prove that a sufficient condition found by Ren-Xu in 2020 on when the Hilbert series differs between characteristic $0$ and $p$ is also necessary for $n=3$, $p=2,3$. This is the first description of quasi-invariant polynomials in the case, where the space forms a modular representation over the symmetric group, bringing us closer to describing the quasi-invariant polynomials in all characteristics and numbers of variables.

415) Shreyas Ekanathan (PRIMES), Oscar Smith, Christopher Rackauckas, A Fully Adaptive Radau Method for the Efficient Solution of Stiff Ordinary Differential Equations at Low Tolerances (arXiv.org, 18 Dec 2024)

Radau IIA methods, specifically the adaptive order radau method in Fortran due to Hairer, are known to be state-of-the-art for the high-accuracy solution of highly stiff ordinary differential equations (ODEs). However, the traditional implementation was specialized to a specific range of tolerance, in particular only supporting 5th, 9th, and 13th order versions of the tableau and only derived in double precision floating point, thus limiting the ability to be truly general purpose for highly accurate scenarios. To alleviate these constraints, we implement an adaptive-time adaptive-order Radau method which can derive the coefficients for the Radau IIA embedded tableau to any order on the fly to any precision. Additionally, our Julia-based implementation includes many modernizations to improve performance, including improvements to the order adaptation scheme and improved linear algebra integrations. In a head-to-head benchmark against the classic Fortran implementation, we demonstrate our implementation is approximately 2x across a range of stiff ODEs. We benchmark our algorithm against several well-reputed numerical integrators for stiff ODEs and find state-of-the-art performance on several test problems, with a 1.5-times speed-up over common numerical integrators for stiff ODEs when low error tolerance is required. The newly implemented method is distributed in open source software for free usage on stiff ODEs.

414) Yunseo Choi (Harvard University) and Katelyn Gan, Ungar Games on the Young-Fibonacci Lattice and the Lattices of the Order Ideals of Shifted Staircases (9 Dec 2024)

In 2023, Defant and Li introduced an Ungar move, which sends an element $v$ of a meet-semilattice $L$ to the meet of some subset of the elements covered by $v$. More recently, Defant, Kravitz, and Williams introduced the Ungar game on $L$, in which two players take turns making nontrivial Ungar moves starting from an element of $L$ until the player who cannot make a nontrivial Ungar move loses. In this note, we settle two conjectures by Defant, Kravitz, and Williams on the Ungar games on the Young-Fibonacci lattice and the lattices of the order ideals of shifted staircases.

413) Eddy Li, Advaith Mopuri, Charles Zhang, Goldbach Theorems for Group Semidomains (arXiv.org, 21 Nov 2024)

A semidomain is a subsemiring of an integral domain. We call a semidomain $S$ additively reduced if $0$ is the only invertible element of the monoid $(S, +)$, while we say that $S$ is additively Furstenberg if every non-invertible element of $(S,+)$ can be expressed as the sum of an atom and an element of $S$. In this paper, we study a variant of the Goldbach conjecture within the framework of group semidomains $S[G]$ and group series semidomains $S[\![G]\!]$, where $S$ is both an additively reduced and additively Furstenberg semidomain and $G$ is a torsion-free abelian group. In particular, we show that every non-constant polynomial expression in $S[G]$ can be written as the sum of at most two irreducibles if and only if the condition $\mathscr{A}_+(S) = S^\times$ holds.

412) Anay Aggarwal, Marly Gotti (Apple), Ekam Kaur, and Susie Lu, Comparative Analysis of Machine Learning Models for Thyroid Cancer Recurrence Prediction (21 Nov 2024)

Thyroid cancer is one of the most common endocrine malignancies, with Differentiated Thyroid Cancer (DTC) accounting for the majority of cases. Accurate prediction of cancer recurrence is essential for improving personalized treatment and patient outcomes. This study compares six machine learning algorithms—Artificial Neural Network (ANN), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost)—to identify the best model for predicting DTC recurrence. We conducted a comparative analysis using a dataset from the UCI Machine Learning Repository, which includes demographic, clinical, and pathological data for thyroid cancer patients. Each algorithm was evaluated on key performance metrics, including accuracy, precision, recall, and specificity. Feature selection techniques, such as Principal Component Analysis (PCA) and Feature Importance Analysis (FIA), were applied to identify the most significant features influencing recurrence. Among the models tested, Random Forest achieved the highest overall accuracy and specificity, while SVM with the polynomial kernel excelled in recall, ensuring all positive cases were captured. Feature selection highlighted “Response”, “N”, “T”, “Risk”, and “Age” as the most impactful variables, contributing to model improvement and enhanced interpretability. The Random Forest model demonstrates robust predictive power and is a strong candidate for clinical applications in DTC recurrence prediction, with potential to support more tailored treatment strategies. The study underscores the role of machine learning in advancing cancer care through improved predictive accuracy and personalized risk assessment.

411) Shiqiao Zhang, A Rigidity Result for Axisymmetric Toric Ricci Solitons (arXiv.org, 30 Oct 2024)

We examine a non-axisymmetric perturbation of a family of axisymmetric toric Einstein manifolds and Ricci solitons studied in Firester–Tsiamis (2024). We establish a rigidity result stating that these axisymmetric Ricci solitons do not admit constant-angle non-axisymmetric perturbations except for conformally flat cases. For these new cases, our result leads to an explicit description of the Einstein metrics and a classification of the Ricci solitons under a volume-collapsing ansatz.

410) Foster Tom (MIT), Aarush Vailaya, Adjacent cycle-chains are $e$-positive (arXiv.org, 29 Oct 2024)

We describe a way to decompose the chromatic symmetric function as a positive sum of smaller pieces. We show that these pieces are $e$-positive for cycles. Then we prove that attaching a cycle to a graph preserves the $e$-positivity of these pieces. From this, we prove an $e$-positive formula for graphs of cycles connected at adjacent vertices. We extend these results to graphs formed by connecting a sequence of cycles and cliques.

409) Neil Krishnan, Rupert Li (MIT), On the Connectivity of Friends-and-strangers Graphs (arXiv.org, 27 Oct 2024)

Friends-and-strangers graphs, coined by Defant and Kravitz, are denoted by $\mathsf{FS}(X,Y)$ where $X$ and $Y$ are both graphs on $n$ vertices. The graph $X$ represents positions and edges mark adjacent positions while the graph $Y$ represents people and edges mark friendships. The vertex set of $\mathsf{FS}(X,Y)$ consists of all one-to-one placements of people on positions, and there is an edge between any two placements if it is possible to swap two people who are friends and on adjacent positions to get from one placement to the other. Previous papers have studied when $\mathsf{FS}(X,Y)$ is connected. In this paper, we consider when $\mathsf{FS}(X,Y)$ is $k$-connected where a graph is $k$-connected if it remains connected after removing any $k-1$ or less vertices. We first consider $\mathsf{FS}(X,Y)$ when $Y$ is a complete graph or star graph. We find tight bounds on their connectivity, proving their connectivity equals their minimum degree. We further consider the size of the connected components of $\mathsf{FS}(X,\mathsf{Star}_n)$ where $X$ is connected. We show that asymptotically similar conditions as the conditions mentioned by Bangachev are sufficient for $\mathsf{FS}(X,Y)$ to be $k$-connected. Finally, we consider when $X$ and $Y$ are independent Erdős--Rényi random graphs on $n$ vertices and edge probability $p_1$ and $p_2,$ respectively. We show that for $p_0 = n^{-1/2+o(1)},$ if $p_1p_2\geq p_0^2$ and $p_1,$ $p_2 \geq w(n) p_0$ where $w(n) \rightarrow 0$ as $n \rightarrow \infty,$ then $\mathsf{FS}(X,Y)$ is $k$-connected with high probability. This is asymptotically tight as we show that below an asymptotically similar threshold $p_0'=n^{-1/2+o(1)}$, the graph $\mathsf{FS}(X,Y)$ is disconnected with high probability if $p_1p_2 \leq (p_0')^2$.

408) Evin Liang, Alexander Wang, Lerchen Zhong, On maximal common divisors in Puiseux monoids (arXiv.org, 11 Oct 2024)

Let $M$ be a commutative monoid. An element $d \in M$ is called a maximal common divisor of a nonempty subset $S$ of $M$ if $d$ is a common divisor of $S$ in $M$ and the only common divisors in $M$ of the set $\big\{ \frac{s}d : s \in S \big\}$ are the units of $M$. In this paper, we investigate the existence of maximal common divisors in rank-$1$ torsion-free commutative monoids, also known as Puiseux monoids. We also establish some connections between the existence of maximal common divisors and both atomicity and the ascending chain condition on principal ideals for the monoids we investigate here.

407) Jonathan Du, Bryan Li, Shaohuan Zhang, On the Internal Sum of Puiseux Monoids (arXiv.org, 28 Sep 2024), published in the Korean Journal of Mathematics 32:4 (2024): 745-757

In this paper, we investigate the internal (finite) sum of submonoids of rank-$1$ torsion-free abelian groups. These submonoids, when not groups, are isomorphic to nontrivial submonoids of the nonnegative cone of $\mathbb Q$, known as Puiseux monoids, and have been actively studied during the last few years. Here we study how the atomicity and arithmetic of Puiseux monoids behave under their internal (finite) sum inside the abelian group $\mathbb Q$. We study the factorization properties of such internal sums, giving priority to Cohn's notion of atomicity and the classical bounded and finite factorization properties introduced and studied in 1990 by Anderson, Anderson, and Zafrullah in the setting of integral domains, and then generalized by Halter-Koch to commutative monoids. We pay special attention to how each of the considered properties behaves under the internal sum of a Puiseux monoid with a finitely generated Puiseux monoid. Throughout the paper, we also discuss examples showing that our primary results do not hold for submonoids of torsion-free abelian groups with rank larger than $1$.

406) Marina Lin, Laura P. Schaposnik (University of Illinois at Chicago), A Carbon Aware Ant Colony System (CAACS) (arXiv.org, 11 Sep 2024)

In an era where sustainability is becoming increasingly crucial, we introduce a new Carbon-Aware Ant Colony System (CAACS) Algorithm that addresses the Generalized Traveling Salesman Problem (GTSP) while minimizing carbon emissions. This novel approach leverages the natural efficiency of ant colony pheromone trails to find optimal routes, balancing both environmental and economic objectives. By integrating sustainability into transportation models, CAACS provides a powerful tool for real-world applications, including network design, delivery route planning, and commercial aircraft logistics. Our algorithm's unique bi-objective optimization advances the study of sustainable transportation solutions.

405) Sidarth Erat, Arun S. Kannan, and Shihan Kanungo, Mixed Tensor Products, Capelli Berezinians, and Newton’s Formula for $\mathfrak{gl}(m|n)$ (arXiv.org, 4 Sep 2024), published in Transformation Groups (24 March 2025)

In this paper, we extend the results of Grantcharov and Robitaille in 2021 on mixed tensor products and Capelli determinants to the superalgebra setting. Specifically, we construct a family of superalgebra homomorphisms $\varphi_R : U(\mathfrak{gl}(m+1|n)) \rightarrow \mathcal{D}'(m|n) \otimes U(\mathfrak{gl}(m|n))$ for a certain space of differential operators $\mathcal{D}'(m|n)$, and study the homomorphism's properties. We use $\varphi_R$ to inflate representations of $U(\mathfrak{gl}(m|n))$ to those of $U(\mathfrak{gl}(m+1|n))$ and find partial criteria for when these inflations are simple. Next, we study the restriction of $\varphi_R$ to the center of $U(\mathfrak{gl}(m+1|n)$ and determine its interaction with the Harish-Chandra homomorphism and determine the image of Gelfand generators of the center. To do so, we prove a super-analog of the Newton's formula for $\mathfrak{gl}(m)$ relating Capelli generators and Gelfand generators. Finally, we prove the kernel of $\varphi_{R_1}$ is the ideal of $U(\mathfrak{gl}(m+1|n))$ generated by the first Gelfand invariant $G_1$.

404) Boyan Litchev, Improved Performance for Private Information Retrieval (10 Jul 2024)

By allowing users to retrieve items from a database without revealing which item was retrieved, Private Information Retrieval (PIR) has enabled recent advances in anonymous communication, private streaming, and more. However, PIR is very computationally expensive, and is fundamentally limited to having a computational cost that scales linearly with the size of the database, limiting the scale of protocols that use it to millions of users. By adjusting the procedure for gadget inversions, a key step in the homomorphic multiplications used in PIR, we achieve a 30% speedup over existing state-of-the-art PIR protocols and similarly reduce network costs.

403) Stephanie Wan, Transparent Authorship Verification with Machine Learning Models (26 Jun 2024)

Authorship Verification (AV) is the task of determining if two given documents were written by the same person. AV is critical in addressing issues such as misinformation and impersonation, though it holds risks in violating privacy rights. This paper presents a publicly accessible website hosting transparent AV machine learning models. We aggregate and pre-process diverse datasets to train a lexical model based on embeddings and a stylometric model leveraging feature vectors. To enhance model transparency, we incorporate attention-based highlighting and output important features. The code and website for this paper are available at GitHub and Streamlit.

402) Sophia Lichterfeld and Kyle Hogan (MIT), An Analysis of the 2024 Web Monetization Landscape (21 Jun 2024)

W3C’s Web Monetization (WM) API offers users the ability to compensate content creators online by continuously streaming micropayments to the website owner while viewing a page. While WM could be a feasible alternative to advertisements or subscriptions, it has not yet been widely adopted by websites. Rates of WM adoption were tracked from 2019 to 2021 but have not been evaluated for the past several years. To implement WM, website owners must add a meta tag or link with a payment pointer directing the money to their online wallet into their page’s HTML head. Using the presence of the meta tag or link as an indicator of WM adoption, we built a web scraper to determine the current WM adoption rate in 2024. To expand our adoption rate results, we analyzed a dataset curated by HTTP Archive through Google’s BigQuery database. We further assessed the breakdown of wallet providers, the distribution of website hosts, and the comparison of these metrics across time points and subsets of the dataset. We hope our findings will fill this data gap and better inform approaches to increasing widespread WM adoption.

2023 Research Papers

401) Anay Aggarwal, Felix Gotti, Susie Lu (CrowdMath-2023), On primality and atomicity of numerical power monoids (arXiv.org, 8 Dec 2024)

In the first part of this paper, we establish a variation of a recent result by Bienvenu and Geroldinger on the (almost) non-existence of absolute irreducibles in (restricted) power monoids of numerical monoids: we argue the (almost) non-existence of primal elements in the same class of power monoids. The second part of this paper, devoted to the study of the atomic density of $\mathcal{P}_{\text{fin}, 0}(\mathbb{N}_0)$, is motivated by work of Shitov, a recent paper by Bienvenu and Geroldinger, and some questions pointed out by Geroldinger and Tringali. In the same, we study atomic density through the lens of the natural partition $\{ \mathcal{A}_{n,k} : k \in \mathbb{N}_0\}$ of $\mathcal{A}_n$, the set of atoms of $\mathcal{P}_{\text{fin}, 0}(\mathbb{N}_0)$ with maximum at most $n$: \[ \mathcal{A}_{n,k} = \{A \in \mathcal{A} : \max A \le n \text{ and } |A| = k\} \] for all $n,k \in \mathbb{N}$, where $\mathcal{A}$ is the set of atoms of $\mathcal{P}_{\text{fin}, 0}(\mathbb{N}_0)$. We pay special attention to the sequence $(α_{n,k})_{n,k \ge 1}$, where $α_{n,k}$ denote the size of the block $\mathcal{A}_{n,k}$. First, we establish some bounds and provide some asymptotic results for $(α_{n,k})_{n,k \ge 1}$. Then, we take some probabilistic approach to argue that, for each $n \in \mathbb{N}$, the sequence $(α_{n,k})_{k \ge 1}$ is almost unimodal. Finally, for each $n \in \mathbb{N}$, we consider the random variable $X_n : \mathcal{A}_n \to \mathbb{N}_0$ defined by the assignments $X_n : A \mapsto |A|$, whose probability mass function is $\mathbb{P}(X_n=k) = α_{n,k}/| \mathcal{A}_n|$. We conclude proving that, for each $m \in \mathbb{N}$, the sequence of moments $(\mathbb{E}(X_n^m))_{n \ge 1}$ behaves asymptotically as that of a sequence $(\mathbb{E}(Y_n^m))_{n \ge 1}$, where $Y_n$ is a binomially distributed random variable with parameters $n$ and $\frac12$.

400) Coleman DuPlessie, Aidan Gao, A Novel Review of Stability Techniques for Improved Privacy-Preserving Machine Learning (arXiv.org, 31 May 2024)

Machine learning models have recently enjoyed a significant increase in size and popularity. However, this growth has created concerns about dataset privacy. To counteract data leakage, various privacy frameworks guarantee that the output of machine learning models does not compromise their training data. However, this privatization comes at a cost by adding random noise to the training process, which reduces model performance. By making models more resistant to small changes in input and thus more stable, the necessary amount of noise can be decreased while still protecting privacy. This paper investigates various techniques to enhance stability, thereby minimizing the negative effects of privatization in machine learning.

399) Nicholas Hagedorn, Algorithmically Generated Pants Decompositions of Combinatorial Surfaces (29 Apr 2024)

We describe two algorithms that efficiently find a pants decomposition of a surface model given by taking a $2n$-sided regular polygon with unit length sides and gluing all the edges in pairs. The first algorithm closely follows Buser's proof that any surface $S$ of genus $g \ge 2$ has a pants decomposition of length at most $C(g \text{Area}(S))^{1/2}$ for some constant $C>0$. The second algorithm finds a pants decomposition by estimating the size of the largest embedded ball at a randomly chosen point on the surface. We prove that the first algorithm always gives a pants decomposition of size at most $C'g$ for some constant $C'>0$ in $O(ng + g^3)$ time. Empirically, we observe that the second algorithm outputs much shorter pants decomposition than the first.

398) Amith Saligrama, A Novel Statistical Framework for Characterizing Mosaic Altered Cells in Single-Cell RNA Data (22 Apr 2024)

We introduce a novel statistical framework, to analyze single-cell gene-expression counts in samples with autosomal alterations. Unlike the loss of the Y chromosome— easily detected due to gene de-activation and explored in prior works—identifying cells with autosomal alterations is fundamentally challenging. This complexity arises because, expression for autosomal chromosomes undergoing loss or alteration exhibits significant variability, rendering detection purely based on absolute counts unreliable. Our key insight for detecting chromosomal loss in a cell is based on the idea of normalizing against another chromosome, whose expression is known to be statistically independent of target chromosomal loss/mutation. This leads us to a precise characterization in terms of binomial distributions, and we can perform a hypothesis test for each cell and detect ploidy. We extend this framework for detection of cells with allelic alterations. We then develop a classification algorithm that detects chromosomal loss under control on false positivity rate (FPR). We validate our model by utilizing counts of single RNA molecules from haplotypes affected in a fraction of the cells analyses, and then use the algorithm to identify cells that have lost chromosome 18 in brain cells or carry a 9q CN-LOH alteration in chromosome 9q in induced pluripotent stem cells derived from peripheral blood mononuclear cells. Cell-by-cell identification of chromosomal loss is a critical step for inferring gene expressivity, and we identify a consistent pattern of abnormal trans-chromosomal expression in cells with autosomal loss/alterations. Our study also leads to a rather surprising finding: prior studies associate 9q CN-LOH with diverse detrimental effects, and in contrast our study reveals that the mutated cells behave no differently from non-mutated cells.

397) Alan Bu, The Local-Global Principle and a Projective Twist on the Hasse Norm Theorem (21 Apr 2024)

A finite extension of global fields L/K satisfies the Hasse norm principle if any nonzero element of K has the property that it is a norm locally if and only if it is a norm globally. In 1931, Hasse proved that any cyclic extension satisfies the Hasse norm principle, providing a novel approach to extending the local-global principle to equations with degree greater than 2. In this paper, we introduce the projective Hasse norm principle, generalizing the Hasse norm principle to multiple number fields and asking whether a projective line that contains a norm locally in every number field must also contain a norm globally in every number field. We show that the projective Hasse norm principle is independent from the conjunction of Hasse norm principles in all of the constituent number fields in the general case, but that the latter implies the former when the fields are all Galois and independent. We also prove an analogue of the Hasse norm theorem for the projective Hasse norm theorem, namely that the projective Hasse norm principle holds in all cyclic extensions.

396) Raymond Luo, Cyclic Base Orderings of Matroids (18 Mar 2024)

A cyclic base ordering of a matroid $M=(E,\mathcal{I})$ is a cyclic ordering of the elements of $E$ such that every $r(E)$ consecutive elements form a base, where $r$ is the rank function of $M$. An area of research in matroid theory asks which matroid classes exhibit cyclic base orderings under certain conditions. In this paper, we provide several necessary conditions for matching and graphic matroids to have cyclic base orderings. We also provide graph operations that preserve the existence of cyclic base orderings on graphic matroids.

395) Nick Derr (MIT), Alex Zhao, Torque and Force Resulting from Eccentricity in Oscillation (7 Mar 2024)

The self-alignment and organization of small objects in fluids is important in many contexts including biology, robotics, and medicine. In laminar Stokes flow, viscous forces dominate, and Purcell’s theorem forbids time-average steady flow as a result of oscillation. In turbulent flow, inertial forces dominate. At intermediate Reynolds numbers, not one of viscous or inertial forces can dominate the other. A number of recent works have investigated steady flows as a result of micro-oscillation in simple systems at intermediate Reynolds numbers. Here, we extend previous work and analyze the micro-oscillation behavior of fluid in a 2D circular domain, forced around two ellipses fixed in position. A perturbation analysis decomposes the problem into a series of linear problems, which are solved using the finite element method in complex numbers. The force and torque on each ellipse is computed with various geometric positions and Reynolds numbers Re. We find that in the single-ellipse case, the torque is sinusoidal in angular orientation, and also approximately proportional to Re, so angular orientation aligns perpendicular to the direction of oscillation. In the double-ellipse case, several local effects in the single-ellipse case are preserved. Furthermore, the change in the torque of one ellipse is also sinusoidal in the angular orientation of the other ellipse and proportional to Re, as well as being independent of the orientation of the one ellipse.

394) David Dong, Generalized Eulerian Numbers and Directed Friends-and-seats Graphs (arXiv.org, 29 Feb 2024)

Let $A(n,m)$ denote the Eulerian numbers, which count the number of permutations on $[n]$ with exactly $m$ descents, or, due to the Foata transform, the number of permutations on $[n]$ with exactly $m$ excedances. Friends-and-seats graphs, also known as friends-and-strangers graphs, are a seemingly unrelated recent construction in graph theory. In this paper, we introduce directed friends-and-seats graphs and establish a connection between these graphs and a generalization of the Eulerian numbers. We use this connection to reprove and extend a Worpitzky-like identity on generalized Eulerian numbers.

393) Laura P. Schaposnik (University of Illinois at Chicago), Raina Wu, Influencer Identification on Link Predicted Graphs (arXiv.org, 5 Feb 2024)

How would admissions look like in a university program for influencers? In the realm of social network analysis, influence maximization and link prediction stand out as pivotal challenges. Influence maximization focuses on identifying a set of key nodes to maximize information dissemination, while link prediction aims to foresee potential connections within the network. These strategies, primarily deep learning link prediction methods and greedy algorithms, have been previously used in tandem to identify future influencers. However, given the complexity of these tasks, especially in large-scale networks, we propose an algorithm, The Social Sphere Model, which uniquely utilizes expected value in its future graph prediction and combines specifically path-based link prediction metrics and heuristic influence maximization strategies to effectively identify future vital nodes in weighted networks. Our approach is tested on two distinct contagion models, offering a promising solution with lower computational demands. This advancement not only enhances our understanding of network dynamics but also opens new avenues for efficient network management and influence strategy development.

392) Andrey Boris Khesin (MIT) and Alexander M. Li, Canonical Forms and Equivalence Classes of QECC’s in ZX Calculus (4 Feb 2024)

Quantum error-correcting codes (QECC’s) are needed to combat the inherent noise affecting quantum processes. Using ZX calculus, we represent QECC’s in a form called a ZX diagram, consisting of a graph made up of nodes and edges. In this paper, we present canonical forms for the ZX diagrams of the toric codes and certain surface codes. We derive these forms by rewriting them using the bialgebra rule, which removes extra internal nodes and was implemented through Quantomatic, and edge local complementation rule, which exchanges the colors of two nodes. Next, we tabulate the equivalence classes, including properties such as their size and the presence (or lack) of bipartite forms, of generic ZX diagrams of QECC’s. This work expands on previous works in exploring the canonical forms of QECC’s in their ZX diagram representations.

391) Joel Hayford, Jacob Goldman-Wetzler, Eric Wang (PRIMES), Lu Lu, Speeding up and reducing memory usage for scientific machine learning via mixed precision (arXiv.org, 30 Jan 2024)

Scientific machine learning (SciML) has emerged as a versatile approach to address complex computational science and engineering problems. Within this field, physics-informed neural networks (PINNs) and deep operator networks (DeepONets) stand out as the leading techniques for solving partial differential equations by incorporating both physical equations and experimental data. However, training PINNs and DeepONets requires significant computational resources, including long computational times and large amounts of memory. In search of computational efficiency, training neural networks using half precision (float16) rather than the conventional single (float32) or double (float64) precision has gained substantial interest, given the inherent benefits of reduced computational time and memory consumed. However, we find that float16 cannot be applied to SciML methods, because of gradient divergence at the start of training, weight updates going to zero, and the inability to converge to a local minima. To overcome these limitations, we explore mixed precision, which is an approach that combines the float16 and float32 numerical formats to reduce memory usage and increase computational speed. Our experiments showcase that mixed precision training not only substantially decreases training times and memory demands but also maintains model accuracy. We also reinforce our empirical observations with a theoretical analysis. The research has broad implications for SciML in various computational applications.

390) Dongchen Zou, Intersection Attacks in Non-Uniform Setting (22 Jan 2024)

Recently consumer demand for privacy has spurred growth in private messaging systems. However, formally, privacy degrades in such systems when users log on and off: this change of status exposes the ongoing conversations. Intersection attacks (also known as statistical disclosure attacks) use messaging patterns or liveness information to reconstruct relationships, deanonymize users, and track user behaviors. Prior attacks assume users have an underlying uniform communication pattern for simplicity, leaving the question open of how effective such attacks would be in a non-uniform real world. We observe that effects like clustering in real social graphs and correlation between repeated conversations change the behavior and potential of such attacks. This paper provides a new approach that can consider some of these additional factors by constructing a polynomial to determine the social graph. We provide an analysis of the performance, accuracy, and convergence rate of our attack. Our attack applies to many existing anonymous communication systems, and our technique can be extended to incorporate additional factors.

389) Sophia Lichterfeld, Garima Rastogi, and Kyle Hogan (MIT), Leveraging the Escrow-Holding Abilities of Ethereum Smart Contracts to Incentivize Account Creation for the Widespread Adoption of Web Monetization Schemes (16 Jan 2024)

Traditional Web Monetization (WM) schemes that stream micropayments directly to the website owner throughout the time the user spends on the page have faced significant challenges in acquiring the widespread adoption of their platforms because they require full website participation to be implemented. However, many website owners are still unfamiliar with cryptocurrencies and online wallets; therefore, it becomes a major hindrance to obligate website owners to have already set up a completely functional WM system before any user can begin employing WM with the website. Our proposal addresses this barrier by providing users with the option to initiate WM on a web page even before its owner has had the chance to establish their end of the system. We introduce a scheme where any user wishing to employ WM on a site can begin streaming micropayments to a common smart contract address where the money will be temporarily held in escrow. Owners wanting to retrieve this revenue must adopt the WM standard for future use; thus, our approach ultimately aims to encourage the propagation of WM as a viable alternative to ads or subscriptions, especially for small websites.

388) Iz Chen, Arun S. Kannan (MIT), and Krishna Pothapragada, Classification of Non-Degenerate Symmetric Bilinear and Quadratic Forms in the Verlinde Category $Ver_4^+$ (16 Jan 2024, arXiv.org 10 Jun 2024), published in Journal of Algebra 686 (2026): 220-262

Although Deligne's theorem classifies all symmetric tensor categories (STCs) with moderate growth over algebraically closed fields of characteristic zero, the classification does not extend to positive characteristic. At the forefront of the study of STCs is the search for an analog to Deligne's theorem in positive characteristic, and it has become increasingly apparent that the Verlinde categories are to play a significant role. Moreover, these categories are largely unstudied, but have already shown very interesting phenomena as both a generalization of and a departure from superalgebra and supergeometry. In this paper, we study $Ver_4^+$, the simplest non-trivial Verlinde category in characteristic 2. In particular, we classify all isomorphism classes of non-degenerate symmetric bilinear forms and non-degenerate quadratic forms and study the associated Witt semi-ring that arises from the addition and multiplication operations on bilinear forms.

387) Boyan Litchev, Parallelizable and Updatable Private Information Retrieval (15 Jan 2024)

In traditional fully homomorphic encryption (FHE), number-theoretic transforms (NTTs) are utilized to speed up the process of multiplication. After multiplication, the ciphertext noise increases multiplicatively, meaning that few multiplications can be applied successively. To reduce this noise, certain schemes apply modulus and key-switching after multiplication. However, these operations cannot be applied to the NTT forms of ciphertexts, so ciphertexts have to be converted out of NTT form, using a significant amount of processing time and preventing parallelization. In the setting of private information retrieval (PIR), small ciphertext values, low multiplicative depth, and the usage of fresh ciphertexts in multiplications mitigate noise even without key and modulus-switching. We explore the efficiency of removing key and modulus-switching from the computation process for PIR, eliminating the need for intermediate number-theoretic transforms. This also aids in updating the result of a query when the database is modified.

386) Anna Du, Utilizing Machine Learning to Identify Time Asymmetry of DNA Loop Extrusion (15 Jan 2024)

DNA loop extrusion, mediated by cohesin protein complexes, plays a central role in genome organization. However, direct observation of loop extrusion in vivo remains challenging. This study investigates a novel methodology using time reversal asymmetry and machine learning to detect loop extrusion in microscopy data. I aim to do this by analyzing DNA motion in microscopy data, hypothesizing that movies of DNA under loop extrusion appear differently when played forward versus backward. Simulations with and without loop extrusion generate a synthetic dataset to test this hypothesis and determine the feasibility of detection. A Convolutional Neural Network (CNN) is employed to process these DNA motion movies, trained through supervised learning to distinguish between normal and reversed trajectories. The CNN’s performance, measured by its accuracy in identifying reversed motion, serves as an indicator of loop extrusion presence in the DNA. The test CNN used here achieved an accuracy consistent with random guessing on simulated data with loop extrusion, suggesting great difficulty in the prediction task. I propose further optimizations such as increasing the frame rate, change in network architecture, and extrusion parameters which may make the task easier. With additional optimization, this approach may enable time reversal and machine learning to analyze the presence of loop extrusion.

385) Kent B. Vashaw (MIT) and Justin Zhang, Non-negligible summands in tensor powers of some modular representations of finite p-groups (15 Jan 2024; arXiv.org, 21 Aug 2025)

Let $p>0$ be a prime, $G$ be a finite $p$-group and $\Bbbk$ be an algebraically closed field of characteristic $p$. Dave Benson has conjectured that if $p=2$ and $V$ is an odd-dimensional indecomposable representation of $G$ then all summands of the tensor product $V \otimes V^*$ except for $\Bbbk$ have even dimension. It is known that the analogous result for general $p$ is false. In this paper, we investigate the class of graded representations $V$ which have dimension coprime to $p$ and for which $V \otimes V^*$ has a non-trivial summand of dimension coprime to $p$, for a graded group scheme closely related to $\mathbb{Z}/p^r \mathbb{Z} \times \mathbb{Z}/p^s \mathbb{Z}$, where $r$ and $s$ are nonnegative integers and $p>2$. We produce an infinite family of such representations in characteristic 3 and show in particular that the tensor subcategory generated by any of these representations in the semisimplification contains the modulo $3$ reduction of the category of representations of the symmetric group $S_3$. Our results are compatible with a general version of Benson's conjecture due to Etingof.

384) Razzi Masroor, Hyperoctahedral Schur Category and Hyperoctahedral Web Category (15 Jan 2024; arXiv.org, 7 Feb 2024)

We extend the Schur algebra and the polynomial web category of the symmetric group to the hyperoctahedral group. In particular, we define the hyperoctahedral web category diagrammatically by generators and relations, and prove that it is equivalent to the hyperoctahedral Schur category.

383) Evan Ning (PRIMES), Nikita Lazarev (MIT), and Varun Gohil (MIT), Reinforcement Learning Based Serverless Container Autoscaler (15 Jan 2024)

Cloud computing, characterized by vast data centers with millions of high-performance computers, has revolutionized the way developers run code, offering scalability without the constraints of hardware limitations. Serverless Function as a Service (FaaS) within cloud computing has emerged as a popular paradigm, freeing users from resource management responsibilities and adopting a pay-per-functioncall model. While this approach is resource-efficient and cost-effective for users, it introduces challenges for serverless providers in maintaining Quality of Service (QoS). Effective resource allocation in serverless environments is critical, yet challenging. Underprovisioning can lead to function execution failures, necessitating resource redeployment and compromising QoS. Conversely, over-provisioning results in inefficiency as functions operate with more resources than required. The dynamic nature of serverless environments, characterized by diverse functions with varying workloads and short task durations, adds complexity to resource allocation. Current serverless providers often employ Finite-State-Machine (FSM)-based resource managers, necessitating manual tuning of parameters like autoscalers, load balancers, and CPU frequency governors. To address these challenges, machine learning methods, particularly reinforcement learning (RL), have been explored. RL’s adaptability to dynamic serverless environments, where functions exhibit diverse characteristics, makes it a compelling choice. In this paper, we present an RL-based approach to resource management, leveraging its ability to simultaneously optimize multiple parameters without manual intervention. Our implementation utilizes RL algorithms, including Deep Q Learning, to provide scaling recommendations for cloud providers, demonstrating successful convergence in both horizontal and vertical scaling scenarios. To evaluate our approach, we constructed and replicated a serverless environment using vHive, vSwarm, and Kubernetes. The results indicate not only successful convergence in scaling but also rapid adaptability—a crucial attribute in the context of dynamic serverless environments. This research contributes valuable insights into the application of RL in serverless resource management, paving the way for future advancements in the field.

382) Alan Song (PRIMES), Nikita Lazarev (MIT), Varun Gohil (MIT), and Yueying Li (MIT), SCARLET: Serverless Container Autoscaling with Reinforcement Learning Environments (15 Jan 2024)

Serverless computing is a paradigm of cloud computing that allows users to avoid challenging server management and overprovisioning of resources. In the serverless model, users submit functions to cloud providers (e.g. Google or Amazon), who deploy and execute instances of these workloads in short-lived containers before returning the output to the user. Cloud providers are thus responsible for managing computing resources such that (1) user-provider agreements on quality of service objectives are met, and (2) resources (i.e. containers) are neither over- nor underprovisioned. Current serverless systems in production address resource management with naive autoscalers that provide heuristic solutions at best. Recent research has shown that using reinforcement learning (RL) for serverless resource management is promising; however, the implementation of RL-based autoscalers in production-grade environments like Kubernetes and the evaluation of these autoscalers using realistic serverless benchmarks have been limited. We present SCARLET, a framework for RLbased autoscaling in Kubernetes clusters. In our design, users only need to provide standard Kubernetes YAML manifests and service-level agreement (SLA) configurations for each function. SCARLET also allows developers to experiment with any RL agent implemented with adherence to the standard OpenAI Gym API. Finally, we use SCARLET to implement a Deep Q-Learning model. Our evaluation demonstrates that, through implementation via SCARLET, the model satisfies quality-of-service constraints for multiple functions running concurrently.

381) Victor Gonzalez, Eddy Li, Henrick Rabinovitz, Pedro Rodriguez, and Marcos Tirador (CrowdMath-2023), On the Atomicity of Power Monoids of Puiseux Monoids (15 Jan 2024; arXiv.org, 23 Jan 2024), published in International Journal of Algebra and Computation 35:2 (2025): 167-181.

A submonoid of the additive group $\mathbb{Q}$ is called a Puiseux monoid if it consists of nonnegative rationals. Given a monoid $M$, the set consisting of all nonempty finite subsets of $M$ is also a monoid under the Minkowski sum, and it is called the (finitary) power monoid of $M$. In this paper we study atomicity and factorization properties in power monoids of Puiseux monoids. We specially focus on the ascent of the property of being atomic and both the bounded and the finite factorization properties (the ascending chain on principal ideals and the length-finite factorization properties are also considered here). We prove that both the bounded and the finite factorization properties ascend from any Puiseux monoid to its power monoid. On the other hand, we construct an atomic Puiseux monoid whose power monoid is not atomic. We also prove that the existence of maximal common divisors for nonempty finite subsets is a sufficient condition for the property of being atomic to ascend from a Puiseux monoid to its power monoid

380) Ethan Liu, On the Structure and Generators of the nth-order Chromatic Algebra (arXiv.org, 11 Jan 2024)

This work investigates the intrinsic properties of the chromatic algebra, introduced by Fendley and Krushkal as a framework to study the chromatic polynomial. We prove that the dimension of the $n$th-order chromatic algebra is the $2n$th Riordan number, which exhibits exponential growth. We find a generating set of size $\binom{n}{2}$, and we provide a procedure to construct the basis from the generating set. We additionally provide proofs for fundamental facts about this algebra that appear to be missing from the literature. These include determining a representation of the chromatic algebra as noncrossing planar partitions and expanding the chromatic relations to include an edge case.

379) Roger Fan, Nitya Mani (MIT), Multidisperse Random Sequential Adsorption and Generalizations (7 Jan 2024)

In this paper, we present a unified study of the limiting density in one-dimensional random sequential adsorption (RSA) processes where segment lengths are drawn from a given distribution. In addition to generic bounds, we are also able to characterize specific cases, including multidisperse RSA, in which we draw from a finite set of lengths, and power-law RSA, in which we draw lengths from a power-law distribution.

378) Vasiliy Neckrasov (Brandeis University), Eric Zhan, On Nontrivial Winning and Losing Parameters of Schmidt Games (arXiv.org, 1 Jan 2024), published in Results in Mathematics 80, 236 (2025)

In this paper we completely describe the winning and losing conditions different from the only "trivial" conditions known before. In other words, we solve the open question of finding a complete nontrivial Schmidt diagram. In addition, we give the new bounds for two family of sets: one related to frequencies of digits in base-$2$ expansions, and one connected to the set of the badly approximable numbers.

377) Adrita Samanta and Henry Han, Visualizing Distributed Traces in Aggregate (30 Dec 2023)

Distributed systems are comprised of many components that communicate together to form an application. Distributed tracing gives us visibility into these complex interactions, but it can be difficult to reason about the system’s behavior, even with traces. Systems collect large amounts of tracing data even with low sampling rates. Even when there are patterns in the system, it is often difficult to detect similarities in traces since current tools mainly allow developers to visualize individual traces. Debugging and system optimization is difficult for developers without an understanding of the whole trace dataset. In order to help present these similarities, this paper proposes a method to aggregate traces in a way that groups together and visualizes similar traces.We do so by assigning a few traces that are representative of each set. We suggest that traces can be grouped based on how many services they share, how many levels the graph has, how structurally similar they are, or how close their latencies are. We also develop an aggregate trace data structure as a way to comprehensively visualize these groups and a method for filtering out incomplete traces if a more complete version of the trace exists. The unique traces of each group are especially useful to developers for troubleshooting. Overall, our approach allows for a more efficient method of analyzing system behavior.

376) Michael Yang, Rigidity and Rank of Group-Circulant Matrices (19 Dec 2023)

Given a finite group $G$, a ring $\Lambda,$ and a function $f : G \rightarrow \Lambda$, a $G$-circulant matrix of $f$ is a $|G| \times |G|$ matrix $M$ with rows and columns indexed by the elements of $G$ for which $M_{xy} = f(xy)$ for all $x, y \in G.$ We study the fundamental properties of $G$-circulants when $\Lambda$ is an algebraically closed field with characteristic coprime to $|G|$. We begin by proving new results about the matrix rigidity of $G$-circulants for nonabelian $G$, which are the first of its kind. We show that for any sequence of finite groups $G_i$ whose abelian normal subgroups have sufficiently small index, the family of $G_i$-circulants is not Valiant-rigid. Furthermore, we show that this result applies for families of groups $\{G_i\}_i$ whose representations are bounded above in degree. Next, we exhibit a formula for the rank of any $G$-circulant in terms of the decomposition of its corresponding function $f : G \rightarrow \Lambda$ into the matrix coefficients of the irreducible representations of $G.$ While this was known to Diaconis, we present a more elementary proof that avoids the full strength of Schur Orthogonality. We then apply this formula to the case of $G$-circulants for cyclic $G.$ Through this, we generalize a theorem of Chen, providing a necessary and sufficient criterion for when zero-one circulants are always nonsingular. Additionally, we answer an open problem about singular circulant digraphs posed by Lal--Reddy and give a probabilistic estimate for the regularity of zero-one singular circulant matrices. Lastly, we investigate orthogonal representations of graphs. Given a finite, simple graph $G,$ we provide a novel lower bound for the minimal dimension in which a faithful orthogonal representation for $G$ exists. Furthermore, we use our bound to determine the aforementioned minimal dimension for an infinite family of Kneser graphs up to a constant factor.

375) Rohan Das, Christopher Qiu, and Shiqiao Zhang, The Distribution of the Cokernels of Random Symmetric and Alternating Matrices over the Integers Modulo a Prime Power (13 Dec 2023)

Given a prime $p$ and positive integers $n$ and $k$, consider the ring $M_n(\mathbb{Z}/p^{k}\mathbb{Z})$ of $n \times n$ matrices over $\mathbb{Z}/p^{k}\mathbb{Z}$. In 1989, Friedman and Washington computed the number of matrices in $M_n(\mathbb{Z}/p^{k}\mathbb{Z})$ with a given residue modulo $p$ and a given cokernel $G$ subject to the condition $p^{k - 1} G = 0$. Cheong, Liang, and Strand generalized this result in 2023 by removing the condition $p^{k - 1} G = 0$, completing the description of the distribution of the cokernel of a random matrix uniformly selected from $M_n(\mathbb{Z}/p^{k}\mathbb{Z})$. In 2015, following the work of Friedman and Washington, Clancy, Kaplan, Leake, Payne, and Wood determined the distribution of the cokernel of a random $n \times n$ symmetric matrix over $\mathbb{Z}_p$, and Bhargava, Kane, Lenstra, Poonen, and Rains determined the distribution of the cokernel of a random $n \times n$ alternating matrix over $\mathbb{Z}_p$. In this paper, we refine these results by determining the distribution of the cokernels of random symmetric and alternating matrices over $\mathbb{Z}_p$ with a fixed residue modulo $p$.

374) James Unwin (University of Illinois at Chicago) and Steve Zhang, On the Optimization of Cost Functions in Absolute Plate Motion Modeling (9 Dec 2023; arXiv.org, 2 Mar 2026)

We consider the implementation of optimization techniques within the study of tectonic plate motion. Specifically, we examine the optimization underlying optAPM, a leading code for modeling absolute plate motion. We highlight that modifications in the construction of the objective function, composed of individual cost functions, can improve modelling performance. In particular, we propose a simpler and more intuitive formulation of the hotspot cost function. A key part of the new hotspot analysis is the pre-interpolation of hotspot trail data, crucial geological markers for validating absolute plate motion over O(100) Myr timescales. By reducing the propagation of modeling errors, our refined model provides more precise reconstructions of historical plate movements. Our modified hotspot modelling improves the accuracy and reliability of the optAPM outputs.

373) Anton Levonian, Existence of Circle Packings on Translation Surfaces (8 Dec 2023; arXiv.org, 11 Dec 2023)

A translation surface is a surface formed by identifying edges of a collection of polygons in the complex plane that are parallel and of equal length using only translations. We determined that the same circle packing can be realized on varying translation surfaces in a certain stratum. We also determined possible complexities of contacts graphs and provide a bound on this complexity in some low-genus strata. Finally, we established the possibility of certain contacts graphs’ complexities in strata with genus greater than 2.

372) Srinivas Arun, Further Bounds on the Helly Numbers of Product Sets (7 Dec 2023)

The Helly number $h(S)$ of a set $S\subseteq\mathbb{R}^d$ is defined as the smallest positive integer $h$, if it exists, such that the following statement is true: for any finite family of convex sets in $\mathbb{R}^d,$ if every subfamily of $h$ sets intersects, then all sets in the family intersect. We study Helly numbers of product sets of the form $A^d$ for some one-dimensional set $A.$ Inspired by Dillon's research on the Helly numbers of product sets, Ambrus, Balko, Frankl, Jung, and Naszódi recently obtained the first bounds for Helly numbers of exponential lattices in two dimensions, which are sets of the form $S=\{\alpha^n: n\in\mathbb{N}\}^2$ for some $\alpha>1.$ We develop a different, simpler method to obtain better upper bounds for exponential lattices. In addition, we generalize the lower bounds of Ambrus et al.~to higher dimensions. We additionally investigate sets $A\in\mathbb{Z}$ whose consecutive elements differ by at most $2$ such that $h(A^2)=\infty.$ We slightly strengthen a theorem of Dillon that such sets exist while also providing a shorter proof. We obtain Helly number bounds for certain sets defined by arithmetic congruences. Finally, we introduce a generalization of the notion of an empty polygon, and show that in one case, it is equivalent to the original definition.

371) Michelle Wei and Guanghao Ye (MIT), Solving Second-Order Cone Programs Deterministically in Matrix Multiplication Time (3 Dec 2023)

We propose a deterministic algorithm for solving second-order cone programs of the form \[ \min_{Ax=b,x \in \mathcal{L}_1\times \dots \times \mathcal{L}_r} c^\top x, \] which optimize a linear objective function over the set of $x\in \mathbb{R}^n$ contained in the intersection of an affine set and the product of $r$ second-order cones. Our algorithm achieves a runtime of $$\widetilde {O}((n^{\omega} + n^{2+o(1)}r^{1/6} + n^{2.5-\alpha/2 + o(1)})\log(1/\epsilon)),$$ where $\omega$ and $\alpha$ are the exponents of matrix multiplication, and $\epsilon$ is the relative accuracy. For the current values of $\omega\sim 2.37$ and $\alpha\sim 0.32$, our algorithm takes $\widetilde{O}(n^{\omega} \log(1/\epsilon))$ time. This nearly matches the runtime for solving the sub-problem $Ax=b$. To the best of our knowledge, this is the first improvement on the computational complexity of solving second-order cone programs after the seminal work of Nesterov and Nemirovski on general convex programs. For $\omega=2$, our algorithm takes $\widetilde{O}(n^{2+o(1)} r^{1/6}\log(1/\epsilon))$ time. To obtain this result, we utilize several new concepts that we believe may be of independent interest: (1) We introduce a novel reduction for splitting $\ell_p$-cones. (2) We propose a deterministic data structure to efficiently maintain the central path of interior point methods for general convex programs.

370) Sophia Liao, Harold Polo (University of Florida), A Goldbach theorem for Laurent polynomials with positive integer coefficients (arXiv.org, 2 Dec 2023), published in American Mathematical Monthly (9 Jul 2024)

We establish an analogue of the Goldbach conjecture for Laurent polynomials with positive integer coefficients.

369) Matvey Borodin, The Orbits of the Action of the Cactus Group on Arc Diagrams (arXiv.org, 2 Dec 2023)

The cactus group $J_n$ is the $S_n$-equivariant fundamental group of the real locus of the Deligne-Mumford moduli space of stable rational curves with marked points. This group plays the role of the braid group for the monoidal category of Kashiwara crystals attached to a simple Lie algebra. Following Frenkel, Kirillov and Varchenko, one can identify the multiplicity set in a tensor product of $\mathfrak{sl}_2$-crystals with the set of arc diagrams on a disc, thus allowing a much simpler description of the corresponding $J_n$-action. We address the problem of classifying the orbits of this cactus group action. Namely, we describe some invariants of this action and show that in some (fairly general) classes of examples there are no other invariants. Furthermore, we describe some additional relations, including the braid relation, that this action places on the generators of $J_n$.

368) Catherine Li and Daniel Lazarev (MIT), Spatiotemporal risk prediction for infectious disease spread and mortality (28 Nov 2023; arXiv.org, 5 Dec 2023)

With the outbreak of the COVID-19 pandemic, various studies have focused on predicting the trajectory and risk factors of the virus and its variants. Building on previous work that addressed this problem using genetic and epidemiological data, we introduce a method, Geo Score, that also incorporates geographic, socioeconomic, and demographic data to estimate infection and mortality risk by region and time. We employ gradient descent to find the optimal weights of the factors’ significance in determining risk. Such spatiotemporal risk prediction is important for informed public health decision-making so that individuals are aware of the risks of travel during an epidemic or pandemic, and, perhaps more importantly, so that policymakers know how to triage limited resources during a crisis. We apply our method to New York City COVID-19 data from 2020, predicting ZIP code-level COVID-19 risk for 2021.

367) Aryan Bora, Yunseo Choi (Harvard), and Lucas Tang, On the Spum and Sum-Diameter of Paths (27 Nov 2023), published in Discrete Mathematics 348 (2025) 114257

In a sum graph, the vertices are labeled with distinct positive integers, and two vertices are adjacent if the sum of their labels is equal to the label of another vertex. The spum of a graph G is defined as the minimum difference between the largest and smallest labels of a sum graph that consists of G in union with a minimum number of isolated vertices. More recently, Li introduced the sum-diameter of a graph G, which modifies the definition of spum by removing the requirement that the number of isolated vertices must be minimal. In this paper, we settle conjectures by Singla, Tiwari, and Tripathi and a conjecture by Li by evaluating the spum and the sum-diameter of paths.

366) Artem Kalmykov (MIT), Brian Li, Intertwining operators between subregular Whittaker modules for $\mathfrak{gl}_N$ and non-standard quantizations (arXiv.org, 29 Oct 2023)

In this paper, we study intertwining operators between subregular Whittaker modules of $gl_N$ generalizing, on the one hand, the classical exchange construction of dynamical quantum groups, on the other hand, earlier results for principal W-algebras. We explicitly construct them using the generators of W-algebras introduced by Brundan-Kleshchev. We interpret the fusion on intertwining operators in terms of categorical actions and compute the semi-classical limit of the corresponding monoidal isomorphisms which turn out to depend on dynamical-like parameters.

365) Felix Gotti (MIT), Henrick Rabinovitz, On the ascent of atomicity to one-dimensional monoid algebras (arXiv.org, 28 Oct 2023), forthcoming in Journal of Algebra

A commutative cancellative monoid is atomic if every non-invertible element factors into irreducibles (also called atoms), while an integral domain is atomic if its multiplicative monoid is atomic. Back in the eighties, Gilmer posed the question of whether the fact that a torsion-free monoid~$M$ and an integral domain $R$ are both atomic implies that the monoid algebra $R[M]$ of $M$ over $R$ is also atomic. In general this is not true, and the first negative answer to this question was given by Roitman in 1993: he constructed of an atomic integral domain whose polynomial extension is not atomic. More recently, Coykendall and the first author constructed finite-rank torsion-free atomic monoids whose algebras over certain finite fields are not atomic. Still, the ascent of atomicity from finite-rank torsion-free monoids to their monoid algebras over fields of characteristic zero is an open problem. The main purpose of this paper is to provide a negative answer to this problem. We actually construct a rank-one torsion-free atomic monoid whose monoid algebras over any field are not atomic. To do so, we introduce and study a methodological construction inside the class of rank-one torsion-free monoid that we call lifting: it consists in embedding a given monoid into another monoid that is often more tractable from the arithmetic viewpoint.

364) Scott T. Chapman (SHSU), Joshua Jang, Jason Mao, Skyler Mao, Betti Graphs and Atomization of Puiseux Monoids (9 Oct 2023; arXiv.org, 30 Nov 2023), forthcoming in the Bulletin of the Australian Mathematical Society

Let $M$ be a Puiseux monoid, that is, a monoid consisting of nonnegative rationals (under addition). A nonzero element of $M$ is called an atom if its only decomposition as a sum of two elements in $M$ is the trivial decomposition (i.e., one of the summands is $0$), while a nonzero element $b \in M$ is called atomic if it can be expressed as a sum of finitely many atoms allowing repetitions: this sum of atoms is called an (additive) factorization of $b$. The monoid $M$ is called atomic if every nonzero element of $M$ is atomic. In this paper, we study factorizations in atomic Puiseux monoids through the lens of their associated Betti graphs. The Betti graph of $b \in M$ is the graph whose vertices are the factorizations of $b$ with edges between factorizations that share at least one atom. Betti graphs have been useful in the literature to understand several factorization invariants in the more general class of atomic monoids.

363) Hannah Fox, Agastya Goel, Sophia Liao, Arithmetic of semisubtractive semidomains (5 Oct 2023; arXiv.org, 13 Nov 2023)

A subset $S$ of an integral domain is called a semidomain if the pairs $(S,+)$ and $(S, \cdot)$ are commutative and cancellative semigroups with identities. The multiplication of $S$ extends to the group of differences $\mathcal{G}(S)$, turning $\mathcal{G}(S)$ into an integral domain. In this paper, we study the arithmetic of semisubtractive semidomains (i.e., semidomains $S$ for which either $s \in S$ or $-s \in S$ for every $s \in \mathcal{G}(S)$). Specifically, we provide necessary and sufficient conditions for a semisubtractive semidomain to satisfy the ascending chain condition on principals ideals, to be a bounded factorization semidomain, and to be a finite factorization semidomain, which are subsequent relaxations of the property of having unique factorizations. In addition, we present a characterization of half-factorial semisubtractive semidomains. Throughout the article, we present examples to provide insight into the arithmetic aspects of semisubtractive semidomains.

362) Andrew Lin, Henrick Rabinovitz, Qiao Zhang, The Furstenberg property in Puiseux monoids (arXiv.org, 21 Sept 2023)

Let $M$ be a commutative monoid. The monoid $M$ is called atomic if every non-invertible element of $M$ factors into atoms (i.e., irreducible elements), while $M$ is called a Furstenberg monoid if every non-invertible element of $M$ is divisible by an atom. Additive submonoids of $\mathbb{Q}$ consisting of nonnegative rationals are called Puiseux monoids, and their atomic structure has been actively studied during the past few years. The primary purpose of this paper is to investigate the property of being Furstenberg in the context of Puiseux monoids. In this direction, we consider some properties weaker than being Furstenberg, and then we connect these properties with some atomic results which have been already established for Puiseux monoids.

361) Akshaya Chakravarthy (PRIMES), Agustina Czenky (University of Oregon), Julia Plavnik (Indiana University Bloomington), On modular categories with Frobenius-Perron dimension congruent to 2 modulo 4 (arXiv.org, 24 Aug 2023), forthcoming in Proceedings of the American Mathematical Society

We contribute to the classification of modular categories $\mathcal{C}$ with $\operatorname{FPdim}(\mathcal{C})\equiv 2 \pmod 4$. We prove that such categories have group of invertibles of even order, and that they factorize as $\mathcal C\cong \widetilde{\mathcal C} \boxtimes \operatorname{sem}$, where $\widetilde{\mathcal C}$ is an odd-dimensional modular category and $\operatorname{sem}$ is the rank 2 pointed modular category. This reduces the classification of these categories to the classification of odd-dimensional modular categories. It follows that modular categories $\mathcal C$ with $\operatorname{FPdim}(\mathcal{C})\equiv 2 \pmod 4$ of rank up to 46 are pointed. More generally, we prove that if $\mathcal C$ is a weakly integral MTC and $p$ is an odd prime dividing the order of the group of invertibles that has multiplicity one in $\operatorname{FPdim}(\mathcal C)$, then we have a factorization $\mathcal C \cong \widetilde{\mathcal C} \boxtimes \operatorname{Vec}_{\mathbb Z_p}^{\chi},$ for $\widetilde{\mathcal C}$ an MTC with dimension not divisible by $p$.

360) Evan Chang (PRIMES), Neel Kolhe (PRIMES), Youngtak Sohn (MIT), Upper bounds on the $2$-colorability threshold of random $d$-regular $k$-uniform hypergraphs for $k\geq 3$ (arXiv.org, 3 Aug 2023)

For a large class of random constraint satisfaction problems (CSP), deep but non-rigorous theory from statistical physics predict the location of the sharp satisfiability transition. The works of Ding, Sly, Sun (2014, 2016) and Coja-Oghlan, Panagiotou (2014) established the satisfiability threshold for random regular $k$-NAE-SAT, random $k$-SAT, and random regular $k$-SAT for large enough $k\geq k_0$ where $k_0$ is a large non-explicit constant. Establishing the same for small values of $k\geq 3$ remains an important open problem in the study of random CSPs. In this work, we study two closely related models of random CSPs, namely the $2$-coloring on random $d$-regular $k$-uniform hypergraphs and the random $d$-regular $k$-NAE-SAT model. For every $k\geq 3$, we prove that there is an explicit $d_{\ast}(k)$ which gives a satisfiability upper bound for both of the models. Our upper bound $d_{\ast}(k)$ for $k\geq 3$ matches the prediction from statistical physics for the hypergraph $2$-coloring by Dall'Asta, Ramezanpour, Zecchina (2008), thus conjectured to be sharp. Moreover, $d_{\ast}(k)$ coincides with the satisfiability threshold of random regular $k$-NAE-SAT for large enough $k\geq k_0$ by Ding, Sly, Sun (2014).

359) Henry Jiang, Shihan Kanungo, Harry Kim, A weaker notion of the finite factorization property (arXiv.org, 18 Jul 2023), published in Communications of the Korean Mathematical Society 39:2 (2024): 313–329

An (additive) commutative monoid is called atomic if every given non-invertible element can be written as a sum of atoms (i.e., irreducible elements), in which case, such a sum is called a factorization of the given element. The number of atoms (counting repetitions) in the corresponding sum is called the length of the factorization. Following Geroldinger and Zhong, we say that an atomic monoid $M$ is a length-finite factorization monoid if each $b \in M$ has only finitely many factorizations of any prescribed length. An additive submonoid of $\mathbb{R}_{\ge 0}$ is called a positive monoid. Factorizations in positive monoids have been actively studied in recent years. The main purpose of this paper is to give a better understanding of the non-unique factorization phenomenon in positive monoids through the lens of the length-finite factorization property. To do so, we identify a large class of positive monoids which satisfy the length-finite factorization property. Then we compare the length-finite factorization property to the bounded and the finite factorization properties, which are two properties that have been systematically investigated for more than thirty years.

358) Alicia Li and Matan Yablon, Adversarial Attacks Against Online Learning Agents (1 Jul 2023)

Consider a typical streaming problem, where an agent dynamically interacts with its environment to learn an optimal behavior. Such methods are used in a variety of applications, including playing Atari games and robotic hand manipulation. We analyze an agent that learns the rewards of each path in its environment, which can be modeled as determining the edge weights of a graph. We study an agent that follows an ϵ-greedy sampling strategy because this model is widely used and has been successfully applied to many problems. However, in recent years, numerous attacks have been devised against graph learning algorithms, with some methods exploiting graph structure and node features. To ultimately create a robust graph streaming algorithm based on ϵ-annealing, we first construct, implement, and analyze worst-case attacks against random-sampling and ϵ-greedy victim models. Our adversarial strategy exploits path overlaps and stalls the victim to effectively increase the corruption budget.

357) Linus Tang, Extremal Bounds on Peripherality Measures (arXiv.org, 27 Jun 2023)

We investigate several measures of peripherality for vertices and edges in networks. We improve asymptotic bounds on the maximum value achieved by edge peripherality, edge sum peripherality, and the Trinajstić index over $n$ vertex graphs. We also prove similar results on the maxima over $n$-vertex bipartite graphs, trees, and graphs with a fixed diameter. Finally, we refute two conjectures of Furtula, the first on necessary conditions for minimizing the Trinajstić index and the second about maximizing the Trinajstić index.

356) David Dong, Generalized Eulerian Numbers (arXiv.org, 16 Jun 2023)

Let $A(n,m)$ denote the Eulerian numbers, which count the number of permutations on $[n]$ with exactly $m$ descents. It is well known that $A(n,m)$ also counts the number of permutations on $[n]$ with exactly $m$ excedances. In this report, we define numbers of the form $A(n,m,k)$, which count the number of permutations on $[n]$ with exactly $m$ descents and the last element $k$. We then show bijections between this definition and various other analogs for $r$-excedances and $r$-descents. We also prove a variation of Worpitzky's identity on $A(n,m,k)$ using a combinatorial argument mentioned in a paper by Spivey in 2021.

355) Joseph Vulakh, Twisted homogeneous racks over the alternating groups (arXiv.org, 30 May 2023), published in AMS Contemporary Mathematics 813 (2025): 341–351

An important step towards the classification of finite-dimensional pointed Hopf algebras is the classification of finite-dimensional Nichols algebras arising from braided vector spaces of group type. This question is fundamentally linked with the structure of algebraic objects called racks. Of particular interest to this classification is the type D condition on racks, a sufficient condition for a rack to not be the source of a finite-dimensional Nichols algebra. In this paper, we study the type D condition in simple racks arising from the alternating groups. Expanding upon previous work in this direction, we make progress towards a general classification of twisted homogeneous racks of type D by proving that several families of twisted homogeneous racks arising from alternating groups are of type D.

354) Agustina Czenky, William Gvozdjak (PRIMES), Julia Plavnik, Classification of low-rank odd-dimensional modular categories (arXiv.org, 23 May 2023), published in Journal of Algebra, and also presented at BIMSA-Tsinghua Quantum Symmetry Seminar

We prove that any odd-dimensional modular category of rank at most $23$ is pointed. We also show that an odd-dimensional modular category of rank $25$ is either pointed, perfect, or equivalent to $\operatorname{Rep}(D^\omega(\mathbb Z_7\rtimes\mathbb Z_3))$. Finally, we give partial classification results for modular categories of rank up to $73$.

2022 Research Papers

353) Alan Bu, Felix Gotti, Bangzheng Li, Alex Zhao, One-dimensional monoid algebras and ascending chains of principal ideals (arXiv.org, 1 Sep 2024)

An integral domain $R$ is called atomic if every nonzero nonunit of $R$ factors into irreducibles, while $R$ satisfies the ascending chain condition on principal ideals if every ascending chain of principal ideals of $R$ stabilizes. It is well known and not hard to verify that if an integral domain satisfies the ACCP, then it must be atomic. The converse does not hold in general, but examples are hard to come by and most of them are the result of crafty and technical constructions. Sporadic constructions of such atomic domains have appeared in the literature in the last five decades, including the first example of a finite-dimensional atomic monoid algebra not satisfying the ACCP recently constructed by the second and third authors. Here we construct the first known one-dimensional monoid algebras satisfying the almost ACCP but not the ACCP (the almost ACCP is a notion weaker than the ACCP but still stronger than atomicity). Although the two constructions we provide here are rather technical, the corresponding monoid algebras are perhaps the most elementary known examples of atomic domains not satisfying the ACCP.

352) Matvey Borodin, Ethan Liu, Justin Zhang, Results on Vanishing Polynomials and Polynomial Root Counting with Relevant Technological Applications (arXiv.org, 24 Sept 2023), published in Proceedings of the 2023 IEEE MIT Undergraduate Research Technology Conference

We study the set of algebraic objects known as vanishing polynomials (the set of polynomials that annihilate all elements of a ring) over general commutative rings with identity. These objects are of special interest due to their close connections to both ring theory and the technical applications of polynomials, along with numerous applications to other mathematical and engineering fields. We first determine the minimum degree of monic vanishing polynomials over a specific infinite family of rings of a specific form and consider a generalization of the notion of a monic vanishing polynomial over a subring. We then present a partial classification of the ideal of vanishing polynomials over general commutative rings with identity of prime and prime square orders. Finally, we prove some results on rings that have a finite number of roots and propose a technique that can be utilized to restrict the number of roots polynomials can have over certain finite commutative rings.

351) Daniel Kriz, Eric Shen (PRIMES), and Kevin Wu (PRIMES), Congruences between logarithms of Heegner points (26 Mar 2023)

Elliptic curves are an important class of Diophantine equations. We study certain special solutions of elliptic curves called Heegner points, which are the traces of images under modular parametrizations of complex multiplication points in the complex upper half-plane. We prove, for pairs of elliptic curves with isomorphic Galois representations, a general congruence of stabilized formal logarithms. This is done by first showing that the isomorphism of Galois representations implies a congruence of stabilized modular forms and then translating these to the congruence of formal logarithms using Honda’s theorem relating formal groups of elliptic curves to L-series and the modular parametrization. We use this congruence to show that examples of elliptic curves with analytic and algebraic rank 1 propagate in quadratic twist families.

350) Brendan Halstead, Moduli spaces of morphisms between cone stacks (22 Mar 2023)

We study morphisms between $\textit{cone stacks}$, objects defined by Cavelieri, Chan, Ulirsch, and Wise as a framework for moduli problems in tropical geometry. We construct a cone stack $[\Sigma, \Gamma]$ parameterizing morphisms between fixed cone stacks $\Sigma$ and $\Gamma.$ We also briefly discuss applications to logarithmic geometry.

349) Annie Wang, On the Hilbert Series of the Rational Cherednik Algebra in Type A_n in Characteristic p (28 Feb 2023)

We study the polynomial representation of the rational Cherednik algebra of type $A$ in characteristic $p=3$ for $p$ dividing $n-2$, some parameter $t=0$, and generic parameter $c.$ We describe all the polynomials in the maximal proper graded submodule $\ker{\mathcal{B}}$, which is the kernel of the contravariant form $\mathcal{B},$ and we use this to find the Hilbert series of the irreducible quotient for the polynomial representation. We proceed degree by degree to explicitly determine the Hilbert series and work towards proving Etingof and Rains's conjecture in the case that $p=3$, $t=0$, and $n=kp+2.$

348) Tanya Khovanova (MIT), Rich Wang (PRIMES), Ending States of a Special Variant of the Chip-Firing Algorithm (arXiv.org, 21 Feb 2023), published in Enumerative Combinatorics and Applications 4:3 (2024), article #S2R20

We investigate a special variant of chip-firing, in which we consider an infinite set of rooms on a number line, some of which are occupied by violinists. In a move, we take two violinists in adjacent rooms, and send one of them to the closest unoccupied room to the left and the other to the closest unoccupied room to the right. We classify the different possible final states from repeatedly performing this operation. We introduce numbers $R(N,\ell,x)$ that count labeled recursive rooted trees with $N$ vertices, $\ell$ leaves, and the smallest rooted path ending in $x$. We describe the properties of these numbers and connect them to permutations. We conjecture that these numbers describe the probabilities ending with different final states when the moves are chosen uniformly.

347) Khalid Ajran, Juliet Bringas, Bangzheng Li, Easton Singer, Marcos Tirador (CrowdMath-2022), Factorization in Additive Monoids of Evaluation Polynomial Semirings (arXiv.org, 5 Feb 2023), published in Communications in Algebra 51:10 (2023): 4347-4362

For a positive real $α$, we can consider the additive submonoid $M$ of the real line that is generated by the nonnegative powers of $α$. When $α$ is transcendental, $M$ is a unique factorization monoid. However, when $α$ is algebraic, $M$ may not be atomic, and even when $M$ is atomic, it may contain elements having more than one factorization (i.e., decomposition as a sum of irreducibles). The main purpose of this paper is to study the phenomenon of multiple factorizations inside $M$. When $α$ is algebraic but not rational, the arithmetic of factorizations in $M$ is highly interesting and complex. In order to arrive to that conclusion, we investigate various factorization invariants of $M$, including the sets of lengths, sets of Betti elements, and catenary degrees. Our investigation gives continuity to recent studies carried out by Chapman, et al. in 2020 and by Correa-Morris and Gotti in 2022.

346) Benjamin Fan (PRIMES), Edward Qiao (PRIMES), Anran Jiao, Zhouzhou Gu, Wenhao Li, and Lu Lu, Deep Learning for Solving and Estimating Dynamic Macro-Finance Models (4 Feb 2023)

Deep learning has been shown to be an effective method for solving partial differential equations (PDEs) by embedding the PDE residual into the neural network loss function. In this paper, we design a methodology that utilizes deep learning to simultaneously solve and estimate canonical continuous-time general equilibrium models in financial economics, including (1) industrial dynamics of firms and (2) macroeconomic models with financial frictions. Through these applications, we illustrate the advantages of our method.

345) Steven Tan, Models for Somatic CAG Repeat Expansion in the Onset and Progression of Huntington's Disease (30 Jan 2023)

Huntington's Disease (HD) is an inherited neurodegenerative disease caused by alleles with 36 or more repeats of the trinucleotide sequence CAG in the huntingtin (HTT) gene. A person with HD inherits an allele with a certain CAG length (> 35) at birth, but somatic expansion within the brain is known to occur throughout their lifetime, resulting in a situation in which individual cells have longer and highly variable numbers of CAG repeats. Somatic expansion is increasingly thought to be a driver of disease onset, as age-at-onset associates with modifier alleles in DNA-repair genes that regulate somatic expansion. Thus, a better understanding of the mechanisms behind CAG repeat expansion could be crucial in revealing novel therapeutic targets. In this study, we adapted a stochastic birth-death model previously used for a different repeat-expansion disease (Myotonic Dystrophy Type 1, or DM1) to model CAG repeat expansion in HD. We made use of a new kind of biological data, in which CAG length has been measured precisely in many individual neurons of the most vulnerable type from post mortem brain samples. We found that single-process models consisting of only one length threshold and rate — models that succeeded in modeling DM1 — were unable to explain all features of repeat expansion data observed in HD patients. Effectively fitting the data required models consisting of two separate processes, suggesting that there may be two distinct biological mechanisms underlying CAG repeat expansion in HD. These processes appear to have differing rates and CAG length thresholds: one at roughly 36 CAGs — a threshold for instability — and another at 70 CAGs, which we hypothesize is a threshold for accelerated expansion. This model deepens our understanding of disease progression and can inform the design of clinical trials for new therapies that target the somatic expansion process.

344) Garett Brown, Linda He (PRIMES), and James Unwin, The Potential Impact of Primordial Black Holes on Exoplanet Systems (28 Jan 2023), forthcoming in Monthly Notices of the Royal Astronomical Society

The orbits of planetary systems can be deformed from their initial configurations due to close encounters with large astrophysical bodies. Candidates for close encounters include astrophysical black holes, brown dwarf stars, rogue planets, as well as hypothetical populations of primordial black holes (PBH) or dark matter microhalos. We show that potentially tens of thousands of exoplanetary systems in the Milky Way may have had close encounters with PBH significant enough to impact their planetary orbits. Furthermore, we propose that precision measurements of exoplanet orbital parameters could be used to infer or constrain the abundances of these astrophysical bodies. Specifically, focusing on PBH we numerically estimate the number of times that such objects pass through the local neighborhood of a given planetary system, and then analyze the statistical impact on the orbital parameters of such systems.

343) Nilay Mishra, On the Uniqueness of Certain Types of Circle Packings on Translation Surfaces (26 Jan 2023; arXiv.org, 22 Jan 2025)

Consider a collection of finitely many polygons in $\mathbb C$, such that for each side of each polygon, there exists another side of some polygon in the collection (possibly the same) that is parallel and of equal length. A translation surface is the surface formed by identifying these opposite sides with one another. The $\mathcal{H}(1, 1)$ stratum consists of genus two translation surfaces with two singularities of order one. A circle packing corresponding to a graph $G$ is a configuration of disjoint disks such that each vertex of $G$ corresponds to a circle, two disks are externally tangent if and only if their vertices are connected by an edge in $G$, and $G$ is a triangulation of the surface. It is proven that for certain circle packings on $\mathcal{H}(1, 1)$ translation surfaces, there are only a finite number of ways the packing can vary without changing the contacts graph, if two disks along the slit are fixed in place. These variations can be explicitly characterized using a new concept known as \textit{splitting bigons}. Finally, the uniqueness theorem is generalized to a specific type of translation surfaces with arbitrary genus $g \geq 2$.

342) Yibo Gao (MIT) and Anthony Wang (PRIMES), Consecutive Patterns in Coxeter Groups (25 Jan 2023), published in Journal of Algebra, vol. 634 (15 November 2023): 650-666

For an arbitrary Coxeter group element $\sigma$ and a connected subset $J$ of the Coxeter diagram, the parabolic decomposition $\sigma=\sigma^J\sigma_J$ defines $\sigma_J$ as a consecutive pattern of $\sigma$, generalizing the notion of consecutive patterns in permutations. We then define the cc-Wilf-equivalence classes as an extension of the c-Wilf-equivalence classes for permutations, and identify non-trivial families of cc-Wilf-equivalent classes. Furthermore, we study the structure of the consecutive pattern poset in Coxeter groups and prove that its M\"{o}bius function is bounded by $2$ when the arguments belong to finite Coxeter groups, but can be arbitrarily large otherwise.

341) Eric Chen and Alex Zitzewitz, Unitary Conditions for Lamé and Heun Differential Operators (25 Jan 2023)

In this paper, we explore the connections between the so-called "accessory parameter" of the Heun Equation and the properties of its monodromy groups. In particular, we investigate which numerical values of the accessory parameter yield unitary monodromy groups (i.e., those that preserve a Hermitian inner product). To this end, we employ both analytical and computational methods, extending previous work on the Lamé Equation. In particular, for a large class of Heun Equations (generalizing the Lamé Equation), we prove a connection between unitarity and the traces of certain monodromy matrices. We exploit this theorem to create an algorithm that finds accessory parameters that yield unitary monodromy groups. Using this algorithm, we calculate and report the values of the accessory parameter that give rise to unitary monodromy groups. We also draw convergence maps, demonstrating the convergence and overall robustness of our algorithm. Finally, we derive an asymptotic formula for the desired accessory parameters which agrees with our numerical results.

340) Advay Goel (PRIMES) and Zoe Wellner (CMU), The Geometry and Limits of Young Partition Flow Polytopes (23 Jan 2023)

In 2017, Mészáros, Simpson, and Wellner demonstrated that certain flow polytopes resulting from Young tableaux are easily decomposed into simplices, and others have a natural relation to the well-known Tesler and CRY polytopes. Within a family of polytopes determined by a single tableaux shape, they introduced the limiting polytope. The limiting polytope is a useful notion since it is easy to decompose into a product of simplices. In this work, we use geometric decomposition to further examine the limiting process within each family of polytopes. Our main results analyze the family of hooks, and we demonstrate an algorithm to get geometric decompositions.

339) Yihao (Michael) Huang (PRIMES), Shangdi Yu (MIT), and Julian Shun (MIT), Faster Parallel Exact Density Peaks Clustering (16 Jan 2023)

Clustering multidimensional points is a fundamental data mining task, with applications in many fields, such as astronomy, neuroscience, bioinformatics, and computer vision. The goal of clustering algorithms is to group similar objects together. Density-based clustering is a clustering approach that defines clusters as dense regions of points. It has the advantage of being able to detect clusters of arbitrary shapes, rendering it useful in many applications.
In this paper, we propose fast parallel algorithms for Density Peaks Clustering (DPC), a popular variant of density-based clustering. Existing exact DPC algorithms suffer from low parallelism both in theory and in practice, which limits their application to largescale data sets. Our most performant algorithm, which is based on priority search kd-trees, achieves O(log n log log n) span (parallel time complexity). Our algorithm is also work-efficient, achieving a work complexity matching the best existing sequential exact DPC algorithm. In addition, we present another DPC algorithm based on a Fenwick tree that makes fewer assumptions for its average-case complexity to hold.
We provide optimized implementations of our algorithms and evaluate their performance via extensive experiments. On a 30- core machine with two-way hyperthreading, we find that our best algorithm achieves a 10.8–13169x speedup over the previous best parallel exact DPC algorithm. Compared to the state-of-the-art parallel approximate DPC algorithm, our best algorithm achieves a geometric mean speedup of 55.8x while being exact.

338) Andrey Khesin (MIT), Andrew Tung (PRIMES), and Karthik Vedula (PRIMES), New Properties of Intrinsic Information and Their Relation to Bound Secrecy (16 Jan 2023)

Two parties, Alice and Bob, seek to generate a mutually agreed upon string of bits, unknown to an eavesdropper Eve, by sampling repeatedly from a joint probability distribution. The secret-key rate has been defined as the asymptotic rate at which Alice and Bob can extract secret bits after sampling many times from the probability distribution. The secret-key rate has been bounded above by two information-theoretic quantities, first by the intrinsic information, and more strongly by the reduced intrinsic information. However, in this paper we prove that the reduced intrinsic information is 0 if and only if the intrinsic information is 0. This result implies that at least one of the following two conjectures is false: either the conjecture of the existence of bound secrecy, distributions where the intrinsic information is positive but the secret-key rate is 0, or the conjecture that the reduced intrinsic information equals the secret-key rate. Furthermore, we introduce a number of promising approaches for showing that bound secrecy does indeed exist using the idea of binarization of random variables. We improve on previous work by giving an explicit construction for a particular candidate for bound secrecy of an information-erasing binarization.

337) Max Xu, Gonality Sequences of Multipartite Graphs (15 Jan 2023)

In this paper, we deal with a particular sequence associated with a graph, the gonality sequence. This gonality sequence is a part of a larger topic of the chipfiring game on a graph G. The gonality sequence of a graph measures how much the degree of a divisor on that graph needs to change in order to increase its rank. The portions of the gonality sequence are known for when the input is greater than the genus. However, there has been little work done to find the first terms of the gonality sequence. In this paper, we partially compute the first terms of the gonality sequence for some complete multipartite graphs. In particular, the ones with all but one partite class having one vertex are analyzed, and here we present some results and further conjectures.

336) Jiayi Dong and Anshul Rastogi, Locating regions of uncertainty in distributed systems using aggregate trace data (15 Jan 2023)

Distributed systems are central to countless applications in the modern world. These applications can have tens to thousands of components interacting making it difficult to identify the source of performance problems. Distributed tracing is widely used to elucidate the interactions within a distributed system; however, instrumenting system codebases can be tedious, and collecting tracing data generates overhead. Optimally, minimal instrumentation is added to regions of the codebase that explains the majority of the system's performance variation. We present a prototype application that highlights regions of performance uncertainty in a system, guiding developers to where instrumentation would most increase predictability. Using aggregate trace data, spans are ranked by uncertainty metrics, which are primarily the standard deviation and coefficient of variation of the exclusive latencies of an operation across multiple traces. We developed our prototype in Python and applied it to trace data extracted from HotROD. We evaluated our tool on four test scenarios where we injected latency into services in HotROD. Our tool highlights the service(s) with injected latency in all four test cases.

335) Alicia Li and Matan Yablon, Adversarial Attacks Against Online Reinforcement Learning Agents in MDPs (15 Jan 2023)

Online Reinforcement Learning (RL) is a fast-growing branch of machine learning with increasingly important applications. Moreover, making RL algorithms robust against perturbations is essential to their utility in the real world. Adversarial RL, in which an attacker attempts to degrade an RL agent's performance by perturbing the environment, can be used to understand how to robustify RL systems. In this work, we connect an adversarial attack model to streaming algorithms: the victim samples paths based on its interactions with the environment, while the adversary corrupts this stream of data. We construct an attack algorithm in Markov Decision Processes (MDPs) for a random-sampling victim and prove its optimality, in addition to investigating an adversarial strategy against an epsilon-greedy victim with a warm start period. In the epsilon-greedy setting, we bound adversarial corruption and analyze how to exploit this highly adaptive model to improve upon warm start budget. Experimentally, we show that our algorithm outperforms baseline attacks, and we generate random MDPs to characterize how their general-case structure affects the adversary's ability to maintain its warm start corruption.

334) Jeffrey Chen (PRIMES) and Jesse Selover (UMass Amherst), Positivity properties of the q-hit numbers in the finite general linear group (15 Jan 2023; arXiv.org, 12 Apr 2025)

We consider the problem of counting matrices over a finite field with fixed rank and support contained in a fixed set. The count of such matrices gives a q-analogue of the classical rook number, but it is known not to be polynomial in q in general. We use inclusion-exclusion on the support of the matrices and the orbit counting method of Lewis et al. to show that the residues of these functions in low degrees are polynomial. We define a generalization of the rook and hit numbers over certain classes of graphs. This provides us a formula for residues of the q-rook and q-hit numbers in low degrees. We analyze the residues of the q-hit number and show that the coefficient of q $-$ 1 in the q-hit number is always non-negative.

333) Sacha Servan-Schreiber (MIT), Simon Beyzerov (PRIMES), Eli Yablon (PRIMES), and Hyojae Park (PRIMES), Private Access Control for Function Secret Sharing (15 Jan 2023)

Function Secret Sharing (FSS; Eurocrypt 2015) allows a dealer to share a function f with two or more evaluators. Given secret shares of a function f, the evaluators can locally compute secret shares of f(x) on an input x, without learning information about f.
In this paper, we initiate the study of access control for FSS. Given the shares of f, the evaluators can ensure that the dealer is authorized to share the provided function. For a function family $F$ and an access control list defined over the family, the evaluators receiving the shares of $f ∈ F$ can efficiently check that the dealer knows the access key for f.
This model enables new applications of FSS, such as: (1) anonymous authentication in a multiparty setting, (2) access control in private databases, and (3) authentication and spam prevention in anonymous communication systems.
Our definitions and constructions abstract and improve the concrete efficiency of several recent systems that implement ad-hoc mechanisms for access control over FSS. The main building block behind our efficiency improvement is a discrete-logarithm zero-knowledge proof-ofknowledge over secret-shared elements, which may be of independent interest.
We evaluate our constructions and show a 50–70× reduction in computational overhead compared to existing access control techniques used in anonymous communication. In other applications, such as private databases, the processing cost of introducing access control is only 1.5–3× when amortized over databases with 500,000 or more items.

332) Derek Liu (PRIMES) and Yuan Yao (MIT), Arrangements of Simplices in Fine Mixed Subdivisions (12 Jan 2023)

A regular simplex of side length $n$ can be subdivided into multiple polytopes, each of which is a Minkowski sum of some faces of a unit simplex. Ardila and Billey have shown that exactly $n$ of these cells must be simplices, and their positions must be in a “spread-out” arrangement. In this paper, we consider their question of whether every spread-out arrangement of simplices can be extended into such a subdivision, especially in the three-dimension case. We prove that a specific class of these arrangements, namely those that project down to a two-dimensional spread-out arrangement, all extend to a subdivision.

331) George Cao (PRIMES), Kent B. Vashaw (MIT), On the decomposition of tensor products of monomial modules for finite 2-groups (arXiv.org, 11 Jan 2023)

Dave Benson conjectured in 2020 that if $G$ is a finite $2$-group and $V$ is an odd-dimensional indecomposable representation of $G$ over an algebraically closed field $\Bbbk$ of characteristic $2$, then the only odd-dimensional indecomposable summand of $V \otimes V^*$ is the trivial representation $\Bbbk$. This would imply that a tensor power of an odd-dimensional indecomposable representation of $G$ over $\Bbbk$ has a unique odd-dimensional summand. Benson has further conjectured that, given such a representation $V$, the function sending a positive integer $n$ to the dimension of the unique odd-dimensional indecomposable summand of $V^{\otimes n}$ is quasi-polynomial. We examine this conjecture for monomial modules, a class of graded representations for the group $\mathbb{Z}/{2^r}\mathbb{Z} \times \mathbb{Z}/{2^s}\mathbb{Z}$ which correspond to skew Young diagrams. We prove the tensor powers conjecture for several modules, giving some of the first nontrivial cases where this conjecture has been verified, and we give conjectural quasi-polynomials for a broad range of monomial modules based on computational evidence.

330) Jesse Geneson (SJSU), Ethan Zhou (PRIMES), Online Learning of Smooth Functions (arXiv.org, 4 Jan 2023)

In this paper, we study the online learning of real-valued functions where the hidden function is known to have certain smoothness properties. Specifically, for $q \ge 1$, let $\mathcal F_q$ be the class of absolutely continuous functions $f: [0,1] \to \mathbb R$ such that $\|f'\|_q \le 1$. For $q \ge 1$ and $d \in \mathbb Z^+$, let $\mathcal F_{q,d}$ be the class of functions $f: [0,1]^d \to \mathbb R$ such that any function $g: [0,1] \to \mathbb R$ formed by fixing all but one parameter of $f$ is in $\mathcal F_q$. For any class of real-valued functions $\mathcal F$ and $p>0$, let $\text{opt}_p(\mathcal F)$ be the best upper bound on the sum of $p^{\text{th}}$ powers of absolute prediction errors that a learner can guarantee in the worst case. In the single-variable setup, we find new bounds for $\text{opt}_p(\mathcal F_q)$ that are sharp up to a constant factor. We show for all $\varepsilon \in (0, 1)$ that $\text{opt}_{1+\varepsilon}(\mathcal{F}_{\infty}) = Θ(\varepsilon^{-\frac{1}{2}})$ and $\text{opt}_{1+\varepsilon}(\mathcal{F}_q) = Θ(\varepsilon^{-\frac{1}{2}})$ for all $q \ge 2$. We also show for $\varepsilon \in (0,1)$ that $\text{opt}_2(\mathcal F_{1+\varepsilon})=Θ(\varepsilon^{-1})$. In addition, we obtain new exact results by proving that $\text{opt}_p(\mathcal F_q)=1$ for $q \in (1,2)$ and $p \ge 2+\frac{1}{q-1}$. In the multi-variable setup, we establish inequalities relating $\text{opt}_p(\mathcal F_{q,d})$ to $\text{opt}_p(\mathcal F_q)$ and show that $\text{opt}_p(\mathcal F_{\infty,d})$ is infinite when $p<d$ and finite when $p>d$. We also obtain sharp bounds on learning $\mathcal F_{\infty,d}$ for $p < d$ when the number of trials is bounded.

329) Coleman DuPlessie and Eddie Wei, Deep Learning Transformers for Non-cyclical Kinematics (31 Dec 2022)

Machine learning is a useful tool in the field of kinematics because of its ability to easily analyze high-dimensional temporal data and recognize patterns that are often not discernible to humans. Many machine learning models have already been applied to human kinematics, yet the transformer, a model that is especially good at capturing long-distance relationships in data, has not yet been applied to this field. Because common models such as LSTMs perform much worse on non-cyclical data than on cyclical data, their usefulness in the field of kinematics is limited. We theorize that, because Transformers can better represent long-term dependencies, they will achieve superior performance on tasks in this field, where the time series data is significantly aperiodic. In this work, we have compared Transformers and similar models to an LSTM model and a heuristic benchmark on non-cyclical, 3-dimensional positional data from CMU’s Quality of Life Grand Challenge Kitchen dataset and found that vanilla Transformers are able to outperform both LSTMs and simple heuristics.

328) S. K. Devalapurkar (Harvard), and M. L. Misterka (PRIMES), Generalized n-Series and de Rham Complexes (31 Dec 2022)

The goal of this article is to study some basic algebraic and combinatorial properties of ``generalized $n$-series'' over a commutative ring $R$, which are functions $s: \mathbb{Z}_{\geq 0} \to R$ satisfying a mild condition. A special example of generalized $n$-series is given by the $q$-integers $\frac{q^n-1}{q-1} \in \mathbb{Z}[q]$. Given a generalized $n$-series $s$, one can define $s$-analogues of factorials (via $n!_s = \prod_{i=1}^n s(n)$) and binomial coefficients. We prove that Pascal's identity, the binomial identity, Lucas' theorem, and the Vandermonde identity admit $s$-analogues; each of these specialize to their appropriate $q$-analogue in the case of the $q$-integer generalized $n$-series. We also study the growth rates of generalized $n$-series defined over the integers. Finally, we define an $s$-analogue of the ($q$-)derivative, and prove $s$-analogues of the Poincar\'e lemma and the Cartier isomorphism for the affine line, as well as a pullback square due to Bhatt-Lurie.

327) Matvey Borodin, Ethan Liu, Justin Zhang, The Ideal of Vanishing Polynomials and the Ring of Polynomial Functions (25 Dec 2022; arXiv.org, 24 Sept 2023)

Vanishing polynomials are polynomials over a ring which output $0$ for all elements in the ring. In this paper, we study the ideal of vanishing polynomials over specific types of rings, along with the closely related ring of polynomial functions. In particular, we provide several results on generating vanishing polynomials. We first analyze the ideal of vanishing polynomial over $\mathbb{Z}_n$, the ring of the integers modulo $n$. We then establish an isomorphism between the vanishing polynomials of a ring and the vanishing polynomials of the constituent rings in its decomposition. Lastly, we generalize our results to study the ideal of vanishing polynomials over arbitrary commutative rings.

326) Felix Gotti (MIT), Joseph Vulakh (PRIMES), On the atomic structure of torsion-free monoids (arXiv.org, 16 Dec 2022), published in Semigroup Forum 107 (2023): 402–423

Let $M$ be a cancellative and commutative (additive) monoid. The monoid $M$ is atomic if every non-invertible element can be written as a sum of irreducible elements, which are also called atoms. Also, $M$ satisfies the ascending chain condition on principal ideals (ACCP) if every increasing sequence of principal ideals (under inclusion) becomes constant from one point on. In the first part of this paper, we characterize torsion-free monoids that satisfy the ACCP as those torsion-free monoids whose submonoids are all atomic. A submonoid of the nonnegative cone of a totally ordered abelian group is often called a positive monoid. Every positive monoid is clearly torsion-free. In the second part of this paper, we study the atomic structure of certain classes of positive monoids.

325) Paul Gutkovich (PRIMES) and Zi Song Yeoh (MIT), Computing Truncated Metric Dimension of Trees (8 Dec 2022)

Let $G=(V,E)$ be a simple, unweighted, connected graph. Let $d(u,v)$ denote the distance between vertices $u,v$. A resolving set of $G$ is a subset $S$ of $V$ such that knowing the distance from a vertex $v$ to every vertex in $S$ uniquely identifies $v$. The metric dimension of $G$ is defined as the size of the smallest resolving set of $G$. We define the $k$-truncated resolving set and $k$-truncated metric dimension of a graph similarly, but with the notion of distance replaced with $d_k(u,v) := \min(d(u,v),k+1)$.
In this paper, we demonstrate that computing the $k$-truncated metric dimension of trees is NP-Hard for general $k$. We then present a polynomial-time algorithm to compute the $k$-truncated metric dimension of trees when $k$ is a fixed constant.

324) Nitya Mani (MIT) and Edward Yu (PRIMES), Turán Problems for Mixed Graphs (arXiv.org, 23 Oct 2022)

We investigate natural Turán problems for mixed graphs, generalizations of graphs where edges can be either directed or undirected. We study a natural Turán density coefficient that measures how large a fraction of directed edges an $F$-free mixed graph can have; we establish an analogue of the Erdős-Stone-Simonovits theorem and give a variational characterization of the Turán density coefficient of any mixed graph (along with an associated extremal $F$-free family). This characterization enables us to highlight an important divergence between classical extremal numbers and the Turán density coefficient. We show that Turán density coefficients can be irrational, but are always algebraic; for every $k \in \mathbb N$, we construct a family of mixed graphs whose Turán density coefficient has algebraic degree $k$.

323) Alan Bu, Joseph Vulakh, and Alex Zhao, Length-Factoriality and Pure Irreducibility (arXiv.org, 13 Oct 2022), published in Communications in Algebra 51:9 (2023): 3745-3755

An atomic monoid $M$ is called length-factorial if for every non-invertible element $x \in M$, no two distinct factorizations of $x$ into irreducibles have the same length (i.e., number of irreducible factors, counting repetitions). The notion of length-factoriality was introduced by J. Coykendall and W. Smith in 2011 under the term 'other-half-factoriality': they used length-factoriality to provide a characterization of unique factorization domains. In this paper, we study length-factoriality in the more general context of commutative, cancellative monoids. In addition, we study factorization properties related to length-factoriality, namely, the PLS property (recently introduced by Chapman et al.) and bi-length-factoriality in the context of semirings.

322) Alan Lee, Connectedness in Friends-and-Strangers Graphs of Spiders and Complements (arXiv.org, 5 Oct 2022)

Let $X$ and $Y$ be two graphs with vertex set $[n]$. Their friends-and-strangers graph $\mathsf{FS}(X,Y)$ is a graph with vertex set $S_n$, and two permutations $σ$ and $σ'$ are adjacent if they are separated by a transposition $\{a,b\}$ such that $a$ and $b$ are adjacent in $X$ and $σ(a)$ and $σ(b)$ are adjacent in $Y$. Specific friends-and-strangers graphs such as $\mathsf{FS}(\mathsf{Path}_n,Y)$ and $\mathsf{FS}(\mathsf{Cycle}_n,Y)$ have been researched, and their connected components have been enumerated using various equivalence relations such as double-flip equivalence. A spider graph is a collection of path graphs that are all connected to a single center point. In this paper, we delve deeper into the question of when $\mathsf{FS}(X,Y)$ is connected when $X$ is a spider and $Y$ is the complement of a spider or a tadpole.

321) Scott T. Chapman (SHSU), Caroline Liu (PRIMES), Annabel Ma (PRIMES), Andrew Zhang (PRIMES), On the factorization invariants of arithmetical congruence monoids (arXiv.org, 3 Oct 2022)

In this paper, we study various factorization invariants of arithmetical congruence monoids. The invariants we investigate are the catenary degree, a measure of the maximum distance between any two factorizations of the same element, the length density, which describes the distribution of the factorization lengths of an element, and the omega primality, which measures how far an element is from being prime.

320) Colin Defant (MIT), David Dong (PRIMES), Alan Lee (PRIMES), Michelle Wei (PRIMES), Connectedness and Cycle Spaces of Friends-and-Strangers Graphs (arXiv.org, 4 Sept 2022)

If $X=(V(X),E(X))$ and $Y=(V(Y),E(Y))$ are $n$-vertex graphs, then their friends-and-strangers graph $\mathsf{FS}(X,Y)$ is the graph whose vertices are the bijections from $V(X)$ to $V(Y)$ in which two bijections $\sigma$ and $\sigma'$ are adjacent if and only if there is an edge $\{a,b\}\in E(X)$ such that $\{\sigma(a),\sigma(b)\}\in E(Y)$ and $\sigma'=\sigma\circ (a\,\,b)$, where $(a\,\,b)$ is the permutation of $V(X)$ that swaps $a$ and $b$. We prove general theorems that provide necessary and/or sufficient conditions for $\mathsf{FS}(X,Y)$ to be connected. As a corollary, we obtain a complete characterization of the graphs $Y$ such that $\mathsf{FS}(\mathsf{Dand}_{k,n},Y)$ is connected, where $\mathsf{Dand}_{k,n}$ is a dandelion graph; this substantially generalizes a theorem of the first author and Kravitz in the case $k=3$. For specific choices of $Y$, we characterize the spider graphs $X$ such that $\mathsf{FS}(X,Y)$ is connected. In a different vein, we study the cycle spaces of friends-and-strangers graphs. Naatz proved that if $X$ is a path graph, then the cycle space of $\mathsf{FS}(X,Y)$ is spanned by $4$-cycles and $6$-cycles; we show that the same statement holds when $X$ is a cycle and $Y$ has domination number at least $3$. When $X$ is a cycle and $Y$ has domination number at least $2$, our proof sheds light on how walks in $\mathsf{FS}(X,Y)$ behave under certain Coxeter moves.

319) Paula Bergero, Laura P. Schaposnik, and Grace Wang (PRIMES), Correlations Between COVID-19 and Dengue (arXiv.org, 27 Jul 2022), published in Nature Scientific Reports (27 Jan 2023)

A dramatic increase in the number of outbreaks of Dengue has recently been reported, and climate change is likely to extend the geographical spread of the disease. In this context, this paper shows how a neural network approach can incorporate Dengue and COVID-19 data as well as external factors (such as social behaviour or climate variables), to develop predictive models that could improve our knowledge and provide useful tools for health policy makers. Through the use of neural networks with different social and natural parameters, in this paper we define a Correlation Model through which we show that the number of cases of COVID-19 and Dengue have very similar trends. We then illustrate the relevance of our model by extending it to a Long short-term memory model (LSTM) that incorporates both diseases, and using this to estimate Dengue infections via COVID-19 data in countries that lack sufficient Dengue data.

318) Zifan (Carl) Guo (PRIMES) and William S. Moses (MIT), Understanding High-Level Properties of Low-Level Programs Through Transformers (8 July 2022)

Transformer models have enabled breakthroughs in the field of natural language processing largely because unlike other models, Transformers can be trained on a large corpus of unlabeled data. One can then perform fine-tuning on the model to fit a specific task. Unlike natural language, which is somewhat tolerant of minor differences in word choices or ordering, the structured nature of programming languages means that program meaning can be completely redefined or be invalid if even one token is altered. In comparison to highlevel languages, low-level languages are less expressive and more repetitive with more details from the computer microarchitecture. Whereas recent literature has examined how to effectively use Transformer models on high-level programming semantics, this project explores the effectiveness of applying Transformer models on low-level representations of programs that can shed light on better optimizing compilers. In this paper, we show that Transformer models can translate C to LLVM-IR with high accuracy, by training on a parallel corpus of functions extract from 1 million compilable, open-sourced C programs (AnghaBench) and its corresponding LLVM-IR after compiling with Clang. Our model shows a $49.57\%$ verbatim match when performed on the AnghaBench dataset and a high BLEU score of 87.68. We also present another case study that analyzes x86 64 basic blocks for estimating their throughput and match the state of the art. We show through ablation studies that a collection of preprocessing simplifications of the low-level programs especially improves the model’s ability to generate low level programs and discuss data selection, network architecture, as well as limitations to the use of Transformers on low-level programs.

317) Tanisha Saxena (PRIMES) and Jun Wan (MIT), A Systematic Study on the Difference and Conversion Between Synchronous and Asynchronous Protocols (1 July 2022)

In this paper, we provide a fundamental analysis of the similarities and differences between synchronous and asynchronous distributed systems. Specifically, we define a special and normal adversary such that any protocol for a synchronous system that is resilient to the special adversary can be replicated by a protocol for an asynchronous system that is resilient to the normal adversary. Protocols for the synchronous model are less complex, as the guarantee that messages will be delivered within a bounded time makes it easy to determine the sequence of events in the system. But, this is unrealistic in the real world, as systems tend to be asynchronous where messages are not guaranteed to be delivered in a timely manner. Protocols for the asynchronous model, on the other hand, are more complex as there are many edge cases to account for. Our adversaries help to create intermediary models that allow us to replicate protocol outputs across both synchronous and asynchronous systems, allowing for simpler creation of protocols that remain functional under the asynchronous model.

2021 Research Papers

316) Anand, Jesse Geneson, Suchir Kaustav, Shen-Fu Tsai (CrowdMath-2021), Sequence saturation (arXiv.org, 10 May 2024), published in Discrete Applied Mathematics 360 (2025): 382-393

In this paper, we introduce saturation and semisaturation functions of sequences, and we prove a number of fundamental results about these functions. Given any forbidden sequence $u$ with $r$ distinct letters, we say that a sequence $s$ on a given alphabet is $u$-saturated if $s$ is $r$-sparse, $u$-free, and adding any letter from the alphabet to $s$ violates $r$-sparsity or induces a copy of $u$. We say that $s$ is $u$-semisaturated if $s$ is $r$-sparse and adding any letter from the alphabet to $s$ violates $r$-sparsity or induces a new copy of $u$. Let the saturation function $\operatorname{Sat}(u, n)$ denote the minimum possible length of a $u$-saturated sequence on an alphabet of size $n$, and let the semisaturation function $\operatorname{Ssat}(u, n)$ denote the minimum possible length of a $u$-semisaturated sequence on an alphabet of size $n$. For alternating sequences of the form $a b a b \dots$, we determine the saturation functions up to a multiplicative factor of $2$, and we determine the semisaturation functions up to the leading term. We demonstrate a dichotomy for the semisaturation functions of sequences: for any sequence $u$, we have $\operatorname{Ssat}(u, n) = O(1)$ if and only if the first letter and the last letter of $u$ each occur exactly once, and otherwise we have $\operatorname{Ssat}(u, n) = \Theta(n)$. For the saturation function, we show that every sequence $u$ has either $\operatorname{Sat}(u, n) \ge n$ for every positive integer $n$ or $\operatorname{Sat}(u, n) = O(1)$. We prove that every sequence $u$ in which every letter occurs at least twice has $\operatorname{Sat}(u, n) \ge n$, and we show that $\operatorname{Sat}(u, n) = \Theta(n)$ or $\operatorname{Sat}(u, n) = O(1)$ for every sequence $u$ with $2$ distinct letters.

315) Felix Gotti (MIT), Bangzheng Li (PRIMES), Divisibility and a weak ascending chain condition on principal ideals (arXiv.org, 12 Dec 2022)

An integral domain $R$ is atomic if each nonzero nonunit of $R$ factors into irreducibles. In addition, an integral domain $R$ satisfies the ascending chain condition on principal ideals (ACCP) if every increasing sequence of principal ideals (under inclusion) becomes constant from one point on. Although it is not hard to verify that every integral domain satisfying ACCP is atomic, examples of atomic domains that do not satisfy ACCP are notoriously hard to construct. The first of such examples was constructed by A. Grams back in 1974. In this paper we delve into the class of atomic domains that do not satisfy ACCP. To better understand this class, we introduce the notion of weak-ACCP domains, which generalizes that of integral domains satisfying ACCP. Strongly atomic domains were introduced by D. D. Anderson, D. F. Anderson, and M. Zafrullah in 1990. It turns out that every weak-ACCP domain is strongly atomic, and so we introduce a taxonomic classification on our class of interest: ACCP implies weak-ACCP, which implies strong atomicity, which implies atomicity. We study this chain of implications, putting special emphasis on the weak-ACCP property. This allows us to provide new examples of atomic domains that do not satisfy ACCP.

314) Tanya Khovanova (MIT) and Atharva Pathak (PRIMES), Combinatorial Aspects of the Card Game War (arXiv.org, 28 Jan 2022), published in The Australasian Journal of Combinatorics 91:2 (2025): 301-325

This paper studies a single-suit version of the card game War on a finite deck of cards. There are varying methods of how players put the cards that they win back into their hands, but we primarily consider randomly putting the cards back and deterministically always putting the winning card before the losing card. The concept of a $\textit{passthrough}$ is defined, which refers to a player playing through all cards in their hand from a particular point in the game. We consider games in which the second player wins during their first passthrough. We introduce several combinatorial objects related to the game: game graphs, win-loss sequences, win-loss binary trees, and game posets. We show how these objects relate to each other. We enumerate states depending on the number of rounds and the number of passthroughs.

313) Luke Robitaille, Topological Entropy of Simple Braids (22 Jan 2022)

Mathematical objects called $\textit{braids}$ are formed from “strands” (like string or yarn) that intertwine. A certain collection of braids, called $\textit{simple braids}$, correspond to permutations, depending on how the strands get permuted. We can think of braids as maps from a disc with some “punctures” to itself; using this idea, we can consider the $\textit{topological entropy}$ of a braid, which can be zero or positive. What proportion of simple braids have positive topological entropy? The main theorem of this project is that, in the limit as the number of strands increases, the proportion of simple braids that have positive topological entropy approaches 1. This can be proved by showing that we can almost always find a long cycle in the permutation that will enable us to get a braid with three strands that has positive topological entropy, yielding the theorem. Topological entropy of braids can have use beyond just being interesting mathematics, such as for considering how to stir fluids.

312) Andrew Gu, On LU Matrices and Springer Theory (19 Jan 2022)

In this paper, we investigate and find the number of LU matrices in $GL_n(\mathbb{F}_q)$ that are similar to a regular semisimple $s$ in $GL_n(\mathbb{F}_q)$. Linking our results with M.-T. Trinh's study of certain ``generalized Steinberg varieties,'' we expand on his work. Trinh has established certain numerical identities coming from a $P=W$ conjecture of Cataldo-Hausel-Migliorini between affine Springer fibers and these generalized Steinberg varieties. The results of this paper provide numerical evidence of the relation between Springer fibers and LU matrices. Using a linear-algebraic approach, we find a direct relation between LU matrices and Trinh's spaces. Consequently, we derive a closed formula for a point count of LU matrices that is a constant factor from the point count of Trinh's spaces. Furthermore, we identify a common point count among these sets. From this we propose a conjecture that generalizes our results.

311) Zifan (Carl) Guo, The Effectiveness of Transformer Models for Analyzing Low-Level Programs (18 Jan 2022)

Recently, transformer networks have enabled breakthroughs in the field of natural language processing. This is partially due to the fact that transformer models can be first trained on a large corpus of unlabeled data prior to fine-tuning on a downstream task. Unlike natural language, which is somewhat tolerant of minor differences in word choices or ordering, the structured nature of programming languages means that program meaning can be completely redefined or be invalid if even one token is altered. In comparison to high-level languages, low-level languages are less expressive and more repetitive with more details from the computer microarchitecture. Whereas recent literature has examined how to effectively use transformer models on high-level programming semantics, this project explores the effectiveness of applying transformer models on low-level representations of programs that can shed light on better optimizing compilers. In this paper, we show that transformer models can translate C to LLVM-IR with high accuracy, by training on a parallel corpus of functions extract from 1 million compilable, open-sourced C programs (AnghaBench) and its corresponding LLVM-IR after compiling with Clang. We also present another case study that analyzes x86_64 basic blocks for estimating their throughput. We discuss various changes in data selection, program representation, network architecture, and other modifications that influence the effectiveness of transformer models on low-level programs.

310) Arun S. Kannan (MIT) and Zifan (Atticus) Wang, Representation Stability and Finite Orthogonal Groups (17 Jan 2022; arXiv.org, 20 Feb 2022), published in Algebras and Representation Theory 26 (29 March 2023): 3119–3141

In this paper, we prove stability results about orthogonal groups over finite commutative rings where 2 is a unit. Inspired by Putman and Sam (2017), we construct a category $\mathbf{OrI}(R)$ and prove a Noetherianity theorem for the category of $\mathbf{OrI}(R)$-modules. This implies an asymptotic structure theorem for orthogonal groups. In addition, we show general homological stability theorems for orthogonal groups, with both untwisted and twisted coefficients, partially generalizing a result of Charney (1987).

309) Ilaria Seidel, Bounds on Generalized Symmetric Numerical Semigroups (16 Jan 2022)

Numerical semigroups are combinatorial objects that are easy to define, but have rich connections to other fields. Certain families of numerical semigroups are of particular interest because of their connections to algebraic geometry. We focus on one such family known as symmetric semigroups, and analyze the rate of growth of the number of symmetric semigroups $S(g)$ with genus $g$. Then, we partition semigroups of genus $g$ by their Frobenius number, and denote by $N(g, F)$ the number of semigroups with genus $g$ and Frobenius number $F$. We extend results from $S(g)$ to $N(g, 2g-k)$ for $k$ fixed in the range $1 \leq k \leq g$. We state a conjecture about the local behavior of the ratio $\frac{S(g+1)}{S(g)}$, depending on the residue of $g \pmod 3$. Finally, we generalize this conjecture to include $N(g, 2g-k)$ for fixed $k$.

308) Kevin Cong, Square Tilings of Translation Surfaces (16 Jan 2022)

Translation surfaces are obtained by identifying opposite edges of a polygon with an even number of sides, paired together. We explore the question of tiling translation surfaces including the torus and the surfaces generated by the regular octagon with squares. Given any tiling, we identify its contacts graph, a triangulation formed by corresponding one vertex per square and drawing edges between vertices corresponding to adjacent squares. In particular, we prove that under certain conditions, there is exactly one torus tiling that has contacts graph a given torus triangulation. We then provide a method to approximately construct this tiling. We also show that the regular octagon translation surface cannot be tiled with squares. However, we give constructive tilings of translation surfaces corresponding to certain affine transformations of the octagon.

307) Akhil Kammila, Proposed Improvements to the Tor Handshake (15 Jan 2022)

Tor is the world’s largest anonymous communication network. It conceals its users’ identities by sending their traffic through three successive Tor relays. To establish connections between users, relays, and destinations, Tor uses a unique two-staged handshake. The first stage is a modified version of TLS 1.2 and the second stage is a fully encrypted exchange of Tor cells. The two-stage process enables both parties to authenticate while masking the differences that the Tor’s handshake has from standard TLS. The Tor handshake has multiple shortcomings when compared to widely-used cryptographic protocols like TLS and QUIC. It has high latency that detracts from the user experience and increased complexity that makes maintenance challenging. The first stage of the handshake also only supports TLS 1.2 despite TLS 1.3’s release in 2018. Our work presents an analysis of Tor’s handshake and proposes improvements. We find messages in the second stage of the Tor handshake that are redundant. Most notably, the responder sends a certificate that is not necessary for authentication. Removing these messages reduces the data transferred in the handshake without compromising the key exchange or authentication. Further, we find that removing backward compatibility from the Tor handshake allows for the trivial use of TLS 1.3 in the first stage. This reduces the round-trips and improves the security of the Tor handshake.

306) Abigail Thomas, The Implementation of Model Pruning to Optimize zk-SNARKs (15 Jan 2022)

Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge (zk-SNARK)s are used to convince a verifier that a server possesses certain information without revealing these private inputs. Thus, zk-SNARKs can be useful when outsourcing computations for cloud computing. The proofs returned by the server must be less computationally intensive than the given task, but the more complex the task, the more expensive the proof. We present a method that involves model pruning to decrease the complexity of the given task and thus the proof as well, to allow clients to outsource more complex programs. The proposed method harnesses the beneﬁts of producing accurate results using a lower number of constraints, while remaining secure.

305) Vishnu Emani (PRIMES), Vijay Govindarajan, and David Hoganson (Boston Children's Hospital), Computational Fluid Modeling for Surgical Planning of Single Ventricle Congenital Heart Defects (15 Jan 2022)

Single ventricle defects (SVD) refer to the collection of congenital heart defects in which one chamber of the heart remains weak or underdeveloped. The most common palliative treatment for SVD physiologies involves a 3-stage surgical intervention, ending with the Fontan procedure. For patients with bilateral Superior Vena Cavae (SVC), the bilateral bidirectional Glenn (BBDG) procedure is typically employed. The primary goal of this study was to examine the effects of various physiological factors, such as vascular sizes, hepatic vein angle, curvature and position of the Fontan conduit, and the construction of a neo-innominate vein on the distribution of hepatic flow to the lungs in BBDG geometries.

304) Tanisha Saxena (PRIMES) and Jun Wan (MIT), A Compromise Between Synchronous and Asynchronous Systems (15 Jan 2022)

In this paper, we introduce a partially synchronous model for distributed systems such that any protocol for our model can be transformed to a corresponding protocol for the asynchronous model. Given a distributed system with $n$ users, we define a normal adversary as one that allows up to $ f (f < n/2)$ users to send any arbitrary message at any time, and a special adversary that can, additionally, block up to $f$ message channels for any number of users. We prove that, for any synchronous protocol that is resilient to the special adversary, there is an equivalent protocol for the asynchronous model that is resilient to the normal adversary. The special adversary helps us relax the restriction of time-bounded delivery and provides a model that is useful in analyzing if a synchronous protocol can be modified to work correctly in an asynchronous distributed system. Our model provides a basis to use synchronous protocols to function on asynchronous systems such as electronic banking and Blockchain systems distributed across the Internet.

303) Yavor Litchev, Signature Scheme with Access Control (15 Jan 2022)

A wide variety of digital signature schemes currently exist, from RSA to El-Gamal to Schnorr. More recently, multi-party signature schemes have been developed, including distributed signature schemes and threshold signature schemes. In particular, threshold signature schemes provide useful functionality, in that they require the number of participating parties to pass a threshold in order to generate a valid signature. However, they are limited in their complexity, as they can only model a threshold function. The proposed signature scheme (monotonic signature scheme) allows for the modeling of complex functions, so long as they are monotonic. This would allow for a much greater degree of access control, all while security and correctness are preserved.

302) Jack Wang, Exploration of Capabilities and Limitations in View Change of the X-Fields Model (15 Jan 2022)

Generating images of the same scenes from different perspectives — whether that is from different points, from different angles, under varying illumination, or with other parameters — has a myriad of use cases, stretching from creating debug models to producing smooth videos. In the X-Fields model, hard-coded graphics tricks like lighting, 3D projection, and albedo are used to supplement neural networks in creating a differentiable map for the image parameters and the actual pixels using sample images and their corresponding coordinate values. Although X-Fields performs well on datasets of images concentrated on a 2D (x, y) plane relative to alternative interpolation methods, the original model cannot support broader, practical use cases like the interpolation of images in different 3D (x, y, z) positions. In this paper, we use 3D images and coordinates generated by the 3DB framework in our dimensionally expanded X-Fields model. We find that the new model can generate promising interpolation results with relatively sparse datasets and with large view angle changes; parameters such as learning rate, the bandwidth parameter in soft blending, and others have impact over the interpolation quality and construct trade-offs between training cost and interpolation quality; and that adding certain backgrounds (like the ocean) reference images can pose challenges for interpolation.

301) Garrett Heller (PRIMES) and Chengyang Shao (MIT), Strichartz and Multi-linear Estimates for the One-dimensional Periodic Dysthe equation (11 Jan 2022)

This paper presents Strichartz estimates for the linearized 1D periodic Dysthe equation on the torus, namely estimate of the $L^6_{x,t}(\mathbb{T}^2)$ norm of the solution in terms of the initial data, and estimate of the $L^4_{x,t}(\mathbb{T}^2)$ norm in terms of the Bourgain space norm. The paper also presents other results such as bilinear and trilinear estimates pertaining to local well-posedness of the 1-dimensional periodic Dysthe equation in a suitable Bourgain space, and ill-posedness results in Sobolev spaces.

300) Neil Chowdhury, Interplay Between Loop Extrusion and Compartmentalization During Mitosis (10 Jan 2022)

During mitosis, DNA changes its physical structure from diffuse chromatin spread throughout the cell nucleus to discrete, compacted, cylindrical chromatids. This process is essential for cells to be able to transfer replicated chromosomes to the daughter nuclei. During interphase, chromatin is compartmentalized into heterochromatin and euchromatin, resulting in a visible signal in Hi-C contact maps. However, as the cell enters mitosis, this signal is disrupted, only to reappear after the cell divides. This paper explores the interphase and mitotic states by modeling DNA using polymer simulations. It is shown that loop extrusion, the mechanism underlying mitotic chromosome formation, can simultaneously be responsible for disrupting compartmentalization.

299) Nathan Xiong (PRIMES) and Pu Yu (MIT), The Master Field and Free Brownian Motions (10 Jan 2022)

The master field on the plane is the large $N$ limit of the Wilson loop functionals from the two-dimensional Yang–Mills holonomy process. In this paper, we redefine the master field purely through free Brownian motions, so that its definition is independent from finite $N$ Yang–Mills theory. From this aspect, we prove that the master field does not depend on the lasso basis chosen on a graph. We also give a new, elementary proof for the Makeenko–Migdal equations, which allow us to efficiently calculate the master field of any loop via a system of differential equations. While previous work in this field is mostly differential geometric in nature, our proofs all use combinatorial techniques, heavily utilizing the moment-cumulant relation from free probability.

298) Sushanth Sathish Kumar, The Restricted Lie Algebra Structure on the Bar Spectral Sequence of an Iterated Loop Space (8 Jan 2022)

There is a rich algebraic structure in the mod $p$ homology of the iterated loop space $H_*(\Omega^n X; \mathbb{F}_p)$. It admits a Lie bracket called the Browder bracket that is compatible with the Dyer-Lashof operations $Q_0, Q_1,\ldots, Q_{n-1}$. Furthermore, the top Dyer-Lashof operation $Q_{n-1}$ is a restriction for the Browder bracket. Ni proved that the Browder bracket on the homology $H_*(\Omega^n X)$ converges to the bracket on $H_*(\Omega^{n-1} X)$ in the bar spectral sequence, making it a spectral sequence of Poission-Hopf algebras. Our goal is to use the bar spectral sequence to relate the restricted Lie algebra structure given by the top Dyer-Lashof operation on $H_*(\Omega^n X; \mathbb{F}_2)$ to that of $H_*(\Omega^{n-1} X; \mathbb{F}_2)$.

297) Nancy Jiang, Bangzheng Li, and Sophie Zhu, On the primality and elasticity of algebraic valuations of cyclic free semirings (arXiv.org, 4 Jan 2022), published in International Journal of Algebra and Computation 33:2 (2023): 197-210

A cancellative commutative monoid is atomic if every non-invertible element factors into irreducibles. Under certain mild conditions on a positive algebraic number $\alpha$, the additive monoid $M_\alpha$ of the evaluation semiring $\mathbb{N}_0[\alpha]$ is atomic. The atomic structure of both the additive and the multiplicative monoids of $\mathbb{N}_0[\alpha]$ has been the subject of several recent papers. Here we focus on the monoids $M_\alpha$, and we study its omega-primality and elasticity, aiming to better understand some fundamental questions about their atomic decompositions. We prove that when $\alpha$ is less than 1, the atoms of $M_\alpha$ are as far from being prime as they can possibly be. Then we establish some results about the elasticity of $M_\alpha$, including that when $\alpha$ is rational, the elasticity of $M_\alpha$ is full (this was previously conjectured by S. T. Chapman, F. Gotti, and M. Gotti).

296) Kunal Kapoor (PRIMES) and Jun Wan (MIT), Consensus under a Dynamic Synchronous Model (3 Jan 2022)

With the advance of blockchain and cryptocurrency, the need for efficient and practical consensus algorithms is growing. However, most existing works only consider protocols under the synchronous setting. It is usually assumed that there exist at least $h$ users who are always honest and online. This is impractical as honest users might alternate between online and offline states. In this paper, we adapt Byzantine Broadcast protocols to a dynamic synchronous model which features sleepy/offline users as well as information gaps. We do this by building off an approach centered around a Trust Graph, modifying key algorithms from previous works such as the post-processing algorithm to ensure correctness with the dynamic model. This allows the creation of a more fault-tolerant protocol.

295) Andrew Du, Quaternion-Based Analytical Inverse Dynamics for the Human Body (31 Dec 2021)

The human body provides unique challenges to study from a dynamical perspective, due to its mechanical complexity and the difficulty of obtaining measurements of internal dynamic quantities. Thus, it is essential to create models that both simplify analysis and account for important anatomical details, the two of which must necessarily be balanced into a sufficiently accurate-yet-manageable framework. A number of critical applications require accurate inverse dynamic models of the human body, including medical treatment and virtual simulation of human motion. A recent general technique was developed by Dumas et. al. that used a quaternion screw algebra to make computation of inverse dynamic quantities more practical and more efficient. In this paper, we adapt their technique to the case of human anatomy, integrating these computational improvements within a novel framework for modeling human musculature.

294) Tanmay Gupta and Anshul Rastogi, Threshold-Based Inference of Dependencies in Distributed Systems (31 Dec 2021)

Many current online services rely on the interaction between different components that form a distributed system. Analyzing distributed systems is important in performance analysis (e.g. critical path analysis), debugging, and testing newfeatures. However, the analysis of these systems can be difficult due to limited knowledge of how components work and the variety of services and applications that are usually instrumented. The Mystery Machine , introduced by Chow et al. in 2014, has a “big data” approach, using logged events across many traces to generate and refine a causal model. We introduce Scooby Systems , our extension of The Mystery Machine ’s algorithm. We introduce thresholds to increase the tolerance to violations in the formation of causal relationships. In the future, we hope to improve Scooby Systems ’s scalability with a Hadoop MapReduce implementation.

293) Yihao (Michael) Huang and Claire Wang, Efficient Algorithms for Parallel Bi-core Decomposition (31 Dec 2021)

Graphs are used in the modeling of social networks, biological networks, user-product networks, and many other real-world relationships. Identifying dense regions within these graphs can often aid in applications including product-recommendation, spam identification, and protein-function discovery. A fundamental dense substructure discovery problem in graph theory is the k -core decomposition. However, the k -core decomposition does not directly apply to bipartite graphs, which are graphs that model the connections between two disjoint sets of entities, such as book-authorship, affiliation, and gene-disease association. Given the prevalence of bipartite graphs, solving the dense subgraph discovery problem on bipartite graphs has wide-reaching real-world impacts.
In this paper, we solve the bipartite analogue of the k- core decomposition problem, which is the bi-core decomposition problem. Existing sequential bi-core decomposition algorithms are not scalable to large-scale bipartite graphs with hundreds of millions of edges. Therefore, we develop a theoretically efficient parallel bi-core decomposition algorithm. Our algorithm improves the theoretical bounds of existing algorithms, reducing the length of the computation graph’s longest dependency path, which asymptotically bounds the runtime of a parallel algorithm when there are sufficiently many processors. We prove the problem of bi-core decomposition to be P-complete. We also devise a parallel bi-core index structure to allow for fast queries of the computed cores. Finally, we provide optimized parallel implementations of our algorithms that are scalable and fast. Using 30 threads, our parallel bi-core decomposition algorithm achieves up to a 44x speedup over the best existing sequential algorithm and up to a 2.9x speedup over the best existing parallel algorithm. Our parallel query implementation is up to 22.3x faster than the existing sequential query implementation.

292) Raymond Feng, Andrew Lee, and Espen Slettnes, Results on Various Models of Mistake-Bounded Online Learning (29 Dec 2021)

We determine bounds for several variations of the mistake-bound model. The first half of our paper presents various bounds on the weak reinforcement model and the delayed, ambiguous reinforcement model. In both models, the adversary gives $r$ inputs in one round and only indicates a correct answer if all $r$ guesses are correct. The only difference between the two models is that in the delayed, ambiguous model, the learner must answer each input before receiving the next input of the round, while the learner receives all $r$ inputs at once in the modified weak reinforcement model. We also prove generalizations for multi-class functions.
Then, we prove a lower and upper bound of the maximum factor gap that are tight up to a factor of $r$ between the modified weak reinforcement model and the standard model.
Lastly, we also introduce several related models for learning with permutation patterns: the order model, the relative position model, and the delayed relative position model. In these models, a learner attempts to learn a permutation from a set of permutations $F$ by guessing statistics related to sub-permutations. We similarly define the notions of weak versus strong reinforcement and of delayed, ambiguous, reinforcement, and determine some sharp bounds by mimicking sorting algorithms.

291) Fenghuan (Linda) He, A Topological Centrality Measure for Directed Networks (24 Dec 2021; arXiv.org, 30 Jan 2022)

Given a directed network G , we are interested in studying the qualitative features of G which govern how perturbations propagate across G . Various classical centrality measures have been already developed and proven useful to capture qualitative features and behaviors for undirected networks. In this paper, we use topological data analysis (TDA) to adapt measures of centrality to capture both directedness and non-local propagating behaviors in networks. We introduce a new metric for computing centrality in directed weighted networks, namely the quasi-centrality measure. We compute these metrics on trade networks to illustrate that our measure successfully captures propagating effects in the network and can also be used to identify sources of shocks that can disrupt the topology of directed networks. Moreover, we introduce a method that gives a hierarchical representation of the topological influences of nodes in a directed network.

290) Joshua Guo (PRIMES) and Kevin Chang (MIT), On the Gauss-Epple homomorphism of the braid group $B_n$, and generalizations to Artin groups of crystallographic type (24 Dec 2021)

In this paper, we introduce a broad family of group homomorphisms that we name the Gauss-Epple homomorphisms. In the setting of braid groups, the Gauss-Epple invariant was originally defined by Epple based on a note of Gauss as an action of the braid group $B_n$ on the set $\{1, \dots, n\}\times\mathbb{Z}$; we prove that it is well-defined. We consider the associated group homomorphism from $B_n$ to the symmetric group $\text{Sym}(\{1, \dots, n\}\times\mathbb{Z})$. We prove that this homomorphism factors through $\mathbb{Z}^n\rtimes S_n$ (in fact, its image is an order 2 subgroup of the previous group). We also describe the kernel of the homomorphism and calculate the asymptotic probability that it contains a random braid of a given length. Furthermore, we discuss the super-Gauss-Epple homomorphism, a homomorphism which extends the generalization of the Gauss-Epple homomorphism and describe a related 1-cocycle of the symmetric group $S_n$ on the set of antisymmetric $n\times n$ matrices over the integers. We then generalize the super-Gauss-Epple homomorphism and the associated 1-cocycle to Artin groups of finite type. For future work, we suggest studying possible generalizations to complex reflection groups and computing the vector spaces of Gauss-Epple analogues.

289) Valeri Frumkin (MIT) and Rishabh Das (PRIMES), Thermal modulation of fluidic lenses in microgravity (22 Dec 2021)

The fluidic shaping method is an exciting new technology that allows to rapidly shape liquids into a wide range of optical topographies with sub-nanometer surface quality. The scale-invariance of the method makes it well suited for for space-based fabrication of large fluidic optics. However, in microgravity, the resulting optical topographies are limited to constant mean curvature surfaces. Here we study how variations in surface tension result in deviations from constant mean curvature topographies, allowing one to introduce optical corrections which would not be obtainable otherwise. Under the assumption of small thermal Peclet number, we derive a differential equation governing the steady-state shape of the liquid surface under the effect of spatially varying surface tension. This equation allows us to formulate an inverse problem of finding the required surface-tension distribution for a desired correction. Lastly, we provide several examples for surface tension distributions yielding required aspheric topographies.

288) Yi Liang (PRIMES) and James Unwin (University of Illinois at Chicago), COVID-19 Forecasts via Stock Market Indicators (arXiv.org, 13 Dec 2021)

Reliable short term forecasting can provide potentially lifesaving insights into logistical planning, and in particular, into the optimal allocation of resources such as hospital staff and equipment. By reinterpreting COVID-19 daily cases in terms of candlesticks, we are able to apply some of the most popular stock market technical indicators to obtain predictive power over the course of the pandemics. By providing a quantitative assessment of MACD, RSI, and candlestick analyses, we show their statistical significance in making predictions for both stock market data and WHO COVID-19 data. In particular, we show the utility of this novel approach by considering the identification of the beginnings of subsequent waves of the pandemic. Finally, our new methods are used to assess whether current health policies are impacting the growth in new COVID-19 cases.

287) Anuj Sakarda, Jerry Tan, and Armaan Tipirneni, On the Distance Spectra of Extended Double Stars (arXiv.org, 6 Dec 2021)

The distance matrix of a connected graph is defined as the matrix in which the entries are the pairwise distances between vertices. The distance spectrum of a graph is the set of eigenvalues of its distance matrix. A graph is said to be determined by its distance spectrum if there does not exist a non-isomorphic graph with the same spectrum. The question of which graphs are determined by their spectrum has been raised in the past, but it remains largely unresolved. In this paper, we prove that extended double stars are determined by their distance spectra.

286) Daniel Xia (PRIMES) and Pei-Ken Hung (University of Minnesota), A Minkowski-type inequality in the AdS-Melvin space (arXiv.org, 19 Nov 2021)

The AdS-Melvin spacetime was introduced by Astorino and models the AdS soliton with electromagnetic charge. It is a static spacetime with a time-symmetric Cauchy hypersurface, which we refer to as the AdS-Melvin space. In this paper, we study a sharp Minkowski-type inequality for surfaces embedded in the AdS-Melvin space. We first prove the inequality for special cases in which the surface enjoys axisymmetry or is a small perturbation of a coordinate torus. We then use a weighted normal flow to show that the inequality holds for general surfaces.

285) Jeremy Yu (PRIMES), Lu Lu (MIT), Xuhui Meng, and George Em Karniadakis, Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems (arXiv.org, 1 Nov 2021), published in Computer Methods in Applied Mechanics and Engineering , vol. 393 (1 April 2022)

Deep learning has been shown to be an effective tool in solving partial differential equations (PDEs) through physics-informed neural networks (PINNs). PINNs embed the PDE residual into the loss function of the neural network, and have been successfully employed to solve diverse forward and inverse PDE problems. However, one disadvantage of the first generation of PINNs is that they usually have limited accuracy even with many training points. Here, we propose a new method, gradient-enhanced physics-informed neural networks (gPINNs), for improving the accuracy and training efficiency of PINNs. gPINNs leverage gradient information of the PDE residual and embed the gradient into the loss function. We tested gPINNs extensively and demonstrated the effectiveness of gPINNs in both forward and inverse PDE problems. Our numerical results show that gPINN performs better than PINN with fewer training points. Furthermore, we combined gPINN with the method of residual-based adaptive refinement (RAR), a method for improving the distribution of training points adaptively during training, to further improve the performance of gPINN, especially in PDEs with solutions that have steep gradients.

284) Felix Gotti (MIT) and Bangzheng Li (PRIMES), Atomic semigroup rings and the ascending chain condition on principal ideals (arXiv.org, 30 Oct 2021), published in Proceedings of the American Mathematical Society 151 (2023): 2291-2302

An integral domain is called atomic if every nonzero nonunit element factors into irreducibles. On the other hand, an integral domain is said to satisfy the ascending chain condition on principal ideals (ACCP) if every ascending chain of principal ideals terminates. It was asserted by Cohn back in the sixties that every atomic domain satisfies the ACCP, but such an assertion was refuted by Grams in the seventies with an explicit construction of a neat example. Still, atomic domains without the ACCP are notoriously elusive, and just a few classes have been found since Grams' first construction. In the first part of this paper, we generalize Grams' construction to provide new classes of atomic domains without the ACCP. In the second part of this paper, we construct what seems to be the first atomic semigroup ring without the ACCP in the existing literature.

283) Karthik Seetharaman, William Yue, and Isaac Zhu, Patterns in the Lattice Homology of Seifert Homology Spheres (arXiv.org, 26 Oct 2021)

In this paper, we study various homology cobordism invariants for Seifert fibered integral homology 3-spheres derived from Heegaard Floer homology. Our main tool is lattice homology, a combinatorial invariant defined by Ozsv\'ath-Szab\'o and N\'emethi. We reprove the fact that the $d$-invariants of Seifert homology spheres $\Sigma(a_1,a_2,\dots,a_n)$ and $\Sigma(a_1,a_2,\dots,a_n+a_1a_2\cdots a_{n-1})$ are the same using an explicit understanding of the behavior of the numerical semigroup minimally generated by $a_1a_2\cdots a_n/a_i$ for $i\in[1,n]$. We also study the maximal monotone subroots of the lattice homologies, another homology cobordism invariant introduced by Dai and Manolescu. We show that the maximal monotone subroots of the lattice homologies of Seifert homology spheres $\Sigma(a_1,a_2,\dots,a_n)$ and $\Sigma(a_1,a_2,\dots,a_n+2a_1a_2\cdots a_{n-1})$ are the same.

282) Christian Gaetz (MIT) and Ram K. Goel (PRIMES), Products of reflections in smooth Bruhat intervals (arXiv.org, 25 Oct 2021)

A permutation is called smooth if the corresponding Schubert variety is smooth. Gilboa and Lapid prove that in the symmetric group, multiplying the reflections below a smooth element $w$ in Bruhat order in a compatible order yields back the element $w$. We strengthen this result by showing that such a product in fact determines a saturated chain $e \to w$ in Bruhat order, and that this property characterizes smooth elements.

281) Yash Agarwal (PRIMES) and Sarah Greer (MIT), Convolutional encoder decoder network for the removal of coherent seismic noise (arXiv.org, 25 Oct 2021)

Seismologists often need to gather information about the subsurface structure of a location to determine if it is fit to be drilled for oil. However, there may be electrical noise in seismic data which is often removed by disregarding certain portions of the data with the use of a notch filter. Instead, we use a convolutional encoder decoder network to remove such noise by training the network to take the noisy shot record as input and remove the noise from the shot record as output. In this way, we retain important information about the data collected while still removing coherent noise in seismic data.

280) Sophia Benjamin, Arushi Mantri, and Quinn Perian, On the Wasserstein Distance Between $k$-Step Probability Measures on Finite Graphs (arXiv.org, 20 Oct 2021)

We consider random walks $X,Y$ on a finite graph $G$ with respective lazinesses $\alpha, \beta \in [0,1]$. Let $\mu_k$ and $\nu_k$ be the $k$-step transition probability measures of $X$ and $Y$. In this paper, we study the Wasserstein distance between $\mu_k$ and $\nu_k$ for general $k$. We consider the sequence formed by the Wasserstein distance at odd values of $k$ and the sequence formed by the Wasserstein distance at even values of $k$. We first establish that these sequences always converge, and then we characterize the possible values for the sequences to converge to. We further show that each of these sequences is either eventually constant or converges at an exponential rate. By analyzing the cases of different convergence values separately, we are able to partially characterize when the Wasserstein distance is constant for sufficiently large $k$.

279) Sheryl Hsu (PRIMES), Fidel I. Schaposnik Massolo (Université Libre de Bruxelles), and Laura P. Schaposnik (University of Illinois at Chicago), The Power of Many: A Physarum Swarm Steiner Tree Algorithm (arXiv.org, 15 Oct 2021)

We create a novel Physarum Steiner algorithm designed to solve the Euclidean Steiner tree problem. Physarum is a unicellular slime mold with the ability to form networks and fuse with other Physarum organisms. We use the simplicity and fusion of Physarum to create large swarms which independently operate to solve the Steiner problem. The Physarum Steiner tree algorithm then utilizes a swarm of Physarum organisms which gradually find terminals and fuse with each other, sharing intelligence. The algorithm is also highly capable of solving the obstacle avoidance Steiner tree problem and is a strong alternative to the current leading algorithm. The algorithm is of particular interest due to its novel approach, rectilinear properties, and ability to run on varying shapes and topological surfaces.

278) Alexander Tianlin Hu (PRIMES) and Andrey Boris Khesin (MIT), Improved Graph Formalism for Quantum Circuit Simulation (arXiv.org, 20 Sep 2021)

Improving the simulation of quantum circuits on classical computers is important for understanding quantum advantage and increasing development speed. In this paper, we explore a new way to express stabilizer states and further improve the speed of simulating stabilizer circuits with a current existing approach. First, we discover a unique and elegant canonical form for stabilizer states based on graph states to better represent stabilizer states and show how to efficiently simplify stabilizer states to canonical form. Second, we develop an improved algorithm for graph state stabilizer simulation and establish limitations on reducing the quadratic runtime of applying controlled-Pauli $Z$ gates. We do so by creating a simpler formula for combining two Pauli-related stabilizer states into one. Third, to better understand the linear dependence of stabilizer states, we characterize all linearly dependent triplets, revealing symmetries in the inner products. Using our novel controlled-Pauli $Z$ algorithm, we improve runtime for inner product computation from $O(n^3)$ to $O(nd^2)$ where $d$ is the maximum degree of the graph.

277) Sophie Zhu, Factorizations in evaluation monoids of Laurent semirings (arXiv.org, 26 Aug 2021), published in Communications in Algebra 50:6 (2022): 2719-2730

For a positive real number $α$, let $\mathbb{N}_0[α,α^{-1}]$ be the semiring of all real numbers $f(α)$ for $f(x)$ lying in $\mathbb{N}_0[x,x^{-1}]$, which is the semiring of all Laurent polynomials over the set of nonnegative integers $\mathbb{N}_0$. In this paper, we study various factorization properties of the additive structure of $\mathbb{N}_0[α, α^{-1}]$. We characterize when $\mathbb{N}_0[α, α^{-1}]$ is atomic. Then we characterize when $\mathbb{N}_0[α, α^{-1}]$ satisfies the ascending chain condition on principal ideals in terms of certain well-studied factorization properties. Finally, we characterize when $\mathbb{N}_0[α, α^{-1}]$ satisfies the unique factorization property and show that, when this is not the case, $\mathbb{N}_0[α, α^{-1}]$ has infinite elasticity.

276) Felix Gotti (MIT) and Bangzheng Li (PRIMES), Divisibility in rings of integer-valued polynomials (arXiv.org, 25 July 2021), published in The New York Journal of Mathematics 28 (2022): 117–139

In this paper, we address various aspects of divisibility by irreducibles in rings consisting of integer-valued polynomials. An integral domain is called atomic if every nonzero nonunit factors into irreducibles. Atomic domains that do not satisfy the ascending chain condition on principal ideals (ACCP) have proved to be elusive, and not many of them have been found since the first one was constructed by A. Grams in 1974. Here we exhibit the first class of atomic rings of integer-valued polynomials without the ACCP. An integral domain is called a finite factorization domain (FFD) if it is simultaneously atomic and an idf-domain (i.e., every nonzero element is divisible by only finitely many irreducibles up to associates). We prove that a ring is an FFD if and only if its ring of integer-valued polynomials is an FFD. In addition, we show that neither being atomic nor being an idf-domain transfer, in general, from an integral domain to its ring of integer-valued polynomials. In the same class of rings of integer-valued polynomials, we consider further properties that are defined in terms of divisibility by irreducibles, including being Cohen-Kaplansky and being Furstenberg.

275) Beining Zhou, High-Order Sensor Array Geometries for Improved Direction of Arrival Estimation in Signal Processing (9 July 2021)

In signal processing, the direction of arrival (DOA) estimation is a central problem to locate the source of a signal. It applies extensively in wireless communication systems such as radars and the GPS, in medical imaging, in telescopes, etc. Devising a signal sensor array geometry that achieves higher degree of freedom (DOF) has been a crucial challenge to improve the efficiency of DOA estimation. Recently, high-order cumulants are used extensively to construct high-order sensor arrays, but the state-of-the art high-order arrays are not optimal. This paper proposes novel sensor array geometries, the high-order embeded arrays (HOEA) for the 4th- and 6th-order and then extends those arrays to the 2$q$th-order by layering. Compared to previous methods, the proposed HOEA significantly improves the DOF generation from $O(2^{q}N^{2q})$ to $O(17^{q/3}N^{2q})$, which increases the theoretical efficiency by $25\%$ in the 4th order, $113\%$ in the 6th, and $352\%$ in the 12th order.

274) Benjamin Chen, Practical Anonymity Sets in a Pseudonymous Forum Setting (6 July 2021)

Pseudonymous forums are online websites where users can post publicly visible content and participate in discussions under a pseudonym. Such forums are not perfectly private, as their privacy can be compromised to traffic analysis attacks. However, many methods of providing perfect privacy to such a system come with a heavy performance cost—whether in bandwidth or latency. We examine the practicality of anonymity sets, a defense against such attacks that can still provide a formal privacy guarantee with less performance losses, and attempt to simulate their implementation in a real-world setting using real data scraped from Reddit, a popular pseudonymous forum. We try various different methods of creating these anonymity sets, finding that K-means with some dimensionality compression yields decent results; we also propose a method of defining a common traffic budget for members of a set. We find that anonymity sets are a feasible defense against such attacks in the pseudonymous forum setting.

273) Matthew Ding, An Analysis of Multi-hop Iterative Approximate Byzantine Consensus with Local Communication (27 June 2021)

Iterative Approximate Byzantine Consensus (IABC) is a fundamental problem of fault-tolerant distributed computing where machines seek to achieve approximate consensus to arbitrary exactness in the presence of Byzantine failures. We present a novel algorithm for this problem, named Relay-IABC, which relies on the usage of a multi-hop relayed messaging system and crytographically secure message signatures. The use of signatures and relays allows the strict necessary network conditions of traditional IABC algorithms to be circumvented. In addition, we show evidence that Relay-IABC achieves faster convergence than traditional algorithms even under these strict network conditions with both theoretical analysis and experimental results.

272) Jason Yang (PRIMES), Jun Wan (MIT), and Hanshen Xiao (MIT), Decentralized Gradient Descent: how network structure affects convergence (26 June 2021)

We investigate decentralized gradient descent among a network of nodes where an adversary has corrupted certain nodes. We focus on the case where the utility functions of all nodes are 1-dimensional quadratics, and where each corrupted node is connected to all honest nodes.

271) Sheryl Hsu (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), Cell fusion through slime mold network dynamics (arXiv.org, 21 June 2021)

Physarum Polycephalum is a unicellular slime mold that has been intensely studied due to its ability to solve mazes, find shortest paths, generate Steiner trees, share knowledge, remember past events, and its applications to unconventional computing. The CELL model is a unicellular automaton introduced in the recent work of Gunji et al. in 2008, that models Physarum's amoeboid motion, tentacle formation, maze solving, and network creation. In the present paper, we extend the CELL model by spawning multiple CELLs, allowing us to understand the interactions between multiple cells, and in particular, their mobility, merge speed, and cytoplasm mixing. We conclude the paper with some notes about applications of our work to modeling the rise of present day civilization from the early nomadic humans and the spread of trends and information around the world. Our study of the interactions of this unicellular organism should further the understanding of how Physarum Polycephalum communicates and shares information.

270) Linda Chen, Communication Complexity of Byzantine Broadcast (19 June 2021)

Byzantine Broadcast is a fundamental problem in distributed computing, with communication complexity being an important aspect of Byzantine Broadcast protocols. In Byzantine Broadcast, a designated leader must ensure that all honest users in a distributed system reach a consensus, even in the presence of some dishonest users. Previous works have shown an $O(n^2)$ lower bound on communication complexity, as well as protocols with $O(n^2)$ communication complexity for the honest majority scenario. In this paper, we review the previous work and provide various methods and intuition towards a possible $O(n^3)$ communication complexity lower bound for dishonest majority Byzantine Broadcast.

2020 Research Papers

269) Varun Suraj (PRIMES), Catherine Del Vecchio Fitz, Laura Kleiman, Suresh Bhavnani, Chinmay Jani, Surbhi Shah, Rana McKay, Jeremy Warner, and Gil Alterovitz, SMART COVID Navigator, a Clinical Decision Support Tool for COVID-19 Treatment: Design and Development Study , published in Journal of Medical Internet Research 24, no. 2 (18 Feb 2022)

COVID-19 caused by SARS-CoV-2 has infected 219 million individuals at the time of writing of this paper. A large volume of research findings from observational studies about disease interactions with COVID-19 is being produced almost daily, making it difficult for physicians to keep track of the latest information on COVID-19’s effect on patients with certain pre-existing conditions.

268) Ayshwarya Subramanian (Broad Institute), Mikhail Alperovich (PRIMES), Yiming Yang, and Bo Li, Biology-inspired data-driven quality control for scientific discovery in single-cell transcriptomics (bioRxiv.org, 28 Oct 2021)

Quality control (QC) of cells, a critical step in single-cell RNA sequencing data analysis, has largely relied on arbitrarily fixed data-agnostic thresholds on QC metrics such as gene complexity and fraction of reads mapping to mitochondrial genes. The few existing data-driven approaches perform QC at the level of samples or studies without accounting for biological variation in the commonly used QC criteria. We demonstrate that the QC metrics vary both at the tissue and cell state level across technologies, study conditions, and species. We propose data-driven QC ( ddqc ), an unsupervised adaptive quality control framework that performs flexible and data-driven quality control at the level of cell states while retaining critical biological insights and improved power for downstream analysis. On applying ddqc to 6,228,212 cells and 835 mouse and human samples, we retain a median of 39.7% more cells when compared to conventional data-agnostic QC filters. With ddqc , we recover biologically meaningful trends in gene complexity and ribosomal expression among cell-types enabling exploration of cell states with minimal transcriptional diversity or maximum ribosomal protein expression. Moreover, ddqc allows us to retain cell-types often lost by conventional QC such as metabolically active parenchymal cells, and specialized cells such as neutrophils or gastric chief cells. Taken together, our work proposes a revised paradigm to quality filtering best practices - iterative QC, providing a data-driven quality control framework compatible with observed biological diversity.

267) Robert H. Dolin, Shaileshbhai R. Gothi, Aziz Boxwala, Bret S. E. Heale, Ammar Husami, James Jones, Himanshu Khangar, Shubham Londhe, Frank Naeymi-Rad, Soujanya Rao, Barbara Rapchak, James Shalaby, Varun Suraj (PRIMES), Ning Xie, Srikar Chamala & Gil Alterovitz, vcf2fhir: a utility to convert VCF files into HL7 FHIR format for genomics-EHR integration , published in BMC Bioinformatics 22, article No. 104 (2 Mar 2021)

VCF formatted files are the lingua franca of next-generation sequencing, whereas HL7 FHIR is emerging as a standard language for electronic health record interoperability. A growing number of FHIR-based clinical genomics applications are emerging. Here, we describe an open source utility for converting variants from VCF format into HL7 FHIR format.

266) Quanlin Chen, The Center of the $q$-Weyl Algebra over Rings with Torsion (23 Jan 2021)

We compute the centers of the Weyl algebra, $q$-Weyl algebra, and the "first $q$-Weyl algebra" over the quotient of the ring $\mathbb{Z}/p^N \mathbb{Z}[q]$ by some polynomial $P(q)$. Through this, we generalize and "quantize" part of a result by Stewart and Vologodsky on the center of the ring of differential operators on a smooth variety over $\mathbb{Z}/p^N \mathbb{Z}$. We prove that a corresponding Witt vector structure appears for general $P(q)$ and compute the extra terms for special $P(q)$ with particular properties, answering a question by Bezrukavnikov of possible interpolation between two known results.

265) Tanisha Saxena and Daniel Xu, Graph Alignment-Based Protein Comparison (23 Jan 2021)

Inspired by the question of identifying mechanisms of viral infection, we are interested in the problem of comparing pairs of proteins, given by amino acid sequences and traces of their 3-dimensional structure. While it is true that the problem of predicting and comparing protein function is one of the most famous unsolved problems in computational biology, we propose a heuristic which poses it as a simple alignment problem, which - after some linear-algebraic pre-processing - is amenable to a dynamic programming solution.

264) Andrew Cai, Ratios of Naruse-Newton Coefficients Obtained from Descent Polynomials (arXiv.org, 20 Jan 2021)

We study Naruse-Newton coefficients, which are obtained from expanding descent polynomials in a Newton basis introduced by Jiradilok and McConville. These coefficients $C_0, C_1, \ldots$ form an integer sequence associated to each finite set of positive integers. For fixed nonnegative integers $a<b$, we examine the set $R_{a, b}$ of all ratios $\frac{C_a}{C_b}$ over finite sets of positive integers. We characterize finite sets for which $\frac{C_a}{C_b}$ is minimized and provide a construction to prove $R_{a, b}$ is unbounded above. We use this construction to obtain results on the closure of $R_{a, b}$. We also examine properties of Naruse-Newton coefficients associated with doubleton sets, such as unimodality and log-concavity. Finally, we find an explicit formula for all ratios $\frac{C_a}{C_b}$ of Naruse-Newton coefficients associated with ribbons of staircase shape.

263) Ishan Levy (MIT) and Justin Wu (PRIMES), The Borel Cohomology of Free Iterated Loop Spaces (16 Jan 2021; arXiv.org, 28 May 2021), forthcoming in Algebraic & Geometric Topology

We compute the $\text{SO}(n+1)$-equivariant mod $2$ Borel cohomology of the free iterated loop space $Z^{S^n}$ when $n$ is odd and $Z$ is a product of mod $2$ Eilenberg Mac Lane spaces. When $n=1$, this recovers Ottosen and B\"okstedt's computation for the free loop space. The highlight of our computation is a construction of cohomology classes using an $O(n)$-equivariant evaluation map and a pushforward map. We then reinterpret our computation as giving a presentation of the zeroth derived functor of the Borel cohomology of $Z^{S^n}$ for arbitrary $Z$. We also include an appendix where we give formulas for computing the zeroth derived functor of the cohomology of mapping spaces, and study the dependence of such derived functors on the Steenrod operations.

262) Linda Chen, Reducing Round Complexity of Byzantine Broadcast (15 Jan 2021)

Byzantine Broadcast is an important topic in distributed systems and improving its round complexity has long been a focused challenge. Under honest majority, the state of the art for Byzantine Broadcast is 10 rounds for a static adversary and 16 rounds for an adaptive adversary. In this paper, we present a Byzantine Broadcast protocol with expected 8 rounds under a static adversary and expected 10 rounds under an adaptive adversary. We also generalize our idea to the dishonest majority setting and achieve an improvement over existing protocols.

261) Zarathustra Brady (MIT) and Holden Mui (PRIMES), Symmetric Operations on Domains of Size at Most 4 (15 Jan 2021)

To convert a fractional solution to an instance of a constraint satisfaction problem into a solution, a rounding scheme is needed, which can be described by a collection of symmetric operations with one of each arity. An intriguing possibility, raised in a recent paper by Carvalho and Krokhin, would imply that any clone of operations on a set $D$ which contains symmetric operations of arities $1, 2, \ldots, |D|$ contains symmetric operations of all arities in the clone. If true, then it is possible to check whether any given family of constraint satisfaction problems is solved by its linear programming relaxation. We characterize all idempotent clones containing symmetric operations of arities $1, 2, \ldots, |D|$ for all sets $D$ with size at most four and prove that each one contains symmetric operations of every arity, proving the conjecture above for $|D|{\leq}4$.

260) Yuxiao Wang, Asymptotics for Iterating the Lusztig-Vogan Bijection for $GL_n$ on Dominant Weights (15 Jan 2021)

In this paper, we iterate the explicit algorithm computing the Lusztig-Vogan bijection in Type $A$ ($GL_n$) on dominant weights, which was proposed by Achar and simplified by Rush. Our main result focuses on describing asymptotic behavior between the number of iterations for an input and the length of the input; we also present a recursive formula to compute the slope of the asymptote. This serves as another contribution to understanding the Lusztig-Vogan bijection from a combinatorial perspective and a first step in understanding the iterative behavior of the Lusztig-Vogan bijection in Type $A$.

259) Quanlin Chen, Tianze Jiang, and Yuxiao Wang, On the Generational Behavior of Gaussian Binomial Coefficients at Roots of Unity (15 Jan 2021)

The generational behavior of Gaussian binomial coefficients at roots of unity shadows the relationship between the reductive algebraic group in prime characteristic and the quantum group at roots of unity. In this paper, we study three ways of obtaining integer values from Gaussian binomial coefficients at roots of unity. We rigorously define the generations in this context and prove such behavior at primes power and two times primes power roots of unity. Moreover, we investigate and make conjectures on the vanishing, valuation, and sign behavior under the big picture of generations.

258) Fiona Abney-McPeek, Serena An, and Jakin Ng, The Stembridge Equality for Skew Stable Grothendieck Polynomials and Skew Dual Stable Grothendieck Polynomialsls (15 Jan 2021; arXiv.org, 9 Feb 2021)

The Schur polynomials $s_{\lambda}$ are essential in understanding the representation theory of the general linear group. They also describe the cohomology ring of the Grassmannians. For $\rho = (n, n-1, \dots, 1)$ a staircase shape and $\mu \subseteq \rho$ a subpartition, the Stembridge equality states that $s_{\rho/\mu} = s_{\rho/\mu^T}$. This equality provides information about the symmetry of the cohomology ring. The stable Grothendieck polynomials $G_{\lambda}$, and the dual stable Grothendieck polynomials $g_{\lambda}$, developed by Buch, Lam, and Pylyavskyy, are variants of the Schur polynomials and describe the $K$-theory of the Grassmannians. Using the Hopf algebra structure of the ring of symmetric functions and a generalized Littlewood-Richardson rule, we prove that $G_{\rho/\mu} = G_{\rho/\mu^T}$ and $g_{\rho/\mu} = g_{\rho/\mu^T}$, the analogues of the Stembridge equality for the skew stable and skew dual stable Grothendieck polynomials.

257) Samuel H. Florin (PRIMES), Matthew H. Ho (PRIMES), and Zilin Jiang (MIT), On the binary adder channel with complete feedback, with an application to quantitative group testing (arXiv.org, 25 Jan 2021), published in IEEE Transactions on Information Theory 68:5 (May 2022): 2839-2856

We determine the exact value of the optimal symmetric rate point in the Dueck zero-error capacity region of the binary adder channel with complete feedback. Our motivation is a problem in quantitative group testing. Given a set of $n$ elements two of which are defective, the quantitative group testing problem asks for the identification of these two defectives through a series of tests. Each test gives the number of defectives contained in the tested subset, and the outcomes of previous tests are assumed known at the time of designing the current test. We establish that the minimum number of tests is asymptotic to $(\log_2 n) / r$, where the constant $r \approx 0.78974$ lies strictly between the lower bound $5/7 \approx 0.71428$ due to Gargano et al. and the information-theoretic upper bound $(\log_2 3) / 2 \approx 0.79248$.

256) Adithya Balachandran, Andrew Huang, and Siwen Sun, Product Expansions of q -Character Polynomials (15 Jan 2021)

We consider certain class functions defined simultaneously on the groups $Gl_n(\mathbb{F}_q)$ for all n , which we also interpret as statistics on matrices. It has been previously shown that these simultaneous class functions are closed under multiplication, and we work towards computing the structure constants of this ring of functions. We derive general criteria for determining which statistics have nonzero expansion coefficients in the product of two fixed statistics. To this end, we introduce an algorithm that computes expansion coefficients in general, which we furthermore use to give closed form expansions in some cases. We conjecture that certain indecomposable statistics generate the whole ring, and indeed prove this to be the case for statistics associated with matrices consisting of up to 2 Jordan blocks. The coefficients we compute exhibit surprising stability phenomena, which in turn reflect stabilizations of joint moments as well as multiplicities in the irreducible decomposition of tensor products of representations of finite general linear groups.

255) Daniel Hong, Hyunwoo Lee, and Alex Wei, Optimal solutions and ranks in the max-cut SDP (15 Jan 2021)

The max-cut problem is a classical graph theory problem which is NP-complete. The best polynomial time approximation scheme relies on semidefinite programming (SDP). We study the conditions under which graphs of certain classes have rank 1 solutions to the max-cut SDP. We apply these findings to look at how solutions to the max-cut SDP behave under simple combinatorial constructions. Our results determine when solutions to the max-cut SDP for cycle graphs are rank 1. We find the solutions to the max-cut SDP of the vertex sum of two graphs. We then characterize the SDP solutions upon joining two triangle graphs by an edge sum.

254) Sam Florin, Matthew Ho, and Rahul Thomas, Group testing for two defectives and the zero-error channel capacity (14 Jan 2021)

The issue of identifying defects in a set with as few tests as possible has many applications, including in maximum efficiency pool testing during the COVID-19 pandemic. This research aims to determine the rate of growth of the number of tests required relative to the logarithm of the size of the set. In particular, we focus on the case where there are exactly two defects in the set, which is equivalent to the problem of determining the zero-error capacity of a two-user binary adder channel with complete feedback. The channel capacity is given by a non-linear optimization problem involving entropy functions, whose optimal value remains unknown. In this paper, using the linear dependence technique, we are able to reduce the complexity of the optimization problem significantly. We also gather numerical evidence for the conjectured optimal value.

253) Sarah Chen, In silico prediction of retained intron-derived neoantigens in leukemia (8 Jan 2021)

Alternative splicing is critical for the regulation and diversification of gene expression. Conversely, splicing dysregulation, caused by mutations in splicing machinery or splice junctions, is a hallmark of cancer. Tumor-specific isoforms are a potential source of neoantigens, cancer-specific peptides presented by human leukocyte antigen (HLA) class I molecules and potentially recognized by T cells. For cancers such as acute myeloid leukemia (AML) with a low mutation burden but widespread splicing aberrations, splice variants and retained introns (RIs) in particular, may broaden the number of suitable targets for immunotherapy. I developed a computational pipeline to predict AS-derived neoepitopes from tumor RNA-Seq. I first used the B721.221 B cell line as a model system, for which RNA-Seq, Ribo-Seq, and immunoproteome data from >90 HLA class I monoallelic lines were available. I performed de novo transcriptome assembly with StringTie, identifying on average 694±73 AS isoforms across 4 technical replicates. Using HLAthena, I identified 1,087 AS-derived neoepitopes predicted to bind across 4 frequent HLA alleles. Of them, 192 (18%) also displayed evidence of mRNA translation, measured as the alignment of ≥1 Ribo-Seq. To further increase prediction accuracy, I am currently analyzing the HLA I immunopeptidome to define the features of predicted AS isoforms more likely to be not only translated but also HLA presented. Finally, I applied my prediction pipeline to AML cell lines ( n =8) and primary samples ( n =7). I identified 682±113 AS isoforms in AML cell lines, similar to the 694 in B721, but the proportion of isoforms containing RIs (as opposed to alternative 5' and 3' splice sites or cassette exons) was 3.5x higher than in B721, in line with the biological relevance of RIs in particular in this disease setting. Primary AML samples yielded 1496±294 AS isoforms, more than twofold the number in B721 or AML cell lines, thus reinforcing the significant contribution of AS to the cancer immunopeptidome. Accurate prediction of AS-derived neoantigens through this pipeline will contribute to the design of novel cancer immunotherapies.

252) Kenta Suzuki (PRIMES) and Michael E. Zieve (University of Michigan), Meromorphic functions with the same preimages at several finite sets (31 Dec 2020)

Let $p$ and $q$ be nonconstant meromorphic functions on $\mathbb{C}^m$. We show that if $p$ and $q$ have the same preimages as one another, counting multiplicities, at each of four nonempty pairwise disjoint subsets $S_1,\ldots,S_4$ of $ \mathbb{C}$, then $p$ and $q$ have the same preimages as one another at each of infinitely many subsets of $ \mathbb{C}$, and moreover $g(p)=g(q)$ for some nonconstant rational function $g(x)$ whose degree is bounded in terms of the sizes of the $S_i$'s. This result is new already when $m=1$, and it implies many previous results about the extent to which a meromorphic function is determined by its preimages of a few points or a few small sets.

251) Yavor Litchev and Abigail Thomas, Hybrid Privacy Scheme (31 Dec 2020)

Local Differential Privacy (LDP) is an approach that allows a central server to compute on data submitted by multiple users while maintaining the privacy of each user. LDP is a very efficient approach to security; however, as privacy increases, the accuracy of these computations decreases. Multi-Party Computation (MPC) is a process by which multiple parties work together to compute the output of a function without revealing their own information. MPC is highly secure and accurate for such computations, but it is very computationally expensive and slow. The proposed hybrid privacy model harnesses the benefits of both LDP and MPC to create a secure, accurate, and fast algorithm for machine learning.

250) Ho Tin Fan and Alvin Lu, Parallel Batch-Dynamic 3-Vertex Subgraph Maintenance (31 Dec 2020)

Counting certain subgraphs is a fundamental problem that is crucial in recognizing patterns in large graphs, such as social networks and biological interactomes. However, many real world graphs are constantly evolving and are subject to changes over time, and previous work on efficient parallel subgraph counting algorithms either do not support dynamic modifications or do not extend to general subgraphs. This paper presents a theoretically-efficient and demonstrably fast algorithm for parallel batch-dynamic 3-vertex subgraph counting, and the underlying data structure can be extended to counting 4-vertex subgraph counts as well. The algorithm maintains the h -index of the graph, or the maximum h such that the graph contains h vertices with degree at least h , and uses this to update subgraph counts through an efficient traversal of two-paths, or wedges. For a batch of size b , the algorithm takes O( bh ) expected amortized work and O(log( bh )) span with high probability.

249) Kevin Edward Zhao (PRIMES), Vladislav Lialin & Anna Rumshisky (UMass Lowell), Text Is an Image: Augmentation via Embedding Mixing (30 Dec 2020)

Data augmentation techniques are essential for computer vision, yielding significant accuracy improvements with little engineering costs. However, data augmentation for text has always been tricky. Synonym replacement techniques require a good thesaurus and domain-specific rules for synonym selection from the synset, while backtranslation techniques are computationally expensive and require a good translation model for the language in interest.
In this paper, we present simple text augmentation techniques on the embeddings level, inspired by mixing-based image augmentations. These techniques are language-agnostic and require little to no hyperparameter tuning. We evaluate the augmentation techniques on IMDB and GLUE tasks, and the results show that the augmentations significantly improve the score of the RoBERTa model.

248) Alvin Chen (PRIMES) and Kai Huang (MIT), Alpha invariants of $K$-semistable smooth toric Fano varieties (29 Dec 2020)

Jiang conjectured that the $\alpha$-invariant for $n$-dimensional $K$-semistable smooth Fano varieties has a gap between $\frac{1}{n}$ and $\frac{1}{n+1}$, where $\frac{1}{n+1}$ can only be achieved by projective $n$-space. Assuming a weaker version of Ewald's conjecture, we prove this gap conjecture in the toric case. We also prove a necessary and sufficient classification for all possible values of the $\alpha$-invariant for $K$-semistable smooth toric Fano varieties by providing an explicit construction of the polytopes that can achieve these values. This provides an important step towards understanding the types of polytopes that correspond to particular values of the $\alpha$-invariant; in particular, we show that $K$-semistable smooth Fano polytopes are centrally symmetric if and only if they have an $\alpha$-invariant of $\frac{1}{2}$. Lastly, we examine the effects of the Picard number on the $\alpha$-invariant, classifying the $K$-semistable smooth toric Fano varieties with Picard number 1 or 2 and their $\alpha$-invariants.

247) Vishnu Emani (PRIMES), Klaus Schmitz-Abe and Pankaj Agrawal (Boston Children's Hospital), Statistical Ranking Model for Candidate Genes in Rare Genetic Disorders (28 Dec 2020)

Genetic mutations are responsible for a significant number of rare diseases, and so investigating the genetic basis of various rare diseases has been a crucial area of study. More specifically, studying variants in the exome, the protein coding region which makes up approximately 1% of the human genome, has been proven effective at identifying the most likely pathogenic variants. The advent of whole exome and whole genome sequencing facilitates identification of the most likely pathogenic mutations much more efficiently and on a greater scale. Next-generation sequencing has been growing rapidly in the past decade and has led to numerous successful disease-detection pipelines. The pipeline involved in this study was the Variant Explorer Pipeline (VExP), developed by our laboratory to improve diagnostic yield. In the VExP pipeline, genetic variants are filtered based on a variety of criteria, which can be divided into the categories of genotype data and phenotype data (Figure 1). After the filtering process, the most likely variants are isolated, a process which requires meticulous examination of a large number of mutations. Furthermore, determining the strength of a phenotype match presents challenges because a number of resources need to be consulted to make an informed decision. The purpose of this project was to develop an automated algorithm, using a host of parameters, to rank mutation candidates based on the two computed scores for pathogenicity.

246) Neil Chowdhury, Modeling the Effect of Histone Methylation on Chromosomal Organization in Colon Cancer Cells (27 Dec 2020)

Loop extrusion and compartmentalization are the two most important processes regulating the high-level organization of DNA in the cell nucleus. These processes are largely believed to be independent and competing. Chromatin consists of nucleosomes, which contain coils of DNA wrapped around histone proteins. Besides packing DNA, nucleosomes contain an "epigenetic code" - tails of histone proteins are chemically modified at certain positions to leave certain "histone marks" on the chromatin fiber. This paper explores the effect of the H3K9me3 histone modification, which typically corresponds to inactive and repressed chromatin, on genome structure. Interestingly, in H3K9me3 domains, there are much fewer topologically associating domains (TADs) than in other domains, and there is a unique compartmentalization pattern. A high-resolution polymer model simulating both loop extrusion and compartmentalization is created to explore these differences.

245) Daniel Xu, Modeling of Network Based Digital Contact Tracing and Testing Strategies for the COVID-19 Pandemic (26 Dec 2020; arXiv.org, 28 Dec 2020), published in Mathematical Biosciences , vol. 338 (August 2021)

With more than 1.7 million COVID-19 deaths, identifying effective measures to prevent COVID19 is a top priority. We developed a mathematical model to simulate the COVID-19 pandemic with digital contact tracing and testing strategies. The model uses a real-world social network generated from a high-resolution contact data set of 180 students. This model incorporates infectivity variations, test sensitivities, incubation period, and asymptomatic cases. We present a method to extend the weighted temporal social network and present simulations on a network of 5000 students. The purpose of this work is to investigate optimal quarantine rules and testing strategies with digital contact tracing. The results show that the traditional strategy of quarantining direct contacts reduces infections by less than 20% without sufficient testing. Periodic testing every 2 weeks without contact tracing reduces infections by less than 3%. A variety of strategies are discussed including testing second and third degree contacts and the pre-exposure notification system, which acts as a social radar warning users how far they are from COVID-19. The most effective strategy discussed in this work was combined the pre-exposure notification system with testing second and third degree contacts. This strategy reduces infections by 18.3% when 30% of the population uses the app, 45.2% when 50% of the population uses the app, 72.1% when 70% of the population uses the app, and 86.8% when 95% of the population uses the app. When simulating the model on an extended network of 5000 students, the results are similar with the contact tracing app reducing infections by up to 79%.

244) Yongyi Chen (MIT) and Tae Kyu Kim (PRIMES), On Generalized Carmichael Numbers (15 Dec 2020; arXiv.org 5 Mar 2021)

Given an integer $k$, define $C_k$ as the set of integers $n > \max(k,0)$ such that $a^{n-k+1} \equiv a \pmod{n}$ holds for all integers $a$. We establish various multiplicative properties of the elements in $C_k$ and give a sufficient condition for the infinitude of $C_k$. Moreover, we prove that there are finitely many elements in $C_k$ with one and two prime factors if and only if $k>0$ and $k$ is prime. In addition, if all but two prime factors of $n \in C_k$ are fixed, then there are finitely many elements in $C_k$, excluding certain infinite families of $n$. We also give conjectures about the growth rate of $C_k$ with numerical evidence. We explore a similar question when both $a$ and $k$ are fixed and prove that for fixed integers $a \geq 2$ and $k$, there are infinitely many integers $n$ such that $a^{n-k} \equiv 1 \pmod{n}$ if and only if $(k,a) \neq (0,2)$ by building off the work of Kiss and Phong. Finally, we discuss the multiplicative properties of positive integers $n$ such that Carmichael function $\lambda(n)$ divides $n-k$.

243) William Qin, HOMFLY Polynomials of Pretzel Knots (11 Dec 2020; arXiv.org, 3 Jan 2021)

HOMFLY polynomials are one of the major knot invariants being actively studied. They are difficult to compute in the general case but can be far more easily expressed in certain specific cases. In this paper, we examine two particular knots, as well as one more general infinite class of knots.
From our calculations, we see some apparent patterns in the polynomials for the knots $9_{35}$ and $9_{46}$, and in particular their $F$-factors. These properties are of a form that seems conducive to finding a general formula for them, which would yield a general formula for the HOMFLY polynomials of the two knots.
Motivated by these observations, we demonstrate and conjecture some properties both of the $F$-factors and HOMFLY polynomials of these knots and of the more general class that contains them, namely pretzel knots with 3 odd parameters. We make the first steps toward a matrix-less general formula for the HOMFLY polynomials of these knots.

242) Jonathan Yin (PRIMES), Hattie Chung (Broad Institute), and Aviv Regev (Broad Institute), A multi-view generative model for molecular representation improves prediction tasks (7 Dec 2020), accepted paper for LMRL2020 (Learning Meaningful Representations of Life) workshop at NeurIPS 2020 (Thirty-fourth Conference on Neural Information Processing Systems)

Unsupervised generative models have been a popular approach to representing molecules. These models extract salient molecular features to create compact vec- tors that can be used for downstream prediction tasks. However, current generative models for molecules rely mostly on structural features and do not fully capture global biochemical features. Here, we propose a multi-view generative model that integrates low-level structural features with global chemical properties to create a more holistic molecular representation. In proof-of-concept analyses, compared to purely structural latent representations, multi-view latent representations improve model accuracy on various tasks when used as input to feed-forward prediction networks. For some tasks, simple models trained on multi-view representations perform comparably to more complex supervised methods. Multi-view represen- tations are an attractive method to improve representations in an unsupervised manner, and could be useful for prediction tasks, particularly in contexts where data is limited.

241) Yibo Gao (MIT), Joshua Guo (PRIMES), Karthik Seetharaman (PRIMES), and Ilaria Seidel (PRIMES), The Rank-Generating Functions of Upho Posets (arXiv.org, 3 Nov 2020), published in Discrete Mathematics 345:1 (Jan 2022)

Upper homogeneous finite type (upho) posets are a large class of partially ordered sets with the property that the principal order filter at every vertex is isomorphic to the whole poset. Well-known examples include k-array trees, the grid graphs, and the Stern poset. Very little is known about upho posets in general. In this paper, we construct upho posets with Schur-positive Ehrenborg quasisymmetric functions, whose rank-generating functions have rational poles and zeros. We also categorize the rank-generating functions of all planar upho posets. Finally, we prove the existence of an upho poset with uncomputable rank-generating function.

240) Jason Yang (PRIMES) and Jun Wan (MIT), On Updating and Querying Submatrices (arXiv.org, 25 Oct 2020)

In this paper, we study the $d$-dimensional update-query problem. We provide lower bounds on update and query running times, assuming a long-standing conjecture on min-plus matrix multiplication, as well as algorithms that are close to the lower bounds. Given a $d$-dimensional matrix, an \textit{update} changes each element in a given submatrix from $x$ to $x\bigtriangledown v$, where $v$ is a given constant. A \textit{query} returns the $\bigtriangleup$ of all elements in a given submatrix. We study the cases where $\bigtriangledown$ and $\bigtriangleup$ are both commutative and associative binary operators. When $d = 1$, updates and queries can be performed in $O(\log N)$ worst-case time for many $(\bigtriangledown,\bigtriangleup)$ by using a segment tree with lazy propagation. However, when $d\ge 2$, similar techniques usually cannot be generalized. We show that if min-plus matrix multiplication cannot be computed in $O(N^{3-\varepsilon})$ time for any $\varepsilon>0$ (which is widely believed to be the case), then for $(\bigtriangledown,\bigtriangleup)=(+,\min)$, either updates or queries cannot both run in $O(N^{1-\varepsilon})$ time for any constant $\varepsilon>0$, or preprocessing cannot run in polynomial time. Finally, we show a special case where lazy propagation can be generalized for $d\ge 2$ and where updates and queries can run in $O(\log^d N)$ worst-case time. We present an algorithm that meets this running time and is simpler than similar algorithms of previous works.

239) Vishaal Ram (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), A modified age-structured SIR model for COVID-19 type viruses (arXiv.org, 23 Sept 2020), published in Nature Scientific Reports (2021) 11:15194

We present a modified age-structured SIR model based on known patterns of social contact and distancing measures within Washington, USA. We find that population age-distribution has a significant effect on disease spread and mortality rate, and contribute to the efficacy of age-specific contact and treatment measures. We consider the effect of relaxing restrictions across less vulnerable age-brackets, comparing results across selected groups of varying population parameters. Moreover, we analyze the mitigating effects of vaccinations and examine the effectiveness of age-targeted distributions. Lastly, we explore how our model can be applied to other states to reflect social-distancing policy based on different parameters and metrics.

238) Richard Chen (PRIMES), Feng Gui (MIT), Jason Tang (PRIMES), and Nathan Xiong (PRIMES), Few distance sets in $\ell_p$ spaces and $\ell_p$ product spaces (19 Sept 2020; arXiv.org, 26 Sept 2020), published in European Journal of Combinatorics 102 (May 2022)

Kusner asked if $n+1$ points is the maximum number of points in $\mathbb{R}^n$ such that the $\ell_p$ distance between any two points is $1$. We present an improvement to the best known upper bound when $p$ is large in terms of $n$, as well as a generalization of the bound to $s$-distance sets. We also study equilateral sets in the $\ell_p$ sums of Euclidean spaces, deriving upper bounds on the size of an equilateral set for when $p=\infty$, $p$ is even, and for any $1\le p<\infty$.

237) Tanya Khovanova (MIT) and Sean Li (PRIMES), The Penney's Game with Group Action (arXiv.org, 13 Sept 2020), published in Annals of Combinatorics (15 Jan 2022)

We generalize word avoidance theory by equipping the alphabet $\mathcal{A}$ with a group action. We call equivalence classes of words patterns. We extend the notion of word correlation to patterns using group stabilizers. We extend known word avoidance results to patterns. We use these results to answer standard questions for the Penney's game on patterns and show non-transitivity for the game on patterns as the length of the pattern tends to infinity. We also analyze bounds on the pattern-based Conway leading number and expected wait time, and further explore the game under the cyclic and symmetric group actions.

236) Ankit Bisain (PRIMES) and Eric J. Hanson (Brandeis University), The Bernardi Formula for Non-Transitive Deformations of the Braid Arrangement (7 Sept 2020; arXiv.org, 2 Oct 2020), published in The Electronic Journal of Combinatorics 28:4 (2021)

Bernardi has given a general formula to compute the number of regions of a deformation of the braid arrangement as a signed sum over boxed trees . We prove that the contribution to this sum of the set of boxed trees sharing an underlying rooted labeled tree is 0 or ±1 and give an algorithm for computing this value. We then restrict to arrangements which we call almost transitive and construct a sign-reversing involution which reduces Bernardi's signed sum to enumeration of a set of rooted labeled trees in this case. We conclude by explicitly enumerating the trees corresponding to the regions of certain nested Ish arrangements which we call non-negative , recovering their known counting formula.

235) Alejandro H. Morales (UMass Amherst) and William Shi (PRIMES), Refinements and Symmetries of the Morris identity for volumes of flow polytopes (7 Sept 2020; arXiv.org, 11 Feb 2021), published in Comptes Rendus Mathématique 359 (2021): 823-851

Flow polytopes are an important class of polytopes in combinatorics whose lattice points and volumes have interesting properties and relations. The Chan-Robbins-Yuen (CRY) polytope is a flow polytope with normalized volume equal to the product of consecutive Catalan numbers. Zeilberger proved this by evaluating the Morris constant term identity, but no combinatorial proof is known. There is a refinement of this formula that splits the largest Catalan number into Narayana numbers, which Mészáros gave an interpretation as the volume of a collection of flow polytopes. We introduce a new refinement of the Morris identity with combinatorial interpretations both in terms of lattice points and volumes of flow polytopes. Our results generalize Mészáros's construction and a recent flow polytope interpretation of the Morris identity by Corteel-Kim-Mészáros. We prove the product formula of our refinement following the strategy of the Baldoni-Vergne proof of the Morris identity. Lastly, we study a symmetry of the Morris identity bijectively using the Danilov-Karzanov-Koshevoy triangulation of flow polytopes and a bijection of Mészáros-Morales-Striker.

234) Vishaal Ram (PRIMES), Laura P. Schaposnik (University of Illinois at Chicago) et al., Extrapolating continuous color emotions through deep learning (2 Sept 2020), published in Physical Review Research 2:3 (September–November 2020)

By means of an experimental dataset, we use deep learning to implement an RGB (red, green, and blue) extrapolation of emotions associated to color, and do a mathematical study of the results obtained through this neural network. In particular, we see that males (type-$m$ individuals) typically associate a given emotion with darker colors, while females (type-$f$ individuals) associate it with brighter colors. A similar trend was observed with older people and associations to lighter colors. Moreover, through our classification matrix, we identify which colors have weak associations to emotions and which colors are typically confused with other colors.

233) Jesse Geneson, Suchir Kaustav, and Antoine Labelle (CrowdMath-2020), Extremal results for graphs of bounded metric dimension (arXiv.org, 31 Aug 2020), published in Discrete Applied Mathematics 309 (15 March 2022): 123-129

Metric dimension is a graph parameter motivated by problems in robot navigation, drug design, and image processing. In this paper, we answer several open extremal problems on metric dimension and pattern avoidance in graphs from (Geneson, Metric dimension and pattern avoidance, Discrete Appl. Math. 284, 2020, 1-7). Specifically, we construct a new family of graphs that allows us to determine the maximum possible degree of a graph of metric dimension at most $k$, the maximum possible degeneracy of a graph of metric dimension at most $k$, the maximum possible chromatic number of a graph of metric dimension at most $k$, and the maximum $n$ for which there exists a graph of metric dimension at most $k$ that contains $K_{n, n}$.
We also investigate a variant of metric dimension called edge metric dimension and solve another problem from the same paper for $n$ sufficiently large by showing that the edge metric dimension of $P_n^{d}$ is $d$ for $n \geq d^{d-1}$. In addition, we use a probabilistic argument to make progress on another open problem from the same paper by showing that the maximum possible clique number of a graph of edge metric dimension at most $k$ is $2^{\Theta(k)}$. We also make progress on a problem from (N. Zubrilina, On the edge dimension of a graph, Discrete Math. 341, 2018, 2083-2088) by finding a family of new triples $(x, y, n)$ for which there exists a graph of metric dimension $x$, edge metric dimension $y$, and order $n$. In particular, we show that for each integer $k > 0$, there exist graphs $G$ with metric dimension $k$, edge metric dimension $3^k(1-o(1))$, and order $3^k(1+o(1))$.

232) William Li, Lebesgue Measure Preserving Thompson's Monoid (30 Aug 2020)

This paper defines Lebesgue measure preserving Thompson's monoid, denoted by $\mathbb{G}$, which is modeled on Thompson's group $\mathbb{F}$ except that the elements of $\mathbb{G}$ are non-invertible. Moreover, it is required that the elements of $\mathbb{G}$ preserve Lebesgue measure. Monoid $\mathbb{G}$ exhibits very different properties from Thompson's group $\mathbb{F}$. The paper studies a number of algebraic (group-theoretic) and dynamical properties of $\mathbb{G}$ including approximation, mixing, periodicity, entropy, decomposition, generators, and topological conjugacy.

231) Srinath Mahankali, Velocity Inversion Using the Quadratic Wasserstein Metric (24 Aug 2020; arXiv.org 26 Aug 2020)

Full-waveform inversion (FWI) is a method used to determine properties of the Earth from information on the surface. We use the squared Wasserstein distance (squared $W_2$ distance) as an objective function to invert for the velocity as a function of position in the Earth, and we discuss its convexity with respect to the velocity parameter. In one dimension, we consider constant, piecewise increasing, and linearly increasing velocity models as a function of position, and we show the convexity of the squared $W_2$ distance with respect to the velocity parameter on the interval from zero to the true value of the velocity parameter when the source function is a probability measure. Furthermore, we consider a two-dimensional model where velocity is linearly increasing as a function of depth and prove the convexity of the squared $W_2$ distance in the velocity parameter on large regions containing the true value. We discuss the convexity of the squared $W_2$ distance compared with the convexity of the squared $L^2$ norm, and we discuss the relationship between frequency and convexity of these respective distances. We also discuss multiple approaches to optimal transport for non-probability measures by first converting the wave data into probability measures.

230) Michael Gerovitch, Environment-aware Pedestrian Trajectory Prediction for Autonomous Driving (21 Aug 2020)

People's safety is a primary concern in autonomous driving. There exist efficient methods for identifying static obstacles. However, the prediction of future trajectories of moving elements, such as pedestrians crossing a street, is a much more challenging problem. A promising direction of research is the use of machine learning algorithms with location bias maps. Our goal was to further explore this idea by training an interchangeable location bias map, a location-specific feature that is added into the middle of a convolutional neural network. For different locations, we used different location bias maps to allow the network to learn from different setting contexts without overfitting to a specific setting. Using pre-annotated video footage of pedestrians moving around in crowded areas, we implemented a pedestrian behavior encoding scheme to generate input and output volumes for the neural network. Using this encoding scheme, we trained our neural network and interchangeable location bias map. Our research demonstrates that the network with an interchangeable location bias map can predict realistic pedestrian trajectories even when trained simultaneously in multiple settings.

229) Andrew Shen, Towards Proving Application Isolation for Cryptocurrency Hardware Wallets (22 Jul 2020)

We often perform security-sensitive operations in our day-to-day lives such as performing monetary transactions. To perform these operations securely, we can isolate the confirmation of such operations to separate hardware devices. However, proving that these devices operate securely is still difficult given the complexity of their kernels, yet important given the rise in popularity of cryptocurrency transaction devices. To support multiple cryptocurrencies and other functionality, these devices must be able to run multiple applications that are isolated from one another as they could be potentially maliciously acting applications. We can simplify our device by modeling it as running applications sequentially in user mode. We seek to prove that these applications cannot tamper with the kernel memory and show that the kernel protection is set up correctly. To do this, we developed a RISC-V machine emulator in Rosette, which enables us to reason about the behaviour of symbolic machine states and symbolic applications. We make progress towards verifying application isolation for launching and running applications on a simple kernel.

228) Andrey Boris Khesin (MIT) and Alexander Lu Zhang (PRIMES), On Quasisymmetric Functions with Two Bordering Variables (arXiv.org, 23 Jul 2020)

We extend past results on a family of formal power series $K_{n, \Lambda}$, parameterized by $n$ and $\Lambda \subseteq [n]$, that largely resemble quasisymmetric functions. This family of functions was conjectured to have the property that the product $K_{n, \Lambda}K_{m, \Omega}$ of any two functions $K_{n, \Lambda}$ and $K_{m, \Omega}$ from the family can be expressed as a linear combination of other functions from the family. In this paper, we show that this is indeed the case and that the span of the $K_{n, \Lambda}$'s forms an algebra. We also provide techniques for examining similar families of functions and a formula for the product $K_{n, \Lambda}K_{m, \Omega}$ when $n=1$.

227) Neel Bhalla, Constructing Workflow-centric Traces in Close to Real Time for the Hadoop File System (22 Jul 2020)

Diagnosing problems in large scale systems using cloud based distributed services is a challenging problem. Workflow-centric tracing captures the workflow (work done to process requests) and dependency graph of causally-related events among the components of a distributed system. But, constructing traces has historically been performed offline in batch fashion, so trace data is not immediately available to engineers for their diagnosis efforts. In this work, we present an approach based on graph abstraction and streaming framework to construct workflow-centric traces in near real time for the Hadoop file system. This approach will provide the network operators with a real time understanding of the distributed system behavior.

226) Yunseo Choi (PRIMES) and James Unwin (University of Illinois at Chicago), Racial Impact on Infections and Deaths due to COVID-19 in New York City (11 Jul 2020; arXiv.org , 9 Jul 2020), forthcoming in Harvard Technology Review

Redlining is the discriminatory practice whereby institutions avoided investment in certain neighborhoods due to their demographics. Here we explore the lasting impacts of redlining on the spread of COVID-19 in New York City (NYC). Using data available through the Home Mortgage Disclosure Act, we construct a redlining index for each NYC census tract via a multi-level logistical model. We compare this redlining index with the COVID-19 statistics for each NYC Zip Code Tabulation Area. Accurate mappings of the pandemic would aid the identification of the most vulnerable areas and permit the most effective allocation of medical resources, while reducing ethnic health disparities.

225) Sanath Govindarajan (PRIMES) and William S. Moses (MIT), SyFER-MLIR: Integrating Fully Homomorphic Encryption Into the MLIR Compiler Framework (3 Jul 2020)

Fully homomorphic encryption opens up the possibility of secure computation on private data. However, fully homomorphic encryption is limited by its speed and the fact that arbitrary computations must be represented by combinations of primitive operations, such as addition, multiplication, and binary gates. Integrating FHE into the MLIR compiler infrastructure allows it to be automatically optimized at many different levels and will allow any program which compiles into MLIR to be modified to be encrypted by simply passing another flag into the compiler. The process of compiling into an intermediate representation and dynamically generating the encrypted program, rather than calling functions from a library, also allows for optimizations across multiple operations, such as rewriting a DAG of operations to run faster and removing unnecessary operations.

224) Ethan Mendes (PRIMES) and Kyle Hogan (MIT), Defending Against Imperceptible Audio Adversarial Examples Using Proportional Additive Gaussian Noise (30 Jun 2020)

Neural networks are susceptible to adversarial examples, which are specific inputs to a network that result in a misclassification or an incorrect output. While most past work has focused on methods to generate adversarial examples to fool image classification networks, recently, similar attacks on automatic speech recognition systems have been explored. Due to the relative novelty of these audio adversarial examples, there exist few robust defenses for these attacks. We present a robust defense for inaudible or imperceptible audio adversarial examples. This approach mimics the adversarial strategy to add targeted proportional additive Gaussian noise in order to revert an adversarial example back to its original transcription. Our defense performs similarly to other defenses yet is the first randomized or probabilistic strategy. Additionally, we demonstrate the challenges that arise when applying defenses against adversarial examples for images to audio adversarial examples.

223) Walden Yan (PRIMES) and William S. Moses (MIT) , Token pairing to improve neural program synthesis models (30 Jun 2020)

In neural program synthesis (NPS), a network is trained to output or aid in the output of code that satisfies a given program specification. In our work, we make modifications upon the simple sequence-to-sequence (Seq2Seq) LSTM model. Extending the most successful techniques from previous works, we guide a beam search with an encoder-decoder scheme augmented with attention mechanisms and a specialized syntax layer. But one of the withstanding difficulties of NPS is the implicit tree structure of programs, which makes it inherently more difficult for linearly-structured models. To address this, we experiment with a novel technique we call token pairing . Our model is trained and evaluated on AlgoLisp, a dataset of English description-to-code programming problems paired with example solutions and test cases on which to evaluate programs. We also create a new interpreter for AlgoLisp that fixes the bugs present in the builtin executor. In the end, our model achieves 99.24% accuracy at evaluation, which greatly improves on the previous state-of-the-art of 95.80% while using fewer of parameters.

222) Zhenkun Li (MIT) and Jessica Zhang (PRIMES), Classification of tight contact structures on a solid torus (arXiv.org, 30 Jun 2020)

It is a basic question in contact geometry to classify all non-isotopic tight contact structures on a given 3-manifold. If the manifold has a boundary, we need also specify the dividing set on the boundary. In this paper, we answer the classification question completely for the case of a solid torus by writing down a closed formula for the number of non-isotopic tight contact structures with any given dividing set on the boundary of the solid torus. Previously, only a few special cases were known due to work by Honda.

221) Christian Gaetz (MIT) and Katherine Tung (PRIMES), The Sperner property for $132$-avoiding intervals in the weak order (arXiv.org, 29 Jun 2020), published in Bulletin of the London Mathematical Society 53:2 (April 2021): 442-457.

A well-known result of Stanley implies that the weak order on a maximal parabolic quotient of the symmetric group $S_n$ has the Sperner property; this same property was recently established for the weak order on all of $S_n$ by Gaetz and Gao, resolving a long-open problem. In this paper we interpolate between these results by showing that the weak order on any parabolic quotient of $S_n$ (and more generally on any $132$-avoiding interval) has the Sperner property. This result is proven by exhibiting an action of $\mathfrak{sl}_2$ respecting the weak order on these intervals. As a corollary we obtain a new formula for principal specializations of Schubert polynomials. Our formula can be seen as a strong Bruhat order analogue of Macdonald's reduced word formula. This proof technique and formula generalize work of Hamaker, Pechenik, Speyer, and Weigandt and Gaetz and Gao.

220) Yuxuan (Jason) Chen, Real World Application of Event-based End to End Autonomous Driving (29 Jun 2020)

End-to-end autonomous driving has recently been a popular area of study for deep learning. This work studies the use of event cameras for real-world deep learned driving task in comparison to traditions RGB cameras. In this work, we evaluate existing stateof-the-art event-based models on offline datasets, design a novel model that fuses the benefits from both event-based and traditional frame-based cameras, and integrate the trained models on board a full-scale vehicle. We conduct tests in a challenging track with features unseen to the model. Through our experiments and saliency visualization, we show that event-based models actually predict the existing motion of the car rather than the active control the car should take. Therefore, while event-based models excel at offline tasks such as motion estimation, our experiments reveal a fundamental challenge in applying event-based end-to-end learning to active control tasks, that the models need to learn reasoning about future actions with a feedback loop that impacts its future state.

219) Arun S. Kannan (MIT) and Honglin Zhu (PRIMES), Characters for Projective Modules in the BGG Category $\mathcal{O}$ for the Orthosymplectic Lie Superalgebra $\mathfrak{osp}(3|4)$ (arXiv.org, 11 Jun 2020), published in Journal of Algebra 569 (1 March 2021): 723-757

We determine the Verma multiplicities of standard filtrations of projective modules for integral atypical blocks in the BGG category $\mathcal{O}$ for the orthosymplectic Lie superalgebras $\mathfrak{osp}(3|4)$ by way of translation functors. We then explicitly determine the composition factor multiplicities of Verma modules using BGG reciprocity.

2019 Research Papers

218) Espen Slettnes, Minimal Embedding Dimensions of Rectangle k-Visibility Graphs , published in Journal of Graph Algorithms and Applications 25:1 (January 2021): 59-96.

Bar visibility graphs were adopted in the 1980s as a model to represent traces, e.g., on circuit boards and in VLSI chip designs. Two generalizations of bar visibility graphs, rectangle visibility graphs and bar $k$-visibility graphs, were subsequently introduced. Here, we combine bar $k$- and rectangle visibility graphs to form rectangle $k$-visibility graphs (R$k$VGs), and further generalize these to higher dimensions. A graph is a $d$-dimensional R$k$VG if and only if it can be represented with vertices as disjoint axis-aligned hyperrectangles in $d$-space, such that there is an axis-parallel line of sight between two hyperrectangles that intersects at most $k$ other hyperrectangles if and only if there is an edge between the two corresponding vertices. For any graph $G$ and a fixed $k$, we prove that given enough spatial dimensions, $G$ has a rectangle $k$-visibility representation, and thus we define the minimal embedding dimension (MED) with $k$-visibility of $G$ to be the smallest $d$ such that $G$ is a $d$-dimensional R$k$VG. We study the properties of MEDs and find upper bounds on the MEDs of various types of graphs. In particular, we find that the $k$-visibility MED of the complete graph on $m$ vertices $K_m$ is at most $\lceil{m/(2(k+1))}\rceil,$ of complete $r$-partite graphs is at most $r+1,$ and of the $m^{\rm th}$ hypercube graph $Q_m$ is at most $\lceil{2m/3}\rceil$ in general, and at most $\lfloor{\sqrt{m}\,}\rceil$ for $k=0,~ m \ne 2.$

217) Zhengyang (Leo) Dong (PRIMES) and Gil Alterovitz (MIT), netAE: Semi-supervised dimensionality reduction of single-cell RNA sequencing to facilitate cell labeling , published in Bioinformatics (29 Jul 2020)

Single-cell RNA sequencing allows us to study cell heterogeneity at an unprecedented cell-level resolution and identify known and new cell populations. Current cell labeling pipeline uses unsupervised clustering and assigns labels to clusters by manual inspection. However, this pipeline does not utilize available gold-standard labels because there are usually too few of them to be useful to most computational methods. This paper aims to facilitate cell labeling with a semi-supervised method in an alternative pipeline, in which a few gold-standard labels are first identified and then extended to the rest of the cells computationally. We built a semi-supervised dimensionality reduction method, a network-enhanced autoencoder (netAE). Tested on three public datasets, netAE outperforms various dimensionality reduction baselines and achieves satisfactory classification accuracy even when the labeled set is very small, without disrupting the similarity structure of the original space.

216) Tanya Khovanova (MIT) and Kevin Wu (PRIMES), Base 3/2 and Greedily Partitioned Sequences (arXiv.org, 19 Jul 2020)

We delve into the connection between base $\frac{3}{2}$ and the greedy partition of non-negative integers into 3-free sequences. Specifically, we find a fractal structure on strings written with digits 0, 1, and 2. We use this structure to prove that the even non-negative integers written in base $\frac{3}{2}$ and then interpreted in base 3 form the Stanley cross-sequence, where the Stanley cross-sequence comprises the first terms of the infinitely many sequences that are formed by the greedy partition of non-negative integers into 3-free sequences.

215) Dmitry Kleinbock (Brandeis University), Anurag Rao (Brandeis University), and Srinivasan Sathiamurthy (PRIMES), Critical loci of convex domains in the plane (26 Mar 2020; arXiv.org, 30 Mar 2020), published in Indagationes Mathematicae 32:3 (May 2021): 719-728.

Let $K$ be a bounded convex domain in $\mathbb{R}^2$ symmetric about the origin. The critical locus of $K$ is defined to be the (non-empty compact) set of lattices $\Lambda$ in $\mathbb{R}^2$ of smallest possible covolume such that $\Lambda \cap K= \lbrace 0\rbrace$. These are classical objects in geometry of numbers; yet all previously known examples of critical loci were either finite sets or finite unions of closed curves. In this paper we give a new construction which, in particular, furnishes examples of domains having critical locus of arbitrary Hausdorff dimension between $0$ and $1$.

214) P. A. Crowdmath, Propagation time for weighted zero forcing (arXiv.org, 15 May 2020)

Zero forcing is a graph coloring process that was defined as a tool for bounding the minimum rank and maximum nullity of a graph. It has also been used for studying control of quantum systems and monitoring electrical power networks. One of the problems from the 2017 AIM workshop "Zero forcing and its applications" was to explore edge-weighted probabilistic zero forcing, where edges have weights that determine the probability of a successful force if forcing is possible under the standard zero forcing coloring rule.
In this paper, we investigate the expected time to complete the weighted zero forcing coloring process, known as the expected propagation time, as well as the time for the process to be completed with probability at least $\alpha$, known as the $\alpha$-confidence propagation time. We demonstrate how to find the expected and confidence propagation times of any edge-weighted graph using Markov matrices. We also determine the expected and confidence propagation times for various families of edge-weighted graphs including complete graphs, stars, paths, and cycles.

213) P. A. Crowdmath, Applications of the abc conjecture to powerful numbers (arXiv.org, 15 May 2020)

The abc conjecture is one of the most famous unsolved problems in number theory. The conjecture claims for each real $\epsilon > 0$ that there are only a finite number of coprime positive integer solutions to the equation $a+b = c$ with $c > (rad(a b c))^{1+\epsilon}$. If true, the abc conjecture would imply many other famous theorems and conjectures as corollaries. In this paper, we discuss the abc conjecture and find new applications to powerful numbers, which are integers $n$ for which $p^2 | n$ for every prime $p$ such that $p | n$. We answer several questions from an earlier paper on this topic, assuming the truth of the abc conjecture.

212) Alin Tomescu (MIT CSAIL), Robert Chen (PRIMES), Yiming Zheng (PRIMES), Ittai Abraham (VMware Research), Benny Pinkas (VMware Research and Bar Ilan University), Guy Golan Gueta (VMware Research), and Srinivas Devadas (MIT CSAIL), Towards Scalable Threshold Cryptosystems (9 Mar 2020), published in Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP) , San Francisco, CA, vol. 1, pp. 1242-1258.

The resurging interest in Byzantine fault tolerant systems will demand more scalable threshold cryptosystems. Unfortunately, current systems scale poorly, requiring time quadratic in the number of participants. In this paper, we present techniques that help scale threshold signature schemes (TSS), verifiable secret sharing (VSS) and distributed key generation (DKG) protocols to hundreds of thousands of participants and beyond. First, we use efficient algorithms for evaluating polynomials at multiple points to speed up computing Lagrange coefficients when aggregating threshold signatures. As a result, we can aggregate a 130,000 out of 260,000 BLS threshold signature in just 6 seconds (down from 30 minutes). Second, we show how "authenticating" such multipoint evaluations can speed up proving polynomial evaluations, a key step in communicationefficient VSS and DKG protocols. As a result, we reduce the asymptotic (and concrete) computational complexity of VSS and DKG protocols from quadratic time to quasilinear time, at a small increase in communication complexity. For example, using our DKG protocol, we can securely generate a key for the BLS scheme above in 2.3 hours (down from 8 days). Our techniques improve performance for thresholds as small as 255 and generalize to any Lagrange-based threshold scheme, not just threshold signatures. Our work has certain limitations: we require a trusted setup, we focus on synchronous VSS and DKG protocols and we do not address the worst-case complaint overhead in DKGs. Nonetheless, we hope it will spark new interest in designing large-scale distributed systems.

211) Daniil Kalinov (MIT) and Lev Kruglyak (PRIMES), The Rational Cherednik Algebra of Type $A_1$ with Divided Powers (5 Mar 2020), published in New York Journal of Mathematics 27 (2021): 1328-1346

Motivated by the recent developments of the theory of Cherednik algebras in positive characteristic, we study rational Cherednik algebras with divided powers. In our research we have started with the simplest case, the rational Cherednik algebra of type $A_1$. We investigate its maximal divided power extensions over $R[c]$ and $R$ for arbitrary principal ideal domains $R$ of characteristic zero. In these cases, we prove that the maximal divided power extensions are free modules over the base rings, and construct an explicit basis in the case of $R[c]$. In addition, we provide an abstract construction of the rational Cherednik algebra of type $A_1$ over an arbitrary ring, and prove that this generalization expands the rational Cherednik algebra to include all of the divided powers.

210) Sebastian Jeon (PRIMES) and Tanya Khovanova (MIT), 3-Symmetric Graphs (arXiv.org, 8 Mar 2020)

An intuitive property of a random graph is that its subgraphs should also appear randomly distributed. We consider graphs whose subgraph densities exactly match their expected values. We call graphs with this property for all subgraphs with $k$ vertices to be $k$-symmetric. We discuss some properties and examples of such graphs. We construct 3-symmetric graphs and provide some statistics.

209) Lucy Cai, Espen Slettnes, and Jeremy Zhou, A Combinatorial Approach to Extracting Rooted Tree Statistics from the Order Quasisymmetric Function (3 Mar 2020)

The chromatic symmetric function defined by Stanley is a power series that is symmetric in an infinite number of variables and generalizes the chromatic polynomial. Shareshian and Wachs defined the chromatic quasisymmetric function, and Awan and Bernardi defined an analog of it for digraphs.
Three decades ago, Stanley posed a question equivalent to "Does the chromatic symmetric function distinguish between all trees?" A similar question can be raised for rooted trees: "Does the chromatic quasisymmetric function distinguish between all rooted trees?". Hasebe and Tsujie showed algebraically the stronger statement that the order quasisymmetric function distinguishes rooted trees. Here, we aim to directly extract useful statistics about a tree given only its order quasisymmetric function. This approach emphasizes the combinatorics of trees over the the algebraic properties of quasisymmetric functions. We show that a rooted-tree-statistic we name the "co-height profile profile" is extractable, and that it distinguishes rooted 2-caterpillars.

208) Heidi Lei, On the Hausdorff Dimension of the Visible Koch Curve (28 Feb 2020)

In geometry, a point in a set is visible from another point if the line segment connecting two points does not contain other points in the set. We show that the Hausdorff dimension is 1 for the portion of the Koch curve that is visible from points at infinity and points in certain defined regions of the plane.

207) Aditya Saligrama (PRIMES) and Guillaume Leclerc (MIT), Revisiting Ensembles in an Adversarial Context: Improving Natural Accuracy (arXiv.org, 26 Feb 2020), presented at the ICLR 2020 Workshop on Towards Trustworthy ML: Rethinking Security and Privacy for ML (26 April 2020) ( slides )

A necessary characteristic for the deployment of deep learning models in real world applications is resistance to small adversarial perturbations while maintaining accuracy on non-malicious inputs. While robust training provides models that exhibit better adversarial accuracy than standard models, there is still a significant gap in natural accuracy between robust and non-robust models which we aim to bridge. We consider a number of ensemble methods designed to mitigate this performance difference. Our key insight is that model trained to withstand small attacks, when ensembled, can often withstand significantly larger attacks, and this concept can in turn be leveraged to optimize natural accuracy. We consider two schemes, one that combines predictions from several randomly initialized robust models, and the other that fuses features from robust and standard models.

206) William Kuszmaul (MIT) and Alek Westover (PRIMES), In-Place Parallel-Partition Algorithms using Exclusive-Read-and-Write Memory (25 Feb 2020)

We present an in-place algorithm for the parallel partition problem that has linear work and polylogarithmic span. The algorithm uses only exclusive read/write shared variables, and can be implemented using parallel-for-loops without any additional concurrency considerations (i.e., the algorithm is EREW). A key feature of the algorithm is that it exhibits provably optimal cache behavior, up to small-order factors.
We also present a second in-place EREW algorithm that has linear work and span O (log n ·loglog n ), which is within an O (loglog n ) factor of the optimal span. By using this low-span algorithm as a subroutine within the cache-friendly algorithm, we are able to obtain a single EREW algorithm that combines their theoretical guarantees: the algorithm achieves span O (log n ·loglog n ) and optimal cache behavior. As an immediate consequence, we also get an in-place EREW quicksort algorithm with work O ( n log n ), span O (log ² n ·loglog n ).

205) Justin Yu, On a rank game (22 Feb 2020)

We introduce a new game played by two players that generates an $(0,1)$-matrix of size $n$. The first player aims to maximize its resulting rank, while the second player aims to minimize it. We show that the first player can force almost full rank given additional power in move possibilities.

204) Benjamin Kang (PRIMES) and James Unwin (University of Illinois at Chicago), All-Pay Auctions as Models for Trade Wars and Military Annexation (arXiv.org, 10 Feb 2020), published in Letters in Spatial and Resource Sciences (13 May 2022)

We explore an application of all-pay auctions to model trade wars and territorial annexation. Specifically, in the model we consider the expected resource, production, and aggressive (military/tariff) power are public information, but actual resource levels are private knowledge. We consider the resource transfer at the end of such a competition which deprives the weaker country of some fraction of its original resources. In particular, we derive the quasi-equilibria strategies for two country conflicts under different scenarios. This work is relevant for the ongoing US-China trade war, and the recent Russian capture of Crimea, as well as historical and future conflicts.

203) Benjamin Kang (PRIMES) and James Unwin (University of Illinois at Chicago), All-Pay Auctions with Different Forfeits (arXiv.org, 7 Feb 2020), forthcoming in the Yau Competition finalists compendium

In an auction each party bids a certain amount and the one which bids the highest is the winner. Interestingly, auctions can also be used as models for other real-world systems. In an all pay auction all parties must pay a forfeit for bidding. In the most commonly studied all pay auction, parties forfeit their entire bid, and this has been considered as a model for expenditure on political campaigns. Here we consider a number of alternative forfeits which might be used as models for different real-world competitions, such as preparing bids for defense or infrastructure contracts.

202) Victoria Zhang, Patterns and Symmetries in Spiking Neural Networks (11 Jan 2020)

Inspired by recent progress in computational neuroscience and artificial intelligence, this paper explores rich temporal patterns in networks of neurons that communicate via electric pulses known as spikes. In particular, we describe the attractors in small circuits of spiking neurons with different symmetries and connectivities. Using methods developed in the theory of dynamical systems, we extend an analytical approach to capture the phase-locked states and their stability for a general N -cell system. We then systematically explore attractors in reduced state spaces via Poincaré maps for both all-to-all coupled and star-like coupled networks. We identify a sequence of bifurcations when the coupling strengths vary from inhibition to excitation. Moreover, using high-precision numerical simulations, we find two novel states in star-like networks that are unobserved in all-to-all networks: the death of oscillation for inhibitory coupling and quasi-periodic behaviors for excitatory coupling. Our results elucidate the interplay between dynamical patterns and symmetries in the building blocks of real networks. Furthermore, as self-sustained oscillations with pulsatile couplings are ubiquitous, our analysis may clarify understanding of not only neural dynamics but also other pulse-coupled oscillator systems such as non-linear electric circuits, wireless sensor networks, and self-organizing chemical reactions.

201) Zander Hill, Upper Bound on the Distortion of Cabled Knots (8 Jan 2020)

The torus knots are a class of knots generated by ordered pairs $(p,q)$ of relatively prime integers, where the $(p,q)$-torus knot is the curve defined by a ray of slope $\frac{p}{q}$ emanating from the origin in the representation of the torus as a square with opposing sides identified. Furthermore, given a curve $K$, we can define the $(p,q)$-cabling of $K$ to be the $(p,q)$-torus knot living on an embedding of the torus which follows $K$, as opposed to the standard embedding of the torus which follows $S^1$ in $\mathbb{R}^3$. We show that for all $p$ and $q \gg p$, there exists a curve in the isotopy class of the $(p,q)$-torus knot whose supremal ratio of arc length to Euclidean distance, called the distortion of the curve, is bounded above by $\frac{7q}{\log(q)}$, and additionally show that this bound holds for the $(p,q)$-cabling of any knot. This extends a result of Studer establishing sublinear upper bounds for the distortion of the $(2,q)-$torus knots.

200) Oliver Hayman (PRIMES) and Ashwin Narayan (MIT), Analyzing Visualization and Dimensionality-Reduction Algorithms (9 Jan 2020)

In order to find patterns among high dimensional data sets in scientific studies, scientists use mapping algorithms to produce representative two-dimensional or three-dimensional data sets that are easier to visualize. The most prominent of these algorithms is the t-Distributed Stochastic Neighbor Embedding algorithm (t-SNE). In this project, we create a metric for evaluating how clustered a data set is, and use it to measure how the perplexity parameter of the t-SNE algorithm affects the clustering of outputted data sets. Additionally, we propose a modification in which improved how well randomness is preserved in outputted data sets. Finally, we create a separate metric to test whether a group of points contains one or multiple clusters in a data set of centered clusters.

199) Frank Wang, The integral shuffle algebra and the $K$-theory of the Hilbert scheme of points in $\mathbb{A}^2$ (8 Jan 2020; arXiv.org, 12 Feb 2020)

We examine the shuffle algebra defined over the ring $\mathbf{R} = \mathbb{C}[q_1^{\pm 1}, q_2^{\pm 1}]$, also called the integral shuffle algebra, which was found by Schiffmann and Vasserot to act on the equivariant $K$-theory of the Hilbert Scheme of points in the plane. We find that the modules of 2 and 3 variable elements of the shuffle algebra are finitely generated, and prove a necessary condition for an element to be in the integral shuffle algebra for arbitrarily many variables.

198) Tejas Gopalakrishna (PRIMES) and Yichi Zhang (MIT), Analysis of the One Line Factoring Algorithm (6 Jan 2020)

For integers that fit within $42$ bits, a competitive factoring algorithm is the so-called One Line Factoring Algorithm proposed by William B. Hart. We analyze this algorithm in special cases, in particular, for semiprimes $N = pq$, and look for optimizations. We first observe the cases in which the larger or smaller prime is returned. We then show that when $p$ and $q$ are sufficiently close, we always finish on the first iteration. An upper bound can be found for the first iteration that successfully factors an odd semiprime. Using this upper bound, we demonstrate some simplifications to the algorithm for odd semiprimes in particular. One of our observations is that we only need to iterate numbers $\{ 0,1,3,5,7 \}$ modulo $8$, as the other iterators are very rarely the first that successfully factor the semiprime. Finally, we inspect the performance of the optimized algorithm.

197) Sunay Joshi, On the degenerate Turán problem and its variants (3 Jan 2020)

Given a family of graphs $\mathcal{F}$, a central problem in extremal graph theory is to determine the maximum number $\text{ex}(n,\mathcal{F})$ of edges in a graph on $n$ vertices that does not contain any member of $\mathcal{F}$ as a subgraph. The degenerate Turán problem regards the asymptotic behavior of $\text{ex}(n,\mathcal{F})$ for familes $\mathcal{F}$ of bipartite graphs. In this paper, we prove four new theorems regarding the extremal number and its variants. We begin by investigating several notions central to providing lower bounds on extremal numbers, including balanced rooted graphs and the Erdös--Simonovits Reduction Theorem. In addition, we present new lower bounds on the asymmetric extremal number $\text{ex}(m,n,F)$ and the lopsided asymmetric extremal number $\text{ex}^*(m,n,F)$ when $F$ is a blowup of a bipartite graph or a theta graph.

196) Alexander J. Ding, An Evaluation of UPC++ by Porting Shared-Memory Parallel Graph Algorithms (1 Jan 2020)

Unified Parallel C++ (UPC++), a C++ library, attempts to address the programming difficulty introduced by distributed parallel systems and still take advantage of the model's high scalability by exposing an API that represents the distributed memory as a contiguous global address space, similar to that of a sharedmemory parallel system. Though previous work, including the various benchmarks by UPC++ developers, has demonstrated the library's effectiveness in simple tasks and in porting distributed-memory parallel algorithms that are often implemented in OpenMPI, there lacks an assessment of the ease and effectiveness of porting shared-memory parallel algorithms into UPC++. We implement a number of graph algorithms in OpenMP, a common shared-memory parallel library, and port them into UPC++ in a locality-aware, communication-averse manner to evaluate the convenience, scalability, and robustness of UPC++. Tests on both a single-node, multicore system and the NERSC supercomputer (a multi-node system), with a plethora of real and random input graphs, demonstrate a number of prerequisites for high scalability in our UPC++ implementation: large input graphs, dense input graphs, and dense operations. Similar tests on our OpenMP implementation function as control, proving the algorithms' performance in shared-memory systems. Despite the relatively straightforward and naive porting from OpenMP, we still achieve competitive performance and scalability in dense algorithms on large inputs. The porting demonstrates UPC++'s ease of usage and good porting potential, especially when compared with other distributed libraries like OpenMPI. Finally, we extrapolate a distributed graph processing system on UPC++, optimized with a hybrid top-down/bottom-up approach, to simplify future distributed graph algorithm implementations.

195) Jason Yang (PRIMES), Martin Falk (MIT), and Sameer Abraham (MIT), The relationship between gene expression correlation and 3D genome organization (31 Dec 2019)

In some organisms such as E. coli and S. cerevisiae yeast, it is known that there is a relationship between the distance among genes and their coexpression (Pannier et. al., Kruglyak and Tang). It is also known that in general there is a relationship between gene function and genome structure (Szabo et. al). One might also expect to find a relationship between gene expression and TADs, which are domains within the genome where loci inside contact each other more frequently than loci outside. However, by analyzing data from Mus musculus brain cells, we do not find a relationship between gene pair correlation of single-cell RNA-seq gene expression and gene pair distance. Furthermore, despite the body of work linking gene expression and TAD structure, we also find no difference between gene pairs within a single TAD and between two TADs in terms of the relationship between gene pair distance and correlation. Additionally, we find that gene pair correlation is not related to the biological functions of the genes. However, there is a relationship between highly negative gene pair correlation and the number of times both genes are expressed 0 times across different cells.

194) Sarah Chen (PRIMES), Karl Clauser, Travis Law, and Tamara Ouspenskaia (Broad Institute), Seeking Neoantigen Candidates within Retained Introns (28 Dec 2019)

Major histocompatibility complex class I (MHC I) molecules present peptides from cytosolic proteins on the surface of cells. Cytotoxic T cells can recognize the presented antigens, and infected or cancerous cells that present non-self antigens can elicit an immune response. The identification of cancer-specific peptides (neoantigens) produced by somatic mutations in tumor cells and presented by MHC I molecules enables immunotherapies such as personalized cancer vaccines and adoptive T cell transfer. The state of the art approach searches for neoantigens derived from cancer-specific somatic variants and often falls short for cancers with few somatic mutations. Retained introns (RIs) resulting from splicing errors in cancer are an additional source of neoantigens. In this study, we identify RIs which are transcribed, translated, and contribute peptides to MHC I presentation. Using de novo transcriptome assembly of RNA-seq data,we identified 1799 RIs in B721.221 cells. Additionally, we detected 87 peptides from 83 RIs by liquid chromatography-tandem mass spectrometry of the MHC I immunopeptidome (LC-MS/MS). Finally, we use ribosome profiling (Ribo-seq), which provides a readout of mRNA translation, to identify RIs that are translated, a prerequisite for MHC I presentation. Previous studies have predicted thousands of RIs but have been able to validate only a handful through mass spectrometry. By distinguishing transcribed but untranslated versus translated candidates, Ribo-seq has the potential to improve RI predictions. We propose the use of a combination of RNA-seq and Ribo-seq, paired with mass spectrometry validation, to more accurately predict the contribution of RIs to the MHC I immunopeptidome, enabling the use of RI derived neoantigens in future immunotherapies.

193) Kevin Edward Zhao and Vishnu Emani, The Role of Protein Occupancy in DNA Compartmentalization (23 Dec 2019)

The organization of DNA throughout the genome is a complex process to study. Analysis reveals a checker-board pattern of separation at a megabase-pair scale, called compartments, which are captured well by the largest eigenvector of the Hi-C contact matrix. The sign of the eigenvector correlates with active and repressed areas of the genome. These compartments have been characterized into two categories, called A and B compartments, which are hypothesized to be spatially separated based upon the protein occupancy in the region. This project explores the factors that govern DNA compartmentalization, including the relationship between compartments and protein occupancy. In order to analyze contacts within the genome, Hi-C data was loaded and the eigenvectors of the contact matrix were computed. Protein occupancy in murine cortical neurons and neural progenitor cells was measured via ChIP-Seq. Using this data, we calculated the influence of several proteins on the sign of the Hi-C eigenvector via regression and Support Vector Machines (SVMs). Based on our findings, we tried to develop a simple model for compartments and explored this via simulations. We developed simple simulations of compartments based on ChIP-Seq data, and compared the results to compartments identified in experimental Hi-C maps. The results demonstrate a high correlation between the eigenvectors of the simulated and experimental Hi-C maps. In conclusion, the computational methods are effective at determining the proteins which most significantly contribute to compartmentalization.

192) Neil Chowdhury, A method to recognize universal patterns in genome structure using Hi-C (22 Dec 2019)

The expression of genes in cells is a complicated process. Expression levels of a gene are determined not only by its local neighborhood but also by more distal regions, as is the case with enhancer-promoter interactions, which can connect regions millions of bases away. The large-scale organization of DNA within the cell nucleus plays a substantial role in gene expression and cell fate, with recent developments in biochemical assays (such as Hi-C) generating quantitative maps of the higher-order structure of DNA. The interactions captured by Hi-C have been attributed to several distinct physical processes. One of the processes is that of segregation of DNA into compartmental domains by phase separation. While the current consensus is that there are broadly two types of compartmental domains (A and B), there is some evidence for a larger number of compartmental domains. Here a methodology to determine the identity and number of such compartments is presented, and it is observed that there are four distinct compartments within the genome.

191) Yizhen Chen, Mobile Sensor Networks: Bounds on Capacity and Complexity of Realizability (22 Dec 2019; arXiv.org, 21 Jan 2020), submitted to Electronic Journal of Combinatorics

In a restricted combinatorial mobile sensor network (RCMSN), there are n sensors that continuously receive and store information from outside. Every two sensors communicate exactly once, and at an event when two sensors communicate, they receive and store additionally all information the other has stored. C. Gu, I. Downes, O. Gnawali, and L. Guibas proposed a capacity of information diffusion in mobile sensor networks. They collected all information received by two sensors between a communication event and the previous communication events for each of them into one information packet, and considered the number of sensors a packet eventually reaches. Then they defined the capacity of an RCMSN to be the ratio of the average number of sensors the packets reach and the total number of sensors. While they have studied the expected capacity of an RCMSN (when the order of communications is random), we found the RCMSNs with maximum and minimum capacities. We also found the maximum, minimum, and expected capacities for several related mobile sensor network constructions, such as ones generated from intersections of lines, as well as complexity results concerning when a mobile sensor network can be generated in such geometric ways.

190) Andrew Zhang, Antimicrobial resistance prediction using deep convolutional neural networks on whole genome sequence data (19 Dec 2019)

We propose a method to determine whether a bacterial strain is resistant to an antibiotic based on its whole genome sequence data using deep machine learning – deep convolutional neural networks (DCNN). The DCNN model developed in this research is shown to achieve an average AMR prediction accuracy of 94.7%. Each prediction takes less than a second. The model is verified with Klebsiella pneumoniae resistance to tetracycline data and Acinetobacter baumannii resistance to carbapenem data from the public database PATRIC. The DCNN model is further tested with clinically collected genomic data of 149 strains of Mycobacterium tuberculosis, and achieves a prediction accuracy of 93.1% for resistance to pyrazinamide (PZA). To find genes that harbor mutations of PZA resistance, we build a Support Vector Machine (SVM) model tailored for VCF format genomic data, which has revealed two novel genes, embB and gyrA, that harbor mutations associated with PZA resistance besides the well-known pncA gene. Our DCNN and SVM Machine Learning framework, if used together with the real-time genome sequencing machines, which are now already available, could make rapid AMR predictions, allowing for critical time to ensure good patient outcomes and preventing outbreaks of deadly AMR infections. Furthermore, the developed framework identifies pertinent resistance genes, helping researchers understand the mechanisms behind resistance. Finally, this research demonstrates how deep machine learning techniques can produce high accuracy predictive models accelerating the diagnosis of AMR.

189) Rupert Li, Pulses of Flow-firing Processes (8 Dec 2019)

Flow-firing is a natural generalization of chip-firing, or the abelian sandpile model, to higher dimensions, operating on infinite planar graphs. The edges of the graph have flow, which is rerouted through the faces of the graph. We investigate initial flow configurations which display terminating behavior and global confluence, meaning the terminating configuration is unique. The pulse configuration over a hole, or a configuration of flow going around a face that cannot redirect flow, is known to display global confluence, and we expand this result to initial configurations that have multiple pulses, identifying which terminating configurations are possible. We also generalize the analysis of the global confluence of pulses to configurations with flow outside of the hole, especially to the configuration of a pulse with radius, and prove under what conditions this displays global confluence. We conclude with a conjecture on the global confluence of a generalization of a pulse with radius, a uniform conservative configuration, or contour.

188) Yibo Gao (MIT) and Rupert Li (PRIMES), Compatible Recurrent Identities of the Sandpile Group and Maximal Stable Configurations (18 Nov 2019; arXiv.org, 23 Aug 2020), published in Discrete Applied Mathematics 288 (15 Jan 2021): 123-137

In the abelian sandpile model, recurrent chip configurations are of interest as they are a natural choice of coset representatives under the quotient of the reduced Laplacian. We investigate graphs whose recurrent identities with respect to different sinks are compatible with each other. The maximal stable configuration is the simplest recurrent chip configuration, and graphs whose recurrent identities equal the maximal stable configuration are of particular interest, and are said to have the complete maximal identity property. We prove that given any graph $G$ one can attach trees to the vertices of $G$ to yield a graph with the complete maximal identity property. We conclude with several intriguing conjectures about the complete maximal identity property of various graph products.

187) Andrew Weinfeld, Bases for Quotients of Symmetric Polynomials (arXiv.org, 17 Nov 2019)

We create several families of bases for the symmetric polynomials. From these bases we prove that certain Schur symmetric polynomials form a basis for quotients of symmetric polynomials that generalize the cohomology and the quantum cohomology of the Grassmannian. Our work also provides an alternative proof of a result due to Grinberg.

186) Yuyuan Luo (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), Minimal percolating sets for mutating infectious diseases (arXiv.org, 5 Nov 2019), published in Physical Review Research, vol. 2 (1 April 2020), featured in the Coronavirus (COVID-19) Collection from Physical Review journals by the American Physical Society

This paper is dedicated to the study of the interaction between dynamical systems and percolation models, with views towards the study of viral infections whose virus mutate with time. Recall that r-bootstrap percolation describes a deterministic process where vertices of a graph are infected once r neighbors of it are infected. We generalize this by introducing $F(t)$-bootstrap percolation, a time-dependent process where the number of neighbouring vertices which need to be infected for a disease to be transmitted is determined by a percolation function $F(t)$ at each time $t$. After studying some of the basic properties of the model, we consider smallest percolating sets and construct a polynomial-timed algorithm to find one smallest minimal percolating set on finite trees for certain $F(t)$-bootstrap percolation models.

185) Christopher Zhu, Enumerating Permutations and Rim Hooks Characterized by Double Descent Sets (arXiv.org, 28 Oct 2019)

Let $dd(I;n)$ denote the number of permutations of $[n]$ with double descent set $I$. For singleton sets $I$, we present a recursive formula for $dd(I;n)$ and a method to estimate $dd(I;n)$. We also discuss the enumeration of certain classes of rim hooks. Let $\mathcal{R}_I(n)$ denote the set of all rim hooks of length $n$ with double descent set $I$, so that any tableau of one of these rim hooks corresponds to a permutation with double descent set $I$. We present a formula for the size of $\mathcal{R}_I(n)$ when $I$ is a singleton set, and we also present a formula for the size of $\mathcal{R}_I(n)$ when $I$ is the empty set. We additionally present several conjectures about the asymptotics of certain ratios of $dd(I;n)$.

184) Nithin Kavi, Cutting and Gluing Surfaces (arXiv.org, 25 Oct 2019)

We start with a disk with $2n$ vertices along its boundary where pairs of vertices are connected with $n$ strips with certain restrictions. This forms a {\it pairing}. To relate two pairings, we define an operator called a cut-and-glue operation. We show that this operation does not change an invariant of pairings known as the {\it signature.} Pairings with a signature of $0$ are special because they are closely related to a topological construction through cut and glue operations that have other applications in topology. We prove that all balanced pairings for a fixed $n$ are connected on a surface with any number of boundary components. As a topological application, combined with works of Li, this shows that a properly embedded surface induces a well-defined grading on the sutured monopole Floer homology defined by Kronheimer and Mrowka.

183) Alejandro H. Morales (UMass Amherst) and Daniel G. Zhu (PRIMES), On the Okounkov-Olshanski formula for standard tableaux of skew shapes (arXiv.org, 9 Jul 2020); published in FPSAC 2020 Proceedings of the 32nd Conference on Formal Power Series and Algebraic Combinatorics (Online) and forthcoming in Combinatorial Theory

The classical hook length formula counts the number of standard tableaux of straight shapes. In 1996, Okounkov and Olshanski found a positive formula for the number of standard Young tableaux of a skew shape. We prove various properties of this formula, including three determinantal formulas for the number of nonzero terms, an equivalence between the Okounkov-Olshanski formula and another skew tableaux formula involving Knutson-Tao puzzles, and two $q$-analogues for reverse plane partitions, which complements work by Stanley and Chen for semistandard tableaux. We also give several reformulations of the formula, including two in terms of the excited diagrams appearing in a more recent skew tableaux formula by Naruse. Lastly, for thick zigzag shapes we show that the number of nonzero terms is given by a determinant of the Genocchi numbers and improve on known upper bounds by Morales-Pak-Panova on the number of standard tableaux of these shapes.

182) Alin Tomescu (MIT), Vivek Bhupatiraju (PRIMES), Dimitrios Papadopoulos (Hong Kong University of Science and Technology), Charalampos Papamanthou (University of Maryland, College Park), Nikos Triandopoulos (Stevens Institute of Technology), Srinivas Devadas and (MIT), Transparency Logs via Append-Only Authenticated Dictionaries, published in CCS '19 Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, United Kingdom, November 11-15, 2019, pp. 1299-1316.

Transparency logs allow users to audit a potentially malicious service, paving the way towards a more accountable Internet. For example, Certificate Transparency (CT) enables domain owners to audit Certificate Authorities (CAs) and detect impersonation attacks. Yet, to achieve their full potential, transparency logs must be bandwidth-efficient when queried by users. Specifically, everyone should be able to efficientlylook up log entries by their keyand efficiently verify that the log remainsappend-only. Unfortunately, without additional trust assumptions, current transparency logs cannot provide both small-sizedlookup proofs and small-sizedappend-only proofs. In fact, one of the proofs always requires bandwidth linear in the size of the log, making it expensive for everyone to query the log. In this paper, we address this gap with a new primitive called anappend-only authenticated dictionary (AAD). Our construction is the first to achieve (poly)logarithmic size for both proof types and helps reduce bandwidth consumption in transparency logs. This comes at the cost of increased append times and high memory usage, both of which remain to be improved to make practical deployment possible.

181) Ezra Erives (PRIMES), Srinivasan Sathiamurthy (PRIMES), and Zarathustra Brady (MIT), Asymptotics of $d$-Dimensional Visibility (arXiv.org, 16 Sep 2019)

We consider the space $[0,n]^3$, imagined as a three dimensional, axis-aligned grid world partitioned into $n^3$ $1\times 1 \times 1$ unit cubes. Each cube is either considered to be empty, in which case a line of sight can pass through it, or obstructing, in which case no line of sight can pass through it. From a given position, some of these obstructing cubes block one's view of other obstructing cubes, leading to the following extremal problem: What is the largest number of obstructing cubes that can be simultaneously visible from the surface of an observer cube, over all possible choices of which cubes of $[0,n]^3$ are obstructing? We construct an example of a configuration in which $\Omega\big(n^\frac{8}{3}\big)$ obstructing cubes are visible, and generalize this to an example with $\Omega\big(n^{d-\frac{1}{d}}\big)$ visible obstructing hypercubes for dimension $d>3$. Using Fourier analytic techniques, we prove an $O\big(n^{d-\frac{1}{d}}\log n\big)$ upper bound in a reduced visibility setting.

180) Florian Naef (MIT) and Yuting Qin (PRIMES), The Elliptic Kashiwara-Vergne Lie algebra in low weights (arXiv.org, 7 Aug 2019)

In this paper, we study the elliptic Kashiwara-Vergne Lie Algebra $\mathfrak{krv}$, which is a certain Lie subalgebra of the Lie algebra of derivations of the free Lie algebra in two generators. It has a natural bigrading, such that the Lie bracket is of bidegree $(-1,-1)$. After recalling the graphical interpretation of this Lie algebra, we examine low degree elements of $\mathfrak{krv}$. More precisely, wцe find that $\mathfrak{krv}^{(2,j)}$ is one-dimensional for even $j$ and zero $j$ odd. We also compute $\operatorname{dim}(\mathfrak{krv})^{(3,m)} = \lfloor\frac{m-1}{2}\rfloor - \lfloor\frac{m-1}{3}\rfloor$. In particular, we show that in those degrees there are no odd elements and also confirm Enriquez' conjecture in those degrees.

179) Vincent Huang (PRIMES) and James Unwin (University of Illinois, Chicago), Markov Chain Models of Refugee Migration Data (arXiv.org, 19 March 2019), published in IMA Journal of Applied Mathematics (2020): 1-21

The application of Markov chains to modelling refugee crises is explored, focusing on local migration of individuals at the level of cities and days. As an explicit example we apply the Markov chains migration model developed here to UNHCR data on the Burundi refugee crisis. We compare our method to a state-of-the-art `agent-based' model of Burundi refugee movements, and highlight that Markov chain approaches presented here can improve the match to data while simultaneously being more algorithmically efficient.

178) Sean Elliott, Anti-Ramsey Type Problems (14 March 2019)

A classical theorem due to Ramsey says the following: Given a finite number of colors and a positive integer p, any edge-coloring of the complete graph $K_n$ will contain a monochromatic copy of $K_p$ as long as n is sufficiently large. A related problem is to consider colorings of $K_n$ for which every copy of $K_4$ uses at least $3$ distinct colors, and ask for the minimum number of colors that can be used to produce such a coloring. Here we present an alternate proof of the best known upper bound, which is $2^{O(\sqrt{\log n})}$. We also consider the problem of covering a regular graph with regular bipartite subgraphs. The motivation for this problem comes from the example of covering $K_n$ with complete bipartite subgraphs, which can be done with $\log_{2} (n)$ many subgraphs. Here we show that with high probability, a random $d$-regular graph with an even number of vertices can be covered with $c {\log d}$ many regular bipartite subgraphs for an absolute constant $c$.

177) Alan Yan, Asymptotic Counting in Dynamical Systems (4 March 2019)

We consider several dynamically generated sets with certain measurable properties such as the diameters or angles. We define various counting functions on these geometric objects which quantify these properties and explore the asymptotics of these functions. We conjecture that these functions grow like power functions with exponent the dimension of the residual set. The main objects that we examine are Fatou components of the quadratic family and limit sets of Schottky groups. Finally, we provide heuristic algorithms to compute the counting functions in these examples in an attempt to confirm this conjecture.

176) Archer Wang, Hilbert Series of Quasiinvariant Polynomials (19 Feb 2019)

The space of quasiinvariant polynomials generalize that of symmetric polynomials: under the action of the symmetric group, the polynomials remain invariant to a certain order. We discern the structure and symmetries of quasiinvariant polynomials by way of examining the invariance of relevant polynomial spaces under certain specific group actions. Both pure and computational methods are employed in this pursuit. Felder and Veselov, when studying quasiinvariant polynomials, made a breakthrough discovery in computing their Hilbert series in fields of characteristic 0, and since then, quasiinvariant polynomials have been extensively studied due to their applications in representation theory, algebraic geometry, and mathematical physics. We investigate the Hilbert series of quasiinvariant polynomials that are divisible by a generic homogeneous polynomial. We also continue the previous work regarding their Hilbert series in fields of prime characteristic.

175) Sanjit Bhat (PRIMES), Dimitris Tsipras (MIT), and Aleksander Madry (MIT), Towards Efficient Methods for Training Robust Deep Neural Networks (13 Feb 2019).

In recent years, it has been shown that neural networks are vulnerable to adversarial examples, i.e., specially crafted inputs that look visually similar to humans yet cause machine learning models to make incorrect predictions. A lot of research has been focused on training robust models--models immune to adversarial examples. One such method is Adversarial Training, in which the model continuously trains on adversarially perturbed inputs. However, since these inputs require significant computation time to create, Adversarial Training is often much slower than vanilla training. In this work, we explore two approaches to increasthe efficiency of Adversarial Training. First, we study whether faster yet less accurate methods for generating adversarially perturbed inputs suffice to train a robust model. Second, we devise a method for asynchronous parallel Adversarial Training and analyze a phenomenon of independent interest that arises--staleness. Taken together, these two techniques enable comparable robustness on the MNIST dataset to prior art with a 26× reduction in training time from 4 hours to just 9 minutes.

174) Jesse Geneson (Iowa State University), Carl Joshua Quines (PRIMES), Espen Slettnes (PRIMES), Shen-Fu Tsai (Google), Expected capture time and throttling number for cop versus gambler (arXiv.org, 10 Feb 2019)

We bound expected capture time and throttling number for the cop versus gambler game on a connected graph with $n$ vertices, a variant of the cop versus robber game that is played in darkness, where the adversary hops between vertices using a fixed probability distribution. The paper that originally defined the cop versus gambler game focused on two versions, a known gambler whose distribution the cop knows, and an unknown gambler whose distribution is secret. We define a new version of the gambler where the cop makes a fixed number of observations before the lights go out and the game begins. We show that the strategy that gives the best possible expected capture time of $n$ for the known gambler can also be used to achieve nearly the same expected capture time against the observed gambler when the cop makes a sufficiently large number of observations. We also show that even with only a single observation, the cop is able to achieve an expected capture time of approximately $1.5n$, which is much lower than the expected capture time of the best known strategy against the unknown gambler (approximately $1.95n$).

173) John Kuszmaul, Verkle Trees (5 Feb 2019)

We present Verkle Trees, a bandwidth-efficient alternative to Merkle Trees. Merkle Trees are currently employed in a variety of applications in which membership proofs are sent across a network, including consensus protocols, public-key directories, cryptocurrencies such as Bitcoin, and Secure File Systems. A Merkle Tree with n leaves has $O({\log_2 n})$-sized proofs. In large trees, sending the proofs can dominate bandwidth consumption. Vector Commitments (VCs) pose a potential alternative to Merkle Trees, with constant-sized proofs. Unfortunately, VC construction time is $O(n^2)$, which is too large for many applications. We present Verkle Trees, which are constructed similarly to Merkle Trees, but using Vector Commitments rather than cryptographic hash functions. In a Merkle Tree, a parent node is the hash of its children. In a Verkle Tree, a parent node is the Vector Commitment of its children. A Verkle Tree with branching factor k achieves $O(kn)$ construction time and $O({\log_k n})$ membership proof-size. This means that the branching factor, k, offers a tradeoff between computational power and bandwidth. The bandwidth reduction is independent of the depth of the tree; it depends only on the branching factor. We find experimentally that with a branching factor of k = 1024, which provides a factor of 10 reduction in bandwidth, it takes 110.1 milliseconds on average per leaf to construct a Verkle Tree with $2^{14}$ leaves. A branching factor of k = 32, which provides a bandwidth reduction factor of 5, yields a construction time of 8.4 milliseconds on average per leaf for a tree with $2^{14}$ leaves. (The performance on a tree with $2^{14}$ leaves is representative of larger trees because the asymptotics already dominate the computation costs.) My role in this research project has been proving the time complexities of Verkle Trees, implementing Verkle Trees, and testing and benchmarking the implementation.

172) Andrew Ahn (MIT), Gopal Goel (PRIMES), and Andrew Yao (PRIMES), Derivative Asymptotics of Uniform Gelfand-Tsetlin Patterns (1 Feb 2019)

Bufetov and Gorin introduced the idea of applying differential operators which are diagonalized by the Schur functions to Schur generating functions, a generalization of probability generating functions to particle systems. This technology allowed the authors to access asymptotics of a variety of particle systems. We use this technique to analyze uniformly distributed Gelfand-Tsetlin patterns where the top row is fixed. In particular, we obtain limiting moments for the difference of empirical measure for two adjacent rows in uniformly random Gelfand-Tsetlin patterns.

171) William Fisher, Polynomial Wolff Axioms and Multilinear Kakeya-type Estimates for Bent Tubes in $R^n$ (31 Jan 2019)

In this paper we consider the applicability of Guth and Zahl's polynomial Wolff axioms to bent tubes. We demonstrate that Guth and Zahl's multilinear bounds hold for tubes defined by low degree algebraic curves with bounded $C^2$ -norms. To show this we give an exposition of their proof in a n-dimensional, k-linear context. In considering the ability to obtain linear bounds using the multilinear bounds we utilize the strategy of Guth and Bourgain. We find that the multilinear bounds obtained from Guth and Zahl's technique break the inductive structure of this process and thus provide inferior bounds to the endpoint cases of Bennett, Carbery, and Tao's multilinear bounds. We discuss future research directions, which could eventually remedy this, that improve multilinear bounds by adding the assumption that the collection of tubes lie near a k-plane.

170) Rinni Bhansali (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), A Trust Model in Bootstrap Percolation (21 Jan 2019; arXiv.org, 23 May 2019), published in the Proceedings of the Royal Society A, vol. 476, no. 2235 (1 March 2020)

Bootstrap percolation is a class of monotone cellular automata describing an activation process which follows certain activation rules. In particular, in the classical r-neighbor bootstrap process on a graph G, a set A of initially infected vertices spreads by infecting vertices with at least r already-infected neighbors. Motivated by the study of social networks and biological interactions through graphs, where vertices represent people and edges represent the relations amongst them, we introduce here a novel model which we name T-bootstrap percolation (T-BP). In this new model, vertices of the graph G are assigned random labels, and the set of initially infected vertices spreads by infecting (at each time step) vertices with at least a fixed number of already-infected neighbors of each label. The Trust Model for Bootstrap Percolation allows one to impose a preset level of skepticism towards a rumor, as it requires a rumor to be validated by numerous groups in order for it to spread, hence imposing a predetermined level of trust needed for the rumor to spread. By considering different random and non-random networks, we describe various properties of this new model (e.g., the critical probability of infection and the confidence threshold), and compare it to other types of bootstrap percolation from the literature, such as U-bootstrap percolation. Ultimately, we describe its implications when applied to rumor spread, fake news, and marketing strategies, along with potential future applications in modeling the spread of genetic diseases.

169) Stanley Wang, Connectedness of the Moduli Space of Genus 1 Planar Tropical Curves (arXiv.org, 12 Jan 2019)

Tropical geometry is a relatively recent field in mathematics created as a simplified model for certain problems in algebraic geometry. We introduce the definition of abstract and planar tropical curves as well as their properties, including combinatorial type and degree. We also talk about the moduli space, a geometric object that parameterizes all possible types of abstract or planar tropical curves subject to certain conditions. Our research focuses on the moduli spaces of planar tropical curves of genus one, arbitrary degree d and any number of marked, unbounded edges. We prove that these moduli spaces are connected.

168) Aayush Karan, Generating Set for Nonzero Determinant Links Under Skein Relation (arXiv.org, 6 Jan 2019), published in Topology and its Applications, vol. 265 (15 September 2019)

Traditionally introduced in terms of advanced topological constructions, many link invariants may also be defined in much simpler terms given their values on a few initial links and a recursive formula on a skein triangle. Then the crucial question to ask is how many initial values are necessary to completely determine such a link invariant. We focus on a specific class of invariants known as nonzero determinant link invariants, defined only for links which do not evaluate to zero on the link determinant. We restate our objective by considering a set $\mathcal{S}$ of links subject to the condition that if any three nonzero determinant links belong to a skein triangle, any two of these belonging to $\mathcal{S}$ implies that the third also belongs to $\mathcal{S}$. Then we aim to determine a minimal set of initial generators so that $\mathcal{S}$ is the set of all links with nonzero determinant. We show that only the unknot is required as a generator if the skein triangle is unoriented. For oriented skein triangles, we show that the unknot and Hopf link orientations form a set of generators.

167) Jiwon Choi, Gromov-Hausdorff Distance Between Metric Graphs (2 Jan 2019)

In this paper we study the Gromov-Hausdorff distance between two metric graphs. We compute the precise value of the Gromov-Hausdorff distance between two path graphs. Moreover, we compute the precise value of the Gromov-Hausdorff distance between a cycle graph and a tree. Given a graph X, we consider a graph Y that results from adding an edge to X without changing the number of vertices. We compute the precise value of the Gromov-Hausdorff distance between X and Y.

166) Kaiying Hou, Agent-based Models for Conservation Equations (31 Dec 2018)

In this research, we use agent-based models to solve conservation equations. A conservation equation is a partial differential equation that describes any conserved quantity by establishing a relationship between the density and the flux. It is used in areas such as traffic flow and fluid dynamics. Past research on numerically solving conservation equations mainly tackles the problem by establishing discrete cells in the space and approximating the densities in the cells. In this research, we use an agent-based model, in which we describe the solution through the movement of particles in the space. We propose an agent-based model for conservation equation in 1-D space. We found a change of variables that transforms the original conservation equation to the specific volume conservation equation. This transform allows us to apply results in finite volume method to the agent-based model and find a condition for the agent-based solution to converge to the exact solution of scalar conservation equations.

165) Andy Xu, Approximating the Hurwitz Zeta Function (22 Dec 2018)

This project aims to implement a MATLAB function that approximates the Hurwitz zeta function $\zeta(s, a)$. This is necessary because the naive implementation fails for certain input near critical values for $s$ and for $a$. Other series representations of the Hurwitz zeta function converge rapidly but do not handle complex values of $s$ and/or $a$. We also consider existing forms for the Hurwitz zeta function, including one given by Bailey and Borwein, and evaluate their overall performance.

164) Allen Wang (PRIMES) and Guangyi Yue (MIT), Relationship Between Mullineux Involution and the Generalized Regularization (arXiv.org, 19 Dec 2018), published in European Journal of Combinatorics 85 (March 2020)

The Mullineux involution is an important map on $p$-regular partitions that originates from the modular representation theory of $\mathcal{S}_n$. In this paper we study the Mullineux transpose map and the generalized column regularization and prove a condition under which the two maps are exactly the same. Our results generalize the work of Bessenrodt, Olsson and Xu, and the combinatorial constructions is related to the Iwahori-Hecke algebra and the global crystal basis of the basic $U_q(\widehat{\mathfrak{sl}}_b)$-module. In the conclusion, we provide several conjectures regarding the $q$-decomposition numbers and generalizations of results due to Fayers.

163) Maximillian Guo, Behavior of Bar-Natan Homology under Conway Mutation (18 Dec 2018)

The Bar-Natan homology is a perturbation of the Khovanov homology of a knot. Previous work has shown that Khovanov homology remains unchanged under Conway mutation of the knot diagram. We give an exact triangle with three different resolutions of a link and prove several lemmas relating the dimensions of different Bar-Natan chain complexes and homologies. These allow us to prove that the dimension of the Bar-Natan homology $BN^k (L; \mathbb{Z}/2\mathbb{Z})$ is invariant under Conway mutation.

162) Nithin Kavi (PRIMES), Wendy Wu (PRIMES), and Zhenkun Li (MIT), Trunk of Satellite and Companion Knots (arXiv.org, 8 Dec 2018), published in Topology and its Applications, vol. 272 (1 March 2020)

We study the knot invariant called trunk, as defined by Ozawa, and the relation of the trunk of a satellite knot with the trunk of its companion knot. Our first result is ${\rm trunk}(K) \geq n \cdot {\rm trunk}(J)$ where ${\rm trunk}(\cdot)$ denotes the trunk of a knot, $K$ is a satellite knot with companion $J$, and $n$ is the winding number of $K$. To upgrade winding number to wrapping number, which we denote by $m$, we must include an extra factor of $\frac{1}{2}$ in our second result $\text{trunk}(K)$ $>$ $(1/2)m\cdot \text{trunk}(J)$ since $m \geq n$. We also discuss generalizations of the second result.

161) Merrick Cai (PRIMES) and Daniil Kalinov (MIT), The Hilbert Series of the Irreducible Quotient of the Polynomial Representation of the Rational Cherednik Algebra of Type $A_{n-1}$ in Characteristic $p$ for $p|n-1$ (arXiv.org, 12 Nov 2018)

We study the irreducible quotient $\mathcal{L}_{t,c}$ of the polynomial representation of the rational Cherednik algebra $\mathcal{H}_{t,c}(S_n,\mathfrak{h})$ of type $A_{n-1}$ over an algebraically closed field of positive characteristic $p$ where $p|n-1$. In the $t=0$ case, for all $c\ne 0$ we give a complete description of the polynomials in the maximal proper graded submodule $\ker \mathcal{B}$, the kernel of the contravariant form $\mathcal{B}$, and subsequently find the Hilbert series of the irreducible quotient $\mathcal{L}_{0,c}$. In the $t=1$ case, we give a complete description of the polynomials in $\ker \mathcal{B}$ when the characteristic $p=2$ and $c$ is transcendental over $\mathbb{F}_2$, and compute the Hilbert series of the irreducible quotient $\mathcal{L}_{1,c}$. In doing so, we prove a conjecture due to Etingof and Rains completely for $p=2$, and also for any $t=0$ and $n\equiv 1\pmod{p}$. Furthermore, for $t=1$, we prove a simple criterion to determine whether a given polynomial $f$ lies in $\ker \mathcal{B}$ for all $n=kp+r$ with $r$ and $p$ fixed.

160) Tanya Khovanova (MIT) and Eric Zhang (PRIMES), On 3-Inflatable Permutations (arXiv.org, 22 Sept 2018), published in The Electronic Journal of Combinatorics 28:1 (2021)

Call a permutation $k$-inflatable if it can be "blown up" into a convergent sequence of permutations by a uniform inflation construction, such that this sequence is symmetric with respect to densities of induced subpermutations of length $k$. We study properties of 3-inflatable permutations, finding a general formula for limit densities of pattern permutations in the uniform inflation of a given permutation. We also characterize and find examples of $3$-inflatable permutations of various lengths, including the shortest examples with length $17$.

159) Sathwik Karnik, Bounds on the Maximal Cardinality of an Acute Set in a Hypercube (7 Sept 2018)

The acute set problem asks the following question: what is the maximal cardinality of a $d$-dimensional set of points such that all angles formed between any three points are acute? In this paper, we consider an analogous problem with the condition that the acute set is a subset of a $d$-dimensional unit hypercube. We provide an explicit construction and proof to show that a lower bound for the maximum cardinality of an acute set in $\{0,1\}^d$ is $2^{2^{\lfloor \log_3 d \rfloor}}$. Using a similar construction, we improve this lower bound to $2^{d/3}$. Through a consideration of points diagonally opposite a particular point on 2-faces, we improve the upper bound to $\left(1 + \dfrac{2}{d}\right)\cdot 2^{d-2}$. We then seek to generalize these findings and a combinatorial interpretation of the problem in $\{0,1\}^d$.

158) Vincent Bian, Special Configurations in Anchored Rectangle Packings (arXiv.org, 6 Sept 2018)

Given a finite set S in $[0,1]^2$ including the origin, an anchored rectangle packing is a set of non-overlapping rectangles in the unit square where each rectangle has a point of S as its left-bottom corner and contains no point of S in its interior. Allen Freedman conjectured in the 1960s one can always find an anchored rectangle packing with total area at least $1/2$. We verify the conjecture for point configurations whose relative positions belong to certain classes of permutations.

157) Tanya Khovanova (MIT) and Wayne Zhao (PRIMES), Mathematics of a Sudo-Kurve (arXiv.org, 20 Aug 2018), published in Recreational Mathematics Magazine, no. 10 (2018): 5-27.

We investigate a type of a Sudoku variant called Sudo-Kurve, which allows bent rows and columns, and develop a new, yet equivalent, variant we call a Sudo-Cube. We examine the total number of distinct solution grids for this type with or without symmetry. We study other mathematical aspects of this puzzle along with the minimum number of clues needed and the number of ways to place individual symbols.

156) Vinjai Vale, A new paradigm for computer vision based on compositional representation (14 May 2018)

Deep convolutional neural networks - the state-of-the-art technique in artificial intelligence for computer vision - achieve notable success rates at simple classification tasks, but are fundamentally lacking when it comes to representation. These neural networks encode fuzzy textural patterns into vast matrices of numbers which lack the semantically structured nature of human representations (e.g. "a table is a flat horizontal surface supported by an arrangement of identical legs"). This paper takes multiple important steps towards filling in these gaps. I first propose a series of tractable milestone problems set in the abstract two-dimensional ShapeWorld, thus isolating the challenge of object compositionality. Then I demonstrate the effectiveness of a new compositional representation approach based on identifying structure among the primitive elements comprising an image and representing this structure through an augmented primitive element tree and coincidence list. My approach outperforms Google's state-of-the-art Inception-v3 Convolutional Neural Network in accuracy, speed, and structural representation in my object representation milestone tasks. Finally, I present a mathematical framework for a probabilistic programming approach that can learn highly structured generative stochastic representations of compositional objects from just a handful of examples. This work is foundational for the future of general computer vision, and its applications are wide-reaching, ranging from autonomous vehicles to intelligent robotics to augmented and virtual reality.

155) Andrew Gritsevskiy (PRIMES) and Maksym Korablyov (MIT), Capsule networks for low-data transfer learning (arXiv.org, 26 Apr 2018)

We propose a capsule network-based architecture for generalizing learning to new data with few examples. Using both generative and non-generative capsule networks with intermediate routing, we are able to generalize to new information over 25 times faster than a similar convolutional neural network. We train the networks on the multiMNIST dataset lacking one digit. After the networks reach their maximum accuracy, we inject 1-100 examples of the missing digit into the training set, and measure the number of batches needed to return to a comparable level of accuracy. We then discuss the improvement in low-data transfer learning that capsule networks bring, and propose future directions for capsule research.

2017 Research Papers

154) Tanya Khovanova (MIT) and Joshua Lee (PRIMES), The 5-Way Scale (8 Mar 2019), published in Recreational Mathematics Magazine 11 (2019): 5-14

In this paper, we discuss coin-weighing problems that use a 5-way scale which has five different possible outcomes: MUCH LESS, LESS, EQUAL, MORE, and MUCH MORE. The 5-way scale provides more information than the regular 3-way scale. We study the problem of finding two fake coins from a pile of identically looking coins in a minimal number of weighings using a 5-way scale. We discuss similarities and differences between the 5-way and 3-way scale. We introduce a strategy for a 5-way scale that can find both counterfeit coins among $2^k$ coins in $k+1$ weighings, which is better than any strategy for a 3-way scale.

153) Grace Tian, Multi-Crossing Numbers for Knots (26 Jan 2019)

We study the projections of a knot K that have only n-crossings. The n-crossing number of K is the minimum number of n-crossings among all possible projections of K with only n-crossings. We obtain new results on the relation between n-crossing number and (2n − 1)-crossing number for every positive even integer n.

152) David Lu (PRIMES), Sanjit Bhat (PRIMES), Albert Kwon (MIT), and Srinivas Devadas (MIT), DynaFlow: An Efficient Website Fingerprinting Defense Based on Dynamically-Adjusting Flows (15 Oct 2018), published in Proceedings of the 2018 Workshop on Privacy in the Electronic Society (WPES 2018), pp. 109-113.

Website fingerprinting attacks enable a local adversary to determine which website a Tor user visits. In recent years, several researchers have proposed defenses to counter these attacks. However, these defenses have shortcomings: many do not provide formal guarantees of security, incur high latency and bandwidth overheads, and require a frequently-updated database of website traffic patterns. In this work, we introduce a new countermeasure, DynaFlow, based on dynamically-adjusting flows to protect against website fingerprinting. DynaFlow provides a similar level of security as current state-of-the-art while being over $40\%$ more efficient. At the same time, DynaFlow does not require a pre-established database and extends protection to dynamically-generated websites.

151) Mihir Singhal (PRIMES) and Christopher Ryba (MIT), Generalizations of Hall-Littlewood Polynomials (24 Sept 2018)

Hall-Littlewood polynomials are important functions in various fields of mathematics and quantum physics, and can be defined combinatorially using a model of path ensembles. Wheeler and Zinn-Justin applied a re ection construction to this model to obtain an expression for type BC Hall-Littlewood polynomials. Borodin applied a single-parameter deformation to the model and obtained a formula for generalized Hall-Littlewood polynomials. Borodin has asked whether a similar generalization could be applied to type BC Hall-Littlewood polynomials. We present the model incorporating Borodin's generalization. We also obtain expressions for polynomials that were previously studied by Borodin, in addition to an expression for generalized type BC Hall-Littlewood polynomials.

150) Gopal Goel (PRIMES) and Andrew Ahn (MIT), Discrete Derivative Asymptotics of the $\beta$-Hermite Eigenvalues (arXiv.org, 18 Sept 2018), published in Combinatorics, Probability and Computing (17 April 2019)

We consider the asymptotics of the difference between the empirical measures of the $\beta$-Hermite tridiagonal matrix and its minor. We prove that this difference has a deterministic limit and Gaussian fluctuations. Through a correspondence between measures and continual Young diagrams, this deterministic limit is identified with the Vershik-Kerov-Logan-Shepp curve. Moreover, the Gaussian fluctuations are identified with a sectional derivative of the Gaussian free field.

149) Franklyn Wang, Monodromy Groups of Indecomposable Rational Functions (10 Sept 2018)

The most important geometric invariant of a degree-$n$ complex rational function $f(X)$ is its monodromy group, which is a set of permutations of $n$ objects. This monodromy group determines several properties of $f(X)$. A fundamental problem is to classify all degree-$n$ rational functions which have special behavior, meaning that their monodromy group $G$ is not one of the two "typical" groups, namely $A_n$ or $S_n$. Many mathematicians have studied this problem, including Oscar Zariski, John Thompson, Robert Guralnick, and Michael Aschbacher. In this paper we bring this problem near completion by solving it when $G$ is in any of the classes of groups which previously seemed intractable. We introduce new techniques combining methods from algebraic geometry, Galois theory, group theory, representation theory, and combinatorics. The classification of rational functions with special behavior will have many consequences, including far-reaching generalizations of Mazur's theorem on uniform boundedness of rational torsion on elliptic curves and Nevanlinna's theorem on uniqueness of meromorphic functions with prescribed preimages of five points. This improved understanding of rational functions has potential significance in various fields of science and engineering where rational functions arise.

148) Michael Ma, New Results on Pattern-Replacement Equivalences: Generalizing a Classical Theorem and Revising a Recent Conjecture (6 Sept 2018)

In this paper we study pattern-replacement equivalence relations on the set $S_n$ of permutations of length $n$. Each equivalence relation is determined by a set of patterns, and equivalent permutations are connected by pattern-replacements in a manner similar to that of the Knuth relation. One of our main results generalizes the celebrated Erdös-Szekeres Theorem for permutation pattern-avoidance to a new result for permutation pattern-replacement. In particular, we show that under the $ \left \{ 123...k, k...321 \right \}$-equivalence, all permutations in $S_n$ are equivalent up to parity when $n \geq \Omega(k^2)$. Additionally, we extend the work of Kuszmaul and Zhou on an infinite family of pattern-replacement equivalences known as the rotational equivalences. Kuszmaul and Zhou proved that the rotational equivalences always yield either one or two nontrivial equivalence classes in Sn, and conjectured that the number of nontrivial classes depended only on the patterns involved in the rotational equivalence (rather than on $n$). We present a counterexample to their conjecture, and prove a new theorem fully classifying (for large $n$) when there is one nontrivial equivalence class and when there are two nontrivial equivalence classes. Finally, we computationally analyze the pattern-replacement equivalences given by sets of pairs of patterns of length four. We then focus on three cases, in which the number of nontrivial equivalence classes matches an OEIS sequence. For two of these we present full proofs of the enumeration and for the third we suggest a potential future method of proof.

147) Kyle Gatesman (PRIMES), James Unwin (University of Illinois at Chicago), Lattice Studies of Gerrymandering Strategies (arXiv.org, 8 Aug 2018), published in Political Analysis 29:2 (April 2021): 167-192

We propose three novel gerrymandering algorithms which incorporate the spatial distribution of voters with the aim of constructing gerrymandered, equal-population, connected districts. Moreover, we develop lattice models of voter distributions, based on analogies to electrostatic potentials, in order to compare different gerrymandering strategies. Due to the probabilistic population fluctuations inherent to our voter models, Monte Carlo methods can be applied to the districts constructed via our gerrymandering algorithms. Through Monte Carlo studies we quantify the effectiveness of each of our gerrymandering algorithms and we also argue that gerrymandering strategies which do not include spatial data lead to (legally prohibited) highly disconnected districts. Of the three algorithms we propose, two are based on different strategies for packing opposition voters, and the third is a new approach to algorithmic gerrymandering based on genetic algorithms, which automatically guarantees that all districts are connected. Furthermore, we use our lattice voter model to examine the effectiveness of isoperimetric quotient tests and our results provide further quantitative support for implementing compactness tests in real-world political redistricting.

146) William Zhang, Improved bounds on the extremal function of hypergraphs (arXiv.org, 5 Jul 2018)

A fundamental problem in pattern avoidance is describing the asymptotic behavior of the extremal function and its generalizations. We prove an equivalence between the asymptotics of the graph extremal function for a class of bipartite graphs and the asymptotics of the matrix extremal function. We use the equivalence to prove several new bounds on the extremal functions of graphs. We develop a new method to bound the extremal function of hypergraphs in terms of the extremal function of their associated multidimensional matrices, improving the bound of the extremal function of $d$-permutation hypergraphs of length $k$ from $O(n^{d-1})$ to $2^{O(k)}n^{d-1}$.

145) P. A. Crowdmath, The Broken Stick Project (arXiv.org, 16 May 2018)

The broken stick problem is the following classical question. You have a segment $[0,1]$. You choose two points on this segment at random. They divide the segment into three smaller segments. Show that the probability that the three segments form a triangle is $1/4$.
The MIT PRIMES program, together with Art of Problem Solving, organized a high school research project where participants worked on several variations of this problem. Participants were generally high school students who posted ideas and progress to the Art of Problem Solving forums over the course of an entire year, under the supervision of PRIMES mentors. This report summarizes the findings of this CrowdMath project.

144) Aaron Kaufer, Superalgebra in characteristic 2 (arXiv.org, 3 Apr 2018)

Following the work of Siddharth Venkatesh, we study the category $\textbf{sVec}_2$. This category is a proposed candidate for the category of supervector spaces over fields of characteristic $2$ (as the ordinary notion of a supervector space does not make sense in charcacteristic $2$). In particular, we study commutative algebras in $\textbf{sVec}_2$, known as $d$-algebras, which are ordinary associative algebras $A$ together with a linear derivation $d:A \to A$ satisfying the twisted commutativity rule: $ab = ba + d(b)d(a)$. In this paper, we generalize many results from standard commutative algebra to the setting of $d$-algebras; most notably, we give two proofs of the statement that Artinian $d$-algebras may be decomposed as a direct product of local $d$-algebras. In addition, we show that there exists no noncommutative $d$-algebras of dimension $\leq 7$, and that up to isomorphism there exists exactly one $d$-algebra of dimension $7$. Finally, we give the notion of a Lie algebra in the category $\textbf{sVec}_2$, and we state and prove the Poincare-Birkhoff-Witt theorem for this category.

143) Kaiying Hou and Brian Rhee, Continuum Modelling of Traffic Systems with Autonomous Vehicles (17 Mar 2018)

Describing the behavior of automobile traffic via mathematical modeling and computer simulation has been a field of study conducted by mathematicians throughout the last century. One of the oldest models in traffic flow theory casts the problem in terms of densities and fluxes in partial differential conservation laws. In the past few years, the rise of autonomous vehicles (driven by software without human intervention) presents a new problem for classical traffic modeling. Autonomous vehicles react very differently from the traditional human-driven vehicles, resulting in modifications to the underlying partial differential equation constitutive laws. In this paper, we aim to provide insight into some new proposed constitutive laws by using continuum modelling to study traffic flows with a mix of human and autonomous vehicles. We also introduce various existing traffic flow models and present a new model for traffic flow that is based on an interaction between human drivers and autonomous vehicles where each vehicle can only measure the total density of surrounding cars, regardless of human or autonomous status. By implementing the Lax-Friedrichs scheme in Octave, we test how these different constitutive laws perform in our model and analyze the density curves that form over time steps. We also analytically derive and implement a Roe solver for a class of coupled conservation equations in which the velocities of cars are polynomial functions of the total density of surrounding cars regardless of type. We hope that our results could help civil engineers bring forth real progress in implementing efficient road systems that integrates both human-operated and unmanned vehicles.

142) Michael Gintz, Classifying Graph Lie Algebras (14 Mar 2018)

A Lie algebra is a linear object which has a powerful homomorphism with a Lie group, an important object in differential geometry. In previous work a construction is given that builds a Lie algebra on a Dynkin diagram, a commonly studied structure in Lie theory. We expand this definition to construct a Lie algebra given any simple graph, and consider the problem of determining its structure. We begin by defining an alteration on a graph which preserves its underlying graph Lie algebra structure, and use it to simplify the general graph. We then provide a decomposition move which further simplifies the Lie algebra structure of the general graph. Finally, we combine these two moves to classify all graph Lie algebras.

141) Sanjit Bhat (PRIMES), David Lu (PRIMES), Albert Kwon (MIT), and Srinivas Devadas (MIT), Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning (arXiv.org, 28 Feb 2018), published in Proceedings on Privacy Enhancing Technologies (PETS 2019) (4): 292-310.

In recent years, there have been several works that use website fingerprinting techniques to enable a local adversary to determine which website a Tor user visits. While the current state-of-the-art attack, which uses deep learning, outperforms prior art with medium to large amounts of data, it attains marginal to no accuracy improvements when both use small amounts of training data. In this work, we propose Var-CNN, a website fingerprinting attack that leverages deep learning techniques along with novel insights specific to packet sequence classification. In open-world settings with large amounts of data, Var-CNN attains over $1\%$ higher true positive rate (TPR) than state-of-the-art attacks while achieving $4\times$ lower false positive rate (FPR). Var-CNN's improvements are especially notable in low-data scenarios, where it reduces the FPR of prior art by $3.12\%$ while increasing the TPR by $13\%$. Overall, insights used to develop Var-CNN can be applied to future deep learning based attacks, and substantially reduce the amount of training data needed to perform a successful website fingerprinting attack. This shortens the time needed for data collection and lowers the likelihood of having data staleness issues.

140) Richard Xu, Algebraicity regarding Graphs and Tilings (27 Jan 2018)

Given a planar graph G, we prove that there exists a tiling of a rectangle by squares such that each square corresponds to a face of the graph and the side lengths of the squares solve an extremal problem on the graph. Furthermore, we provide a practical algorithm for calculating the side lengths. Finally, we strengthen our theorem by restricting the centers and side lengths of the squares to algebraic numbers and explore the application of our technique in proving algebraicity in packing problems.

139) Anlin Zhang (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), Modelling epidemics on d-cliqued graphs (published in Letters in Biomathematics 5:1 (Jan 16, 2018)

Since social interactions have been shown to lead to symmetric clusters, we propose here that symmetries play a key role in epidemic modelling. Mathematical models on d-ary tree graphs were recently shown to be particularly effective for modelling epidemics in simple networks. To account for symmetric relations, we generalize this to a new type of networks modelled on d-cliqued tree graphs, which are obtained by adding edges to regular d-trees to form d-cliques. This setting gives a more realistic model for epidemic outbreaks originating within a family or classroom and which could reach a population by transmission via children in schools. Specifically, we quantify how an infection starting in a clique (e.g. family) can reach other cliques through the body of the graph (e.g. public places). Moreover, we propose and study the notion of a safe zone, a subset that has a negligible probability of infection.

138) Dylan Pentland, Coefficients of Gaussian polynomials modulo N (arXiv.org, 30 Dec 2017)

The $q$-analogue of the binomial coefficient, known as a $q$-binomial coefficient, is typically denoted $\left[{n \atop k}\right]_q$. These polynomials are important combinatorial objects, often appearing in generating functions related to permutations and in representation theory.
Stanley conjectured that the function $f_{k,R}(n) = \#\left\{i : [q^{i}] \left[{n \atop k}\right]_q \equiv R \pmod{N}\right\}$ is quasipolynomial for $N=2$. We generalize, showing that this is in fact true for any integer $N\in \mathbb{N}$ and determine a quasi-period $\pi'_N(k)$ derived from the minimal period $\pi_N(k)$ of partitions with at most $k$ parts modulo $N$.

137) Andy Xu and Wendy Wu, Higher Gonalities of Erdös-Rényi Random Graphs (22 Dec 2017)

We consider the asymptotic behavior of the second and higher gonalities of an Erdös-Rényi random graph and provide upper bounds for both via the probabilistic method. Our results suggest that for sufficiently large $n$, the second gonality of an Erdös-Rényi random Graph $G(n,p)$ is strictly less than and asymptotically equal to the number of vertices under a suitable restriction of the probability $p$. We also prove an asymptotic upper bound for all higher gonalities of large Erdös-Rényi random graphs that adapts and generalizes a similar result on complete graphs. We suggest another approach towards finding both upper and lower bounds for the second and higher gonalities for small $p=\frac{c}{n}$, using a special case of the Riemann-Roch Theorem, and fully determine the asymptotic behavior of arbitrary gonalities when $c\leq 1$.

136) Michael Ren (PRIMES) and Xiaomeng Xu (MIT), Quasi-invariants in characteristic p and twisted quasi-invariants (15 Nov 2017; arXiv.org, 31 Jul 2019)

The spaces of quasi-invariant polynomials were introduced by Feigin and Veselov, where their Hilbert series over fields of characteristic 0 were computed. In this paper, we show some partial results and make two conjectures on the Hilbert series of these spaces over fields of positive characteristic.
On the other hand, Braverman, Etingof, and Finkelberg introduced the spaces of quasi-invariant polynomials twisted by a monomial. We extend some of their results to the spaces twisted by a smooth function.

135) David Darrow, A Novel, Near-Optimal Spectral Method for Simulating Fluids in a Cylinder (13 Nov 2017)

Simulations of fluid flow offer theoretical insight into fluid dynamics and critical applications in industry, with implications ranging from blood flow to hurricanes. However, open problems in fluid dynamics require more accurate simulations and lower computational resource costs than current algorithms provide. Accordingly, we develop in this paper a novel, computationally efficient spectral method for computing solutions of the incompressible Navier–Stokes equations, which model incompressible fluid flow, on the cylinder. The method described addresses three major limitations of current methods. First, while current methods either underresolve the cylinder's boundary or overresolve its center (effectively overemphasizing less physically interesting non-boundary regions), this new method more evenly resolves all parts of the cylinder. Secondly, current simulation times scale proportionally as $N^{7/3}$ or higher (where $N$ is the number of discretization points), while the new method requires at most $\mathcal{O}(N\log N)$ operations per time step. For large $N$, this means that calculations that required weeks can now be run in minutes. Lastly, current practical methods offer only low order (algebraic) accuracy. The new method has spectral accuracy, which often represents an improvement of the accuracy of the results by 5–10 orders of magnitude or more.

134) Espen Slettnes, Carl Joshua Quines, Shen-Fu Tsai, and Jesse Geneson (CrowdMath-2017), Variations of the cop and robber game on graphs (arXiv.org, 31 Oct 2017)

We prove new theoretical results about several variations of the cop and robber game on graphs. First, we consider a variation of the cop and robber game which is more symmetric called the cop and killer game. We prove for all $c < 1$ that almost all random graphs are stalemate for the cop and killer game, where each edge occurs with probability $p$ such that $\frac{1}{n^{c}} \le p \le 1-\frac{1}{n^{c}}$. We prove that a graph can be killer-win if and only if it has exactly $k\ge 3$ triangles or none at all. We prove that graphs with multiple cycles longer than triangles permit cop-win and killer-win graphs. For $\left(m,n\right)\neq\left(1,5\right)$ and $n\geq4$, we show that there are cop-win and killer-win graphs with $m$ $C_n$s. In addition, we identify game outcomes on specific graph products.
Next, we find a generalized version of Dijkstra's algorithm that can be applied to find the minimal expected capture time and the minimal evasion probability for the cop and gambler game and other variations of graph pursuit.
Finally, we consider a randomized version of the killer that is similar to the gambler. We use the generalization of Dijkstra's algorithm to find optimal strategies for pursuing the random killer. We prove that if $G$ is a connected graph with maximum degree $d$, then the cop can win with probability at least $\frac{\sqrt d}{1+\sqrt d}$ after learning the killer's distribution. In addition, we prove that this bound is tight only on the $\left(d+1\right)$-vertex star, where the killer takes the center with probability $\frac1{1+\sqrt d}$ and each of the other vertices with equal probabilities.

133) Ayush Agarwal (PRIMES) and Christian Gaetz (MIT), Differential posets and restriction in critical groups (arXiv.org, 23 Oct 2017), published in Algebraic Combinatorics, vol. 2:6 (2019): 1311-1327.

In recent work, Benkart, Klivans, and Reiner defined the critical group of a faithful representation of a finite group $G$, which is analogous to the critical group of a graph. In this paper we study maps between critical groups induced by injective group homomorphisms and in particular the map induced by restriction of the representation to a subgroup. We show that in the abelian group case the critical groups are isomorphic to the critical groups of a certain Cayley graph and that the restriction map corresponds to a graph covering map. We also show that when $G$ is an element in a differential tower of groups, critical groups of certain representations are closely related to words of up-down maps in the associated differential poset. We use this to generalize an explicit formula for the critical group of the permutation representation of the symmetric group given by the second author, and to enumerate the factors in such critical groups.

132) Louis Golowich (PRIMES) and Chiheon Kim (MIT), New Classes of Set-Sequential Tree (arXiv.org, 14 Oct 2017), published in Discrete Mathematics, vol. 343:3 (March 2020)

A graph is called set-sequential if its vertices can be labeled with distinct nonzero vectors in $\mathbb{F}_2^n$ such that when each edge is labeled with the sum$\pmod{2}$ of its vertices, every nonzero vector in $\mathbb{F}_2^n$ is the label for either a single vertex or a single edge. We resolve certain cases of a conjecture of Balister, Gyori, and Schelp in order to show many new classes of trees to be set-sequential. We show that all caterpillars $T$ of diameter $k$ such that $k \leq 18$ or $|V(T)| \geq 2^{k-1}$ are set-sequential, where $T$ has only odd-degree vertices and $|T| = 2^{n-1}$ for some positive integer $n$. We also present a new method of recursively constructing set-sequential trees.

131) Zachary Steinberg, Automated Segmentation of 3D Punctate Neural Expansion Microscopy Data (30 Sept 2017)

The comprehensive study of multiple-neuron circuits, known as connectomics, has historically been hampered by the time-consuming process of obtaining data with perfect morphological reconstructions of neurons. Existing attempts to automate the reconstruction of synaptic connnections have used electron microscope data to some success, but were limited due to the black-and-white nature of such data and the computational requirements of supervised learning. Now that multicolor data is available at 20nm resolution via Expansion Microscopy (ExM), creating an automated, reliable algorithm requiring minimal training that can process the future petabytes of neural tissue data in a reasonable amount of time is an open problem. Here, we outline an automated approach to segment neurons in a 20x expanded hippocampus slice expressing Brainbow fluorescent proteins. We first use a neural network as a mask to filter data, oversegment in color space to create supervoxels, and finally merge those supervoxels together to reconstruct the 3D volume for an individual neuron. The results demonstrate this approach shows promise to harness ExM data for 3D neural imaging. Our approach offers several insights that can guide future work.

130) Andrew Gritsevskiy, Towards Generative Drug Discovery: Metric Learning using Variational Autoencoders (30 Sept 2017)

We report a method for metric learning using an extended variational autoencoder. Our architecture, based on deep learning, provides the ability to learn a transformation- invariant metric on any set of data. Our architecture consists of a pair of encoding and decoding networks. The encoder network converts the data into differentiable latent representations, while the decoder network learns to convert these representations back into data. We then apply an additional set of losses to the encoder network, forcing it to learn codings that are independent of orientation and re ect the desired metric. Then, our architecture is able to predict the real metric for a set of data points, and can generate data points that match a set of requirements. We demonstrate our networks ability to calculate the maximum overlap area of any two shapes in one shot; we also demonstrate our networks success at matching halves of geometric shapes. We then propose the applications of our network to areas of biochemistry and medicine, especially generative drug discovery.

129) Kaan Dokmeci, Theorems on Field Extensions and Radical Denesting (26 Sept 2017)

The problem of radical denesting is the problem that looks into given nested radical expressions and ways to denest them, or decrease the number of layers of radicals. This is a fairly recent problem, with applications in mathematical software that do algebraic manipulations like denesting given radical expressions. Current algorithms are either limited or inefficient.
We tackle the problem of denesting real radical expressions without the use of Galois Theory. This uses various theorems on field extensions formed by adjoining roots of elements of the original field. These theorems are proven via the roots of unity filter and degree arguments. These theorems culminate in proving a general theorem on denesting and leads to a general algorithm that does not require roots of unity. We optimize this algorithm further. Also, special cases of radical expressions are covered, giving more efficient algorithms in these cases, spanning many examples of radicals. Additionally, a condition for a radical not to denest is given. The results of denesting radicals over $Q$ are extended to real extensions of $Q$ and also transcendental extensions like $Q$(t). Finally, the case of denesting sums of radicals is explored as well.

2016 Research Papers

128) Piotr Suwara (MIT) and Albert Yue (PRIMES), An Index-Type Invariant of Knot Diagrams Giving Bounds for Unknotting Framed Unknots (arXiv.org, 7 Jul 2017)

We introduce a new knot diagram invariant called the Self-Crossing Index (SCI). Using SCI, we provide bounds for unknotting two families of framed unknots. For one of these families, unknotting using framed Reidemeister moves is significantly harder than unknotting using regular Reidemeister moves.
We also investigate the relation between SCI and Arnold's curve invariant St, as well as the relation with Hass and Nowik's invariant, which generalizes cowrithe. In particular, the change of SCI under $\Omega$3 moves depends only on the forward/backward character of the move, similar to how the change of St or cowrithe depends only on the positive/negative quality of the move.

127) P.A. CrowdMath, Results on Pattern Avoidance Games (arXiv.org, 18 Apr 2017)

A zero-one matrix $A$ contains another zero-one matrix $P$ if some submatrix of $A$ can be transformed to $P$ by changing some ones to zeros. $A$ avoids $P$ if $A$ does not contain $P$. The Pattern Avoidance Game is played by two players. Starting with an all-zero matrix, two players take turns changing zeros to ones while keeping $A$ avoiding $P$. We study the strategies of this game for some patterns $P$. We also study some generalizations of this game.

126) P.A. CrowdMath, Algorithms for Pattern Containment in 0-1 Matrices (arXiv.org, 18 Apr 2017)

We say a zero-one matrix $A$ avoids another zero-one matrix $P$ if no submatrix of $A$ can be transformed to $P$ by changing some ones to zeros. A fundamental problem is to study the extremal function $ex(n,P)$, the maximum number of nonzero entries in an $n \times n$ zero-one matrix $A$ which avoids $P$. To calculate exact values of $ex(n,P)$ for specific values of $n$, we need containment algorithms which tell us whether a given $n \times n$ matrix $A$ contains a given pattern matrix $P$. In this paper, we present optimal algorithms to determine when an $n \times n$ matrix $A$ contains a given pattern $P$ when $P$ is a column of all ones, an identity matrix, a tuple identity matrix, an $L$-shaped pattern, or a cross pattern. These algorithms run in $\Theta(n^2)$ time, which is the lowest possible order a containment algorithm can achieve. When $P$ is a rectangular all-ones matrix, we also obtain an improved running time algorithm, albeit with a higher order.

125) Malte Möser, Kyle Soska, Ethan Heilman, Kevin Lee, Henry Heffan (PRIMES), Shashvat Srivastava (PRIMES), Kyle Hogan, Jason Hennessey, Andrew Miller, Arvind Narayanan, and Nicolas Christin, An Empirical Analysis of Traceability in the Monero Blockchain (arXiv.org, 13 Apr 2017); to appear at PETS (Privacy Enhancing Technologies Symposium) 2018; an accompanying article about this paper appread in Wired (March 27, 2018)

Monero is a privacy-centric cryptocurrency that allows users to obscure their transactions by including chaff coins, called "mixins," along with the actual coins they spend. In this paper, we empirically evaluate two weaknesses in Monero's mixin sampling strategy. First, about 62% of transaction inputs with one or more mixins are vulnerable to "chain-reaction" analysis -- that is, the real input can be deduced by elimination. Second, Monero mixins are sampled in such a way that they can be easily distinguished from the real coins by their age distribution; in short, the real input is usually the "newest" input. We estimate that this heuristic can be used to guess the real input with 80% accuracy over all transactions with 1 or more mixins. Next, we turn to the Monero ecosystem and study the importance of mining pools and the former anonymous marketplace AlphaBay on the transaction volume. We find that after removing mining pool activity, there remains a large amount of potentially privacy-sensitive transactions that are affected by these weaknesses. We propose and evaluate two countermeasures that can improve the privacy of future transactions.

124) Alec Leng, Independence of the Miller-Rabin and Lucas Probable Prime Tests (30 Mar 2017)

In the modern age, public-key cryptography has become a vital component for secure online communication. To implement these cryptosystems, rapid primality testing is necessary in order to generate keys. In particular, probabilistic tests are used for their speed, despite the potential for pseudoprimes. So, we examine the commonly used Miller-Rabin and Lucas tests, showing that numbers with many nonwitnesses are usually Carmichael or Lucas-Carmichael numbers in a specific form. We then use these categorizations, through a generalization of Korselt’s criterion, to prove that there are no numbers with many nonwitnesses for both tests, affirming the two tests’ relative independence. As Carmichael and Lucas-Carmichael numbers are in general more difficult for the two tests to deal with, we next search for numbers which are both Carmichael and Lucas-Carmichael numbers, experimentally finding none less than $10^{16}$. We thus conjecture that there are no such composites and, using multivariate calculus with symmetric polynomials, begin developing techniques to prove this.

123) Ria Das, Exploring the Ant Mill: Numerical and Analytical Investigations of Mixed Memory-Reinforcement Systems (arXiv.org, 20 Mar 2017)

Under certain circumstances, a swarm of a species of trail-laying ants known as army ants can become caught in a doomed revolving motion known as the death spiral, in which each ant follows the one in front of it in a never-ending loop until they all drop dead from exhaustion. This phenomenon, as well as the ordinary motions of many ant species and certain slime molds, can be modeled using reinforced random walks and random walks with memory. In a reinforced random walk, the path taken by a moving particle is influenced by the previous paths taken by other particles. In a random walk with memory, a particle is more likely to continue along its line of motion than change its direction. Both memory and reinforcement have been studied independently in random walks with interesting results. However, real biological motion is a result of a combination of both memory and reinforcement. In this paper, we construct a continuous random walk model based on diffusion-advection partial differential equations that combine memory and reinforcement. We find an axi-symmetric, time-independent solution to the equations that resembles the death spiral. Finally, we prove numerically that the obtained steady-state solution is stable.

122) Andrew Gritsevskiy and Adithya Vellal, Development and Biological Analysis of a Neural Network Based Genomic Compression System (3 Mar 2017)

The advent of Next Generation Sequencing (NGS) technologies has resulted in a barrage of genomic data that is now available to the scientific community. This data contains information that is driving fields such as precision medicine and pharmacogenomics, where clinicians use a patient’s genetics in order to develop custom treatments. However, genomic data is immense in size, which makes it extremely costly to store, transport and process. A genomic compression system which takes advantage of intrinsic biological patterns can help reduce the costs associated with this data while also identifying important biological patterns. In this project, we aim to create a compression system which uses unsupervised neural networks to compress genomic data. The complete compression suite, GenComp, is compared to existing genomic data compression methods. The results are then analyzed to discover new biological features of genomic data. Testing showed that GenComp achieves at least 40 times more compression than existing variant compression solutions, while providing comparable decoding times in most applications. GenComp also provides some insight into genetic patterns, which has significant potential to aid in the fields of pharmacogenomics and precision medicine. Our results demonstrate that neural networks can be used to significantly compress genomic data while also assisting in better understanding genetic biology.

121) Vivek Bhupatiraju, John Kuszmaul, and Vinjai Vale, On the Viability of Distributed Consensus by Proof of Space (3 Mar 2017)

In this paper, we present our implementation of Proof of Space (PoS) and our study of its viability in distributed consensus. PoS is a new alternative to the commonly used Proof of Work, which is a protocol at the heart of distributed consensus systems such as Bitcoin. PoS resolves the two major drawbacks of Proof of Work: high energy cost and bias towards individuals with specialized hardware. In PoS, users must store large “hard-to-pebble” PTC graphs, which are recursively generated using subgraphs called superconcentrators. We implemented two types of superconcentrators to examine their differences in performance. Linear superconcentrators are about 1:8 times slower than butterfly superconcentrators, but provide a better lower bound on space consumption. Finally, we discuss our simulation of using PoS to reach consensus in a peer-to-peer network. We conclude that Proof of Space is indeed viable for distributed consensus. To the best of our knowledge, we are the first to implement linear superconcentrators and to simulate the use of PoS to reach consensus on a decentralized network.

120) Albert Yue, An Index-Type Invariant of Knot Diagrams and Bounds for Unknotting Framed Knots (3 Mar 2017)

We introduce a new knot diagram invariant called self-crossing index, or $\mathrm{SCI}$. We found that $\mathrm{SCI}$ changes by at most $\pm 1$ under framed Reidemeister moves, and specifically provides a lower bound for the number of 3 moves. We also found that $\mathrm{SCI}$ is additive under connected sums, and is a Vassiliev invariant of order 1. We also conduct similar calculations with Hass and Nowik's diagram invariant and cowrithe, and present a relationship between forward/backward, ascending/descending, and positive/negative 3 moves.

119) Valerie Zhang, Computer-Based Visualizations and Manipulations of Matching Paths (2 Mar 2017)

Given n points in the 2-D plane, a matching path is a path that starts at one of these n points and ends at a different one without going through any of the other n - 2 points. Matching paths, as well as an important operation called the Hurwitz move, come up naturally in the study of complex algebraic varieties. At the heart of the Hurwitz move is the twist operation, which “twists” one matching path along another to produce a new (third) matching path. Performing the twist operation by hand, however, is not only tedious but also prone to errors and unnecessary complications. Therefore, using computer-based methods to represent matching paths and perform the twist operation makes sense. In this project, which was coded in Java, computer-based methods are developed to perform the twist operation efficiently and accurately, providing a framework for visualizing and manipulating matching paths with computers. The computer program performs fast computations and represents matching paths as simply as possible in a simple visual interface. This program could be utilized when solving open problems in symplectic geometry: potential applications include characterizing the overtwistedness of contact manifolds, as well as better understanding braid group actions.

118) Harshal Sheth, Nihar Sheth, and Aashish Welling, Read-Copy Update in a Garbage Collected Environment (1 Mar 2017)

Read-copy update (RCU) is a synchronization mechanism that allows efficient parallelism when there are a high number of readers compared to writers. The primary use of RCU is in Linux, a highly popular operating system kernel. The Linux kernel is written in C, a language that is not garbage collected, and yet the functionality that RCU provides is effectively that of a “poor man’s garbage collector” (P. E. McKenney). RCU in C is also complicated to use, and this can lead to bugs. The purpose of this paper is to investigate whether RCU implemented in a garbage collected language (Go) is easier to use while delivering comparable performance to RCU in C. This is tested through the implementation and benchmarking of 4 linked lists, 2 using RCU and 2 using mutexes. One RCU linked list and one mutex linked list are implemented in each language. This paper finds that RCU in a garbage collected language is indeed significantly easier to use, has similar overall performance to, and on very high read loads, outperforms, RCU in C.

117) Xiangyao Yu (MIT), Siye Zhu (PRIMES), Justin Kaashoek (PRIMES), Andrew Pavlo (Carnegie Mellon University), and Srinivas Devadas (MIT), Taurus: A Parallel Transaction Recovery Method Based on Fine-Granularity Dependency Tracking (28 Feb 2017)

Logging is crucial to performance in modern multicore main-memory database management systems (DBMSs). Traditional data logging (ARIES) and command logging algorithms enforce a sequential order among log records using a global log sequence number (LSN). Log flushing and recovery after a crash are both performed in the LSN order. This serialization of transaction logging and recovery can limit the system performance at high core count. In this paper, we propose Taurus to break the LSN abstraction and enable parallel logging and recovery by tracking fine-grained dependencies among transactions. The dependency tracking lends Taurus three salient features. (1) Taurus decouples the transaction logging order with commit order and allows transactions to be flushed to persistent storage in parallel independently. Transactions that are persistent before commit can be discovered and ignored by the recovery algorithm using the logged dependency information. (2) Taurus can leverage multiple persistent devices for logging. (3) Taurus can leverage multiple devices and multiple worker threads for parallel recovery. Taurus improves logging and recovery parallelism for both data and command logging. .

116) Louis Golowich (PRIMES), Chiheon Kim (MIT), and Richard Zhou (PRIMES), Maximum Size of a Family of Pairwise Graph-Different Permutations (arXiv.org, 27 Feb 2017), published in The Electronic Journal of Combinatorics 24:4 (2017)

Two permutations of the vertices of a graph $G$ are called $G$-different if there exists an index $i$ such that $i$-th entry of the two permutations form an edge in $G$. We bound or determine the maximum size of a family of pairwise $G$-different permutations for various graphs $G$. We show that for all balanced bipartite graphs $G$ of order $n$ with minimum degree $n/2 - o(n)$, the maximum number of pairwise $G$-different permutations of the vertices of $G$ is $2^{(1-o(1))n}$. We also present examples of bipartite graphs $G$ with maximum degree $O(\log n)$ that have this property. We explore the problem of bounding the maximum size of a family of pairwise graph-different permutations when an unlimited number of disjoint vertices is added to a given graph. We determine this exact value for the graph of 2 disjoint edges, and present some asymptotic bounds relating to this value for graphs consisting of the union of $n/2$ disjoint edges.

115) Sathwik Karnik, On the Classification and Algorithmic Analysis of Carmichael Numbers (arXiv.org, 26 Feb 2017)

In this paper, we study the properties of Carmichael numbers, false positives to several primality tests. We provide a classification for Carmichael numbers with a proportion of Fermat witnesses of less than 50%, based on if the smallest prime factor is greater than a determined lower bound. In addition, we conduct a Monte Carlo simulation as part of a probabilistic algorithm to detect if a given composite number is Carmichael. We modify this highly accurate algorithm with a deterministic primality test to create a novel, more efficient algorithm that differentiates between Carmichael numbers and prime numbers.

114) Felix Wang, Functional equations in Complex Analysis and Number Theory (26 Feb 2017)

We study the following questions:
(1) What are all solutions to $f\circ \hat{f} = g\circ \hat{g}$ with $f,g,\hat{f},\hat{g}\in\mathbb{C}(X)$ being complex rational functions?
(2) For which rational functions $f(X)$ and $g(X)$ with rational coefficients does the equation $f(a)=g(b)$ have infinitely many solutions with $a,b\in$ $Q$?
We utilize various algebraic, geometric and analytic results in order to resolve both (1) and a variant of (2) in case the numerator of $f(X)-g(Y)$ is an irreducible polynomial in $\mathbb{C}[X,Y]$. Our results have applications in various mathematical fields, such as complex analysis, number theory, and dynamical systems. Our work resolves a 1973 question of Fried, and makes significant progress on a 1924 question of Ritt and a 1997 question of Lyubich and Minsky. In addition, we prove a quantitative refinement of a 2015 conjecture of Cahn, Jones and Spear.

113) Laura Pierson, Signatures of Stable Multiplicity Spaces in Restrictions of Representations of Symmetric Groups (25 Feb 2017)

Representation theory is a way of studying complex mathematical structures such as groups and algebras by mapping them to linear actions on vector spaces. Recently, Deligne proposed a new way to study the representation theory of finite groups by generalizing the collection of representations of a sequence of groups indexed by positive integer rank to an arbitrary complex rank, creating an abelian tensor category. In this project, we focused on the case of the symmetric groups $S_n,$ the groups of permutations of $n$ objects. Elements of the Deligne category Rep $S_t$ can be constructed by taking a stable sequence of $S_n$ representations for increasing $n$ and interpolating the associated formulas to an arbitrary complex number $t.$ In this project, we studied the case of restriction multiplicity spaces $V_{\lambda,\rho}$, counting the number of copies of an irreducible representation $V_{\rho}$ of $S_{n-k}$ in the restriction $\text{Res}_{S_{n-k}}^{S_n} V_{\lambda}$ of an irreducible representation of $S_n.$ We found formulas for norms of orthogonal basis vectors in these spaces, and ultimately for signatures (the number of basis vectors with positive norm minus the number with negative norm), an invariant that multiplies over tensor products and has important combinatorial connections.

112) Albert Gerovitch, Automatically Improving 3D Neuron Segmentations for Expansion Microscopy Connectomics (25 Feb 2017)

Understanding the geometry of neurons and their connections is key to comprehending brain function. This is the goal of a new optical approach to brain mapping using expansion microscopy (ExM), developed in the Boyden Lab at MIT to replace the traditional approach of electron microscopy. A challenge here is to perform image segmentation to delineate the boundaries of individual neurons. Currently, however, there is no method implemented for assessing a segmentation algorithm’s accuracy in ExM. The aim of this project is to create automated assessment of neuronal segmentation algorithms, enabling their iterative improvement. By automating the process, I aim to devise powerful segmentation algorithms that reveal the “connectome” of a neural circuit. I created software, called SEV-3D, which uses the pixel error and warping error metrics to assess 3D segmentations of single neurons. To allow better assessment beyond a simple numerical score, I visualized the results as a multilayered image. My program runs in a closed loop with a segmentation algorithm, modifying its parameters until the algorithm yields an optimal segmentation. I am further developing my application to enable evaluation of multi-cell segmentations. In the future, I aim to further implement the principles of machine learning to automatically improve the algorithms, yielding even better accuracy.

111) Kevin Chang, Upper Bounds for Ordered Ramsey Numbers of Small 1-Orderings (arXiv.org, 7 Feb 2017)

A $k$-ordering of a graph $G$ assigns distinct order-labels from the set $\{1,\ldots,|G|\}$ to $k$ vertices in $G$. Given a $k$-ordering $H$, the ordered Ramsey number $R_{<} (H)$ is the minimum $n$ such that every edge-2-coloring of the complete graph on the vertex set $\{1, \ldots, n\}$ contains a copy of $H$, the $i$th smallest vertex of which either has order-label $i$ in $H$ or no order-label in $H$.
This paper conducts the first systematic study of ordered Ramsey numbers for $1$-orderings of small graphs. We provide upper bounds for $R_{<} (H)$ for each connected $1$-ordering $H$ on $4$ vertices. Additionally, for every $1$-ordering $H$ of the $n$-vertex path $P_n$, we prove that $R_{<} (H) \in O(n)$. Finally, we provide an upper bound for the generalized ordered Ramsey number $R_{<} (K_n, H)$ which can be applied to any $k$-ordering $H$ containing some vertex with order-label $1$.

110) Nikhil Marda, On Equal Point Separation by Planar Cell Decompositions (arXiv.org, 17 Jan 2017)

In this paper, we investigate the problem of separating a set $X$ of points in $\mathbb{R}^{2}$ with an arrangement of $K$ lines such that each cell contains an asymptotically equal number of points (up to a constant ratio). We consider a property of curves called the stabbing number, defined to be the maximum countable number of intersections possible between the curve and a line in the plane. We show that large subsets of $X$ lying on Jordan curves of low stabbing number are an obstacle to equal separation. We further discuss Jordan curves of minimal stabbing number containing $X$. Our results generalize recent bounds on the Erdös-Szekeres Conjecture, showing that for fixed $d$ and sufficiently large $n$, if $|X| \ge 2^{c_dn/d + o(n)}$ with $c_d = 1 + O(\frac{1}{\sqrt{d}})$, then there exists a subset of $n$ points lying on a Jordan curve with stabbing number at most $d$.

109) Samuel Cohen and Peter Rowley, Results of Triangles Under Discrete Curve Shortening Flow (7 Jan 2017)

In this paper, we analyze the results of triangles under discrete curve shortening flow, specifically isosceles triangles with top angles greater than $\frac{\pi}{3}$, and scalene triangles. By considering the location of the three vertices of the triangle after some small time $\epsilon$, we use the definition of the derivative to calculate a system of differential equations involving parameters that can describe the triangle. Constructing phase plane diagrams and then analyzing them, we find that the singular behavior of discrete curve shorting flow on isosceles triangles with top angles greater than $\frac{\pi}{3}$ is a point, and for scalene triangles is a line segment.

108) Matthew Hase-Liu (PRIMES) and Nicholas Triantafillou (MIT), Efficient Point-Counting Algorithms for Superelliptic Curves (7 Jan 2017; arXiv.org, 7 Sep 2017)

In this paper, we present efficient algorithms for computing the number of points and the order of the Jacobian group of a superelliptic curve over finite fields of prime order p. Our method employs the Hasse-Weil bounds in conjunction with the Hasse-Witt matrix for superelliptic curves, whose entries we express in terms of multinomial coefficients. We present a fast algorithm for counting points on specific trinomial superelliptic curves and a slower, more general method for all superelliptic curves. For the first case, we reduce the problem of simplifying the entries of the Hasse-Witt matrix modulo p to a problem of solving quadratic Diophantine equations. For the second case, we extend Bostan et al.'s method for hyperelliptic curves to general superelliptic curves. We believe the methods we describe are asymptotically the most efficient known point-counting algorithms for certain families of trinomial superelliptic curves.

107) P.A. CrowdMath, Bounds on parameters of minimally non-linear patterns (arXiv.org, 31 Dec 2016), published in the Electronic Journal of Combinatorics 25:1 (2018)

Let $ex(n, P)$ be the maximum possible number of ones in any 0-1 matrix of dimensions $n \times n$ that avoids $P$. Matrix $P$ is called minimally non-linear if $ex(n, P) = \omega(n)$ but $ex(n, P') = O(n)$ for every strict subpattern $P'$ of $P$. We prove that the ratio between the length and width of any minimally non-linear 0-1 matrix is at most $4$, and that a minimally non-linear 0-1 matrix with $k$ rows has at most $5k-3$ ones. We also obtain an upper bound on the number of minimally non-linear 0-1 matrices with $k$ rows.
In addition, we prove corresponding bounds for minimally non-linear ordered graphs. The minimal non-linearity that we investigate for ordered graphs is for the extremal function $ex_{<}(n, G)$, which is the maximum possible number of edges in any ordered graph on $n$ vertices with no ordered subgraph isomorphic to $G$.

106) Seth Shelley-Abrahamson (MIT) and Alec Sun (PRIMES), Towards a Classification of Finite-Dimensional Representations of Rational Cherednik Algebras of Type D (arXiv.org, 15 Dec 2016)

Using a combinatorial description due to Jacon and Lecouvey of the wall crossing bijections for cyclotomic rational Cherednik algebras, we show that the irreducible representations $L_c(\lambda^\pm)$ of the rational Cherednik algebra $H_c(D_n, \mathbb{C}^n)$ of type $D$ for symmetric bipartitions $\lambda$ are infinite dimensional for all parameters $c$. In particular, all finite-dimensional irreducible representations of rational Cherednik algebras of type $D$ arise as restrictions of finite-dimensional irreducible representations of rational Cherednik algebras of type $B$.

105) Nicholas Guo (PRIMES) and Guangyi Yue (MIT), Counting Independent Sets in Graphs of Hyperplane Arrangements (arXiv.org, 13 Dec 2016), published in Discrete Mathematics , vol. 343:3 (March 2020)

In this paper, we count the number of independent sets of a type of graph $G(\mathcal{A},q)$ associated to some hyperplane arrangement $\mathcal{A}$, which is a generalization of the construction of graphical arrangements. We show that when the parameters of $\mathcal{A}$ satisfy certain conditions, the number of independent sets of the disjoint union $G(\mathcal{A},q_1)\cup\cdots\cup G(\mathcal{A},q_s)$ depends only on the coefficients of $\mathcal{A}$ and the total number of vertices $\sum_i q_i$ when $q_i$'s are powers of large enough prime numbers. In addition it is independent of the coefficients as long as $\mathcal{A}$ is central and the coefficients are multiplicatively independent.

112) Albert Gerovitch, Automatically Improving 3D Neuron Segmentations for Expansion Microscopy Connectomics (25 Feb 2017)

111) Kevin Chang, Upper Bounds for Ordered Ramsey Numbers of Small 1-Orderings (arXiv.org, 7 Feb 2017)

110) Nikhil Marda, On Equal Point Separation by Planar Cell Decompositions (arXiv.org, 17 Jan 2017)

109) Samuel Cohen and Peter Rowley, Results of Triangles Under Discrete Curve Shortening Flow (7 Jan 2017)

108) Matthew Hase-Liu (PRIMES) and Nicholas Triantafillou (MIT), Efficient Point-Counting Algorithms for Superelliptic Curves (7 Jan 2017; arXiv.org, 7 Sep 2017)

107) P.A. CrowdMath, Bounds on parameters of minimally non-linear patterns (arXiv.org, 31 Dec 2016), published in the Electronic Journal of Combinatorics 25:1 (2018)

106) Seth Shelley-Abrahamson (MIT) and Alec Sun (PRIMES), Towards a Classification of Finite-Dimensional Representations of Rational Cherednik Algebras of Type D (arXiv.org, 15 Dec 2016)

105) Nicholas Guo (PRIMES) and Guangyi Yue (MIT), Counting Independent Sets in Graphs of Hyperplane Arrangements (arXiv.org, 13 Dec 2016), published in Discrete Mathematics , vol. 343:3 (March 2020)

104) Yatharth Agarwal (PRIMES), Vishnu Murale (PRIMES), Jason Hennessey (Boston University), Kyle Hogan (Boston University), and Mayank Varia (Boston University), Moving in Next Door: Network Flooding as a Side Channel in Cloud Environments (14-16 Nov 2016), published in Sara Foresti and Giuseppe Persiano, eds., Cryptology and Network Security: 15th International Conference Proceedings, CANS 2016, Milan, Italy, November 14–16, 2016 , pp. 755-760.

Co-locating multiple tenants’ virtual machines (VMs) on the same host underpins public clouds’ affordability, but sharing physical hardware also exposes consumer VMs to side channel attacks from adversarial co-residents. We demonstrate passive bandwidth measurement to perform traffic analysis attacks on co-located VMs. Our attacks do not assume a privileged position in the network or require any communication between adversarial and victim VMs. Using a single feature in the observed bandwidth data, our algorithm can identify which of 3 potential YouTube videos a co-resident VM streamed with 66 % accuracy. We discuss defense from both a cloud provider’s and a consumer’s perspective, showing that effective defense is difficult to achieve without costly under-utilization on the part of the cloud provider or over-utilization on the part of the consumer.

103) Dhruv Rohatgi, A Connection Between Vector Bundles over Smooth Projective Curves and Representations of Quivers (31 Oct 2016)

We create a partition bijection that yields a partial result on a recent conjecture by Schiffmann relating the problems of counting over a finite field (1) vector bundles over smooth projective curves, and (2) representations of quivers.

102) Aaron Yeiser (PRIMES) and Alex Townsend (Cornell University), A spectral element method for meshes with skinny elements (30 Oct 2016; arXiv.org, 27 Mar 2018)

When numerically solving partial differential equations (PDEs), the first step is often to discretize the geometry using a mesh and to solve a corresponding discretization of the PDE. Standard finite and spectral element methods require that the underlying mesh has no skinny elements for numerical stability. Here, we develop a novel spectral element method that is numerically stable on meshes that contain skinny elements, while also allowing for high degree polynomials on each element. Our method is particularly useful for PDEs for which anisotropic mesh elements are beneficial and we demonstrate it with a Navier--Stokes simulation. Code for our method can be found at this URL .

101) Tanya Khovanova (MIT) and Rafael Saavedra (PRIMES), Discreet Coin Weighings and the Sorting Strategy (arXiv.org, 23 Sep 2016)

In 2007, Alexander Shapovalov posed an old twist on the classical coin weighing problem by asking for strategies that manage to conceal the identities of specific coins while providing general information on the number of fake coins. In 2015, Diaco and Khovanova studied various cases of these "discreet strategies" and introduced the revealing factor, a measure of the information that is revealed.
In this paper we discuss a natural coin weighing strategy which we call the sorting strategy: divide the coins into equal piles and sort them by weight. We study the instances when the strategy is discreet, and given an outcome of the sorting strategy, the possible number of fake coins. We prove that in many cases, the number of fake coins can be any value in an arithmetic progression whose length depends linearly on the number of coins in each pile. We also show the strategy can be discreet when the number of fake coins is any value within an arithmetic subsequence whose length also depends linearly on the number of coins in each pile. We arrive at these results by connecting our work to the classic Frobenius coin problem. In addition, we calculate the revealing factor for the sorting strategy.

100) Kai-Siang Ang (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), On the geometry of regular icosahedral capsids containing disymmetrons (arXiv.org, 29 Aug 2016), published in Journal of Structural Biology (19 Jan 2017)

Icosahedral virus capsids are composed of symmetrons, organized arrangements of capsomers. There are three types of symmetrons: disymmetrons, trisymmetrons, and pentasymmetrons, which have different shapes and are centered on the icosahedral 2-fold, 3-fold and 5-fold axes of symmetry, respectively. In 2010 [Sinkovits & Baker] gave a classification of all possible ways of building an icosahedral structure solely from trisymmetrons and pentasymmetrons, which requires the triangulation number T to be odd. In the present paper we incorporate disymmetrons to obtain a geometric classification of icosahedral viruses formed by regular penta-, tri-, and disymmetrons. For every class of solutions, we further provide formulas for symmetron sizes and parity restrictions on h, k, and T numbers. We also present several methods in which invariants may be used to classify a given configuration.

99) Tanya Khovanova (MIT) and Shuheng (Nelson) Niu (PRIMES), m -Modular Wythoff (arXiv.org, 2 Aug 2016), published in Games of No Chance, vol. 6 (Cambridge University Press, 2025), pp. 491-506

We introduce a variant of Wythoff's game that we call $m$-modular Wythoff's game. In the original Wythoff's game, players can take a positive number of tokens from one pile, or they can take a positive number of tokens from both piles if the number of tokens they take from the first pile is equal to the number of tokens they take from the second. In our variant, we weaken this equality condition to one of equivalence modulo m. We characterize the $P$-positions of our $m$-modular variant as a finite subset of the $P$-positions of the known $P$-positions of the original Wythoff's game.

2015 Research Papers

98) Caleb Ji, Robin Park, and Angela Song, Combinatorial Games of No Strategy (20 Aug 2016)

In this paper, we study a particular class of combinatorial game motivated by previous research conducted by Professor James Propp, called Games of No Strategy , or games whose winners are predetermined. Finding the number of ways to play such games often leads to new combinatorial sequences and involves methods from analysis, number theory, and other fields. For the game Planted Brussel Sprouts , a variation on the well-known game Sprouts, we find a new proof that the number of ways to play is equal to the number of spanning trees on n vertices, and for Mozes’ Game of Numbers , a game studied for its interesting connections with other fields, we use prior work by Alon to calculate the number of ways to play the game for a certain case. Finally, in the game Binary Fusion , we show through both algebraic and combinatorial proofs that the number of ways to play generates Catalan’s triangle.

97) Meena Jagadeesan, The Exchange Graphs of Weakly Separated Collections (arXiv.org, 19 Aug 2016)

Weakly separated collections arise in the cluster algebra derived from the Pl\"ucker coordinates on the nonnegative Grassmannian. Oh, Postnikov, and Speyer studied weakly separated collections over a general Grassmann necklace $\mathcal{I}$ and proved the connectivity of every exchange graph. Oh and Speyer later introduced a generalization of exchange graphs that we call $\mathcal{C}$-constant graphs. They characterized these graphs in the smallest two cases. We prove an isomorphism between exchange graphs and a certain class of $\mathcal{C}$-constant graphs. We use this to extend Oh and Speyer's characterization of these graphs to the smallest four cases, and we present a conjecture on a bound on the maximal order of these graphs. In addition, we fully characterize certain classes of these graphs in the special cases of cycles and trees.

96) Nicholas Diaco, Counting Counterfeit Coins: A New Coin Weighing Problem (arXiv.org, 13 Jun 2016)

In 2007, a new variety of the well-known problem of identifying a counterfeit coin using a balance scale was introduced in the sixth International Kolmogorov Math Tournament. This paper offers a comprehensive overview of this new problem by presenting it in the context of the traditional coin weighing puzzle and then explaining what makes the new problem mathematically unique. Two weighing strategies described previously are used to derive lower bounds for the optimal number of admissible situations for given parameters. Additionally, a new weighing procedure is described that can be adapted to provide a solution for a broad spectrum of initial parameters by representing the number of counterfeit coins as a linear combination of positive integers. In closing, we offer a new form of the traditional counterfeit coin problem and provide a lower bound for the number of weighings necessary to solve it.

95) Jesse Geneson (MIT) and Meghal Gupta (PRIMES), Bounding extremal functions of forbidden 0-1 matrices using (r,s) -formations (19 Mar 2016)

First, we prove tight bounds of $n 2^{\frac{1}{(t-2)!}\alpha(n)^{t-2} \pm O(\alpha(n)^{t-3})}$ on the extremal function of the forbidden pair of ordered sequences $(1 2 3 \ldots k)^t$ and $(k \ldots 3 2 1)^t$ using bounds on a class of sequences called $(r,s)$-formations. Then, we show how an analogous method can be used to derive similar bounds on the extremal functions of forbidden pairs of $0-1$ matrices consisting of horizontal concatenations of identical identity matrices and their horizontal reflections.

94) Varun Jain, Novel Relationships Between Circular Planar Graphs and Electrical Networks (20 Feb 2016)

Circular planar graphs are used to model electrical networks, which arise in classical physics. Associated with such a network is a network response matrix, which carries information about how the network behaves in response to certain potential differences. Circular planar graphs can be organized into equivalence classes based upon these response matrices. In each equivalence class, certain fundamental elements are called critical. Additionally, it is known that equivalent graphs are related by certain local transformations. Using wiring diagrams, we first investigate the number of Y-∆ transformations required to transform one critical graph in an equivalence class into another, proving a quartic bound in the order of the graph. Next, we consider positivity phenomena, studying how testing the signs of certain circular minors can be used to determine if a given network response matrix is associated with a particular equivalence class. In particular, we prove a conjecture by Kenyon and Wilson for some cases.

93) Arthur Azvolinsky, Explicit Computations of the Frozen Boundaries of Rhombus Tilings of Polygonal Domains (12 Feb 2016)

Consider a polygonal domain $\Omega$ drawn on a regular triangular lattice. A rhombus tiling of $\Omega$ is defined as a complete covering of the domain with $60^{\textrm{o}}$-rhombi, where each one is obtained by gluing two neighboring triangles together. We consider a uniform measure on the set of all tilings of $\Omega$. As the mesh size of the lattice approaches zero while the polygon remains fixed, a random tiling approaches a deterministic limit shape. An important phenomenon that occurs with the convergence towards a limit shape is the formation of frozen facets ; that is, areas where there are asymptotically tiles of only one particular type. The sharp boundary between these ordered facet formations and the disordered region is a curve inscribed in $\Omega$. This inscribed curve is defined as the frozen boundary . The goal of this project was to understand the purely algebraic approach, elaborated on in a paper by Kenyon and Okounkov, to the problem of explicitly computing the frozen boundary. We will present our results for a number of special cases we considered.

92) David Amirault, Better Bounds on the Rate of Non-Witnesses of Lucas Pseudoprimes (3 Feb 2016)

Efficient primality testing is fundamental to modern cryptography for the purpose of key generation. Different primality tests may be compared using their runtimes and rates of non-witnesses. With the Lucas primality test, we analyze the frequency of Lucas pseudoprimes using MATLAB. We prove that a composite integer n can be a strong Lucas pseudoprime to at most ¹ ⁄ ₆ of parameters P , Q unless n belongs to a short list of exception cases, thus improving the bound from the previous result of ⁴ ⁄ ₁₅ : We also explore the properties obeyed by such exceptions and how these cases may be handled by an extended version of the Lucas primality test.

91) Daniel Guo, An Infection Spreading Model on Binary Trees (26 Jan 2016)

An important and ongoing topic of research is the study of infectious diseases and the speed at which these diseases spread. Modeling the spread and growth of such diseases leads to a more precise understanding of the phenomenon and accurate predictions of spread in real life. We consider a long-range infection model on an infinite regular binary tree. Given a spreading coefficient $\alpha>1$, the time it takes for the infection to travel from one node to another node below it is exponentially distributed with specific rate functions such as $2^{-k}k^{-\alpha}$ or $\frac{1}{\alpha^k}$, where $k$ is the difference in layer number between the two nodes. We simulate and analyze the time needed for the infection to reach layer $m$ or below starting from the root node. The resulting time is recorded and graphed for different values of $\alpha$ and $m$. Finally, we prove rigorous lower and upper bounds for the infection time, both of which are approximately logarithmic with respect to $m$. The same techniques and results are valid for other regular $d$-ary trees, in which each node has exactly $d$ children where $d>2$.

90) Jacob Klegar, Bounded Tiling-Harmonic Functions on the Integer Lattice (25 Jan 2016)

Tiling-harmonic functions are a class of functions on square tilings that minimize a specific energy. These functions may provide a useful tool in studying square Sierpinski carpets. In this paper we show two new Maximum Modulus Principles for these functions, prove Harnack's Inequality, and give a proof that the set of tiling-harmonic functions is closed. One of these Maximum Modulus Principles is used to show that bounded infinite tiling-harmonic functions must have arbitrarily long constant lines. Additionally, we give three sufficient conditions for tiling-harmonic functions to be constant. Finally, we explore comparisons between tiling and graph-harmonic functions, especially in regards to oscillating boundary values.

89) Richard Yi, A Probability-Based Model of Traffic Flow (22 Jan 2016)

Describing the behavior of traffic via mathematical modeling and computer simulation has been a challenge confronted by mathematicians in various ways throughout the last century. In this project, we introduce various existing traffic flow models and present a new, probability-based model that is a hybrid of the microscopic and macroscopic views, drawing upon current ideas in traffic flow theory. We examine the correlations found in the data of our computer simulation. We hope that our results could help civil engineers implement efficient road systems that fit their needs, as well as contribute toward the design of safely operating unmanned vehicles.

88) Kenz Kallal, Matthew Lipman, and Felix Wang, Equal Compositions of Rational Functions (21 Jan 2016)

We study the following questions:
(1) What are all solutions to $f\circ \hat{f} = g\circ \hat{g}$ in complex rational functions $f,g\in\mathbb{C}(X)$ and meromorphic functions $\hat{f}, \hat{g}$ on the complex plane?
(2) For which rational functions $f(X)$ and $g(X)$ with coefficients in an algebraic number field $K$ does the equation $f(a)=g(b)$ have infinitely many solutions with $a,b\in K$?
We utilize various algebraic, geometric and analytic results in order to resolve both questions in the case that the numerator of $f(X)-g(Y)$ is an irreducible polynomial in $\mathbb{C}[X,Y]$ of sufficiently large degree. Our work answers a 1973 question of Fried in all but finitely many cases, and makes significant progress towards answering a 1924 question of Ritt and a 1997 question of Lyubich and Minsky.

87) Dhruv Medarametla, Bounding Norms of Locally Random Matrices (21 Jan 2016)

Recently, several papers proving lower bounds for the performance of the Sum Of Squares Hierarchy on the planted clique problem have come out. A crucial part of all four papers is probabilistically bounding the norms of certain \locally random" matrices. In these matrices, the entries are not completely independent of each other, but rather depend upon a few edges of the input graph. In this paper, we study the norms of these locally random matrices. We start by bounding the norms of simple locally random matrices, whose entries depend on a bipartite graph H and a random graph G ; we then generalize this result by bounding the norms of complex locally random matrices, matrices based o of a much more general graph H and a random graph G . For both cases, we prove almost-tight probabilistic bounds on the asymptotic behavior of the norms of these matrices.

86) Rachel Zhang, Statistics of Intersections of Curves on Surfaces (19 Jan 2016)

Each orientable surface with nonempty boundary can be associated with a planar model, whose edges can then be labeled with letters that read out a surface word. Then, the curve word of a free homotopy class of closed curves on a surface is the minimal sequence of edges of the planar model through which a curve in the class passes. The length of a class of curves is defined to be the number of letters in its curve word. We fix a surface and its corresponding planar model.
Fix a free homotopy class of curves ω on the surface. For another class of curves c , let i (ω; c ) be the minimal number of intersections of curves in ω and c . In this paper, we show that the mean of the distribution of i (ω; c ), for random curve c of length n , grows proportionally with n and approaches μ(ω) ⋅ n for a constant μ(ω). We also give an algorithm to compute μ(ω) and have written a program that calculates μ(ω) for any curve ω on any surface. In addition, we prove that i (ω; c ) approahces a Gaussian distribution as n → ∞ by viewing the generation of a random curve as a Markov Chain.

85) Cristian Gutu and Fengyao Ding, SecretRoom: An Anonymous Chat Client (16 Jan 2016)

While many people would like to be able to communicate anonymously, the few existing anonymous communication systems sacrifice anonymity for performance, or viceversa. The most popular such app is Tor, which relies on a series of relays to protect anonymity. Though proven to be efficient, Tor does not guarantee anonymity in the presence of strong adversaries like ISPs and government agencies who can conduct indepth traffic analysis. In contrast, our messaging application, SecretRoom, implements an improved version of a secure messaging protocol called Dining Cryptographers Networks (DCNets) to guarantee true anonymity in moderately sized groups. However, unlike traditional DCNets, SecretRoom does not require direct communication between all participants and does not depend on the presence of honest clients for anonymity. By introducing an untrusted server that performs the DCNet protocol on behalf of the clients, SecretRoom manages to reduce the O( n ² ) communication associated with traditional DCNets to O( n ) for n clients. Moreover, by introducing artificially intelligent clients, SecretRoom makes the anonymity set size independent of the number of “real” clients. Ultimately SecretRoom reduces the communication to O( n ) and allows the DCNet protocol to scale to hundreds of clients compared to a few tens of clients in traditional DCNets.

84) Girishvar Venkat, Signatures of the Contravariant Form on Representations of the Hecke Algebra and Rational Cherednik Algebra associated to G ( r ,1, n ) (15 Jan 2016)

The Hecke algebra and rational Cherednik algebra of the group G ( r ,1, n ) are non-commutative algebras that are deformations of certain classical algebras associated to the group. These algebras have numerous applications in representation theory, number theory, algebraic geometry and integrable systems in quantum physics. Consequently, understanding their irreducible representations is important. If the deformation parameters are generic, then these irreducible representations, called Specht modules in the case of the Hecke algebra and Verma modules in the case of the Cherednik algebra, are in bijection with the irreducible representations of G ( r ,1, n ). However, while every irreducible representation of G ( r ,1, n ) is unitary, the Hermitian contravariant form on the Specht modules and Verma modules may only be non-degenerate. Thus, the signature of this form provides a great deal of information about the representations of the algebras that cannot be seen by looking at the group representations. In this paper, we compute the signature of arbitrary Specht modules of the Hecke algebra and use them to give explicit formulas of the parameter values for which these modules are unitary. We also compute asymptotic limits of existing formulas for the signature character of the polynomial representations of the Cherednik algebra which are vastly simpler than the full signature characters and show that these limits are rational functions in t . In addition, we show that for half of the parameter values, for each k , the degree k portion of the polynomial representation is unitary for large enough n .

83) Mehtaab Sawhney (PRIMES) and Jonathan Weed (MIT), Further results on arc and bar k-visibility graphs (arXiv.org, 6 Jan 2016)

We consider visibility graphs involving bars and arcs in which lines of sight can pass through up to k objects. We prove a new edge bound for arc k-visibility graphs, provide maximal constructions for arc and semi-arc k-visibility graphs, and give a complete characterization of semi-arc visibility graphs. We show that the family of arc i-visibility graphs is never contained in the family of bar j-visibility graphs for any i and j, and that the family of bar i-visibility graphs is not contained in the family of bar j-visibility graphs for $i \neq j$. We also give the first thickness bounds for arc and semi-arc k-visibility graphs. Finally, we introduce a model for random semi-bar and semi-arc k-visibility graphs and analyze its properties.

82) Harshal Sheth and Aashish Welling, An Implementation and Analysis of a Kernel Network Stack in Go with the CSP Style (30 Dec 2015; arXiv.org, 17 Mar 2016)

Modern operating system kernels are written in lower-level languages such as C. Although the low-level functionalities of C are often useful within kernels, they also give rise to several classes of bugs. Kernels written in higher level languages avoid many of these potential problems, at the possible cost of decreased performance. This research evaluates the advantages and disadvantages of a kernel written in a higher level language. To do this, the network stack subsystem of the kernel was implemented in Go with the Communicating Sequential Processes (CSP) style. Go is a high-level programming language that supports the CSP style, which recommends splitting large tasks into several smaller ones running in independent "threads". Modules for the major networking protocols, including Ethernet, ARP, IPv4, ICMP, UDP, and TCP, were implemented. In this study, the implemented Go network stack, called GoNet, was compared to a representative network stack written in C. The GoNet code is more readable and generally performs better than that of its C stack counterparts. From this, it can be concluded that Go with CSP style is a viable alternative to C for the language of kernel implementations.

81) Xiangyao Yu (MIT), Hongzhe Liu (PRIMES), Ethan Zou (PRIMES), and Srini Devadas (MIT), Tardis 2.0: An Optimized Time Traveling Coherence Protocol (arXiv.org, 27 Nov 2015), published in Proceedings of the 2016 International Conference on Parallel Architectures and Compilation (PACT '16), pp. 261-274.

The scalability of cache coherence protocols is a significant challenge in multicore and other distributed shared memory systems. Traditional snoopy and directory-based coherence protocols are difficult to scale up to many-core systems because of the overhead of broadcasting and storing sharers for each cacheline. Tardis, a recently proposed coherence protocol, shows potential in solving the scalability problem, since it only requires O(logN) storage per cacheline for an N-core system and needs no broadcasting support. The original Tardis protocol, however, only supports the sequential consistency memory model. This limits its applicability in real systems since most processors today implement relaxed consistency models like Total Store Order (TSO). Tardis also incurs large network traffic overhead on some benchmarks due to an excessive number of renew messages. Furthermore, the original Tardis protocol has suboptimal performance when the program uses spinning to communicate between threads. In this paper, we address these downsides of Tardis protocol and make it significantly more practical. Specifically, we discuss the architectural, memory system and protocol changes required in order to implement TSO consistency model on Tardis, and prove that the modified protocol satisfies TSO. We also propose optimizations for better leasing policies and to handle program spinning. Evaluated on 20 benchmarks, optimized Tardis at 64 (256) cores can achieve average performance improvement of 15.8% (8.4%) compared to the baseline Tardis and 1% (3.4%) compared to the baseline directory protocol. Our optimizations also reduce the average network traffic by 4.3% (6.1%) compared to the baseline directory protocol. On this set of benchmarks, optimized Tardis improves on a fullmap directory protocol in the metrics of energy, performance and storage, while being simpler to implement.

80) Allison Paul, Spectral Inference of a Directed Acyclic Graph Using Pairwise Similarities (11 Nov 2015)

A gene ontology graph is a directed acyclic graph (DAG) which represents relationships among biological processes. Inferring such a graph using a gene similarity matrix is NP-hard in general. Here, we propose an approximate algorithm to solve this problem efficiently by reducing the dimensionality of the problem using spectral clustering. We show that the original problem can be simplified to the inference problem of overlapping clusters in a network. We then solve the simplified problem in two steps: first we infer clusters using a spectral clustering technique. Then, we identify possible overlaps among the inferred clusters by identifying maximal cliques over the cluster similarity graph. We illustrate the effectiveness of our method over various synthetic networks in terms of both the performance and computational complexity compared to existing methods.

79) Niket Gowravaram , A Variation of nil-Temperley-Lieb Algebras of type A (26 Sep 2015)

We investigate a variation on the nil-Temperley-Lieb algebras of type A. This variation is formed by removing one of the relations and, in some sense, can be considered as a type B of the algebras. We give a general description of the structure of monomials formed by generators in the algebras. We also show that the dimension of these algebras is the sequence ${2n \choose n}$, by showing that the dimension is the Catalan transform of the sequence $2^n$.

78) Caleb Ji, Tanya Khovanova (MIT), Robin Park, and Angela Song, Chocolate Numbers (arXiv.org, 21 Sep 2015), published in Journal of Integer Sequences , vol. 19 (2016)

In this paper, we consider a game played on a rectangular $m \times n$ gridded chocolate bar. Each move, a player breaks the bar along a grid line. Each move after that consists of taking any piece of chocolate and breaking it again along existing grid lines, until just $mn$ individual squares remain.
This paper enumerates the number of ways to break an $m \times n$ bar, which we call chocolate numbers, and introduces four new sequences related to these numbers. Using various techniques, we prove interesting divisibility results regarding these sequences.

77) Albert Gerovitch, Andrew Gritsevskiy, and Gregory Barboy, Mobile Health Surveillance: The Development of Software Tools for Monitoring the Spread of Disease (21 Sep 2015)

Disease spread monitoring data often comes with a significant delay and low geospatial resolution. We aim to develop a software tool for data collection, which enables daily monitoring and prediction of the spread of disease in a small community. We have developed a crowdsourcing application that collects users' health statuses and locations. It allows users to update their daily status online, and, in return, provides a visual map of geospatial distribution of sick people in a community, outlining locations with increased disease incidence. Currently, due to the lack of a large user base, we substitute this information with simulated data, and demonstrate our program's capabilities on a hypothetical outbreak. In addition, we use analytical methods for predicting town-level disease spread in the future. We model the disease spread via interpersonal probabilistic interactions on an undirected social graph. The network structure is based on scale-free networks integrated with Census data. The epidemic is modeled using the Susceptible-Infected-Recovered (SIR) model and a set of parameters, including transmission rate and vaccination patterns. The developed application will provide better methods for early detection of epidemics, identify places with high concentrations of infected people, and predict localized disease spread.

76) Niket Gowravaram and Tanya Khovanova (MIT), On the Structure of nil-Temperley-Lieb Algebras of type A (arXiv.org, 1 Sep 2015)

We investigate nil-Temperley-Lieb algebras of type A. We give a general description of the structure of monomials formed by the generators. We also show that the dimensions of these algebras are the famous Catalan numbers by providing a bijection between the monomials and Dyck paths. We show that the distribution of these monomials by degree is the same as the distribution of Dyck paths by the sum of the heights of the peaks minus the number of peaks.

75) Tanya Khovanova (MIT) and Karan Sarkar, P-positions in Modular Extensions to Nim (arXiv.org, 27 Aug 2015), published in International Journal of Game Theory , vol. 46 (2017)

In this paper, we consider a modular extension to the game of Nim, which we call $m$-Modular Nim, and explore its optimal strategy. In $m$-Modular Nim, a player can either make a standard Nim move or remove a multiple of $m$ tokens in total. We develop a winning strategy for all $m$ with $2$ heaps and for odd $m$ with any number of heaps.

74) Nicholas Diaco and Tanya Khovanova (MIT), Weighing Coins and Keeping Secrets (arXiv.org, 20 Aug 2015), published in Mathematical Intelligencer (September 2016)

In this expository paper we discuss a relatively new counterfeit coin problem with an unusual goal: maintaining the privacy of, rather than revealing, counterfeit coins in a set of both fake and real coins. We introduce two classes of solutions to this problem --- one that respects the privacy of all the coins and one that respects the privacy of only the fake coins --- and give several results regarding each. We describe and generalize 6 unique strategies that fall into these two categories. Furthermore, we explain conditions for the existence of a solution, as well as showing proof of a solution's optimality in select cases. In order to quantify exactly how much information is revealed by a given solution, we also define the revealing factor and revealing coefficient; these two values additionally act as a means of comparing the relative effectiveness of different solutions. Most importantly, by introducing an array of new concepts, we lay the foundation for future analysis of this very interesting problem, as well as many other problems related to privacy and the transfer of information.

73) Luke Sciarappa, Simple commutative algebras in Deligne's categories Rep($S_t$) (arXiv.org, 24 Jun 2015)

We show that in the Deligne categories $\mathrm{Rep}(S_t)$ for $t$ a transcendental number, the only simple algebra objects are images of simple algebras in the category of representations of a symmetric group under a canonical induction functor. They come in families which interpolate the families of algebras of functions on the cosets of $H\times S_{n-k}$ in $S_n$, for a fixed subgroup $H$ of $S_k$.

2014 Research Papers

72) Geoffrey Fudenberg (Harvard), Maxim Imakaev (MIT), Carolyn Lu (PRIMES), Anton Goloborodko (MIT), Nezar Abdennur (MIT), and Leonid Mirny (MIT), Formation of Chromosomal Domains by Loop Extrusion (bioRxiv, 14 Aug 2015), published in Cell Reports 15:9 (31 May 2016): 2038–2049.

Characterizing how the three-dimensional organization of eukaryotic interphase chromosomes modulates regulatory interactions is an important contemporary challenge. Here we propose an active process underlying the formation of chromosomal domains observed in Hi-C experiments. In this process, cis-acting factors extrude progressively larger loops, but stall at domain boundaries; this dynamically forms loops of various sizes within but not between domains. We studied this mechanism using a polymer model of the chromatin fiber subject to loop extrusion dynamics. We find that systems of dynamically extruded loops can produce domains as observed in Hi-C experiments. Our results demonstrate the plausibility of the loop extrusion mechanism, and posit potential roles of cohesin complexes as a loop-extruding factor, and CTCF as an impediment to loop extrusion at domain boundaries.

71) Kavish Gandhi , Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube (4 April 2015)

A geodesic in the hypercube is the shortest possible path between two vertices. Leader and Long (2013) conjectured that, in every antipodal $2$-coloring of the edges of the hypercube, there exists a monochromatic geodesic between antipodal vertices. For this and an equivalent conjecture, we prove the cases $n = 2, 3, 4, 5$. We also examine the maximum number of monochromatic geodesics of length $k$ in an antipodal $2$-coloring and find it to be $2^{n-1}(n-k+1)\binom{n-1}{k-1}(k-1)!$. In this case, we classify all colorings in which this maximum occurs. Furthermore, we explore the maximum number of antipodal geodesics in a subgraph of the hypercube with a fixed proportion of edges, providing a conjectured optimal configuration as a lower bound, which, interestingly, contains a constant proportion of geodesics with respect to $n$. Finally, we present a series of smaller results that could be of use in finding an upper bound on the maximum number of antipodal geodesics in such a subgraph of the hypercube.

70) Jesse Geneson (MIT) and Peter M. Tian (PRIMES), Sequences of formation width $4$ and alternation length $5$ (arXiv.org, 13 Feb 2015)

Sequence pattern avoidance is a central topic in combinatorics. A sequence $s$ contains a sequence $u$ if some subsequence of $s$ can be changed into $u$ by a one-to-one renaming of its letters. If $s$ does not contain $u$, then $s$ avoids $u$. A widely studied extremal function related to pattern avoidance is $Ex(u, n)$, the maximum length of an $n$-letter sequence that avoids $u$ and has every $r$ consecutive letters pairwise distinct, where $r$ is the number of distinct letters in $u$.
We bound $Ex(u, n)$ using the formation width function, $fw(u)$, which is the minimum $s$ for which there exists $r$ such that any concatenation of $s$ permutations, each on the same $r$ letters, contains $u$. In particular, we identify every sequence $u$ such that $fw(u)=4$ and $u$ contains $ababa$. The significance of this result lies in its implication that, for every such sequence $u$, we have $Ex(u, n) = \Theta(n \alpha(n))$, where $\alpha(n)$ denotes the incredibly slow-growing inverse Ackermann function. We have thus identified the extremal function of many infinite classes of previously unidentified sequences.

69) William Wu (PRIMES), Nicolaas Kaashoek (PRIMES), Matthew Weinberg (MIT), Christos Tzamos (MIT), and Costis Daskalakis (MIT), Game Theory based Peer Grading Mechanisms for MOOCs , paper for the Learning at Scale 2015 conference , March 14-18, 2015, Vancouver, BC, Canada (4 February 2015)

An efficient peer grading mechanism is proposed for grading the multitude of assignments in online courses. This novel approach is based on game theory and mechanism design. A set of assumptions and a mathematical model is ratified to simulate the dominant strategy behavior of students in a given mechanism. A benchmark function accounting for grade accuracy and workload is established to quantitatively compare effectiveness and scalability of various mechanisms. After multiple iterations of mechanisms under increasingly realistic assumptions, three are proposed: Calibration, Improved Calibration, and Deduction. The Calibration mechanism performs as predicted by game theory when tested in an online crowd-sourced experiment, but fails when students are assumed to communicate. The Improved Calibration mechanism addresses this assumption, but at the cost of more effort spent grading. The Deduction mechanism performs relatively well in the benchmark, outperforming the Calibration, Improved Calibration, traditional automated, and traditional peer grading systems. The mathematical model and benchmark opens the way for future derivative works to be performed and compared.

68) Alexandria Yu , Towards the classification of unital 7-dimensional commutative algebras (19 Jan 2015)

An algebra is a vector space with a compatible product operation. An algebra is called commutative if the product of any two elements is independent of the order in which they are multiplied. A basic problem is to determine how many unital commutative algebras exist in a given dimension and to find all of these algebras. This classification problem has its origin in number theory and algebraic geometry. For dimension less than or equal to 6, Poonen has completely classified all unital commutative algebras up to isomorphism. For dimension greater than or equal to 7, the situation is much more complicated due to the fact that there are infinitely many algebras up to isomorphism. The purpose of this work is to develop new techniques to classify unital 7-dimensional commutative algebras up to isomorphism. An algebra is called local if there exists a unique maximal ideal m. Local algebras are basic building blocks for general algebras as any finite dimensional unital commutative algebra is isomorphic to a direct sum of finite dimensional unital commutative local algebras. Hence, in order to classify all finite dimensional unital commutative algebras, it suffices to classify all finite dimensional unital commutative local algebras. In this article, we classify all unital 7-dimensional commutative local algebras up to isomorphism with the exception of the special case k ₁ = 3 and k ₂ = 3, where, for each positive integer i , m ⁱ is the subalgebra generated by products of i elements in the maximal ideal m and k _i is the dimension of the quotient algebra m ⁱ / m ⁱ⁺¹ . When k ₂ = 1, we classify all finite dimensional unital commutative local algebras up to isomorphism. As a byproduct of our classification theorems, we discover several new classes of unital finite dimensional commutative algebras.

67) Niket Gowravaram and Uma Roy , Diagrammatic Calculus of Coxeter and Braid Groups (arXiv.org, 15 Mar 2015)

We investigate a novel diagrammatic approach to examining strict actions of a Coxeter group or a braid group on a category. This diagrammatic language, which was developed in a series of papers by Elias, Khovanov and Williamson, provides new tools and methods to attack many problems of current interest in representation theory. In our research we considered a particular problem which arises in this context. To a Coxeter group $W$ one can associate a real hyperplane arrangement, and can consider the complement of these hyperplanes in the complexification $Y_W$. The celebrated $K(\pi,1)$ conjecture states that $Y_W$ should be a classifying space for the pure braid group, and thus a natural quotient ${Y_W}/{W}$ should be a classifying space for the braid group. Salvetti provided a cell complex realization of the quotient, which we refer to as the Salvetti complex. In this paper we investigate a part of the $K(\pi,1)$ conjecture, which we call the $K(\pi,1)$ conjecturette, that states that the second homotopy group of the Salvetti complex is trivial. In this paper we present a diagrammatic proof of the $K(\pi,1)$ conjecturette for a family of braid groups as well as an analogous result for several families of Coxeter groups.

66) Arjun Khandelwal, Compact dot representations in permutation avoidance (3 Mar 2015)

A paper by a Eriksson et. al (2001) introduced a new form of representing a permutation, referred to as the compact dot representation, with the goal of constructing a smaller superpattern. We study this representation and give bounds on its size. We also consider a variant of the problem, where limitations on the alphabet size are imposed, and obtain lower bounds. Lastly, we consider the Mobius function of the poset of permutations ordered by containment.

65) Suzy Lou and Max Murin, On the Strongly Regular Graph of Parameters (99, 14, 1, 2) (9 Jan 2015)

In an attempt to find a strongly regular graph of parameters (99; 14; 1; 2) or to disprove its existence, we studied its possible substructure and constructions.

64) Shashwat Kishore (PRIMES) and Augustus Lonergan (MIT), Signatures of Multiplicity Spaces in Tensor Products of sl ₂ and U _q ( sl ₂ ) Representations (9 Jan 2015; arXiv.org, 8 Jun 2015)

We study multiplicity space signatures in tensor products of sl2 and U _q ( sl ₂ ) representations and their applications. We completely classify definite multiplicity spaces for generic tensor products of sl ₂ Verma modules. This provides a classification of a family of unitary representations of a basic quantized quiver variety, one of the first such classifications for any quantized quiver variety. We use multiplicity space signatures to provide the first real critical point lower bound for generic sl ₂ master functions. As a corollary of this bound, we obtain a simple and asymptotically correct approximation for the number of real critical points of a generic sl ₂ master function. We obtain a formula for multiplicity space signatures in tensor products of finite dimensional simple U _q ( sl ₂ ) representations. Our formula also gives multiplicity space signatures in generic tensor products of sl ₂ Verma modules and generic tensor products of real U _q ( sl ₂ ) Verma modules. Our results have relations with knot theory, statistical mechanics, quantum physics, and geometric representation theory.

63) Joseph Zurier, Generalizations of the Joints Problem (9 Jan 2015)

In this paper we explore generalizations of the joints problem introduced by B. Chazelle et al.

62) Nathan Wolfe (PRIMES), Ethan Zou (PRIMES), Ling Ren (MIT), and Xiangyao Yu (MIT), Optimizing Path ORAM for Cloud Storage Applications (arXiv.org, 8 Jan 2015)

We live in a world where our personal data are both valuable and vulnerable to misappropriation through exploitation of security vulnerabilities in online services. For instance, Dropbox, a popular cloud storage tool, has certain security flaws that can be exploited to compromise a user's data, one of which being that a user's access pattern is unprotected. We have thus created an implementation of Path Oblivious RAM (Path ORAM) for Dropbox users to obfuscate path access information to patch this vulnerability. This implementation differs significantly from the standard usage of Path ORAM, in that we introduce several innovations, including a dynamically growing and shrinking tree architecture, multi-block fetching, block packing and the possibility for multi-client use. Our optimizations together produce about a 77% throughput increase and a 60% reduction in necessary tree size; these numbers vary with file size distribution.

61) Brice Huang, Monomization of Power Ideals and Generalized Parking Functions (8 Jan 2015)

A power ideal is an ideal in a polynomial ring generated by powers of homogeneous linear forms. Power ideals arise in many areas of mathematics, including the study of zonotopes, approximation theory, and fat point ideals; in particular, their applications in approximation theory are relevant to work on splines and pertinent to mathematical modeling, industrial design, and computer graphics. For this reason, understanding the structure of power ideals, especially their Hilbert series, is an important problem. Unfortunately, due to the computational complexity of power ideals, this is a difficult problem. Only a few cases of this problem have been solved; efficient ways to compute the Hilbert series of a power ideal are known only for power ideals of certain forms. In this paper, we find an efficient way to compute the Hilbert series of a class of power ideals.

60) Kyle Gettig, Linear Extensions of Acyclic Orientations (7 Jan 2015)

Given a graph, an acyclic orientation of the edges determines a partial ordering of the vertices. This partial ordering has a number of linear extensions, i.e. total orderings of the vertices that agree with the partial ordering. The purpose of this paper is twofold. Firstly, properties of the orientation that induces the maximum number of linear extensions are investigated. Due to similarities between the optimal orientation in simple cases and the solution to the Max-Cut Problem, the possibility of a correlation is explored, though with minimal success. Correlations are then explored between the optimal orientation of a graph G and the comparability graphs with the minimum number of edges that contain G as a subgraph, as well as to certain graphical colorings induced by the orientation. Specifically, small cases of non-comparability graphs are investigated and compared to the known results for comparability graphs. We then explore the optimal orientation for odd anti-cycles and related graphs, proving that the conjectured orientations are optimal in the odd anti-cycle case. In the second part of this paper, the above concepts are extended to random graphs, that is, graphs with probabilities associated with each edge. New definitions and theorems are introduced to create a more intuitive system that agrees with the discrete case when all probabilities are 0 or 1, though complete results for this new system would be much more difficult to prove.

59) Shyam Narayanan , Improving the Speed and Accuracy of the Miller-Rabin Primality Test (7 Jan 2015)

In this paper, we discuss the accuracy of the Miller-Rabin Primality Test and the number of nonwitnesses for a composite odd integer n .

58) Peter M. Tian, Extremal Functions of Forbidden Multidimensional Matrices (7 Jan 2015)

We advance the extremal theory of matrices in two directions. The methods that we use come from combinatorics, probability, and analysis.

57) Eric Neyman, Cylindric Young Tableaux and their Properties (7 Jan 2015; earlier version on arXiv.org, 19 Oct 2014)

Cylindric Young tableaux are combinatorial objects that first appeared in the 1990s. A natural extension of the classical notion of a Young tableau, they have since been used several times, most notably by Gessel and Krattenthaler and by Alexander Postnikov. Despite this, relatively little is known about cylindric Young tableaux. This paper is an investigation of the properties of this object. In this paper, we extend the Robinson-Schensted-Knuth Correspondence, a well-known and very useful bijection concerning regular Young tableaux, to be a correspondence between pairs of cylindric tableaux. We use this correspondence to reach further results about cylindric tableaux. We then establish an interpretation of cylindric tableaux in terms of a game involving marble-passing. Next, we demonstrate a generic method to use results concerning cylindric tableaux in order to prove results about skew Young tableaux. We finish with a note on Knuth equivalence and its analog for cylindric tableaux.

56) Yilun Du , On the Algorithmic and Theoretical Exploration of Tiling-Harmonic Functions (6 Jan 2015)

In this paper, we explore a new class of harmonic functions defined on a tiling T , a square tiling of a region D , in C . We define these functions as tiling harmonic functions. We develop an efficient algorithm for computing interior values of tiling harmonic functions and graph harmonic functions in a tiling. Using our algorithm, we find that in general tiling harmonic functions are not generally equivalent to graph harmonic functions. In addition, we prove some theoretical results on the structure of tiling harmonic functions and classify one type of tiling harmonic function.

55) Jessica Li , On the Modeling of Snowflake Growth Using Hexagonal Automata (2 Jan 2015; arXiv.org , 8 May 2015; pubished (with Laura P. Schaposnik) in Physical Review E 93:2 (Feb. 2016) )

Snowflake growth is an example of crystallization, a basic phase transition in physics. Studying snowflake growth helps gain fundamental understanding of this basic process and may help produce better crystalline materials and benefit several major industries. The basic theoretical physical mechanisms governing the growth of snowflake are not well understood: whilst current computer modeling methods can generate snowflake images that successfully capture some basic features of actual snowflakes, so far there has been no analysis of these computer models in the literature, and more importantly, certain fundamental features of snowflakes are not well understood. A key challenge of analysis is that the snowflake growth models consist of a large set of partial difference equations, and as in many chaos theory problems, rigorous study is difficult. In this paper we analyze a popular model (Reiter’s model) using a combined approach of mathematical analysis and numerical simulation. We divide a snowflake image into main branches and side branches and define two new variables (growth latency and growth direction) to characterize the growth patterns. We derive a closed form solution of the main branch growth latency using a one dimensional linear model, and compare it with the simulation results using the hexagonal automata. We discover a few interesting patterns of the growth latency and direction of side branches. On the basis of the analysis and the principle of surface free energy minimization, we propose a new geometric rule to incorporate interface control, a basic mechanism of crystallization that is not taken into account in the original Reiter’s model.

54) Amy Chou and Justin Kaashoek, PuzzleJAR: Automated Constraint-based Generation of Puzzles of Varying Complexity (30 Sept 2014)

Engaging students in practicing a wide range of problems facilitates their learning. However, generating fresh problems that have specific characteristics, such as using a certain set of concepts or being of a given difficulty level, is a tedious task for a teacher. In this paper, we present PuzzleJAR, a system that is based on an iterative constraint-based technique for automatically generating problems. The PuzzleJAR system takes as parameters the problem definition, the complexity function, and domain-specific semantics-preserving transformations. We present an instantiation of our technique with automated generation of Sudoku and Fillomino puzzles, and we are currently extending our technique to generate Python programming problems. Since defining complexities of Sudoku and Fillomino puzzles is still an open research question, we developed our own mechanism to define complexity, using machine learning to generate a function for difficulty from puzzles with already known difficulties. Using this technique, PuzzleJAR generated over 200,000 Sudoku puzzles of different sizes (9x9, 16x16, 25x25) and over 10,000 Fillomino puzzles of sizes ranging from 2x2 to 16x16. .

53) Tanya Khovanova, Eric Nie, and Alok Puranik, The Sierpinski Triangle and The Ulam-Warburton Automaton (arXiv.org, 25 Aug 2014), published in Math Horizons (September 2015), reprinted in The Best Writing on Mathematics 2016

This paper is about the beauty of fractals and the surprising connections between them. We will explain the pioneering role that the Sierpinski triangle plays in the Ulam-Warburton automata and show you a number of pictures along the way.

52) Tanya Khovanova and Joshua Xiong, Cookie Monster Plays Games (arXiv.org, 6 July 2014), published in College Mathematics Journal 46:4 (2015): 283-293

We research a combinatorial game based on the Cookie Monster problem called the Cookie Monster game that generalizes the games of Nim and Wythoff. We also propose several combinatorial games that are in between the Cookie Monster game and Nim. We discuss properties of P-positions of all of these games.
Each section consists of two parts. The first part is a story presented from the Cookie Monster's point of view, the second part is a more abstract discussion of the same ideas by the authors.

51) Tanya Khovanova and Joshua Xiong, Nim Fractals (arXiv.org, 23 May 2014), published in Journal of Integer Sequences , Vol. 17 (2014)

We enumerate P-positions in the game of Nim in two different ways. In one series of sequences we enumerate them by the maximum number of counters in a pile. In another series of sequences we enumerate them by the total number of counters. We show that the game of Nim can be viewed as a cellular automaton, where the total number of counters divided by 2 can be considered as a generation in which P-positions are born. We prove that the three-pile Nim sequence enumerated by the total number of counters is a famous toothpick sequence based on the Ulam-Warburton cellular automaton. We introduce 10 new sequences.

50) Noah Golowich , Resolving a Conjecture on Degree of Regularity of Linear Homogeneous Equations (arXiv.org, 13 Apr 2014), published in The Electronic Journal of Combinatorics 21:3 (2014)

A linear equation is $r$-regular, if, for every $r$-coloring of the positive integers, there exist positive integers of the same color which satisfy the equation. In 2005, Fox and Radoićič conjectured that the equation $x_1 + 2x_2 + \cdots + 2^{n-2}x_{n-1} - 2^{n-1}x_n = 0$, for any $n \geq 2$, has a degree of regularity of $n-1$, which would verify a conjecture of Rado from 1933. Rado's conjecture has since been verified with a different family of equations. In this paper, we show that Fox and Radoićič's family of equations indeed have a degree of regularity of $n-1$. We also prove a few extensions of this result.

2013 Research Papers

49) Ritesh Ragavender , Odd Dunkl Operators and nilHecke Algebras (30 May 2014)

Symmetric functions appear in many areas of mathematics and physics, including enumerative combinatorics, the representation theory of symmetric groups, statistical mechanics, and the quantum statistics of ideal gases. In the commutative (or “even”) case of these symmetric functions, Kostant and Kumar introduced a nilHecke algebra that categorifies the quantum group U _q ( sl ₂ ) . This categorification helps to better understand Khovanov homology, which has important applications in studying knot polynomials and gauge theory. Recently, Ellis and Khovanov initiated the program of “oddification” as an effort to create a representation theoretic understanding of a new “odd” Khovanov homology, which often yields more powerful results than regular Khovanov homology. In this paper, we contribute to- wards the project of oddification by studying the odd Dunkl operators of Khongsap and Wang in the setting of the odd nilHecke algebra. Specifically, we show that odd divided difference operators can be used to construct odd Dunkl operators, which we use to give a representation of sl ₂ on the algebra of skew polynomials and evaluate the odd Dunkl Laplacian. We then investigate q -analogs of divided difference operators to introduce new algebras that are similar to the even and odd nilHecke algebras and act on q -symmetric polynomials. We describe such algebras for all previously unstudied values of q . We conclude by generalizing a diagrammatic method and developing the novel method of insertion in order to study q -symmetric polynomials from the perspective of bialgebras.

48) Gabriella Studt , Construction of the higher Bruhat order on the Weyl group of type B (27 May 2014)

Manin and Schechtman defined the Bruhat order on the type A Weyl group, which is closely associated to the Symmetric group S _n , as the order of all pairs of numbers in {1, 2, ..., n} . They proceeded to define a series of higher orders. Each higher order is an order on the subsets of {1, 2, ..., n} of size k , and can be computed using an inductive argument. It is also possible to define each of these higher orders explicitly, and therefore know conclusively the lexicographic orders for all k . It is thought that a closely related concept of lexicographic order exists for the Weyl group of type B, and that a similar method can be used to compute this series of higher orders. The applicability of this method is demonstrated in the paper, and we are able to determine and characterize the higher Bruhat order explicitly for certain n and k . We therefore conjecture the existence of such an order for all n > k ,as well as its accompanying properties.

47) Jeffrey Cai, Orbits of a fixed-point subgroup of the symplectic group on partial flag varieties of type A (24 May 2014)

In this paper we compute the orbits of the symplectic group Sp _{2

n} on partial flag varieties GL _{2

n} / P and on partial flag varieties enhanced by a vector space, C ^{2

n} x GL _{2

n} / P . This extends analogous results proved by Matsuki on full flags. The general technique used in this paper is to take the orbits in the full flag case and determine which orbits remain distinct when the full flag variety GL _{2

n} / B is projected down to the partial flag variety GL _{2

n} / P .

The recent discovery of a connection between abstract algebra and the classical combinatorial Robinson-Schensted (RS) correspondence has sparked research on related algebraic structures and relationships to new combinatorial bijections, such as the Robinson- Schensted-Knuth (RSK) correspondence, the "mirabolic" RSK correspondence, and the "exotic" RS correspondence. We conjecture an exotic RSK correspondence between the or- bits described in this paper and semistandard bi-tableaux, which would yield an extension to the exotic RS correspondence found in a paper of Henderson and Trapa.

46) John Long , Evidence of Purifying Selection in Mammals (9 May 2014)

The Human Genome Project completed in 2003 gave us a reference genome for the human species. Before the project was completed, it was believed that the primary function of DNA was to code for protein. However, it was discovered that only 2% of the genome consists of regions that code for proteins. The remaining regions of the genome are either functional regions that regulate the coding regions or junk DNA regions that do nothing. The distinct ion between these two types of regions is not completely clear. Evidence of purifying selection, the decrease in frequency of deleterious mutations , is likely a sign that a region is functional. The goal of this project was to find evidence of purifying se lection in newly acquired regions in the human genome that are hypothesized to be functional. The mean Derived Allele Frequency of the featured regions was compared to that of control regions to determine the likelihood of selection.

45) Ravi Jagadeesan , A new Gal( Q /Q)-invariant of dessins d'enfants (arXiv.org, 30 March 2014)

We study the action of $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$ on the category of Belyi functions (finite, \'{e}tale covers of $\mathbb{P}^1_{\overline{\mathbb{Q}}}\setminus \{0,1,\infty\}$). We describe a new combinatorial $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$-invariant for a certain class of Belyi functions. As a corollary, we obtain that for all $k < 2^{\sqrt{\frac{2}{3}}}$ and all positive integers $N$, there is an $n \le N$ such that the set of degree $n$ Belyi functions of a particular rational Nielsen class must split into at least $\Omega\left(k^{\sqrt{N}}\right)$ Galois orbits. In addition, we define a new version of the Grothendieck-Teichm\"{u}ller group $\widehat{GT}$ into which $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$ embeds.

44) Andrey Grinshpun (MIT), Raj Raina (PRIMES), and Rik Sengupta (MIT), Minimum Degrees of Minimal Ramsey Graphs for Almost-Cliques (arXiv.org, 26 Jun 2014)

For graphs $F$ and $H$, we say $F$ is Ramsey for $H$ if every $2$-coloring of the edges of $F$ contains a monochromatic copy of $H$. The graph $F$ is Ramsey $H$-minimal if $F$ is Ramsey for $H$ and there is no proper subgraph $F'$ of $F$ so that $F'$ is Ramsey for $H$. Burr, Erdös, and Lovasz defined $s(H)$ to be the minimum degree of $F$ over all Ramsey $H$-minimal graphs $F$. Define $H_{t,d}$ to be a graph on $t+1$ vertices consisting of a complete graph on $t$ vertices and one additional vertex of degree $d$. We show that $s(H_{t,d})=d^2$ for all values $1<d\le t$; it was previously known that $s(H_{t,1})=t-1$, so it is surprising that $s(H_{t,2})=4$ is much smaller.
We also make some further progress on some sparser graphs. Fox and Lin observed that $s(H)\ge 2\delta(H)-1$ for all graphs $H$, where $\delta(H)$ is the minimum degree of $H$; Szabo, Zumstein, and Zurcher investigated which graphs have this property and conjectured that all bipartite graphs $H$ without isolated vertices satisfy $s(H)=2\delta(H)-1$. Fox, Grinshpun, Liebenau, Person, and Szabo further conjectured that all triangle-free graphs without isolated vertices satisfy this property. We show that $d$-regular $3$-connected triangle-free graphs $H$, with one extra technical constraint, satisfy $s(H) = 2\delta(H)-1$; the extra constraint is that $H$ has a vertex $v$ so that if one removes $v$ and its neighborhood from $H$, the remainder is connected.

43) Boryana Doyle (PRIMES), Geoffrey Fudenberg (Harvard), Maxim Imakaev (MIT), and Leonid Mirny (MIT), Chromatin Loops as Allosteric Modulators of Enhancer-Promoter Interactions , published in PLoS Computational Biology (23 Oct 2014; earlier version in BioRxiv.org, 26 February 2014)

The classic model of eukaryotic gene expression requires direct spatial contact between a distal enhancer and a proximal promoter. Recent Chromosome Conformation Capture (3C) studies show that enhancers and promoters are embedded in a complex network of looping interactions. Here we use a polymer model of chromatin fiber to investigate whether, and to what extent, looping interactions between elements in the vicinity of an enhancer-promoter pair can influence their contact frequency. Our equilibrium polymer simulations show that a chromatin loop, formed by elements flanking either an enhancer or a promoter, suppresses enhancer-promoter interactions, working as an insulator. A loop formed by elements located in the region between an enhancer and a promoter, on the contrary, facilitates their interactions. We find that different mechanisms underlie insulation and facilitation; insulation occurs due to steric exclusion by the loop, and is a global effect, while facilitation occurs due to an effective shortening of the enhancer-promoter genomic distance, and is a local effect. Consistently, we find that these effects manifest quite differently for in silico 3C and microscopy. Our results show that looping interactions that do not directly involve an enhancer-promoter pair can nevertheless significantly modulate their interactions. This phenomenon is analogous to allosteric regulation in proteins, where a conformational change triggered by binding of a regulatory molecule to one site affects the state of another site.

42) William Kuszmaul , A New Approach to Enumerating Statistics Modulo n (arXiv.org, 16 February 2014)

We find a new approach to computing the remainder of a polynomial modulo $x^n-1$; such a computation is called modular enumeration. Given a polynomial with coefficients from a commutative $\mathbb{Q}$-algebra, our first main result constructs the remainder simply from the coefficients of residues of the polynomial modulo $\Phi_d(x)$ for each $d\mid n$. Since such residues can often be found to have nice values, this simplifies a number of modular enumeration problems; indeed in some cases, such residues are already known while the related modular enumeration problem has remained unsolved. We list six such cases which our technique makes easy to solve. Our second main result is a formula for the unique polynomial $a$ such that $a \equiv f \mod \Phi_n(x)$ and $a\equiv 0 \mod x^d-1$ for each proper divisor $d$ of $n$.

We find a formula for remainders of $q$-multinomial coefficients and for remainders of $q$-Catalan numbers modulo $q^n-1$, reducing each problem to a finite number of cases for any fixed $n$. In the prior case, we solve an open problem posed by Hartke and Radcliffe. In considering $q$-Catalan numbers modulo $q^n-1$, we discover a cyclic group operation on certain lattice paths which behaves predictably with regard to major index. We also make progress on a problem in modular enumeration on subset sums posed by Kitchloo and Pachter.

41) Ajay Saini , Predictive Modeling of Opinion and Connectivity Dynamics in Social Networks (26 January 2014)

Social networks have been extensively studied in recent years with the aim of understanding how the connectivity of different societies and their subgroups influences the spread of innovations and opinions through human networks. Using data collected from real-world social networks, researchers are able to gain a better understanding of the dynamics of such networks and subsequently model the changes that occur in these networks over time. In our work, we use data from the Social Evolution dataset of the MIT Human Dynamics Lab to develop a data-driven model capable of predicting the trends and long term changes observed in a real- world social network. We demonstrate the effectiveness of the model by predicting changes in both opinion spread and connectivity that reflect the changes observed in our dataset. After validating the model, we use it to understand how different types of social networks behave over time by varying the conditions governing the change of opinions and connectivity. We conclude with a study of opinion propagation under different conditions in which we use the structure and opinion distribution of various networks to identify sets of agents capable of propagating their opinion throughout an entire network. Our results demonstrate the effectiveness of the proposed modeling approach in predicting the future state of social networks and provide further insight into the dynamics of interactions between agents in real-world social networks.

40) Rohil Prasad, Investigating GCD in Z[√ 2 ] (1 1 January 2014)

We attempt to optimize the time needed to calculate greatest common divisors in the Euclidean domain Z[√ 2 ].

39) Jin-Woo Bryan Oh , Towards Generalizing Thrackles to Arbitrary Graphs (1 January 2014)

In the 1950s, John Conway came up with the notion of thrackles , graphs with embeddings in which no edge crosses itself, but every pair of distinct edges intersects each other exactly once. He conjectured that |E(G)| ≤ |V(G)| for any thrackle G, a question unsolved to this day. In this paper, we discuss some of the known properties of thrackles and contribute a few new ones.

Only a few sparse graphs can be thrackles, and so it is of interest to find an analogous notion that applies to denser graphs as well. In this paper we introduce a generalized version of thrackles called near-thrackles , and prove some of their properties. We also discuss a large number of conjectures about them which seem very obvious but nonetheless are hard to prove. In the final section, we introduce thrackleability , a number between 0 and 1 that turns out to be an accurate measure of how far away a graph is from being a thrackle..

38) Junho Won , Lower bounds for the Crossing Number of the Cartesian Product of a Vertex-transitive Graph with a Cycle (1 January 2014)

The minimum number of crossings for all drawings of a given graph $G$ on a plane is called its crossing number, denoted $cr(G)$. Exact crossing numbers are known only for a few families of graphs, and even the crossing number of a complete graph $K_m$ is not known for all $m$. Wenping et al. showed that $cr(K_m\Box C_n)\geqslant n\cdot cr(K_{m+2})$ for $n\geqslant 4$ and $m\geqslant 4$. We adopt their method to find a lower bound for $cr(G\Box C_n)$ where $G$ is a vertex-transitive graph of degree at least 3. We also suggest some particular vertex-transitive graphs of interest, and give two corollaries that give lower bounds for $cr(G\Box C_n)$ in terms of $n$, $cr(G)$, the number of vertices of $G$, and the degree of $G$, which improve on Wenping et al.'s result.

37) Ying Gao, On an Extension of Stanley Depth for Refinement-Ordered Posets (30 December 2013)

The concept of Stanley depth was originally defined for graded modules over commutative rings in 1982 by Richard P. Stanley. However, in 2009 Herzog, Vladiou, and Zheng found a property, ndepth, of posets analogous to the Stanley depths of certain modules, which provides an important link between combinatorics and commutative algebra. Due to this link, there arises the question of what this ndepth is for certain classes of posets.

Because ndepth was only recently defined, much remains to be discovered about it. In 2009, Biro, Howard, Keller, Trotter and Young found a lower bound for the ndepth of the poset of nonempty subsets of {1; 2; ...; n} ordered by inclusion. In 2010, Wang calculated the ndepth of the product of chains n ^k \ 0. However, ndepth has yet to be studied in relation to many other commonly found classes of posets. We chose to research the properties of the ndepths of one such well-known class of posets - the posets which consist of non-empty partitions of sets ordered by refinement, which we denote as G _i .

We use combinatorial and algebraic methods to find the ndepths for small posets in G _i . We show that for posets of increasing size in G _i , new depth is strictly non-decreasing, and furthermore we show that ndepth[G _i ] ≥ [8i/29] for all i. We also find that for all i, ndepth[G _i ] ≤ i through the proof that ndepth[G _i+1 ] ≤ ndepth[G _i ] + 1.

36) Nihal Gowravaram , Enumeration of Subclasses of (2+2)-free Partially Ordered Sets (26 December 2013)

We investigate avoidance in (2+2)-free partially ordered sets, posets that do not contain any induced subposet isomorphic to the union of two disjoint chains of length two. In particular, we are interested in enumerating the number of partially ordered sets of size N avoiding both 2+2 and some other poset α. For any α of size 3, the results are already well-known. However, out of the 15 such α of size 4, only 2 were previously known. Through the course of this paper, we explicitly enumerate 7 other such α of size 4. Also, we consider the avoidance of three posets simultaneously, 2+2 along with some pair (α,β); it turns out that this enumeration is often clean, and has sometimes surprising results. Furthermore, we turn to the question of Wilf-equivalences in (2+2)-free posets. We show such an equivalence between the Y-shaped and chain posets of size 4 via a direct bijection, and in fact, we extend this to show a Wilf-equivalence between the general chain poset and a general Y-shaped poset of the same size. In this paper, while our focus is on enumeration, we also seek to develop an understanding of the structures of the posets in the subclasses we are studying.

35) Yael Fregier (MIT) and Isaac Xia, Lower Central Series Ideal Quotients Over $\mathbb{F}_p$ and $\mathbb{Z}$ (17 November 2013; arXiv.org, 28 Jun 2015)

Given a graded associative algebra $A$, its lower central series is defined by $L_1 = A$ and $L_{i+1} = [L_i, A]$. We consider successive quotients $N_i(A) = M_i(A) / M_{i+1}(A)$, where $M_i(A) = AL_i(A) A$. These quotients are direct sums of graded components. Our purpose is to describe the $\mathbb{Z}$-module structure of the components; i.e., their free and torsion parts. Following computer exploration using MAGMA , two main cases are studied. The first considers $A = A_n / (f_1,\dots, f_m)$, with $A_n$ the free algebra on $n$ generators $\{x_1, \ldots, x_n\}$ over a field of characteristic $p$. The relations $f_i$ are noncommutative polynomials in $x_j^{p^{n_j}},$ for some integers $n_j$. For primes p > 2 , we prove that $p^{\sum n_j} \mid \text{dim}(N_i(A))$. Moreover, we determine polynomials dividing the Hilbert series of each $N_i(A)$. The second concerns $A = \mathbb{Z} \langle x_1, x_2, \rangle / (x_1^m, x_2^n)$. For $i = 2,3$, the bigraded structure of $N_i(A_2)$ is completely described.

34) Steven Homberg , Finding Enrichments of Functional Annotations for Disease- Associated Single-Nucleotide Polymorphisms (10 November 2013)

Computational analysis of SNP-disease associations from GWAS as well as functional annotations of the genome enables the calculation of a SNP set's enrichment for a disease. These statistical enrichments can be and are calculated with a variety of statistical techniques, but there is no standard statistical method for calculating enrichments. Several entirely different tests are used by different investigators in the field. These tests can also be conducted with several variations in parameters which also lack a standard. In our investigation, we develop a computational tool for conducting various enrichment calculations and, using breast cancer-associated SNPs from a GWAS catalog as a foreground against all GWAS SNPs as a background, test the tool and analyze the relative performance of the various tests. The computational tool will soon be released to the scientific community as a part of the Bioconductor package. Our analysis shows that, for R2 threshold in LD block construction, values around 0.8-0.9 are preferable to those with more lax and more strict thresholds respectively. We find that block-matching tests yield better results than peak-shifting tests. Finally, we find that, in block-matching tests, block tallying using binary scoring, noting whether or not a block has an annotation only, yields the most meaningful results, while weighting LD r2 threshold has no influence.

33) Kavish Gandhi , Noah Golowich , and László Miklós Lovász, Degree of Regularity of Linear Homogeneous Equations (arXiv.org, 27 Sept 2013), published in Journal of Combinatorics 5:2 (2014)

We define a linear homogeneous equation to be strongly r-regular if, when a finite number of inequalities is added to the equation, the system of the equation and inequalities is still r-regular. In this paper, we derive a constraint on the coefficients of a linear homogeneous equation that gives a sufficient condition for the equation to be strongly r-regular. In 2009, Alexeev and Tsimerman introduced a family of equations, each of which is (n-1)-regular but not n-regular, verifying a conjecture of Rado from 1933. We show that these equations are actually strongly (n-1)-regular as a corollary of our results.

32) Leigh Marie Braswell and Tanya Khovanova, On the Cookie Monster Problem (arXiv.org, 23 Sept 2013), published in Jennifer Beineke & Jason Rosenhouse, The Mathematics of Various Entertaining Subjects: Research in Recreational Math (Princeton University Press, 2015).

The Cookie Monster Problem supposes that the Cookie Monster wants to empty a set of jars filled with various numbers of cookies. On each of his moves, he may choose any subset of jars and take the same number of cookies from each of those jars. The Cookie Monster number of a set is the minimum number of moves the Cookie Monster must use to empty all of the jars. This number depends on the initial distribution of cookies in the jars. We discuss bounds of the Cookie Monster number and explicitly find the Cookie Monster number for jars containing cookies in the Fibonacci, Tribonacci, n-nacci, and Super-n-nacci sequences. We also construct sequences of k jars such that their Cookie Monster numbers are asymptotically rk, where r is any real number between 0 and 1 inclusive.

31) Vahid Fazel-Rezai, Equivalence Classes of Permutations Modulo Replacements Between 123 and Two-Integer Patterns (arXiv.org, 18 Sept 2013), published in The Electronic Journal of Combinatorics 21:2 (2014)

We explore a new type of replacement of patterns in permutations, suggested by James Propp, that does not preserve the length of permutations. In particular, we focus on replacements between 123 and a pattern of two integer elements. We apply these replacements in the classical sense; that is, the elements being replaced need not be adjacent in position or value. Given each replacement, the set of all permutations is partitioned into equivalence classes consisting of permutations reachable from one another through a series of bi-directional replacements. We break the eighteen replacements of interest into four categories by the structure of their classes and fully characterize all of their classes.

30) Jesse Geneson (MIT), Rohil Prasad (PRIMES), and Jonathan Tidor (PRIMES), Bounding sequence extremal functions with formations (arXiv.org, 17 Aug 2013), published in The Electronic Journal of Combinatorics 21:3 (2014)

An $(r, s)$-formation is a concatenation of $s$ permutations of $r$ letters. If $u$ is a sequence with $r$ distinct letters, then let $\mathit{Ex}(u, n)$ be the maximum length of any $r$-sparse sequence with $n$ distinct letters which has no subsequence isomorphic to $u$. For every sequence $u$ define $\mathit{fw}(u)$, the formation width of $u$, to be the minimum $s$ for which there exists $r$ such that there is a subsequence isomorphic to $u$ in every $(r, s)$-formation. We use $\mathit{fw}(u)$ to prove upper bounds on $\mathit{Ex}(u, n)$ for sequences $u$ such that $u$ contains an alternation with the same formation width as $u$.
We generalize Nivasch's bounds on $\mathit{Ex}((ab)^{t}, n)$ by showing that $\mathit{fw}((12 \ldots l)^{t})=2t-1$ and $\mathit{Ex}((12\ldots l)^{t}, n) =n2^{\frac{1}{(t-2)!}\alpha(n)^{t-2}\pm O(\alpha(n)^{t-3})}$ for every $l \geq 2$ and $t\geq 3$, such that $\alpha(n)$ denotes the inverse Ackermann function. Upper bounds on $\mathit{Ex}((12 \ldots l)^{t} , n)$ have been used in other papers to bound the maximum number of edges in $k$-quasiplanar graphs on $n$ vertices with no pair of edges intersecting in more than $O(1)$ points.
If $u$ is any sequence of the form $a v a v' a$ such that $a$ is a letter, $v$ is a nonempty sequence excluding $a$ with no repeated letters and $v'$ is obtained from $v$ by only moving the first letter of $v$ to another place in $v$, then we show that $\mathit{fw}(u)=4$ and $\mathit{Ex}(u, n) =\Theta(n\alpha(n))$. Furthermore we prove that $\mathit{fw}(abc(acb)^{t})=2t+1$ and $\mathit{Ex}(abc(acb)^{t}, n) = n2^{\frac{1}{(t-1)!}\alpha(n)^{t-1}\pm O(\alpha(n)^{t-2})}$ for every $t\geq 2$.

29) Jesse Geneson (MIT), Tanya Khovanova (MIT), and Jonathan Tidor (PRIMES), Convex geometric (k+2)-quasiplanar representations of semi-bar k-visibility graphs (arXiv.org, 3 Jul 2013), published in Discrete Mathematics 331 (2014)

We examine semi-bar visibility graphs in the plane and on a cylinder in which sightlines can pass through k objects. We show every semi-bar k-visibility graph has a (k+2)-quasiplanar representation in the plane with vertices drawn as points in convex position and edges drawn as segments. We also show that the graphs having cylindrical semi-bar k-visibility representations with semi-bars of different lengths are the same as the (2k+2)-degenerate graphs having edge-maximal (k+2)-quasiplanar representations in the plane with vertices drawn as points in convex position and edges drawn as segments.

28) Leigh Marie Braswell and Tanya Khovanova, Cookie Monster Devours Naccis (arXiv.org, 18 May 2013), published in the College Mathematics Journal 45:2 (2014)

In 2002, Cookie Monster appeared in The Inquisitive Problem Solver . The hungry monster wants to empty a set of jars filled with various numbers of cookies. On each of his moves, he may choose any subset of jars and take the same number of cookies from each of those jars. The Cookie Monster number is the minimum number of moves Cookie Monster must use to empty all of the jars. This number depends on the initial distribution of cookies in the jars. We discuss bounds of the Cookie Monster number and explicitly find the Cookie Monster number for Fibonacci, Tribonacci and other nacci sequences.

2012 Research Papers

27) William Kuszmaul and Ziling Zhou, Equivalence classes in S _n for three families of pattern-replacement relations (arXiv.org, 20 April 2013)

We study a family of equivalence relations in S _n , the group of permutations on n letters, created in a manner similar to that of the Knuth relation and the forgotten relation. For our purposes, two permutations are in the same equivalence class if one can be reached from the other through a series of pattern-replacements using patterns whose order permutations are in the same part of a predetermined partition of S _c . In particular, we are interested in the number of classes created in S _n by each relation and in characterizing these classes. Imposing the condition that the partition of S _c has one nontrivial part containing the cyclic shifts of a single permutation, we find enumerations for the number of nontrivial classes. When the permutation is the identity, we are able to compare the sizes of these classes and connect parts of the problem to Young tableaux and Catalan lattice paths. Imposing the condition that the partition has one nontrivial part containing all of the permutations in S _c beginning with 1, we both enumerate and characterize the classes in S _n . We do the same for the partition that has two nontrivial parts, one containing all of the permutations in S _c beginning with 1, and one containing all of the permutations in S _c ending with 1.

26) William Kuszmaul , Counting permutations modulo pattern-replacement equivalences for three-letter patterns (arXiv.org, 20 April 2013), published in the Electronic Journal of Combinatorics 20:4 (2013)

We study a family of equivalence relations in S _n , the group of permutations on n letters, created in a manner similar to that of the Knuth relation and the forgotten relation. For our purposes, two permutations are in the same equivalence class if one can be reached from the other through a series of pattern-replacements using patterns whose order permutations are in the same part of a predetermined partition of S _c . When the partition is of S ₃ and has one nontrivial part of size greater than two, we provide formulas for the number of classes created in all unresolved cases. When the partition is of S ₃ and has two nontrivial parts, each of size two (as do the Knuth and forgotten relations), we enumerate the classes for 13 of the 14 unresolved cases. In two of these cases, enumerations arise which are the same as those yielded by the Knuth and forgotten relations. The reasons for this phenomenon are still largely a mystery.

25) Tanya Khovanova and Ziv Scully , Efficient Calculation of Determinants of Symbolic Matrices with Many Variables (arXiv.org, 13 April 2013)

Efficient matrix determinant calculations have been studied since the 19th century. Computers expand the range of determinants that are practically calculable to include matrices with symbolic entries. However, the fastest determinant algorithms for numerical matrices are often not the fastest for symbolic matrices with many variables. We compare the performance of two algorithms, fraction-free Gaussian elimination and minor expansion, on symbolic matrices with many variables. We show that, under a simplified theoretical model, minor expansion is faster in most situations. We then propose optimizations for minor expansion and demonstrate their effectiveness with empirical data.

24) Michael Zanger-Tishler and Saarik Kalia , On the Winning and Losing Parameters of Schmidt's Game (8 April 2013)

First introduced by Wolfgang Schmidt, the ( α , β )-game and its modifications have been shown to be a powerful tool in Diophantine approximation, metric number theory, and dynamical systems. However, natural questions about the winning-losing parameters of most sets have not been studied thoroughly even after more than 40 years. There are a few results in the literature showing that some non-trivial points and small regions are winning or losing, but complete pictures remain largely unknown. Our main goal in this paper is to provide as much detail as possible about the global pictures of winning-losing parameters for some interesting families of sets.

23) Sheela Devadas and Steven Sam, Representations of Cherednik algebras of G (m, r, n) in positive characteristic (arXiv.org, 3 April 2013), published in Journal of Commutative Algebra (Winter 2014): 525-559

We study lowest-weight irreducible representations of rational Cherednik algebras attached to the complex reflection groups G(m, r, n) in characteristic p . Our approach is mostly from the perspective of commutative algebra. By studying the kernel of the contravariant bilinear form on Verma modules, we obtain formulas for Hilbert series of irreducible representations in a number of cases, and present conjectures in other cases. We observe that the form of the Hilbert series of the irreducible representations and the generators of the kernel tend to be determined by the value of n modulo p , and are related to special classes of subspace arrangements. Perhaps the most novel (conjectural) discovery from the commutative algebra perspective is that the kernel can be given the structure of a "matrix regular sequence" in some instances, which we prove in some small cases.

22) Christina Chen and Nan Li, Apollonian Equilateral Triangles (arXiv.org, 1 March 2013)

Given an equilateral triangle with a the square of its side length and a point in its plane with b, c, d the squares of the distances from the point to the vertices of the triangle, it can be computed that a, b, c, d satisfy 3( a ² + b ² + c ² + d ² ) = ( a + b + c + d ) ² . This paper derives properties of quadruples of nonnegative integers ( a; b; c; d ), called triangle quadruples, satisfying this equation. It is easy to verify that the operation generating ( a; b; c; a + b + c - d ) from ( a; b; c; d ) preserves this feature and that it and analogous ones for the other elements can be represented by four matrices. We examine in detail the triangle group, the group with these operations as generators, and completely classify the orbits of quadruples with respect to the triangle group action. We also compute the number of triangle quadruples generated after a certain number of operations and approximate the number of quadruples bounded by characteristics such as the maximal element. Finally, we prove that the triangle group is a hyperbolic Coxeter group and derive information about the elements of triangle quadruples by invoking Lie groups. We also generalize the problem to higher dimensions.

21) Dhroova Aiylam, Modified Stern-Brocot sequences (arXiv.org, 29 January 2013), published in Integers: Electronic Journal of Combinatorics and Number Theory 17 (2017)

We present the classical Stern-Brocot tree and provide a new proof of the fact that every rational number between 0 and 1 appears in the tree. We then generalize the Stern-Brocot tree to allow for arbitrary choice of starting terms, and prove that in all cases the tree maintains the property that every rational number between the two starting terms appears exactly once.

20) Nihal Gowravaram and Ravi Jagadeesan , Beyond alternating permutations: Pattern avoidance in Young diagrams and tableaux (arXiv.org, 28 January 2013), published in the Electronic Journal of Combinatorics 20:4 (2013)

We investigate pattern avoidance in alternating permutations and generalizations thereof. First, we study pattern avoidance in an alternating analogue of Young diagrams. In particular, we extend Babson-West's notion of shape-Wilf equivalence to apply to alternating permutations and so generalize results of Backelin-West-Xin and Ouchterlony to alternating permutations. Second, we study pattern avoidance in the more general context of permutations with restricted ascents and descents. We consider a question of Lewis regarding permutations that are the reading words of thickened staircase Young tableaux, that is, permutations that have (k - 1) ascents followed by a descent, followed by (k - 1) ascents, et cetera. We determine the relative sizes of the sets of pattern-avoiding (k - 1)-ascent permutations in terms of the forbidden pattern. Furthermore, we give inequalities in the sizes of sets of pattern-avoiding permutations in this context that arise from further extensions of shape-equivalence type enumerations.

19) Rohil Prasad and Jonathan Tidor , Optimal Results in Staged Self-Assembly of Wang Tiles (22 January 2013)

The subject of self-assembly deals with the spontaneous creation of ordered systems from simple units and is most often applied in the field of nanotechnology. The self-assembly model of Winfree describes the assembly of Wang tiles, simulating assembly in real-world systems. We use an extension of this model, known as the staged self-assembly model introduced by Demaine et al. that allows for discrete steps to be implemented and permits more diverse constructions. Under this model, we resolve the problem of constructing segments, creating a method to produce them optimally. Generalizing this construction to squares gives a new flexible method for their construction. Changing a parameter of the model, we explore much simpler constructions of complex monotone shapes. Finally, we present an optimal method to build most arbitrary shapes.

18) Aaron Klein, On Rank Functions of Graphs (6 January 2013)

We study rank functions (also known as graph homomorphisms onto Z), ways of imposing graded poset structures on graphs. We rst look at a variation on rank functions called discrete Lipschitz functions . We relate the number of Lipschitz functions of a graph G to the number of rank functions of both G and G X E . We then find generating functions that enable us to compute the number of rank or Lipschitz functions of a given graph. We look at a subset of graphs called squarely generated graphs , which are graphs whose cycle space has a basis consisting only of 4-cycles. We show that the number of rank functions of such a graph is proportional to the number of 3-colorings of the same graph, thereby connecting rank functions to the Potts model of statistical mechanics. Lastly, we look at some asymptotics of rank and Lipschitz functions for various types of graphs.

17) Andrew Xia, Integrated Gene Expression Probabilistic Models for Cancer Staging (1 January 2013)

The current system for classifying cancer patients' stages was introduced more than one hundred years ago. With the modern advance in technology, many parts of the system have been outdated. Because the current staging system emphasizes surgical procedures that could be harmful to patients, there has been a movement to develop a new Taxonomy, using molecular signatures to potentially avoid surgical testing. This project explores the issues of the current classification system and also looking for a potentially better way to classify cancer patients’ stages. Computerization has made a vast amount of cancer data available online. However, a significant portion of the data is incomplete; some crucial information is missing. It is logical to attempt to develop a system of recovering missing cancer data. Successful completion of this research saves costs and increases efficiency in cancer research and curing. Using various methods, we have shown that cancer stages cannot be simply extrapolated with incomplete data. Furthermore, a new approach of using RNA Sequencing data is studied. RNA Sequencing can potentially become a cost-efficient way to determine a cancer patient’s stage. We have obtained promising results of using RNA sequencing data in breast cancer staging.

16) Surya Bhupatiraju , On the Complexity of the Marginal Satisfiability Problem (18 November 2012)

The marginal satisfiability problem (MSP) asks: Given desired marginal distributions D _S for every subset S of c variable indices from {1, . . . , n}, does there exist a distribution D over n-tuples of values in {1, . . . , m} with those S -marginals D _S ? Previous authors have studied MSP in fixed dimensions, and have classified the complexity up to certain upper bounds. However, when using general dimensions, it is known that the size of distributions grows exponentially, making brute force algorithms impractical. This presents an incentive to study more general, tractable variants, which in turn may shed light on the original problem's structure. Thus, our work seeks to explore MSP and its variants for arbitrary dimension, and pinpoint its complexity more precisely. We solve MSP for n = 2 and completely characterize the complexity of three closely related variants of MSP. In particular, we detail novel greedy and stochastic algorithms that handle exponentially-sized data structures in polynomial time, as well as generate accurate representative samples of these structures in polynomial time. These algorithms are also unique in that they represent possible protocols in data compression for communication purposes. Finally, we posit conjectures related to more generalized MSP variants, as well as the original MSP.

15) Fengning Ding and Aleksander Tsymbaliuk, Representations of Infinitesimal Cherednik Algebras (arXiv.org, 17 October 2012), published in Representation Theory 17 (2013)

Infinitesimal Cherednik algebras, first introduced by Etingof, Gan, and Ginzburg (2005), are continuous analogues of rational Cherednik algebras, and in the case of gl _n , are deformations of universal enveloping algebras of the Lie algebras sl _n+1 . Despite these connections, infinitesimal Cherednik algebras are not widely-studied, and basic questions of intrinsic algebraic and representation theoretical nature remain open. In the first half of this paper, we construct the complete center of H _ζ (gl _n ) for the case of n = 2 and give one particular generator of the center, the Casimir operator, for general n. We find the action of this Casimir operator on the highest weight modules to prove the formula for the Shapovalov determinant, providing a criterion for the irreducibility of Verma modules. We classify all irreducible finite dimensional representations and compute their characters. In the second half, we investigate Poisson-analogues of the infinitesimal Cherednik algebras and use them to gain insight on the center of H _ζ (gl _n ). Finally, we investigate H _ζ (sp _2n ) and extend various results from the theory of H _ζ (gl _n ), such as a generalization of Kostant's theorem.

14) Tanya Khovanova and Dai Yang, Halving Lines and Their Underlying Graphs (arXiv.org, 17 October 2012), published in Involve 11:1 (2018): 1–11

In this paper we study halving-edges graphs corresponding to a set of halving lines. Particularly, we study the vertex degrees, path, cycles and cliques of such graphs. In doing so, we study a vertex-partition of said graph called chains which are equipped with interesting properties.

2011 Research Papers

13) Carl Lian, Representations of Cherednik Algebras Associated to Complex Reflection Groups in Positive Characteristic (arXiv.org, 1 July 2012)

We consider irreducible lowest-weight representations of Cherednik algebras associated to certain classes of complex reflection groups in characteristic p . In particular, we study maximal submodules of Verma modules associated to these algebras. Various results and conjectures are presented concerning generators of these maximal submodules, which are found by computing singular polynomials of Dunkl operators. This work represents progress toward the general problem of determining Hilbert series of irreducible lowest-weight representations of arbitrary Cherednik algebras in characteristic p .

12) Aaron Klein, Joel Brewster Lewis, and Alejandro Morales, Counting matrices over finite fields with support on skew Young and Rothe diagrams (arXiv.org, 26 March 2012); published in the Journal of Algebraic Combinatorics (May 2013)

We consider the problem of finding the number of matrices over a finite field with a certain rank and with support that avoids a subset of the entries. These matrices are a q-analogue of permutations with restricted positions (i.e., rook placements). For general sets of entries these numbers of matrices are not polynomials in q (Stembridge 98); however, when the set of entries is a Young diagram, the numbers, up to a power of q-1, are polynomials with nonnegative coefficients (Haglund 98). In this paper, we give a number of conditions under which these numbers are polynomials in q, or even polynomials with nonnegative integer coefficients. We extend Haglund's result to complements of skew Young diagrams, and we apply this result to the case when the set of entries is the Rothe diagram of a permutation. In particular, we give a necessary and sufficient condition on the permutation for its Rothe diagram to be the complement of a skew Young diagram up to rearrangement of rows and columns. We end by giving conjectures connecting invertible matrices whose support avoids a Rothe diagram and Poincaré polynomials of the strong Bruhat order.

11) Surya Bhupatiraju , Pavel Etingof, David Jordan, William Kuszmaul , and Jason Li, Lower central series of a free associative algebra over the integers and finite fields (arXiv.org, 8 March 2012), published in the Journal of Algebra (December 2012)

Consider the free algebra A_n generated over Q by n generators x_1, ..., x_n. Interesting objects attached to A = A_n are members of its lower central series, L_i = L_i(A), defined inductively by L_1 = A, L_{i+1} = [A,L_{i}], and their associated graded components B_i = B_i(A) defined as B_i=L_i/L_{i+1}. These quotients B_i, for i at least 2, as well as the reduced quotient \bar{B}_1=A/(L_2+A L_3), exhibit a rich geometric structure, as shown by Feigin and Shoikhet and later authors (Dobrovolska-Kim-Ma, Dobrovolska-Etingof, Arbesfeld-Jordan, Bapat-Jordan).
We study the same problem over the integers Z and finite fields F_p. New phenomena arise, namely, torsion in B_i over Z, and jumps in dimension over F_p. We describe the torsion in the reduced quotient RB_1 and B_2 geometrically in terms of the De Rham cohomology of Z^n. As a corollary we obtain a complete description of \bar{B}_1(A_n(Z)) and \bar{B}_1(A_n(F_p)), as well as of B_2(A_n(Z[1/2])) and B_2(A_n(F_p)), p>2. We also give theoretical and experimental results for B_i with i>2, formulating a number of conjectures and questions based on them. Finally, we discuss the supercase, when some of the generators are odd (fermionic) and some are even (bosonic), and provide some theoretical results and experimental data in this case.

10) David Jordan and Masahiro Namiki, Determinant formulas for the reflection equation algebra (19 Feb 2012)

In this note, we report on work in progress to explicitly describe generators of the center of the reflection equation algebra associated to the quantum GL(N) R-matrix. In particular, we conjecture a formula for the quantum determinant, and for the quadratic central element, both of which involve the excedance statistic on the symmetric group. Current efforts are directed at proving these formulas, and at finding formulas for the remaining central elements.

9) Ziv Scully , Yan Zhang, and Tian-Yi (Damien) Jiang, Firing Patterns in the Parallel Chip-Firing Game (arXiv.org, 29 Nov 2012), published in Discrete Mathematics and Theoretical Computer Science (DMTCS) proc., Nancy, France, 2014

The parallel chip-firing game is an automaton on graphs in which vertices “fire” chips to their neighbors. This simple model, analogous to sandpiles forming and collapsing, contains much emergent complexity and has connections to different areas of mathematics including self-organized criticality and the study of the sandpile group. In this work, we study firing sequences , which describe each vertex’s interaction with its neighbors in this game. Our main contribution is a complete characterization of the periodic firing sequences that can occur in a game, which have a surprisingly simple combinatorial description. We also obtain other results about local behavior of the game after introducing the concept of motors .

8) Sheela Devadas , Lowest-weight representations of Cherednik algebras in positive characteristic (29 Jan 2012)

We study lowest-weight irreducible representations of rational Cherednik algebras attached to the complex reflection groups G(m, r, n) in characteristic p , focusing specifically on the case p ≤ n , which is more complicated than the case p > n . The goal of our work is to calculate characters (and in particular Hilbert series) of these representations. By studying the kernel of the contravariant bilinear form on Verma modules, we proved formulas for Hilbert series of irreducible modules in a number of cases, and also obtained a lot of computer data which suggests a number of conjectures. Specifically, we find that the shape and form of the Hilbert series of the irreducible representations and the generators of the kernel tend to be determined by the value of n modulo p .

7) Christina Chen , Maximizing Volume Ratios for Shadow Covering by Tetrahedra (arXiv.org, 9 Jan 2012)

Define a body A to be able to hide behind a body B if the orthogonal projection of B contains a translation of the corresponding orthogonal projection of A in every direction. In two dimensions, it is easy to observe that there exist two objects such that one can hide behind another and have a larger area than the other. It was recently shown that similar examples exist in higher dimensions as well. However, the highest possible volume ratio for such bodies is still undetermined. We investigated two three-dimensional examples, one involving a tetrahedron and a ball and the other involving a tetrahedron and an inverted tetrahedron. We calculate the highest volume ratio known up to this date, 1.16, which is generated by our second example.

6) Yongyi Chen, Pavel Etingof, David Jordan, and Michael Zhang , Poisson traces in positive characteristic (arXiv.org, 29 Dec 2011)

We study Poisson traces of the structure algebra A of an affine Poisson variety X defined over a field of characteristic p. According to arXiv:0908.3868v4 , the dual space HP_0(A) to the space of Poisson traces arises as the space of coinvariants associated to a certain D-module M(X) on X. If X has finitely many symplectic leaves and the ground field has characteristic zero, then M(X) is holonomic, and thus HP_0(A) is finite dimensional. However, in characteristic p, the dimension of HP_0(A) is typically infinite. Our main results are complete computations of HP_0(A) for sufficiently large p when X is 1) a quasi-homogeneous isolated surface singularity in the three-dimensional space, 2) a quotient singularity V/G, for a symplectic vector space V by a finite subgroup G in Sp(V), and 3) a symmetric power of a symplectic vector space or a Kleinian singularity. In each case, there is a finite nonnegative grading, and we compute explicitly the Hilbert series. The proofs are based on the theory of D-modules in positive characteristic.

5) Saarik Kalia , The Generalizations of the Golden Ratio: Their Powers, Continued Fractions, and Convergents (23 Dec 2011)

The relationship between the golden ratio and continued fractions is commonly known about throughout the mathematical world: the convergents of the continued fraction are the ratios of consecutive Fibonacci numbers. The continued fractions for the powers of the golden ratio also exhibit an interesting relationship with the Lucas numbers. In this paper, we study the silver means and introduce the bronze means, which are generalizations of the golden ratio. We correspondingly introduce the silver and bronze Fibonacci and Lucas numbers, and we prove the relationship between the convergents of the continued fractions of the powers of the silver and bronze means and the silver and bronze Fibonacci and Lucas numbers. We further generalize this to the Lucas constants, a two-parameter generalization of the golden ratio.

4) Caroline Ellison , The Number of Nonzero Coefficients of Powers of a Polynomial over a Finite Field (15 Nov 2011)

Coefficients of polynomials over finite fields often encode information that can be applied in various areas of science; for instance, computer science and representation theory. The purpose of this project is to investigate these coefficients over the finite field F _p . We find four exact results for the number of nonzero coefficients in special cases of n and p for the polynomial (1 + x + x ² ) ⁿ . More importantly, we use Amdeberhan and Stanley's matrices to find what we conjecture to be an approximation for the sum of the number of nonzero coefficients of P(x) ⁿ over F _p . We also relate the number of nonzero coefficients to the number of base p digits of n . These results lead to questions in representation theory and combinatorics.

3) Xiaoyu He , On the Classification of Universal Rotor-Routers (arXiv.org, 6 Nov 2011)

The combinatorial theory of rotor-routers has connections with problems of statistical mechanics, graph theory, chaos theory, and computer science. A rotor-router network defines a deterministic walk on a digraph G in which a particle walks from a source vertex until it reaches one of several target vertices. Motivated by recent results due to Giacaglia et al., we study rotor-router networks in which all non-target vertices have the same type. A rotor type r is universal if every hitting sequence can be achieved by a homogeneous rotor-router network consisting entirely of rotors of type r. We give a conjecture that completely classifies universal rotor types. Then, this problem is simplified by a theorem we call the Reduction Theorem that allows us to consider only two-state rotors. A rotor-router network called the compressor, because it tends to shorten rotor periods, is introduced along with an associated algorithm that determines the universality of almost all rotors. New rotor classes, including boppy rotors, balanced rotors, and BURD rotors, are defined to study this algorithm rigorously. Using the compressor the universality of new rotor classes is proved, and empirical computer results are presented to support our conclusions. Prior to these results, less than 100 of the roughly 260,000 possible two-state rotor types of length up to 17 were known to be universal, while the compressor algorithm proves the universality of all but 272 of these rotor types.

2) Yongyi Chen and Michael Zhang, On zeroth Poisson homology in positive characteristic (30 Sept 2011)

A Poisson algebra is a commutative algebra with a Lie bracket {,} satisfying the Leibniz rule. An important invariant of a Poisson algebra A is its zeroth Poisson homology HP_0(A)=A/A,A}. It characterizes densities on the phase space invariant under all Hamiltonian flows. Also, the dimension of HP_0(A) gives an upper bound for the number of irreducible representations of any quantization of A. We study HP_0(A) when A is the algebra of functions on an isolated quasihomogeneous surface singularity. Over C, it's known that HP_0(A) is the Jacobi ring of the singularity whose dimension is the Milnor number. We generalize this to characteristic p. In this case, HP_0(A) is a finite (although not finite dimensional) module over A^p. We give its conjectural Hilbert series for Kleinian singularities and for cones of smooth projective curves, and prove the conjecture in several cases. (The conjecture has now been proved in general in our follow-up paper with P. Etingof and D. Jordan.)

1) Christina Chen , Tanya Khovanova, and Daniel A. Klain, Volume bounds for shadow covering (arXiv.org, 8 Sep 2011), published in Transactions of the American Mathematical Society 366 (2014)

For n ≥ 2 a construction is given for a large family of compact convex sets K and L in n -dimensional Euclidean space such that the orthogonal projection L _u onto the subspace u ^⊥ contains a translate of the corresponding projection K _u for every direction u , while the volumes of K and L satisfy V _n (K) > V _n (L) . It is subsequently shown that, if the orthogonal projection L _u onto the subspace u ^⊥ contains a translate of K _u for every direction u , then the set (n/(n−1))L contains a translate of K . It follows that V _n (K) ≤ (n/(n−1)) ⁿ V _n (L) . In particular, we derive a universal constant bound V _n (K) ≤ 2.942 V _n (L) , independent of the dimension n of the ambient space. Related results are obtained for projections onto subspaces of some fixed intermediate co-dimension. Open questions and conjectures are also posed.

Contact

With questions, contact PRIMES Program Director Slava Gerovitch at