# PRIMES: Research Papers

2020 Research Papers

264) Quanlin Chen, The Center of the $q$-Weyl Algebra over Rings with Torsion (23 Jan 2021)

We compute the centers of the Weyl algebra, $q$-Weyl algebra, and the "first $q$-Weyl algebra" over the quotient of the ring $\mathbb{Z}/p^N \mathbb{Z}[q]$ by some polynomial $P(q)$. Through this, we generalize and "quantize" part of a result by Stewart and Vologodsky on the center of the ring of differential operators on a smooth variety over $\mathbb{Z}/p^N \mathbb{Z}$. We prove that a corresponding Witt vector structure appears for general $P(q)$ and compute the extra terms for special $P(q)$ with particular properties, answering a question by Bezrukavnikov of possible interpolation between two known results.

263) Tanisha Saxena and Daniel Xu, Graph Alignment-Based Protein Comparison (23 Jan 2021)

Inspired by the question of identifying mechanisms of viral infection, we are interested in the problem of comparing pairs of proteins, given by amino acid sequences and traces of their 3-dimensional structure. While it is true that the problem of predicting and comparing protein function is one of the most famous unsolved problems in computational biology, we propose a heuristic which poses it as a simple alignment problem, which - after some linear-algebraic pre-processing - is amenable to a dynamic programming solution.

262) Andrew Cai, Ratios of Naruse-Newton Coefficients Obtained from Descent Polynomials (arXiv.org, 20 Jan 2021)

We study Naruse-Newton coefficients, which are obtained from expanding descent polynomials in a Newton basis introduced by Jiradilok and McConville. These coefficients $C_0, C_1, \ldots$ form an integer sequence associated to each finite set of positive integers. For fixed nonnegative integers $a<b$, we examine the set $R_{a, b}$ of all ratios $\frac{C_a}{C_b}$ over finite sets of positive integers. We characterize finite sets for which $\frac{C_a}{C_b}$ is minimized and provide a construction to prove $R_{a, b}$ is unbounded above. We use this construction to obtain results on the closure of $R_{a, b}$. We also examine properties of Naruse-Newton coefficients associated with doubleton sets, such as unimodality and log-concavity. Finally, we find an explicit formula for all ratios $\frac{C_a}{C_b}$ of Naruse-Newton coefficients associated with ribbons of staircase shape.

261) Ishan Levy (MIT) and Justin Wu (PRIMES), Borel cohomology of $S^n$ mapping spaces (16 Jan 2021)

We produce an algebraic approximation of the $\mod 2$ cohomology of the homotopy quotient of mapping spaces from an odd dimensional sphere $S^n$ to an arbitrary space $Z$ by the action of the special orthogonal group $SO(n + 1)$. Our approximation is constructed using the structure of the cohomology of $Z$ as an algebra over the Steenrod algebra, and we prove that it agrees with the actual cohomology when $Z$ is an Eilenberg Mac Lane space whose homotopy groups are finite type $\mathbb{F}_2$-modules. Our construction can be thought of as an analog of negative cyclic homology for higher dimensional spheres that takes into account an action of the Steenrod algebra, and it generalizes a construction of Ottosen and Bokstedt for $n = 1$. We also include an appendix where we give explicit formulas for computing a related algebraic approximation of the $\mod 2$ cohomology of arbitrary mapping spaces out of spaces with finite type $\mod 2$ cohomology.

260) Linda Chen, Reducing Round Complexity of Byzantine Broadcast (15 Jan 2021)

Byzantine Broadcast is an important topic in distributed systems and improving its round complexity has long been a focused challenge. Under honest majority, the state of the art for Byzantine Broadcast is 10 rounds for a static adversary and 16 rounds for an adaptive adversary. In this paper, we present a Byzantine Broadcast protocol with expected 8 rounds under a static adversary and expected 10 rounds under an adaptive adversary. We also generalize our idea to the dishonest majority setting and achieve an improvement over existing protocols.

259) Zarathustra Brady (MIT) and Holden Mui (PRIMES), Symmetric Operations on Domains of Size at Most 4 (15 Jan 2021)

To convert a fractional solution to an instance of a constraint satisfaction problem into a solution, a rounding scheme is needed, which can be described by a collection of symmetric operations with one of each arity. An intriguing possibility, raised in a recent paper by Carvalho and Krokhin, would imply that any clone of operations on a set $D$ which contains symmetric operations of arities $1, 2, \ldots, |D|$ contains symmetric operations of all arities in the clone. If true, then it is possible to check whether any given family of constraint satisfaction problems is solved by its linear programming relaxation. We characterize all idempotent clones containing symmetric operations of arities $1, 2, \ldots, |D|$ for all sets $D$ with size at most four and prove that each one contains symmetric operations of every arity, proving the conjecture above for $|D|{\leq}4$.

258) Yuxiao Wang, Asymptotics for Iterating the Lusztig-Vogan Bijection for $GL_n$ on Dominant Weights (15 Jan 2021)

In this paper, we iterate the explicit algorithm computing the Lusztig-Vogan bijection in Type $A$ ($GL_n$) on dominant weights, which was proposed by Achar and simplified by Rush. Our main result focuses on describing asymptotic behavior between the number of iterations for an input and the length of the input; we also present a recursive formula to compute the slope of the asymptote. This serves as another contribution to understanding the Lusztig-Vogan bijection from a combinatorial perspective and a first step in understanding the iterative behavior of the Lusztig-Vogan bijection in Type $A$.

257) Quanlin Chen, Tianze Jiang, and Yuxiao Wang, On the Generational Behavior of Gaussian Binomial Coefficients at Roots of Unity (15 Jan 2021)

The generational behavior of Gaussian binomial coefficients at roots of unity shadows the relationship between the reductive algebraic group in prime characteristic and the quantum group at roots of unity. In this paper, we study three ways of obtaining integer values from Gaussian binomial coefficients at roots of unity. We rigorously define the generations in this context and prove such behavior at primes power and two times primes power roots of unity. Moreover, we investigate and make conjectures on the vanishing, valuation, and sign behavior under the big picture of generations.

256) Fiona Abney-McPeek, Serena An, and Jakin Ng, The Stembridge Equality for Skew Stable Grothendieck Polynomials and Skew Dual Stable Grothendieck Polynomialsls (15 Jan 2021)

The Schur polynomials $s_{\lambda}$ are essential in understanding the representation theory of the general linear group. They also describe the cohomology ring of the Grassmannians. For $\rho = (n, n-1, \dots, 1)$ a staircase shape and $\mu \subseteq \rho$ a subpartition, the Stembridge equality states that $s_{\rho/\mu} = s_{\rho/\mu^T}$. This equality provides information about the symmetry of the cohomology ring. The stable Grothendieck polynomials $G_{\lambda}$, and the dual stable Grothendieck polynomials $g_{\lambda}$, developed by Buch, Lam, and Pylyavskyy, are variants of the Schur polynomials and describe the $K$-theory of the Grassmannians. Using the Hopf algebra structure of the ring of symmetric functions and a generalized Littlewood-Richardson rule, we prove that $G_{\rho/\mu} = G_{\rho/\mu^T}$ and $g_{\rho/\mu} = g_{\rho/\mu^T}$, the analogues of the Stembridge equality for the skew stable and skew dual stable Grothendieck polynomials.

.

255) Adithya Balachandran, Andrew Huang, and Siwen Sun, Product Expansions of *q*-Character Polynomials (15 Jan 2021)

We consider certain class functions defined simultaneously on the groups $Gl_n(\mathbb{F}_q)$ for all *n*, which we also interpret as statistics on matrices. It has been previously shown that these simultaneous class functions are closed under multiplication, and we work towards computing the structure constants of this ring of functions. We derive general criteria for determining which statistics have nonzero expansion coefficients in the product of two fixed statistics. To this end, we introduce an algorithm that computes expansion coefficients in general, which we furthermore use to give closed form expansions in some cases. We conjecture that certain indecomposable statistics generate the whole ring, and indeed prove this to be the case for statistics associated with matrices consisting of up to 2 Jordan blocks. The coefficients we compute exhibit surprising stability phenomena, which in turn reflect stabilizations of joint moments as well as multiplicities in the irreducible decomposition of tensor products of representations of finite general linear groups.

254) Daniel Hong, Hyunwoo Lee, and Alex Wei, Optimal solutions and ranks in the max-cut SDP (15 Jan 2021)

The max-cut problem is a classical graph theory problem which is NP-complete. The best polynomial time approximation scheme relies on *semidefinite programming *(SDP). We study the conditions under which graphs of certain classes have rank 1 solutions to the max-cut SDP. We apply these findings to look at how solutions to the max-cut SDP behave under simple combinatorial constructions. Our results determine when solutions to the max-cut SDP for cycle graphs are rank 1. We find the solutions to the max-cut SDP of the vertex sum of two graphs. We then characterize the SDP solutions upon joining two triangle graphs by an edge sum.

253) Sam Florin, Matthew Ho, and Rahul Thomas, Group testing for two defectives and the zero-error channel capacity (14 Jan 2021)

The issue of identifying defects in a set with as few tests as possible has many applications, including in maximum eciency pool testing during the COVID-19 pandemic. This research aims to determine the rate of growth of the number of tests required relative to the logarithm of the size of the set. In particular, we focus on the case where there are exactly two defects in the set, which is equivalent to the problem of determining the zero-error capacity of a two-user binary adder channel with complete feedback. The channel capacity is given by a non-linear optimization problem involving entropy functions, whose optimal value remains unknown. In this paper, using the linear dependence technique, we are able to reduce the complexity of the optimization problem signicantly. We also gather numerical evidence for the conjectured optimal value.

252) Sarah Chen, In silico prediction of retained intron-derived neoantigens in leukemia (8 Jan 2021)

Alternative splicing is critical for the regulation and diversification of gene expression. Conversely, splicing dysregulation, caused by mutations in splicing machinery or splice junctions, is a hallmark of cancer. Tumor-specific isoforms are a potential source of neoantigens, cancer-specific peptides presented by human leukocyte antigen (HLA) class I molecules and potentially recognized by T cells. For cancers such as acute myeloid leukemia (AML) with a low mutation burden but widespread splicing aberrations, splice variants and retained introns (RIs) in particular, may broaden the number of suitable targets for immunotherapy. I developed a computational pipeline to predict AS-derived neoepitopes from tumor RNA-Seq. I first used the B721.221 B cell line as a model system, for which RNA-Seq, Ribo-Seq, and immunoproteome data from >90 HLA class I monoallelic lines were available. I performed de novo transcriptome assembly with StringTie, identifying on average 694±73 AS isoforms across 4 technical replicates. Using HLAthena, I identified 1,087 AS-derived neoepitopes predicted to bind across 4 frequent HLA alleles. Of them, 192 (18%) also displayed evidence of mRNA translation, measured as the alignment of ≥1 Ribo-Seq. To further increase prediction accuracy, I am currently analyzing the HLA I immunopeptidome to define the features of predicted AS isoforms more likely to be not only translated but also HLA presented. Finally, I applied my prediction pipeline to AML cell lines ( n =8) and primary samples ( n =7). I identified 682±113 AS isoforms in AML cell lines, similar to the 694 in B721, but the proportion of isoforms containing RIs (as opposed to alternative 5' and 3' splice sites or cassette exons) was 3.5x higher than in B721, in line with the biological relevance of RIs in particular in this disease setting. Primary AML samples yielded 1496±294 AS isoforms, more than twofold the number in B721 or AML cell lines, thus reinforcing the significant contribution of AS to the cancer immunopeptidome. Accurate prediction of AS-derived neoantigens through this pipeline will contribute to the design of novel cancer immunotherapies.

251) Kenta Suzuki (PRIMES) and Michael E. Zieve (University of Michigan), Meromorphic functions with the same preimages at several finite sets (31 Dec 2020)

Let $p$ and $q$ be nonconstant meromorphic functions on $\mathbb{C}^m$. We show that if $p$ and $q$ have the same preimages as one another, counting multiplicities, at each of four nonempty pairwise disjoint subsets $S_1,\ldots,S_4$ of $ \mathbb{C}$, then $p$ and $q$ have the same preimages as one another at each of infinitely many subsets of $ \mathbb{C}$, and moreover $g(p)=g(q)$ for some nonconstant rational function $g(x)$ whose degree is bounded in terms of the sizes of the $S_i$'s. This result is new already when $m=1$, and it implies many previous results about the extent to which a meromorphic function is determined by its preimages of a few points or a few small sets.

250) Yavor Litchev and Abigail Thomas, Hybrid Privacy Scheme (31 Dec 2020)

Local Differential Privacy (LDP) is an approach that allows a central server to compute on data submitted by multiple users while maintaining the privacy of each user. LDP is a very efficient approach to security; however, as privacy increases, the accuracy of these computations decreases. Multi-Party Computation (MPC) is a process by which multiple parties work together to compute the output of a function without revealing their own information. MPC is highly secure and accurate for such computations, but it is very computationally expensive and slow. The proposed hybrid privacy model harnesses the benefits of both LDP and MPC to create a secure, accurate, and fast algorithm for machine learning.

249) Ho Tin Fan and Alvin Lu, Parallel Batch-Dynamic 3-Vertex Subgraph Maintenance (31 Dec 2020)

Counting certain subgraphs is a fundamental problem that is crucial in recognizing patterns in large graphs, such as social networks and biological interactomes. However, many real world graphs are constantly evolving and are subject to changes over time, and previous work on efficient parallel subgraph counting algorithms either do not support dynamic modifications or do not extend to general subgraphs. This paper presents a theoretically-efficient and demonstrably fast algorithm for parallel batch-dynamic 3-vertex subgraph counting, and the underlying data structure can be extended to counting 4-vertex subgraph counts as well. The algorithm maintains the *h*-index of the graph, or the maximum *h* such that the graph contains *h* vertices with degree at least *h*, and uses this to update subgraph counts through an efficient traversal of two-paths, or wedges. For a batch of size *b*, the algorithm takes O(*bh*) expected amortized work and O(log(*bh*)) span with high probability.

248) Kevin Zhao (PRIMES), Vladislav Lialin & Anna Rumshisky (UMass Lowell), Text Is an Image: Augmentation via Embedding Mixing (30 Dec 2020)

Data augmentation techniques are essential for computer vision, yielding significant accuracy improvements with little engineering costs. However, data augmentation for text has always been tricky. Synonym replacement techniques require a good thesaurus and domain-specific rules for synonym selection from the synset, while backtranslation techniques are computationally expensive and require a good translation model for the language in interest.

In this paper, we present simple text augmentation techniques on the embeddings level, inspired by mixing-based image augmentations. These techniques are language-agnostic and require little to no hyperparameter tuning. We evaluate the augmentation techniques on IMDB and GLUE tasks, and the results show that the augmentations significantly improve the score of the RoBERTa model.

247) Alvin Chen (PRIMES) and Kai Huang (MIT), Alpha invariants of $K$-semistable smooth toric Fano varieties (29 Dec 2020)

Jiang conjectured that the $\alpha$-invariant for $n$-dimensional $K$-semistable smooth Fano varieties has a gap between $\frac{1}{n}$ and $\frac{1}{n+1}$, where $\frac{1}{n+1}$ can only be achieved by projective $n$-space. Assuming a weaker version of Ewald's conjecture, we prove this gap conjecture in the toric case. We also prove a necessary and sufficient classification for all possible values of the $\alpha$-invariant for $K$-semistable smooth toric Fano varieties by providing an explicit construction of the polytopes that can achieve these values. This provides an important step towards understanding the types of polytopes that correspond to particular values of the $\alpha$-invariant; in particular, we show that $K$-semistable smooth Fano polytopes are centrally symmetric if and only if they have an $\alpha$-invariant of $\frac{1}{2}$. Lastly, we examine the effects of the Picard number on the $\alpha$-invariant, classifying the $K$-semistable smooth toric Fano varieties with Picard number 1 or 2 and their $\alpha$-invariants.

246) Vishnu Emani (PRIMES), Klaus Schmitz-Abe and Pankaj Agrawal (Boston Children's Hospital), Statistical Ranking Model for Candidate Genes in Rare Genetic Disorders (28 Dec 2020)

Genetic mutations are responsible for a significant number of rare diseases, and so investigating the genetic basis of various rare diseases has been a crucial area of study. More specifically, studying variants in the exome, the protein coding region which makes up approximately 1% of the human genome, has been proven effective at identifying the most likely pathogenic variants. The advent of whole exome and whole genome sequencing facilitates identification of the most likely pathogenic mutations much more efficiently and on a greater scale. Next-generation sequencing has been growing rapidly in the past decade and has led to numerous successful disease-detection pipelines. The pipeline involved in this study was the Variant Explorer Pipeline (VExP), developed by our laboratory to improve diagnostic yield. In the VExP pipeline, genetic variants are filtered based on a variety of criteria, which can be divided into the categories of genotype data and phenotype data (Figure 1). After the filtering process, the most likely variants are isolated, a process which requires meticulous examination of a large number of mutations. Furthermore, determining the strength of a phenotype match presents challenges because a number of resources need to be consulted to make an informed decision. The purpose of this project was to develop an automated algorithm, using a host of parameters, to rank mutation candidates based on the two computed scores for pathogenicity.

245) Neil Chowdhury, Modeling the Effect of Histone Methylation on Chromosomal Organization in Colon Cancer Cells (27 Dec 2020)

Loop extrusion and compartmentalization are the two most important processes regulating the high-level organization of DNA in the cell nucleus. These processes are largely believed to be independent and competing. Chromatin consists of nucleosomes, which contain coils of DNA wrapped around histone proteins. Besides packing DNA, nucleosomes contain an "epigenetic code" - tails of histone proteins are chemically modified at certain positions to leave certain "histone marks" on the chromatin fiber. This paper explores the effect of the H3K9me3 histone modification, which typically corresponds to inactive and repressed chromatin, on genome structure. Interestingly, in H3K9me3 domains, there are much fewer topologically associating domains (TADs) than in other domains, and there is a unique compartmentalization pattern. A high-resolution polymer model simulating both loop extrusion and compartmentalization is created to explore these differences.

244) Daniel Xu, Modeling of Network Based Digital Contact Tracing and Testing Strategies for the COVID-19 Pandemic (26 Dec 2020)

With more than 1.7 million COVID-19 deaths, identifying effective measures to prevent COVID19 is a top priority. We developed a mathematical model to simulate the COVID-19 pandemic with digital contact tracing and testing strategies. The model uses a real-world social network generated from a high-resolution contact data set of 180 students. This model incorporates infectivity variations, test sensitivities, incubation period, and asymptomatic cases. We present a method to extend the weighted temporal social network and present simulations on a network of 5000 students. The purpose of this work is to investigate optimal quarantine rules and testing strategies with digital contact tracing. The results show that the traditional strategy of quarantining direct contacts reduces infections by less than 20% without sufficient testing. Periodic testing every 2 weeks without contact tracing reduces infections by less than 3%. A variety of strategies are discussed including testing second and third degree contacts and the pre-exposure notification system, which acts as a social radar warning users how far they are from COVID-19. The most effective strategy discussed in this work was combined the pre-exposure notification system with testing second and third degree contacts. This strategy reduces infections by 18.3% when 30% of the population uses the app, 45.2% when 50% of the population uses the app, 72.1% when 70% of the population uses the app, and 86.8% when 95% of the population uses the app. When simulating the model on an extended network of 5000 students, the results are similar with the contact tracing app reducing infections by up to 79%.

243) Yongyi Chen (MIT) and Tae Kyu Kim (PRIMES), On Generalized Carmichael Numbers (15 Dec 2020; arXiv.org 5 Mar 2021)

Given an integer $k$, define $C_k$ as the set of integers $n > \max(k,0)$ such that $a^{n-k+1} \equiv a \pmod{n}$ holds for all integers $a$. We establish various multiplicative properties of the elements in $C_k$ and give a sufficient condition for the infinitude of $C_k$. Moreover, we prove that there are finitely many elements in $C_k$ with one and two prime factors if and only if $k>0$ and $k$ is prime. In addition, if all but two prime factors of $n \in C_k$ are fixed, then there are finitely many elements in $C_k$, excluding certain infinite families of $n$. We also give conjectures about the growth rate of $C_k$ with numerical evidence. We explore a similar question when both $a$ and $k$ are fixed and prove that for fixed integers $a \geq 2$ and $k$, there are infinitely many integers $n$ such that $a^{n-k} \equiv 1 \pmod{n}$ if and only if $(k,a) \neq (0,2)$ by building off the work of Kiss and Phong. Finally, we discuss the multiplicative properties of positive integers $n$ such that Carmichael function $\lambda(n)$ divides $n-k$.

242) William Qin, HOMFLY Polynomials of Pretzel Knots (11 Dec 2020; arXiv.org, 3 Jan 2021)

HOMFLY polynomials are one of the major knot invariants being actively studied. They are difficult to compute in the general case but can be far more easily expressed in certain specific cases. In this paper, we examine two particular knots, as well as one more general infinite class of knots.

From our calculations, we see some apparent patterns in the polynomials for the knots $9_{35}$ and $9_{46}$, and in particular their $F$-factors. These properties are of a form that seems conducive to finding a general formula for them, which would yield a general formula for the HOMFLY polynomials of the two knots.

Motivated by these observations, we demonstrate and conjecture some properties both of the $F$-factors and HOMFLY polynomials of these knots and of the more general class that contains them, namely pretzel knots with 3 odd parameters. We make the first steps toward a matrix-less general formula for the HOMFLY polynomials of these knots.

241) Jonathan Yin (PRIMES), Hattie Chung (Broad Institute), and Aviv Regev (Broad Institute), A multi-view generative model for molecular representation improves prediction tasks (7 Dec 2020), accepted paper for LMRL2020 (Learning Meaningful Representations of Life) workshop at NeurIPS 2020 (Thirty-fourth Conference on Neural Information Processing Systems)

Unsupervised generative models have been a popular approach to representing molecules. These models extract salient molecular features to create compact vec- tors that can be used for downstream prediction tasks. However, current generative models for molecules rely mostly on structural features and do not fully capture global biochemical features. Here, we propose a multi-view generative model that integrates low-level structural features with global chemical properties to create a more holistic molecular representation. In proof-of-concept analyses, compared to purely structural latent representations, multi-view latent representations improve model accuracy on various tasks when used as input to feed-forward prediction networks. For some tasks, simple models trained on multi-view representations perform comparably to more complex supervised methods. Multi-view represen- tations are an attractive method to improve representations in an unsupervised manner, and could be useful for prediction tasks, particularly in contexts where data is limited.

240) Yibo Gao (MIT), Joshua Guo (PRIMES), Karthik Seetharaman (PRIMES), and Ilaria Seidel (PRIMES), The Rank-Generating Functions of Upho Posets (3 Nov 2020)

Upper homogeneous finite type (upho) posets are a large class of partially ordered sets with the property that the principal order filter at every vertex is isomorphic to the whole poset. Well-known examples include k-array trees, the grid graphs, and the Stern poset. Very little is known about upho posets in general. In this paper, we construct upho posets with Schur-positive Ehrenborg quasisymmetric functions, whose rank-generating functions have rational poles and zeros. We also categorize the rank-generating functions of all planar upho posets. Finally, we prove the existence of an upho poset with uncomputable rank-generating function.

239) Jason Yang (PRIMES) and Jun Wan (MIT), On Updating and Querying Submatrices (arXiv.org, 25 Oct 2020)

In this paper, we study the $d$-dimensional update-query problem. We provide lower bounds on update and query running times, assuming a long-standing conjecture on min-plus matrix multiplication, as well as algorithms that are close to the lower bounds. Given a $d$-dimensional matrix, an \textit{update} changes each element in a given submatrix from $x$ to $x\bigtriangledown v$, where $v$ is a given constant. A \textit{query} returns the $\bigtriangleup$ of all elements in a given submatrix. We study the cases where $\bigtriangledown$ and $\bigtriangleup$ are both commutative and associative binary operators. When $d = 1$, updates and queries can be performed in $O(\log N)$ worst-case time for many $(\bigtriangledown,\bigtriangleup)$ by using a segment tree with lazy propagation. However, when $d\ge 2$, similar techniques usually cannot be generalized. We show that if min-plus matrix multiplication cannot be computed in $O(N^{3-\varepsilon})$ time for any $\varepsilon>0$ (which is widely believed to be the case), then for $(\bigtriangledown,\bigtriangleup)=(+,\min)$, either updates or queries cannot both run in $O(N^{1-\varepsilon})$ time for any constant $\varepsilon>0$, or preprocessing cannot run in polynomial time. Finally, we show a special case where lazy propagation can be generalized for $d\ge 2$ and where updates and queries can run in $O(\log^d N)$ worst-case time. We present an algorithm that meets this running time and is simpler than similar algorithms of previous works.

238) Vishaal Ram (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), A modified age-structured SIR model for COVID-19 type viruses (arXiv.org, 23 Sept 2020)

We present a modified age-structured SIR model based on known patterns of social contact and distancing measures within Washington, USA. We find that population age-distribution has a significant effect on disease spread and mortality rate, and contribute to the efficacy of age-specific contact and treatment measures. We consider the effect of relaxing restrictions across less vulnerable age-brackets, comparing results across selected groups of varying population parameters. Moreover, we analyze the mitigating effects of vaccinations and examine the effectiveness of age-targeted distributions. Lastly, we explore how our model can be applied to other states to reflect social-distancing policy based on different parameters and metrics.

237) Richard Chen (PRIMES), Feng Gui (MIT), Jason Tang (PRIMES), and Nathan Xiong (PRIMES), Few distance sets in $\ell_p$ spaces and $\ell_p$ product spaces (19 Sept 2020; arXiv.org, 26 Sept 2020)

Kusner asked if $n+1$ points is the maximum number of points in $\mathbb{R}^n$ such that the $\ell_p$ distance between any two points is $1$. We present an improvement to the best known upper bound when $p$ is large in terms of $n$, as well as a generalization of the bound to $s$-distance sets. We also study equilateral sets in the $\ell_p$ sums of Euclidean spaces, deriving upper bounds on the size of an equilateral set for when $p=\infty$, $p$ is even, and for any $1\le p<\infty$.

236) Tanya Khovanova (MIT) and Sean Li (PRIMES), The Penney's Game with Group Action (arXiv.org, 13 Sept 2020)

We generalize word avoidance theory by equipping the alphabet $\mathcal{A}$ with a group action. We call equivalence classes of words patterns. We extend the notion of word correlation to patterns using group stabilizers. We extend known word avoidance results to patterns. We use these results to answer standard questions for the Penney's game on patterns and show non-transitivity for the game on patterns as the length of the pattern tends to infinity. We also analyze bounds on the pattern-based Conway leading number and expected wait time, and further explore the game under the cyclic and symmetric group actions.

235) Ankit Bisain (PRIMES) and Eric J. Hanson (Brandeis University), The Bernardi Formula for Non-Transitive Deformations of the Braid Arrangement (7 Sept 2020; arXiv.org, 2 Oct 2020)

Bernardi has given a general formula to compute the number of regions of a deformation of the braid arrangement as a signed sum over *boxed trees*. We prove that the contribution to this sum of the set of boxed trees sharing an underlying rooted labeled tree is 0 or ±1 and give an algorithm for computing this value. We then restrict to arrangements which we call *almost transitive* and construct a sign-reversing involution which reduces Bernardi's signed sum to enumeration of a set of rooted labeled trees in this case. We conclude by explicitly enumerating the trees corresponding to the regions of certain nested Ish arrangements which we call *non-negative*, recovering their known counting formula.

234) Alejandro H. Morales (UMass Amherst) and William Shi (PRIMES), Refinements and Symmetries of the Morris identity for volumes of flow polytopes (7 Sept 2020; arXiv.org, 11 Feb 2021), forthcoming in *Comptes Rendus - Série Mathématique*

Flow polytopes are an important class of polytopes in combinatorics whose lattice points and volumes have interesting properties and relations. The Chan-Robbins-Yuen (CRY) polytope is a flow polytope with normalized volume equal to the product of consecutive Catalan numbers. Zeilberger proved this by evaluating the Morris constant term identity, but no combinatorial proof is known. There is a refinement of this formula that splits the largest Catalan number into Narayana numbers, which Mészáros gave an interpretation as the volume of a collection of flow polytopes. We introduce a new refinement of the Morris identity with combinatorial interpretations both in terms of lattice points and volumes of flow polytopes. Our results generalize Mészáros's construction and a recent flow polytope interpretation of the Morris identity by Corteel-Kim-Mészáros. We prove the product formula of our refinement following the strategy of the Baldoni-Vergne proof of the Morris identity. Lastly, we study a symmetry of the Morris identity bijectively using the Danilov-Karzanov-Koshevoy triangulation of flow polytopes and a bijection of Mészáros-Morales-Striker.

233) Vishaal Ram (PRIMES), Laura P. Schaposnik (University of Illinois at Chicago) et al., Extrapolating continuous color emotions through deep learning (2 Sept 2020), published in *Physical Review Research* 2:3 (September–November 2020)

By means of an experimental dataset, we use deep learning to implement an RGB (red, green, and blue) extrapolation of emotions associated to color, and do a mathematical study of the results obtained through this neural network. In particular, we see that males (type-$m$ individuals) typically associate a given emotion with darker colors, while females (type-$f$ individuals) associate it with brighter colors. A similar trend was observed with older people and associations to lighter colors. Moreover, through our classification matrix, we identify which colors have weak associations to emotions and which colors are typically confused with other colors.

232) Jesse Geneson, Suchir Kaustav, and Antoine Labelle (CrowdMath-2020), Extremal results for graphs of bounded metric dimension (arXiv.org, 31 Aug 2020)

Metric dimension is a graph parameter motivated by problems in robot navigation, drug design, and image processing. In this paper, we answer several open extremal problems on metric dimension and pattern avoidance in graphs from (Geneson, Metric dimension and pattern avoidance, Discrete Appl. Math. 284, 2020, 1-7). Specifically, we construct a new family of graphs that allows us to determine the maximum possible degree of a graph of metric dimension at most $k$, the maximum possible degeneracy of a graph of metric dimension at most $k$, the maximum possible chromatic number of a graph of metric dimension at most $k$, and the maximum $n$ for which there exists a graph of metric dimension at most $k$ that contains $K_{n, n}$.

We also investigate a variant of metric dimension called edge metric dimension and solve another problem from the same paper for $n$ sufficiently large by showing that the edge metric dimension of $P_n^{d}$ is $d$ for $n \geq d^{d-1}$. In addition, we use a probabilistic argument to make progress on another open problem from the same paper by showing that the maximum possible clique number of a graph of edge metric dimension at most $k$ is $2^{\Theta(k)}$. We also make progress on a problem from (N. Zubrilina, On the edge dimension of a graph, Discrete Math. 341, 2018, 2083-2088) by finding a family of new triples $(x, y, n)$ for which there exists a graph of metric dimension $x$, edge metric dimension $y$, and order $n$. In particular, we show that for each integer $k > 0$, there exist graphs $G$ with metric dimension $k$, edge metric dimension $3^k(1-o(1))$, and order $3^k(1+o(1))$.

231) William Li, Lebesgue Measure Preserving Thompson's Monoid (30 Aug 2020)

This paper defines Lebesgue measure preserving Thompson's monoid, denoted by $\mathbb{G}$, which is modeled on Thompson's group $\mathbb{F}$ except that the elements of $\mathbb{G}$ are non-invertible. Moreover, it is required that the elements of $\mathbb{G}$ preserve Lebesgue measure. Monoid $\mathbb{G}$ exhibits very different properties from Thompson's group $\mathbb{F}$. The paper studies a number of algebraic (group-theoretic) and dynamical properties of $\mathbb{G}$ including approximation, mixing, periodicity, entropy, decomposition, generators, and topological conjugacy.

230) Srinath Mahankali, Velocity Inversion Using the Quadratic Wasserstein Metric (24 Aug 2020; arXiv.org 26 Aug 2020)

Full-waveform inversion (FWI) is a method used to determine properties of the Earth from information on the surface. We use the squared Wasserstein distance (squared $W_2$ distance) as an objective function to invert for the velocity as a function of position in the Earth, and we discuss its convexity with respect to the velocity parameter. In one dimension, we consider constant, piecewise increasing, and linearly increasing velocity models as a function of position, and we show the convexity of the squared $W_2$ distance with respect to the velocity parameter on the interval from zero to the true value of the velocity parameter when the source function is a probability measure. Furthermore, we consider a two-dimensional model where velocity is linearly increasing as a function of depth and prove the convexity of the squared $W_2$ distance in the velocity parameter on large regions containing the true value. We discuss the convexity of the squared $W_2$ distance compared with the convexity of the squared $L^2$ norm, and we discuss the relationship between frequency and convexity of these respective distances. We also discuss multiple approaches to optimal transport for non-probability measures by first converting the wave data into probability measures.

229) Michael Gerovitch, Environment-aware Pedestrian Trajectory Prediction for Autonomous Driving (21 Aug 2020)

People's safety is a primary concern in autonomous driving. There exist efficient methods for identifying static obstacles. However, the prediction of future trajectories of moving elements, such as pedestrians crossing a street, is a much more challenging problem. A promising direction of research is the use of machine learning algorithms with location bias maps. Our goal was to further explore this idea by training an interchangeable location bias map, a location-specific feature that is added into the middle of a convolutional neural network. For different locations, we used different location bias maps to allow the network to learn from different setting contexts without overfitting to a specific setting. Using pre-annotated video footage of pedestrians moving around in crowded areas, we implemented a pedestrian behavior encoding scheme to generate input and output volumes for the neural network. Using this encoding scheme, we trained our neural network and interchangeable location bias map. Our research demonstrates that the network with an interchangeable location bias map can predict realistic pedestrian trajectories even when trained simultaneously in multiple settings.

228) Andrew Shen, Towards Proving Application Isolation for Cryptocurrency Hardware Wallets (22 Jul 2020)

We often perform security-sensitive operations in our day-to-day lives such as performing monetary transactions. To perform these operations securely, we can isolate the confirmation of such operations to separate hardware devices. However, proving that these devices operate securely is still difficult given the complexity of their kernels, yet important given the rise in popularity of cryptocurrency transaction devices. To support multiple cryptocurrencies and other functionality, these devices must be able to run multiple applications that are isolated from one another as they could be potentially maliciously acting applications. We can simplify our device by modeling it as running applications sequentially in user mode. We seek to prove that these applications cannot tamper with the kernel memory and show that the kernel protection is set up correctly. To do this, we developed a RISC-V machine emulator in Rosette, which enables us to reason about the behaviour of symbolic machine states and symbolic applications. We make progress towards verifying application isolation for launching and running applications on a simple kernel.

227) Andrey Boris Khesin (MIT) and Alexander Lu Zhang (PRIMES), On Quasisymmetric Functions with Two Bordering Variables (arXiv.org, 23 Jul 2020)

We extend past results on a family of formal power series $K_{n, \Lambda}$, parameterized by $n$ and $\Lambda \subseteq [n]$, that largely resemble quasisymmetric functions. This family of functions was conjectured to have the property that the product $K_{n, \Lambda}K_{m, \Omega}$ of any two functions $K_{n, \Lambda}$ and $K_{m, \Omega}$ from the family can be expressed as a linear combination of other functions from the family. In this paper, we show that this is indeed the case and that the span of the $K_{n, \Lambda}$'s forms an algebra. We also provide techniques for examining similar families of functions and a formula for the product $K_{n, \Lambda}K_{m, \Omega}$ when $n=1$.

226) Neel Bhalla, Constructing Workflow-centric Traces in Close to Real Time for the Hadoop File System (22 Jul 2020)

Diagnosing problems in large scale systems using cloud based distributed services is a challenging problem. Workflow-centric tracing captures the workflow (work done to process requests) and dependency graph of causally-related events among the components of a distributed system. But, constructing traces has historically been performed offline in batch fashion, so trace data is not immediately available to engineers for their diagnosis efforts. In this work, we present an approach based on graph abstraction and streaming framework to construct workflow-centric traces in near real time for the Hadoop file system. This approach will provide the network operators with a real time understanding of the distributed system behavior.

225) Yunseo Choi (PRIMES) and James Unwin (University of Illinois at Chicago), Racial Impact on Infections and Deaths due to COVID-19 in New York City (11 Jul 2020; arXiv.org, 9 Jul 2020), forthcoming in *Harvard Technology Review*

Redlining is the discriminatory practice whereby institutions avoided investment in certain neighborhoods due to their demographics. Here we explore the lasting impacts of redlining on the spread of COVID-19 in New York City (NYC). Using data available through the Home Mortgage Disclosure Act, we construct a redlining index for each NYC census tract via a multi-level logistical model. We compare this redlining index with the COVID-19 statistics for each NYC Zip Code Tabulation Area. Accurate mappings of the pandemic would aid the identification of the most vulnerable areas and permit the most effective allocation of medical resources, while reducing ethnic health disparities.

224) Sanath Govindarajan (PRIMES) and William S. Moses (MIT), SyFER-MLIR: Integrating Fully Homomorphic Encryption Into the MLIR Compiler Framework (3 Jul 2020)

Fully homomorphic encryption opens up the possibility of secure computation on private data. However, fully homomorphic encryption is limited by its speed and the fact that arbitrary computations must be represented by combinations of primitive operations, such as addition, multiplication, and binary gates. Integrating FHE into the MLIR compiler infrastructure allows it to be automatically optimized at many different levels and will allow any program which compiles into MLIR to be modified to be encrypted by simply passing another flag into the compiler. The process of compiling into an intermediate representation and dynamically generating the encrypted program, rather than calling functions from a library, also allows for optimizations across multiple operations, such as rewriting a DAG of operations to run faster and removing unnecessary operations.

223) Ethan Mendes (PRIMES) and Kyle Hogan (MIT), Defending Against Imperceptible Audio Adversarial Examples Using Proportional Additive Gaussian Noise (30 Jun 2020)

Neural networks are susceptible to adversarial examples, which are specific inputs to a network that result in a misclassification or an incorrect output. While most past work has focused on methods to generate adversarial examples to fool image classification networks, recently, similar attacks on automatic speech recognition systems have been explored. Due to the relative novelty of these audio adversarial examples, there exist few robust defenses for these attacks. We present a robust defense for inaudible or imperceptible audio adversarial examples. This approach mimics the adversarial strategy to add targeted proportional additive Gaussian noise in order to revert an adversarial example back to its original transcription. Our defense performs similarly to other defenses yet is the first randomized or probabilistic strategy. Additionally, we demonstrate the challenges that arise when applying defenses against adversarial examples for images to audio adversarial examples.

222) Walden Yan (PRIMES) and William S. Moses (MIT) , Token pairing to improve neural program synthesis models (30 Jun 2020)

In neural program synthesis (NPS), a network is trained to output or aid in the output of code that satisfies a given program specification. In our work, we make modifications upon the simple sequence-to-sequence (Seq2Seq) LSTM model. Extending the most successful techniques from previous works, we guide a beam search with an encoder-decoder scheme augmented with attention mechanisms and a specialized syntax layer. But one of the withstanding difficulties of NPS is the implicit tree structure of programs, which makes it inherently more difficult for linearly-structured models. To address this, we experiment with a novel technique we call *token pairing*. Our model is trained and evaluated on AlgoLisp, a dataset of English description-to-code programming problems paired with example solutions and test cases on which to evaluate programs. We also create a new interpreter for AlgoLisp that fixes the bugs present in the builtin executor. In the end, our model achieves 99.24% accuracy at evaluation, which greatly improves on the previous state-of-the-art of 95.80% while using fewer of parameters.

221) Zhenkun Li (MIT) and Jessica Zhang (PRIMES), Classification of tight contact structures on a solid torus (arXiv.org, 30 Jun 2020)

It is a basic question in contact geometry to classify all non-isotopic tight contact structures on a given 3-manifold. If the manifold has a boundary, we need also specify the dividing set on the boundary. In this paper, we answer the classification question completely for the case of a solid torus by writing down a closed formula for the number of non-isotopic tight contact structures with any given dividing set on the boundary of the solid torus. Previously, only a few special cases were known due to work by Honda.

220) Christian Gaetz (MIT) and Katherine Tung (PRIMES), The Sperner property for $132$-avoiding intervals in the weak order (arXiv.org, 29 Jun 2020), published in *Bulletin of the London Mathematical Society* 53:2 (April 2021): 442-457.

A well-known result of Stanley implies that the weak order on a maximal parabolic quotient of the symmetric group $S_n$ has the Sperner property; this same property was recently established for the weak order on all of $S_n$ by Gaetz and Gao, resolving a long-open problem. In this paper we interpolate between these results by showing that the weak order on any parabolic quotient of $S_n$ (and more generally on any $132$-avoiding interval) has the Sperner property. This result is proven by exhibiting an action of $\mathfrak{sl}_2$ respecting the weak order on these intervals. As a corollary we obtain a new formula for principal specializations of Schubert polynomials. Our formula can be seen as a strong Bruhat order analogue of Macdonald's reduced word formula. This proof technique and formula generalize work of Hamaker, Pechenik, Speyer, and Weigandt and Gaetz and Gao.

219) Yuxuan (Jason) Chen, Real World Application of Event-based End to End Autonomous Driving (29 Jun 2020)

End-to-end autonomous driving has recently been a popular area of study for deep learning. This work studies the use of event cameras for real-world deep learned driving task in comparison to traditions RGB cameras. In this work, we evaluate existing stateof-the-art event-based models on offline datasets, design a novel model that fuses the benefits from both event-based and traditional frame-based cameras, and integrate the trained models on board a full-scale vehicle. We conduct tests in a challenging track with features unseen to the model. Through our experiments and saliency visualization, we show that event-based models actually predict the existing motion of the car rather than the active control the car should take. Therefore, while event-based models excel at offline tasks such as motion estimation, our experiments reveal a fundamental challenge in applying event-based end-to-end learning to active control tasks, that the models need to learn reasoning about future actions with a feedback loop that impacts its future state.

218) Arun S. Kannan (MIT) and Honglin Zhu (PRIMES), Characters for Projective Modules in the BGG Category $\mathcal{O}$ for the Orthosymplectic Lie Superalgebra $\mathfrak{osp}(3|4)$ (arXiv.org, 11 Jun 2020)

We determine the Verma multiplicities of standard filtrations of projective modules for integral atypical blocks in the BGG category $\mathcal{O}$ for the orthosymplectic Lie superalgebras $\mathfrak{osp}(3|4)$ by way of translation functors. We then explicitly determine the composition factor multiplicities of Verma modules using BGG reciprocity.

2019 Research Papers

217) Zhengyang (Leo) Dong (PRIMES) and Gil Alterovitz (MIT), netAE: Semi-supervised dimensionality reduction of single-cell RNA sequencing to facilitate cell labeling, published in *Bioinformatics* (29 Jul 2020)

Single-cell RNA sequencing allows us to study cell heterogeneity at an unprecedented cell-level resolution and identify known and new cell populations. Current cell labeling pipeline uses unsupervised clustering and assigns labels to clusters by manual inspection. However, this pipeline does not utilize available gold-standard labels because there are usually too few of them to be useful to most computational methods. This paper aims to facilitate cell labeling with a semi-supervised method in an alternative pipeline, in which a few gold-standard labels are first identified and then extended to the rest of the cells computationally. We built a semi-supervised dimensionality reduction method, a network-enhanced autoencoder (netAE). Tested on three public datasets, netAE outperforms various dimensionality reduction baselines and achieves satisfactory classification accuracy even when the labeled set is very small, without disrupting the similarity structure of the original space.

216) Tanya Khovanova (MIT) and Kevin Wu (PRIMES), Base 3/2 and Greedily Partitioned Sequences (arXiv.org, 19 Jul 2020)

We delve into the connection between base $\frac{3}{2}$ and the greedy partition of non-negative integers into 3-free sequences. Specifically, we find a fractal structure on strings written with digits 0, 1, and 2. We use this structure to prove that the even non-negative integers written in base $\frac{3}{2}$ and then interpreted in base 3 form the Stanley cross-sequence, where the Stanley cross-sequence comprises the first terms of the infinitely many sequences that are formed by the greedy partition of non-negative integers into 3-free sequences.

215) Dmitry Kleinbock (Brandeis University), Anurag Rao (Brandeis University), and Srinivasan Sathiamurthy (PRIMES), Critical loci of convex domains in the plane (26 Mar 2020; arXiv.org, 30 Mar 2020), published in Indagationes Mathematicae 32:3 (May 2021): 719-728.

Let $K$ be a bounded convex domain in $\mathbb{R}^2$ symmetric about the origin. The critical locus of $K$ is defined to be the (non-empty compact) set of lattices $\Lambda$ in $\mathbb{R}^2$ of smallest possible covolume such that $\Lambda \cap K= \lbrace 0\rbrace$. These are classical objects in geometry of numbers; yet all previously known examples of critical loci were either finite sets or finite unions of closed curves. In this paper we give a new construction which, in particular, furnishes examples of domains having critical locus of arbitrary Hausdorff dimension between $0$ and $1$.

214) P. A. Crowdmath, Propagation time for weighted zero forcing (arXiv.org, 15 May 2020)

Zero forcing is a graph coloring process that was defined as a tool for bounding the minimum rank and maximum nullity of a graph. It has also been used for studying control of quantum systems and monitoring electrical power networks. One of the problems from the 2017 AIM workshop "Zero forcing and its applications" was to explore edge-weighted probabilistic zero forcing, where edges have weights that determine the probability of a successful force if forcing is possible under the standard zero forcing coloring rule.

In this paper, we investigate the expected time to complete the weighted zero forcing coloring process, known as the expected propagation time, as well as the time for the process to be completed with probability at least $\alpha$, known as the $\alpha$-confidence propagation time. We demonstrate how to find the expected and confidence propagation times of any edge-weighted graph using Markov matrices. We also determine the expected and confidence propagation times for various families of edge-weighted graphs including complete graphs, stars, paths, and cycles.

213) P. A. Crowdmath, Applications of the abc conjecture to powerful numbers (arXiv.org, 15 May 2020)

The abc conjecture is one of the most famous unsolved problems in number theory. The conjecture claims for each real $\epsilon > 0$ that there are only a finite number of coprime positive integer solutions to the equation $a+b = c$ with $c > (rad(a b c))^{1+\epsilon}$. If true, the abc conjecture would imply many other famous theorems and conjectures as corollaries. In this paper, we discuss the abc conjecture and find new applications to powerful numbers, which are integers $n$ for which $p^2 | n$ for every prime $p$ such that $p | n$. We answer several questions from an earlier paper on this topic, assuming the truth of the abc conjecture.

212) Alin Tomescu (MIT CSAIL), Robert Chen (PRIMES), Yiming Zheng (PRIMES), Ittai Abraham (VMware Research), Benny Pinkas (VMware Research and Bar Ilan University), Guy Golan Gueta (VMware Research), and Srinivas Devadas (MIT CSAIL), Towards Scalable Threshold Cryptosystems (9 Mar 2020), published in *Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP)*, San Francisco, CA, vol. 1, pp. 1242-1258.

The resurging interest in Byzantine fault tolerant systems will demand more scalable threshold cryptosystems. Unfortunately, current systems scale poorly, requiring time quadratic in the number of participants. In this paper, we present techniques that help scale threshold signature schemes (TSS), verifiable secret sharing (VSS) and distributed key generation (DKG) protocols to hundreds of thousands of participants and beyond. First, we use efficient algorithms for evaluating polynomials at multiple points to speed up computing Lagrange coefficients when aggregating threshold signatures. As a result, we can aggregate a 130,000 out of 260,000 BLS threshold signature in just 6 seconds (down from 30 minutes). Second, we show how "authenticating" such multipoint evaluations can speed up proving polynomial evaluations, a key step in communicationefficient VSS and DKG protocols. As a result, we reduce the asymptotic (and concrete) computational complexity of VSS and DKG protocols from quadratic time to quasilinear time, at a small increase in communication complexity. For example, using our DKG protocol, we can securely generate a key for the BLS scheme above in 2.3 hours (down from 8 days). Our techniques improve performance for thresholds as small as 255 and generalize to any Lagrange-based threshold scheme, not just threshold signatures. Our work has certain limitations: we require a trusted setup, we focus on synchronous VSS and DKG protocols and we do not address the worst-case complaint overhead in DKGs. Nonetheless, we hope it will spark new interest in designing large-scale distributed systems.

211) Daniil Kalinov (MIT) and Lev Kruglyak (PRIMES), The Rational Cherednik Algebra of Type $A_1$ with Divided Powers (5 Mar 2020)

Motivated by the recent developments of the theory of Cherednik algebras in positive characteristic, we study rational Cherednik algebras with divided powers. In our research we have started with the simplest case, the rational Cherednik algebra of type $A_1$. We investigate its maximal divided power extensions over $R[c]$ and $R$ for arbitrary principal ideal domains $R$ of characteristic zero. In these cases, we prove that the maximal divided power extensions are free modules over the base rings, and construct an explicit basis in the case of $R[c]$. In addition, we provide an abstract construction of the rational Cherednik algebra of type $A_1$ over an arbitrary ring, and prove that this generalization expands the rational Cherednik algebra to include all of the divided powers.

210) Sebastian Jeon (PRIMES) and Tanya Khovanova (MIT), 3-Symmetric Graphs (4 Mar 2020)

An intuitive property of a random graph is that its subgraphs should also appear randomly distributed. We consider graphs whose subgraph densities exactly match their expected values. We call graphs with this property for all subgraphs with $k$ vertices to be $k$-symmetric. We discuss some properties and examples of such graphs. We construct 3-symmetric graphs and provide some statistics.

209) Lucy Cai, Espen Slettnes, and Jeremy Zhou, A Combinatorial Approach to Extracting Rooted Tree Statistics from the Order Quasisymmetric Function (3 Mar 2020)

The chromatic symmetric function defined by Stanley is a power series that is symmetric in an infinite number of variables and generalizes the chromatic polynomial. Shareshian and Wachs defined the chromatic quasisymmetric function, and Awan and Bernardi defined an analog of it for digraphs.

Three decades ago, Stanley posed a question equivalent to "Does the chromatic symmetric function distinguish between all trees?" A similar question can be raised for rooted trees: "Does the chromatic quasisymmetric function distinguish between all rooted trees?". Hasebe and Tsujie showed algebraically the stronger statement that the order quasisymmetric function distinguishes rooted trees. Here, we aim to directly extract useful statistics about a tree given only its *order* quasisymmetric function. This approach emphasizes the combinatorics of trees over the the algebraic properties of quasisymmetric functions. We show that a rooted-tree-statistic we name the "co-height profile profile" is extractable, and that it distinguishes rooted 2-caterpillars.

208) Heidi Lei, On the Hausdorff Dimension of the Visible Koch Curve (28 Feb 2020)

In geometry, a point in a set is visible from another point if the line segment connecting two points does not contain other points in the set. We show that the Hausdorff dimension is 1 for the portion of the Koch curve that is visible from points at infinity and points in certain defined regions of the plane.

207) Aditya Saligrama (PRIMES) and Guillaume Leclerc (MIT), Revisiting Ensembles in an Adversarial Context: Improving Natural Accuracy(arXiv.org, 26 Feb 2020), presented at the ICLR 2020 Workshop on Towards Trustworthy ML: Rethinking Security and Privacy for ML (26 April 2020) (slides)

A necessary characteristic for the deployment of deep learning models in real world applications is resistance to small adversarial perturbations while maintaining accuracy on non-malicious inputs. While robust training provides models that exhibit better adversarial accuracy than standard models, there is still a significant gap in natural accuracy between robust and non-robust models which we aim to bridge. We consider a number of ensemble methods designed to mitigate this performance difference. Our key insight is that model trained to withstand small attacks, when ensembled, can often withstand significantly larger attacks, and this concept can in turn be leveraged to optimize natural accuracy. We consider two schemes, one that combines predictions from several randomly initialized robust models, and the other that fuses features from robust and standard models.

206) William Kuszmaul (MIT) and Alek Westover (PRIMES), In-Place Parallel-Partition Algorithms using Exclusive-Read-and-Write Memory (25 Feb 2020)

We present an in-place algorithm for the parallel partition problem that has linear work and polylogarithmic span. The algorithm uses only exclusive read/write shared variables, and can be implemented using parallel-for-loops without any additional concurrency considerations (i.e., the algorithm is EREW). A key feature of the algorithm is that it exhibits provably optimal cache behavior, up to small-order factors.

We also present a second in-place EREW algorithm that has linear work and span *O*(log*n*·loglog*n*), which is within an *O*(loglog*n*) factor of the optimal span. By using this low-span algorithm as a subroutine within the cache-friendly algorithm, we are able to obtain a single EREW algorithm that combines their theoretical guarantees: the algorithm achieves span *O*(log*n*·loglog*n*) and optimal cache behavior. As an immediate consequence, we also get an in-place EREW quicksort algorithm with work *O*(*n*log*n*), span *O*(log^{2}*n*·loglog*n*).

205) Justin Yu, On a rank game (22 Feb 2020)

We introduce a new game played by two players that generates an $(0,1)$-matrix of size $n$. The first player aims to maximize its resulting rank, while the second player aims to minimize it. We show that the first player can force almost full rank given additional power in move possibilities.

204) Benjamin Kang (PRIMES) and James Unwin (University of Illinois at Chicago), All-Pay Auctions as Models for Trade Wars and Military Annexation (arXiv.org, 10 Feb 2020)

We explore an application of all-pay auctions to model trade wars and territorial annexation. Specifically, in the model we consider the expected resource, production, and aggressive (military/tariff) power are public information, but actual resource levels are private knowledge. We consider the resource transfer at the end of such a competition which deprives the weaker country of some fraction of its original resources. In particular, we derive the quasi-equilibria strategies for two country conflicts under different scenarios. This work is relevant for the ongoing US-China trade war, and the recent Russian capture of Crimea, as well as historical and future conflicts.

203) Benjamin Kang (PRIMES) and James Unwin (University of Illinois at Chicago), All-Pay Auctions with Different Forfeits (arXiv.org, 7 Feb 2020), forthcoming in the Yau Competition finalists compendium

In an auction each party bids a certain amount and the one which bids the highest is the winner. Interestingly, auctions can also be used as models for other real-world systems. In an all pay auction all parties must pay a forfeit for bidding. In the most commonly studied all pay auction, parties forfeit their entire bid, and this has been considered as a model for expenditure on political campaigns. Here we consider a number of alternative forfeits which might be used as models for different real-world competitions, such as preparing bids for defense or infrastructure contracts.

202) Victoria Zhang, Patterns and Symmetries in Spiking Neural Networks (11 Jan 2020)

Inspired by recent progress in computational neuroscience and artificial intelligence, this paper explores rich temporal patterns in networks of neurons that communicate via electric pulses known as spikes. In particular, we describe the attractors in small circuits of spiking neurons with different symmetries and connectivities. Using methods developed in the theory of dynamical systems, we extend an analytical approach to capture the phase-locked states and their stability for a general *N*-cell system. We then systematically explore attractors in reduced state spaces via Poincaré maps for both all-to-all coupled and star-like coupled networks. We identify a sequence of bifurcations when the coupling strengths vary from inhibition to excitation. Moreover, using high-precision numerical simulations, we find two novel states in star-like networks that are unobserved in all-to-all networks: the death of oscillation for inhibitory coupling and quasi-periodic behaviors for excitatory coupling. Our results elucidate the interplay between dynamical patterns and symmetries in the building blocks of real networks. Furthermore, as self-sustained oscillations with pulsatile couplings are ubiquitous, our analysis may clarify understanding of not only neural dynamics but also other pulse-coupled oscillator systems such as non-linear electric circuits, wireless sensor networks, and self-organizing chemical reactions.

201) Zander Hill, Upper Bound on the Distortion of Cabled Knots (8 Jan 2020)

The torus knots are a class of knots generated by ordered pairs $(p,q)$ of relatively prime integers, where the $(p,q)$-torus knot is the curve defined by a ray of slope $\frac{p}{q}$ emanating from the origin in the representation of the torus as a square with opposing sides identified. Furthermore, given a curve $K$, we can define the $(p,q)$-cabling of $K$ to be the $(p,q)$-torus knot living on an embedding of the torus which follows $K$, as opposed to the standard embedding of the torus which follows $S^1$ in $\mathbb{R}^3$. We show that for all $p$ and $q \gg p$, there exists a curve in the isotopy class of the $(p,q)$-torus knot whose supremal ratio of arc length to Euclidean distance, called the distortion of the curve, is bounded above by $\frac{7q}{\log(q)}$, and additionally show that this bound holds for the $(p,q)$-cabling of any knot. This extends a result of Studer establishing sublinear upper bounds for the distortion of the $(2,q)-$torus knots.

200) Oliver Hayman (PRIMES) and Ashwin Narayan (MIT), Analyzing Visualization and Dimensionality-Reduction Algorithms (9 Jan 2020)

In order to find patterns among high dimensional data sets in scientific studies, scientists use mapping algorithms to produce representative two-dimensional or three-dimensional data sets that are easier to visualize. The most prominent of these algorithms is the t-Distributed Stochastic Neighbor Embedding algorithm (t-SNE). In this project, we create a metric for evaluating how clustered a data set is, and use it to measure how the perplexity parameter of the t-SNE algorithm affects the clustering of outputted data sets. Additionally, we propose a modification in which improved how well randomness is preserved in outputted data sets. Finally, we create a separate metric to test whether a group of points contains one or multiple clusters in a data set of centered clusters.

199) Frank Wang, The integral shuffle algebra and the $K$-theory of the Hilbert scheme of points in $\mathbb{A}^2$ (8 Jan 2020; arXiv.org, 12 Feb 2020)

We examine the shuffle algebra defined over the ring $\mathbf{R} = \mathbb{C}[q_1^{\pm 1}, q_2^{\pm 1}]$, also called the integral shuffle algebra, which was found by Schiffmann and Vasserot to act on the equivariant $K$-theory of the Hilbert Scheme of points in the plane. We find that the modules of 2 and 3 variable elements of the shuffle algebra are finitely generated, and prove a necessary condition for an element to be in the integral shuffle algebra for arbitrarily many variables.

198) Tejas Gopalakrishna (PRIMES) and Yichi Zhang (MIT), Analysis of the One Line Factoring Algorithm (6 Jan 2020)

For integers that fit within $42$ bits, a competitive factoring algorithm is the so-called One Line Factoring Algorithm proposed by William B. Hart. We analyze this algorithm in special cases, in particular, for semiprimes $N = pq$, and look for optimizations. We first observe the cases in which the larger or smaller prime is returned. We then show that when $p$ and $q$ are sufficiently close, we always finish on the first iteration. An upper bound can be found for the first iteration that successfully factors an odd semiprime. Using this upper bound, we demonstrate some simplifications to the algorithm for odd semiprimes in particular. One of our observations is that we only need to iterate numbers $\{ 0,1,3,5,7 \}$ modulo $8$, as the other iterators are very rarely the first that successfully factor the semiprime. Finally, we inspect the performance of the optimized algorithm.

197) Sunay Joshi, On the degenerate Turán problem and its variants (3 Jan 2020)

Given a family of graphs $\mathcal{F}$, a central problem in extremal graph theory is to determine the maximum number $\text{ex}(n,\mathcal{F})$ of edges in a graph on $n$ vertices that does not contain any member of $\mathcal{F}$ as a subgraph. The degenerate Turán problem regards the asymptotic behavior of $\text{ex}(n,\mathcal{F})$ for familes $\mathcal{F}$ of bipartite graphs. In this paper, we prove four new theorems regarding the extremal number and its variants. We begin by investigating several notions central to providing lower bounds on extremal numbers, including balanced rooted graphs and the Erdös--Simonovits Reduction Theorem. In addition, we present new lower bounds on the asymmetric extremal number $\text{ex}(m,n,F)$ and the lopsided asymmetric extremal number $\text{ex}^*(m,n,F)$ when $F$ is a blowup of a bipartite graph or a theta graph.

196) Alexander J. Ding, An Evaluation of UPC++ by Porting Shared-Memory Parallel Graph Algorithms (1 Jan 2020)

Unified Parallel C++ (UPC++), a C++ library, attempts to address the programming difficulty introduced by distributed parallel systems and still take advantage of the model's high scalability by exposing an API that represents the distributed memory as a contiguous global address space, similar to that of a sharedmemory parallel system. Though previous work, including the various benchmarks by UPC++ developers, has demonstrated the library's effectiveness in simple tasks and in porting distributed-memory parallel algorithms that are often implemented in OpenMPI, there lacks an assessment of the ease and effectiveness of porting shared-memory parallel algorithms into UPC++. We implement a number of graph algorithms in OpenMP, a common shared-memory parallel library, and port them into UPC++ in a locality-aware, communication-averse manner to evaluate the convenience, scalability, and robustness of UPC++. Tests on both a single-node, multicore system and the NERSC supercomputer (a multi-node system), with a plethora of real and random input graphs, demonstrate a number of prerequisites for high scalability in our UPC++ implementation: large input graphs, dense input graphs, and dense operations. Similar tests on our OpenMP implementation function as control, proving the algorithms' performance in shared-memory systems. Despite the relatively straightforward and naive porting from OpenMP, we still achieve competitive performance and scalability in dense algorithms on large inputs. The porting demonstrates UPC++'s ease of usage and good porting potential, especially when compared with other distributed libraries like OpenMPI. Finally, we extrapolate a distributed graph processing system on UPC++, optimized with a hybrid top-down/bottom-up approach, to simplify future distributed graph algorithm implementations.

195) Jason Yang (PRIMES), Martin Falk (MIT), and Sameer Abraham (MIT), The relationship between gene expression correlation and 3D genome organization (31 Dec 2019)

In some organisms such as *E. coli* and *S. cerevisiae* yeast, it is known that there is a relationship between the distance among genes and their coexpression (Pannier et. al., Kruglyak and Tang). It is also known that in general there is a relationship between gene function and genome structure (Szabo et. al). One might also expect to find a relationship between gene expression and TADs, which are domains within the genome where loci inside contact each other more frequently than loci outside. However, by analyzing data from *Mus musculus* brain cells, we do not find a relationship between gene pair correlation of single-cell RNA-seq gene expression and gene pair distance. Furthermore, despite the body of work linking gene expression and TAD structure, we also find no difference between gene pairs within a single TAD and between two TADs in terms of the relationship between gene pair distance and correlation. Additionally, we find that gene pair correlation is not related to the biological functions of the genes. However, there is a relationship between highly negative gene pair correlation and the number of times both genes are expressed 0 times across different cells.

194) Sarah Chen (PRIMES), Karl Clauser, Travis Law, and Tamara Ouspenskaia (Broad Institute), Seeking Neoantigen Candidates within Retained Introns (28 Dec 2019)

Major histocompatibility complex class I (MHC I) molecules present peptides from cytosolic proteins on the surface of cells. Cytotoxic T cells can recognize the presented antigens, and infected or cancerous cells that present non-self antigens can elicit an immune response. The identification of cancer-specific peptides (neoantigens) produced by somatic mutations in tumor cells and presented by MHC I molecules enables immunotherapies such as personalized cancer vaccines and adoptive T cell transfer. The state of the art approach searches for neoantigens derived from cancer-specific somatic variants and often falls short for cancers with few somatic mutations. Retained introns (RIs) resulting from splicing errors in cancer are an additional source of neoantigens. In this study, we identify RIs which are transcribed, translated, and contribute peptides to MHC I presentation. Using *de novo* transcriptome assembly of RNA-seq data,we identified 1799 RIs in B721.221 cells. Additionally, we detected 87 peptides from 83 RIs by liquid chromatography-tandem mass spectrometry of the MHC I immunopeptidome (LC-MS/MS). Finally, we use ribosome profiling (Ribo-seq), which provides a readout of mRNA translation, to identify RIs that are translated, a prerequisite for MHC I presentation. Previous studies have predicted thousands of RIs but have been able to validate only a handful through mass spectrometry. By distinguishing transcribed but untranslated versus translated candidates, Ribo-seq has the potential to improve RI predictions. We propose the use of a combination of RNA-seq and Ribo-seq, paired with mass spectrometry validation, to more accurately predict the contribution of RIs to the MHC I immunopeptidome, enabling the use of RI derived neoantigens in future immunotherapies.

193) Kevin Zhao and Vishnu Emani, The Role of Protein Occupancy in DNA Compartmentalization (23 Dec 2019)

The organization of DNA throughout the genome is a complex process to study. Analysis reveals a checker-board pattern of separation at a megabase-pair scale, called compartments, which are captured well by the largest eigenvector of the Hi-C contact matrix. The sign of the eigenvector correlates with active and repressed areas of the genome. These compartments have been characterized into two categories, called A and B compartments, which are hypothesized to be spatially separated based upon the protein occupancy in the region. This project explores the factors that govern DNA compartmentalization, including the relationship between compartments and protein occupancy. In order to analyze contacts within the genome, Hi-C data was loaded and the eigenvectors of the contact matrix were computed. Protein occupancy in murine cortical neurons and neural progenitor cells was measured via ChIP-Seq. Using this data, we calculated the influence of several proteins on the sign of the Hi-C eigenvector via regression and Support Vector Machines (SVMs). Based on our findings, we tried to develop a simple model for compartments and explored this via simulations. We developed simple simulations of compartments based on ChIP-Seq data, and compared the results to compartments identified in experimental Hi-C maps. The results demonstrate a high correlation between the eigenvectors of the simulated and experimental Hi-C maps. In conclusion, the computational methods are effective at determining the proteins which most significantly contribute to compartmentalization.

192) Neil Chowdhury, A method to recognize universal patterns in genome structure using Hi-C (22 Dec 2019)

The expression of genes in cells is a complicated process. Expression levels of a gene are determined not only by its local neighborhood but also by more distal regions, as is the case with enhancer-promoter interactions, which can connect regions millions of bases away. The large-scale organization of DNA within the cell nucleus plays a substantial role in gene expression and cell fate, with recent developments in biochemical assays (such as Hi-C) generating quantitative maps of the higher-order structure of DNA. The interactions captured by Hi-C have been attributed to several distinct physical processes. One of the processes is that of segregation of DNA into compartmental domains by phase separation. While the current consensus is that there are broadly two types of compartmental domains (A and B), there is some evidence for a larger number of compartmental domains. Here a methodology to determine the identity and number of such compartments is presented, and it is observed that there are four distinct compartments within the genome.

191) Yizhen Chen, Mobile Sensor Networks: Bounds on Capacity and Complexity of Realizability (22 Dec 2019; arXiv.org, 21 Jan 2020), submitted to *Electronic Journal of Combinatorics*

In a restricted combinatorial mobile sensor network (RCMSN), there are n sensors that continuously receive and store information from outside. Every two sensors communicate exactly once, and at an event when two sensors communicate, they receive and store additionally all information the other has stored. C. Gu, I. Downes, O. Gnawali, and L. Guibas proposed a capacity of information diffusion in mobile sensor networks. They collected all information received by two sensors between a communication event and the previous communication events for each of them into one information packet, and considered the number of sensors a packet eventually reaches. Then they defined the capacity of an RCMSN to be the ratio of the average number of sensors the packets reach and the total number of sensors. While they have studied the expected capacity of an RCMSN (when the order of communications is random), we found the RCMSNs with maximum and minimum capacities. We also found the maximum, minimum, and expected capacities for several related mobile sensor network constructions, such as ones generated from intersections of lines, as well as complexity results concerning when a mobile sensor network can be generated in such geometric ways.

190) Andrew Zhang, Antimicrobial resistance prediction using deep convolutional neural networks on whole genome sequence data (19 Dec 2019)

We propose a method to determine whether a bacterial strain is resistant to an antibiotic based on its whole genome sequence data using deep machine learning – deep convolutional neural networks (DCNN). The DCNN model developed in this research is shown to achieve an average AMR prediction accuracy of 94.7%. Each prediction takes less than a second. The model is verified with Klebsiella pneumoniae resistance to tetracycline data and Acinetobacter baumannii resistance to carbapenem data from the public database PATRIC. The DCNN model is further tested with clinically collected genomic data of 149 strains of Mycobacterium tuberculosis, and achieves a prediction accuracy of 93.1% for resistance to pyrazinamide (PZA). To find genes that harbor mutations of PZA resistance, we build a Support Vector Machine (SVM) model tailored for VCF format genomic data, which has revealed two novel genes, embB and gyrA, that harbor mutations associated with PZA resistance besides the well-known pncA gene. Our DCNN and SVM Machine Learning framework, if used together with the real-time genome sequencing machines, which are now already available, could make rapid AMR predictions, allowing for critical time to ensure good patient outcomes and preventing outbreaks of deadly AMR infections. Furthermore, the developed framework identifies pertinent resistance genes, helping researchers understand the mechanisms behind resistance. Finally, this research demonstrates how deep machine learning techniques can produce high accuracy predictive models accelerating the diagnosis of AMR.

189) Rupert Li, Pulses of Flow-firing Processes (8 Dec 2019)

Flow-firing is a natural generalization of chip-firing, or the abelian sandpile model, to higher dimensions, operating on infinite planar graphs. The edges of the graph have flow, which is rerouted through the faces of the graph. We investigate initial flow configurations which display terminating behavior and global confluence, meaning the terminating configuration is unique. The pulse configuration over a hole, or a configuration of flow going around a face that cannot redirect flow, is known to display global confluence, and we expand this result to initial configurations that have multiple pulses, identifying which terminating configurations are possible. We also generalize the analysis of the global confluence of pulses to configurations with flow outside of the hole, especially to the configuration of a pulse with radius, and prove under what conditions this displays global confluence. We conclude with a conjecture on the global confluence of a generalization of a pulse with radius, a uniform conservative configuration, or contour.

188) Yibo Gao (MIT) and Rupert Li (PRIMES), Compatible Recurrent Identities of the Sandpile Group and Maximal Stable Configurations (18 Nov 2019; arXiv.org, 23 Aug 2020), published in Discrete Applied Mathematics 288 (15 Jan 2021): 123-137

In the abelian sandpile model, recurrent chip configurations are of interest as they are a natural choice of coset representatives under the quotient of the reduced Laplacian. We investigate graphs whose recurrent identities with respect to different sinks are compatible with each other. The maximal stable configuration is the simplest recurrent chip configuration, and graphs whose recurrent identities equal the maximal stable configuration are of particular interest, and are said to have the complete maximal identity property. We prove that given any graph $G$ one can attach trees to the vertices of $G$ to yield a graph with the complete maximal identity property. We conclude with several intriguing conjectures about the complete maximal identity property of various graph products.

187) Andrew Weinfeld, Bases for Quotients of Symmetric Polynomials (arXiv.org, 17 Nov 2019)

We create several families of bases for the symmetric polynomials. From these bases we prove that certain Schur symmetric polynomials form a basis for quotients of symmetric polynomials that generalize the cohomology and the quantum cohomology of the Grassmannian. Our work also provides an alternative proof of a result due to Grinberg.

186) Yuyuan Luo (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), Minimal percolating sets for mutating infectious diseases (arXiv.org, 5 Nov 2019), published in *Physical Review Research, *vol. 2 (1 April 2020), featured in the Coronavirus (COVID-19) Collection from Physical Review journals by the American Physical Society

This paper is dedicated to the study of the interaction between dynamical systems and percolation models, with views towards the study of viral infections whose virus mutate with time. Recall that r-bootstrap percolation describes a deterministic process where vertices of a graph are infected once r neighbors of it are infected. We generalize this by introducing $F(t)$-*bootstrap percolation*, a time-dependent process where the number of neighbouring vertices which need to be infected for a disease to be transmitted is determined by a percolation function $F(t)$ at each time $t$. After studying some of the basic properties of the model, we consider smallest percolating sets and construct a polynomial-timed algorithm to find one smallest minimal percolating set on finite trees for certain $F(t)$-bootstrap percolation models.

185) Christopher Zhu, Enumerating Permutations and Rim Hooks Characterized by Double Descent Sets (arXiv.org, 28 Oct 2019)

Let $dd(I;n)$ denote the number of permutations of $[n]$ with double descent set $I$. For singleton sets $I$, we present a recursive formula for $dd(I;n)$ and a method to estimate $dd(I;n)$. We also discuss the enumeration of certain classes of rim hooks. Let $\mathcal{R}_I(n)$ denote the set of all rim hooks of length $n$ with double descent set $I$, so that any tableau of one of these rim hooks corresponds to a permutation with double descent set $I$. We present a formula for the size of $\mathcal{R}_I(n)$ when $I$ is a singleton set, and we also present a formula for the size of $\mathcal{R}_I(n)$ when $I$ is the empty set. We additionally present several conjectures about the asymptotics of certain ratios of $dd(I;n)$.

184) Nithin Kavi, Cutting and Gluing Surfaces (arXiv.org, 25 Oct 2019)

We start with a disk with $2n$ vertices along its boundary where pairs of vertices are connected with $n$ strips with certain restrictions. This forms a {\it pairing}. To relate two pairings, we define an operator called a cut-and-glue operation. We show that this operation does not change an invariant of pairings known as the {\it signature.} Pairings with a signature of $0$ are special because they are closely related to a topological construction through cut and glue operations that have other applications in topology. We prove that all balanced pairings for a fixed $n$ are connected on a surface with any number of boundary components. As a topological application, combined with works of Li, this shows that a properly embedded surface induces a well-defined grading on the sutured monopole Floer homology defined by Kronheimer and Mrowka.

2018 Research Papers

183) Alejandro H. Morales (UMass Amherst) and Daniel G. Zhu (PRIMES), On the Okounkov-Olshanski formula for standard tableaux of skew shapes (arXiv.org, 9 Jul 2020); published in FPSAC 2020 *Proceedings of the 32nd Conference on Formal Power Series and Algebraic Combinatorics (Online)*

The classical hook length formula counts the number of standard tableaux of straight shapes. In 1996, Okounkov and Olshanski found a positive formula for the number of standard Young tableaux of a skew shape. We prove various properties of this formula, including three determinantal formulas for the number of nonzero terms, an equivalence between the Okounkov-Olshanski formula and another skew tableaux formula involving Knutson-Tao puzzles, and two $q$-analogues for reverse plane partitions, which complements work by Stanley and Chen for semistandard tableaux. We also give several reformulations of the formula, including two in terms of the excited diagrams appearing in a more recent skew tableaux formula by Naruse. Lastly, for thick zigzag shapes we show that the number of nonzero terms is given by a determinant of the Genocchi numbers and improve on known upper bounds by Morales-Pak-Panova on the number of standard tableaux of these shapes.

182) Alin Tomescu (MIT), Vivek Bhupatiraju (PRIMES), Dimitrios Papadopoulos (Hong Kong University of Science and Technology), Charalampos Papamanthou (University of Maryland, College Park), Nikos Triandopoulos (Stevens Institute of Technology), Srinivas Devadas and (MIT), Transparency Logs via Append-Only Authenticated Dictionaries, published in CCS '19 *Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security*, London, United Kingdom, November 11-15, 2019, pp. 1299-1316.

Transparency logs allow users to audit a potentially malicious service, paving the way towards a more accountable Internet. For example, Certificate Transparency (CT) enables domain owners to audit Certificate Authorities (CAs) and detect impersonation attacks. Yet, to achieve their full potential, transparency logs must be bandwidth-efficient when queried by users. Specifically, everyone should be able to efficientlylook up log entries by their keyand efficiently verify that the log remainsappend-only. Unfortunately, without additional trust assumptions, current transparency logs cannot provide both small-sizedlookup proofs and small-sizedappend-only proofs. In fact, one of the proofs always requires bandwidth linear in the size of the log, making it expensive for everyone to query the log. In this paper, we address this gap with a new primitive called anappend-only authenticated dictionary (AAD). Our construction is the first to achieve (poly)logarithmic size for both proof types and helps reduce bandwidth consumption in transparency logs. This comes at the cost of increased append times and high memory usage, both of which remain to be improved to make practical deployment possible.

181) Ezra Erives (PRIMES), Srinivasan Sathiamurthy (PRIMES), and Zarathustra Brady (MIT), Asymptotics of $d$-Dimensional Visibility (arXiv.org, 16 Sep 2019)

We consider the space $[0,n]^3$, imagined as a three dimensional, axis-aligned grid world partitioned into $n^3$ $1\times 1 \times 1$ unit cubes. Each cube is either considered to be empty, in which case a line of sight can pass through it, or obstructing, in which case no line of sight can pass through it. From a given position, some of these obstructing cubes block one's view of other obstructing cubes, leading to the following extremal problem: What is the largest number of obstructing cubes that can be simultaneously visible from the surface of an observer cube, over all possible choices of which cubes of $[0,n]^3$ are obstructing? We construct an example of a configuration in which $\Omega\big(n^\frac{8}{3}\big)$ obstructing cubes are visible, and generalize this to an example with $\Omega\big(n^{d-\frac{1}{d}}\big)$ visible obstructing hypercubes for dimension $d>3$. Using Fourier analytic techniques, we prove an $O\big(n^{d-\frac{1}{d}}\log n\big)$ upper bound in a reduced visibility setting.

180) Florian Naef (MIT) and Yuting Qin (PRIMES), The Elliptic Kashiwara-Vergne Lie algebra in low weights (arXiv.org, 7 Aug 2019)

In this paper, we study the elliptic Kashiwara-Vergne Lie Algebra $\mathfrak{krv}$, which is a certain Lie subalgebra of the Lie algebra of derivations of the free Lie algebra in two generators. It has a natural bigrading, such that the Lie bracket is of bidegree $(-1,-1)$. After recalling the graphical interpretation of this Lie algebra, we examine low degree elements of $\mathfrak{krv}$. More precisely, wцe find that $\mathfrak{krv}^{(2,j)}$ is one-dimensional for even $j$ and zero $j$ odd. We also compute $\operatorname{dim}(\mathfrak{krv})^{(3,m)} = \lfloor\frac{m-1}{2}\rfloor - \lfloor\frac{m-1}{3}\rfloor$. In particular, we show that in those degrees there are no odd elements and also confirm Enriquez' conjecture in those degrees.

179) Vincent Huang (PRIMES) and James Unwin (University of Illinois, Chicago), Markov Chain Models of Refugee Migration Data (arXiv.org, 19 March 2019), published in IMA Journal of Applied Mathematics (2020): 1-21

The application of Markov chains to modelling refugee crises is explored, focusing on local migration of individuals at the level of cities and days. As an explicit example we apply the Markov chains migration model developed here to UNHCR data on the Burundi refugee crisis. We compare our method to a state-of-the-art `agent-based' model of Burundi refugee movements, and highlight that Markov chain approaches presented here can improve the match to data while simultaneously being more algorithmically efficient.

178) Sean Elliott, Anti-Ramsey Type Problems (14 March 2019)

A classical theorem due to Ramsey says the following: Given a finite number of colors and a positive integer p, any edge-coloring of the complete graph $K_n$ will contain a monochromatic copy of $K_p$ as long as n is sufficiently large. A related problem is to consider colorings of $K_n$ for which every copy of $K_4$ uses at least $3$ distinct colors, and ask for the minimum number of colors that can be used to produce such a coloring. Here we present an alternate proof of the best known upper bound, which is $2^{O(\sqrt{\log n})}$. We also consider the problem of covering a regular graph with regular bipartite subgraphs. The motivation for this problem comes from the example of covering $K_n$ with complete bipartite subgraphs, which can be done with $\log_{2} (n)$ many subgraphs. Here we show that with high probability, a random $d$-regular graph with an even number of vertices can be covered with $c {\log d}$ many regular bipartite subgraphs for an absolute constant $c$.

177) Alan Yan, Asymptotic Counting in Dynamical Systems (4 March 2019)

We consider several dynamically generated sets with certain measurable properties such as the diameters or angles. We define various counting functions on these geometric objects which quantify these properties and explore the asymptotics of these functions. We conjecture that these functions grow like power functions with exponent the dimension of the residual set. The main objects that we examine are Fatou components of the quadratic family and limit sets of Schottky groups. Finally, we provide heuristic algorithms to compute the counting functions in these examples in an attempt to confirm this conjecture.

176) Archer Wang, Hilbert Series of Quasiinvariant Polynomials (19 Feb 2019)

The space of quasiinvariant polynomials generalize that of symmetric polynomials: under the action of the symmetric group, the polynomials remain invariant to a certain order. We discern the structure and symmetries of quasiinvariant polynomials by way of examining the invariance of relevant polynomial spaces under certain specific group actions. Both pure and computational methods are employed in this pursuit. Felder and Veselov, when studying quasiinvariant polynomials, made a breakthrough discovery in computing their Hilbert series in fields of characteristic 0, and since then, quasiinvariant polynomials have been extensively studied due to their applications in representation theory, algebraic geometry, and mathematical physics. We investigate the Hilbert series of quasiinvariant polynomials that are divisible by a generic homogeneous polynomial. We also continue the previous work regarding their Hilbert series in fields of prime characteristic.

175) Sanjit Bhat (PRIMES), Dimitris Tsipras (MIT), and Aleksander Madry (MIT), Towards Efficient Methods for Training Robust Deep Neural Networks (13 Feb 2019).

In recent years, it has been shown that neural networks are vulnerable to adversarial examples, i.e., specially crafted inputs that look visually similar to humans yet cause machine learning models to make incorrect predictions. A lot of research has been focused on training robust models--models immune to adversarial examples. One such method is Adversarial Training, in which the model continuously trains on adversarially perturbed inputs. However, since these inputs require significant computation time to create, Adversarial Training is often much slower than vanilla training. In this work, we explore two approaches to increasthe efficiency of Adversarial Training. First, we study whether faster yet less accurate methods for generating adversarially perturbed inputs suffice to train a robust model. Second, we devise a method for asynchronous parallel Adversarial Training and analyze a phenomenon of independent interest that arises--staleness. Taken together, these two techniques enable comparable robustness on the MNIST dataset to prior art with a 26× reduction in training time from 4 hours to just 9 minutes.

174) Jesse Geneson (Iowa State University), Carl Joshua Quines (PRIMES), Espen Slettnes (PRIMES), Shen-Fu Tsai (Google), Expected capture time and throttling number for cop versus gambler (arXiv.org, 10 Feb 2019)

We bound expected capture time and throttling number for the cop versus gambler game on a connected graph with $n$ vertices, a variant of the cop versus robber game that is played in darkness, where the adversary hops between vertices using a fixed probability distribution. The paper that originally defined the cop versus gambler game focused on two versions, a known gambler whose distribution the cop knows, and an unknown gambler whose distribution is secret. We define a new version of the gambler where the cop makes a fixed number of observations before the lights go out and the game begins. We show that the strategy that gives the best possible expected capture time of $n$ for the known gambler can also be used to achieve nearly the same expected capture time against the observed gambler when the cop makes a sufficiently large number of observations. We also show that even with only a single observation, the cop is able to achieve an expected capture time of approximately $1.5n$, which is much lower than the expected capture time of the best known strategy against the unknown gambler (approximately $1.95n$).

173) John Kuszmaul, Verkle Trees (5 Feb 2019)

We present Verkle Trees, a bandwidth-efficient alternative to Merkle Trees. Merkle Trees are currently employed in a variety of applications in which membership proofs are sent across a network, including consensus protocols, public-key directories, cryptocurrencies such as Bitcoin, and Secure File Systems. A Merkle Tree with n leaves has $O({\log_2 n})$-sized proofs. In large trees, sending the proofs can dominate bandwidth consumption. Vector Commitments (VCs) pose a potential alternative to Merkle Trees, with constant-sized proofs. Unfortunately, VC construction time is $O(n^2)$, which is too large for many applications. We present Verkle Trees, which are constructed similarly to Merkle Trees, but using Vector Commitments rather than cryptographic hash functions. In a Merkle Tree, a parent node is the hash of its children. In a Verkle Tree, a parent node is the Vector Commitment of its children. A Verkle Tree with branching factor k achieves $O(kn)$ construction time and $O({\log_k n})$ membership proof-size. This means that the branching factor, k, offers a tradeoff between computational power and bandwidth. The bandwidth reduction is independent of the depth of the tree; it depends only on the branching factor. We find experimentally that with a branching factor of k = 1024, which provides a factor of 10 reduction in bandwidth, it takes 110.1 milliseconds on average per leaf to construct a Verkle Tree with $2^{14}$ leaves. A branching factor of k = 32, which provides a bandwidth reduction factor of 5, yields a construction time of 8.4 milliseconds on average per leaf for a tree with $2^{14}$ leaves. (The performance on a tree with $2^{14}$ leaves is representative of larger trees because the asymptotics already dominate the computation costs.) My role in this research project has been proving the time complexities of Verkle Trees, implementing Verkle Trees, and testing and benchmarking the implementation.

172) Andrew Ahn (MIT), Gopal Goel (PRIMES), and Andrew Yao (PRIMES), Derivative Asymptotics of Uniform Gelfand-Tsetlin Patterns (1 Feb 2019)

Bufetov and Gorin introduced the idea of applying differential operators which are diagonalized by the Schur functions to Schur generating functions, a generalization of probability generating functions to particle systems. This technology allowed the authors to access asymptotics of a variety of particle systems. We use this technique to analyze uniformly distributed Gelfand-Tsetlin patterns where the top row is fixed. In particular, we obtain limiting moments for the difference of empirical measure for two adjacent rows in uniformly random Gelfand-Tsetlin patterns.

171) William Fisher, Polynomial Wolff Axioms and Multilinear Kakeya-type Estimates for Bent Tubes in $R^n$ (31 Jan 2019)

In this paper we consider the applicability of Guth and Zahl's polynomial Wolff axioms to bent tubes. We demonstrate that Guth and Zahl's multilinear bounds hold for tubes defined by low degree algebraic curves with bounded $C^2$ -norms. To show this we give an exposition of their proof in a *n*-dimensional, *k*-linear context. In considering the ability to obtain linear bounds using the multilinear bounds we utilize the strategy of Guth and Bourgain. We find that the multilinear bounds obtained from Guth and Zahl's technique break the inductive structure of this process and thus provide inferior bounds to the endpoint cases of Bennett, Carbery, and Tao's multilinear bounds. We discuss future research directions, which could eventually remedy this, that improve multilinear bounds by adding the assumption that the collection of tubes lie near a *k*-plane.

170) Rinni Bhansali (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), A Trust Model in Bootstrap Percolation (21 Jan 2019; arXiv.org, 23 May 2019), published in the *Proceedings of the Royal Society A*, vol. 476, no. 2235 (1 March 2020)

Bootstrap percolation is a class of monotone cellular automata describing an activation process which follows certain activation rules. In particular, in the classical *r*-neighbor bootstrap process on a graph *G*, a set *A* of initially infected vertices spreads by infecting vertices with at least *r* already-infected neighbors. Motivated by the study of social networks and biological interactions through graphs, where vertices represent people and edges represent the relations amongst them, we introduce here a novel model which we name *T-bootstrap percolation* (*T*-BP). In this new model, vertices of the graph *G* are assigned random labels, and the set of initially infected vertices spreads by infecting (at each time step) vertices with at least a fixed number of already-infected neighbors of each label. The Trust Model for Bootstrap Percolation allows one to impose a preset level of skepticism towards a rumor, as it requires a rumor to be validated by numerous groups in order for it to spread, hence imposing a predetermined level of trust needed for the rumor to spread. By considering different random and non-random networks, we describe various properties of this new model (e.g., the critical probability of infection and the confidence threshold), and compare it to other types of bootstrap percolation from the literature, such as *U*-bootstrap percolation. Ultimately, we describe its implications when applied to rumor spread, fake news, and marketing strategies, along with potential future applications in modeling the spread of genetic diseases.

169) Stanley Wang, Connectedness of the Moduli Space of Genus 1 Planar Tropical Curves (arXiv.org, 12 Jan 2019)

Tropical geometry is a relatively recent field in mathematics created as a simplified model for certain problems in algebraic geometry. We introduce the definition of abstract and planar tropical curves as well as their properties, including combinatorial type and degree. We also talk about the moduli space, a geometric object that parameterizes all possible types of abstract or planar tropical curves subject to certain conditions. Our research focuses on the moduli spaces of planar tropical curves of genus one, arbitrary degree d and any number of marked, unbounded edges. We prove that these moduli spaces are connected.

168) Aayush Karan, Generating Set for Nonzero Determinant Links Under Skein Relation (arXiv.org, 6 Jan 2019), published in Topology and its Applications, vol. 265 (15 September 2019)

Traditionally introduced in terms of advanced topological constructions, many link invariants may also be defined in much simpler terms given their values on a few initial links and a recursive formula on a skein triangle. Then the crucial question to ask is how many initial values are necessary to completely determine such a link invariant. We focus on a specific class of invariants known as nonzero determinant link invariants, defined only for links which do not evaluate to zero on the link determinant. We restate our objective by considering a set $\mathcal{S}$ of links subject to the condition that if any three nonzero determinant links belong to a skein triangle, any two of these belonging to $\mathcal{S}$ implies that the third also belongs to $\mathcal{S}$. Then we aim to determine a minimal set of initial generators so that $\mathcal{S}$ is the set of all links with nonzero determinant. We show that only the unknot is required as a generator if the skein triangle is unoriented. For oriented skein triangles, we show that the unknot and Hopf link orientations form a set of generators.

167) Jiwon Choi, Gromov-Hausdorff Distance Between Metric Graphs (2 Jan 2019)

In this paper we study the Gromov-Hausdorff distance between two metric graphs. We compute the precise value of the Gromov-Hausdorff distance between two path graphs. Moreover, we compute the precise value of the Gromov-Hausdorff distance between a cycle graph and a tree. Given a graph *X*, we consider a graph *Y* that results from adding an edge to *X* without changing the number of vertices. We compute the precise value of the Gromov-Hausdorff distance between *X* and *Y*.

166) Kaiying Hou, Agent-based Models for Conservation Equations (31 Dec 2018)

In this research, we use agent-based models to solve conservation equations. A conservation equation is a partial differential equation that describes any conserved quantity by establishing a relationship between the density and the flux. It is used in areas such as traffic flow and fluid dynamics. Past research on numerically solving conservation equations mainly tackles the problem by establishing discrete cells in the space and approximating the densities in the cells. In this research, we use an agent-based model, in which we describe the solution through the movement of particles in the space. We propose an agent-based model for conservation equation in 1-D space. We found a change of variables that transforms the original conservation equation to the specific volume conservation equation. This transform allows us to apply results in finite volume method to the agent-based model and find a condition for the agent-based solution to converge to the exact solution of scalar conservation equations.

165) Andy Xu, Approximating the Hurwitz Zeta Function (22 Dec 2018)

This project aims to implement a MATLAB function that approximates the Hurwitz zeta function $\zeta(s, a)$. This is necessary because the naive implementation fails for certain input near critical values for $s$ and for $a$. Other series representations of the Hurwitz zeta function converge rapidly but do not handle complex values of $s$ and/or $a$. We also consider existing forms for the Hurwitz zeta function, including one given by Bailey and Borwein, and evaluate their overall performance.

164) Allen Wang (PRIMES) and Guangyi Yue (MIT), Relationship Between Mullineux Involution and the Generalized Regularization (arXiv.org, 19 Dec 2018), published in *European Journal of Combinatorics* 85 (March 2020)

The Mullineux involution is an important map on $p$-regular partitions that originates from the modular representation theory of $\mathcal{S}_n$. In this paper we study the Mullineux transpose map and the generalized column regularization and prove a condition under which the two maps are exactly the same. Our results generalize the work of Bessenrodt, Olsson and Xu, and the combinatorial constructions is related to the Iwahori-Hecke algebra and the global crystal basis of the basic $U_q(\widehat{\mathfrak{sl}}_b)$-module. In the conclusion, we provide several conjectures regarding the $q$-decomposition numbers and generalizations of results due to Fayers.

163) Maximillian Guo, Behavior of Bar-Natan Homology under Conway Mutation (18 Dec 2018)

The Bar-Natan homology is a perturbation of the Khovanov homology of a knot. Previous work has shown that Khovanov homology remains unchanged under Conway mutation of the knot diagram. We give an exact triangle with three different resolutions of a link and prove several lemmas relating the dimensions of different Bar-Natan chain complexes and homologies. These allow us to prove that the dimension of the Bar-Natan homology $BN^k (L; \mathbb{Z}/2\mathbb{Z})$ is invariant under Conway mutation.

162) Nithin Kavi (PRIMES), Wendy Wu (PRIMES), and Zhenkun Li (MIT), Trunk of Satellite and Companion Knots (arXiv.org, 8 Dec 2018), published in Topology and its Applications, vol. 272 (1 March 2020)

We study the knot invariant called trunk, as defined by Ozawa, and the relation of the trunk of a satellite knot with the trunk of its companion knot. Our first result is ${\rm trunk}(K) \geq n \cdot {\rm trunk}(J)$ where ${\rm trunk}(\cdot)$ denotes the trunk of a knot, $K$ is a satellite knot with companion $J$, and $n$ is the winding number of $K$. To upgrade winding number to wrapping number, which we denote by $m$, we must include an extra factor of $\frac{1}{2}$ in our second result $\text{trunk}(K)$ $>$ $(1/2)m\cdot \text{trunk}(J)$ since $m \geq n$. We also discuss generalizations of the second result.

161) Merrick Cai (PRIMES) and Daniil Kalinov (MIT), The Hilbert Series of the Irreducible Quotient of the Polynomial Representation of the Rational Cherednik Algebra of Type $A_{n-1}$ in Characteristic $p$ for $p|n-1$ (arXiv.org, 12 Nov 2018)

We study the irreducible quotient $\mathcal{L}_{t,c}$ of the polynomial representation of the rational Cherednik algebra $\mathcal{H}_{t,c}(S_n,\mathfrak{h})$ of type $A_{n-1}$ over an algebraically closed field of positive characteristic $p$ where $p|n-1$. In the $t=0$ case, for all $c\ne 0$ we give a complete description of the polynomials in the maximal proper graded submodule $\ker \mathcal{B}$, the kernel of the contravariant form $\mathcal{B}$, and subsequently find the Hilbert series of the irreducible quotient $\mathcal{L}_{0,c}$. In the $t=1$ case, we give a complete description of the polynomials in $\ker \mathcal{B}$ when the characteristic $p=2$ and $c$ is transcendental over $\mathbb{F}_2$, and compute the Hilbert series of the irreducible quotient $\mathcal{L}_{1,c}$. In doing so, we prove a conjecture due to Etingof and Rains completely for $p=2$, and also for any $t=0$ and $n\equiv 1\pmod{p}$. Furthermore, for $t=1$, we prove a simple criterion to determine whether a given polynomial $f$ lies in $\ker \mathcal{B}$ for all $n=kp+r$ with $r$ and $p$ fixed.

160) Tanya Khovanova (MIT) and Eric Zhang (PRIMES), On 3-Inflatable Permutations (arXiv.org, 22 Sept 2018), published in The Electronic Journal of Combinatorics 28:1 (2021)

Call a permutation $k$-inflatable if it can be "blown up" into a convergent sequence of permutations by a uniform inflation construction, such that this sequence is symmetric with respect to densities of induced subpermutations of length $k$. We study properties of 3-inflatable permutations, finding a general formula for limit densities of pattern permutations in the uniform inflation of a given permutation. We also characterize and find examples of $3$-inflatable permutations of various lengths, including the shortest examples with length $17$.

159) Sathwik Karnik, Bounds on the Maximal Cardinality of an Acute Set in a Hypercube (7 Sept 2018)

The acute set problem asks the following question: what is the maximal cardinality of a $d$-dimensional set of points such that all angles formed between any three points are acute? In this paper, we consider an analogous problem with the condition that the acute set is a subset of a $d$-dimensional unit hypercube. We provide an explicit construction and proof to show that a lower bound for the maximum cardinality of an acute set in $\{0,1\}^d$ is $2^{2^{\lfloor \log_3 d \rfloor}}$. Using a similar construction, we improve this lower bound to $2^{d/3}$. Through a consideration of points diagonally opposite a particular point on 2-faces, we improve the upper bound to $\left(1 + \dfrac{2}{d}\right)\cdot 2^{d-2}$. We then seek to generalize these findings and a combinatorial interpretation of the problem in $\{0,1\}^d$.

158) Vincent Bian, Special Configurations in Anchored Rectangle Packings (arXiv.org, 6 Sept 2018)

Given a finite set S in $[0,1]^2$ including the origin, an anchored rectangle packing is a set of non-overlapping rectangles in the unit square where each rectangle has a point of S as its left-bottom corner and contains no point of S in its interior. Allen Freedman conjectured in the 1960s one can always find an anchored rectangle packing with total area at least $1/2$. We verify the conjecture for point configurations whose relative positions belong to certain classes of permutations.

157) Tanya Khovanova (MIT) and Wayne Zhao (PRIMES), Mathematics of a Sudo-Kurve (arXiv.org, 20 Aug 2018), published in Recreational Mathematics Magazine, no. 10 (2018): 5-27.

We investigate a type of a Sudoku variant called Sudo-Kurve, which allows bent rows and columns, and develop a new, yet equivalent, variant we call a Sudo-Cube. We examine the total number of distinct solution grids for this type with or without symmetry. We study other mathematical aspects of this puzzle along with the minimum number of clues needed and the number of ways to place individual symbols.

156) Vinjai Vale, A new paradigm for computer vision based on compositional representation (14 May 2018)

Deep convolutional neural networks - the state-of-the-art technique in artificial intelligence for computer vision - achieve notable success rates at simple classification tasks, but are fundamentally lacking when it comes to representation. These neural networks encode fuzzy textural patterns into vast matrices of numbers which lack the semantically structured nature of human representations (e.g. "a table is a flat horizontal surface supported by an arrangement of identical legs"). This paper takes multiple important steps towards filling in these gaps. I first propose a series of tractable milestone problems set in the abstract two-dimensional ShapeWorld, thus isolating the challenge of object compositionality. Then I demonstrate the effectiveness of a new compositional representation approach based on identifying structure among the primitive elements comprising an image and representing this structure through an augmented primitive element tree and coincidence list. My approach outperforms Google's state-of-the-art Inception-v3 Convolutional Neural Network in accuracy, speed, and structural representation in my object representation milestone tasks. Finally, I present a mathematical framework for a probabilistic programming approach that can learn highly structured generative stochastic representations of compositional objects from just a handful of examples. This work is foundational for the future of general computer vision, and its applications are wide-reaching, ranging from autonomous vehicles to intelligent robotics to augmented and virtual reality.

155) Andrew Gritsevskiy (PRIMES) and Maksym Korablyov (MIT), Capsule networks for low-data transfer learning (arXiv.org, 26 Apr 2018)

We propose a capsule network-based architecture for generalizing learning to new data with few examples. Using both generative and non-generative capsule networks with intermediate routing, we are able to generalize to new information over 25 times faster than a similar convolutional neural network. We train the networks on the multiMNIST dataset lacking one digit. After the networks reach their maximum accuracy, we inject 1-100 examples of the missing digit into the training set, and measure the number of batches needed to return to a comparable level of accuracy. We then discuss the improvement in low-data transfer learning that capsule networks bring, and propose future directions for capsule research.

2017 Research Papers

154) Tanya Khovanova (MIT) and Joshua Lee (PRIMES), The 5-Way Scale (8 Mar 2019)

In this paper, we discuss coin-weighing problems that use a 5-way scale which has five different possible outcomes: MUCH LESS, LESS, EQUAL, MORE, and MUCH MORE. The 5-way scale provides more information than the regular 3-way scale. We study the problem of finding two fake coins from a pile of identically looking coins in a minimal number of weighings using a 5-way scale. We discuss similarities and differences between the 5-way and 3-way scale. We introduce a strategy for a 5-way scale that can find both counterfeit coins among $2^k$ coins in $k+1$ weighings, which is better than any strategy for a 3-way scale.

153) Grace Tian, Multi-Crossing Numbers for Knots (26 Jan 2019)

We study the projections of a knot *K* that have only *n*-crossings. The n-crossing number of *K* is the minimum number of *n*-crossings among all possible projections of *K* with only *n*-crossings. We obtain new results on the relation between* n*-crossing number and (2*n* − 1)-crossing number for every positive even integer *n*.

152) David Lu (PRIMES), Sanjit Bhat (PRIMES), Albert Kwon (MIT), and Srinivas Devadas (MIT), DynaFlow: An Efficient Website Fingerprinting Defense Based on Dynamically-Adjusting Flows (15 Oct 2018), published in *Proceedings of the 2018 Workshop on Privacy in the Electronic Society* (WPES 2018), pp. 109-113.

Website fingerprinting attacks enable a local adversary to determine which website a Tor user visits. In recent years, several researchers have proposed defenses to counter these attacks. However, these defenses have shortcomings: many do not provide formal guarantees of security, incur high latency and bandwidth overheads, and require a frequently-updated database of website traffic patterns. In this work, we introduce a new countermeasure, DynaFlow, based on dynamically-adjusting flows to protect against website fingerprinting. DynaFlow provides a similar level of security as current state-of-the-art while being over $40\%$ more efficient. At the same time, DynaFlow does not require a pre-established database and extends protection to dynamically-generated websites.

151) Mihir Singhal (PRIMES) and Christopher Ryba (MIT), Generalizations of Hall-Littlewood Polynomials (24 Sept 2018)

Hall-Littlewood polynomials are important functions in various elds of mathematics and quantum physics, and can be dened combinatorially using a model of path ensembles. Wheeler and Zinn-Justin applied a re ection construction to this model to obtain an expression for type *BC* Hall-Littlewood polynomials. Borodin applied a single-parameter deformation to the model and obtained a formula for generalized Hall- Littlewood polynomials. Borodin has asked whether a similar generalization could be applied to type* BC* Hall-Littlewood polynomials. We present the model incorporating Borodin's generalization. We also obtain expressions for polynomials that were previously studied by Borodin, in addition to an expression for generalized type *BC* Hall-Littlewood polynomials.

150) Gopal Goel (PRIMES) and Andrew Ahn (MIT), Discrete Derivative Asymptotics of the $\beta$-Hermite Eigenvalues (arXiv.org, 18 Sept 2018), published in Combinatorics, Probability and Computing (17 April 2019)

We consider the asymptotics of the difference between the empirical measures of the $\beta$-Hermite tridiagonal matrix and its minor. We prove that this difference has a deterministic limit and Gaussian fluctuations. Through a correspondence between measures and continual Young diagrams, this deterministic limit is identified with the Vershik-Kerov-Logan-Shepp curve. Moreover, the Gaussian fluctuations are identified with a sectional derivative of the Gaussian free field.

149) Franklyn Wang, Monodromy Groups of Indecomposable Rational Functions (10 Sept 2018)

The most important geometric invariant of a degree-$n$ complex rational function $f(X)$ is its *monodromy group*, which is a set of permutations of $n$ objects. This monodromy group determines several properties of $f(X)$. A fundamental problem is to classify all degree-$n$ rational functions which have special behavior, meaning that their monodromy group $G$ is not one of the two "typical" groups, namely $A_n$ or $S_n$. Many mathematicians have studied this problem, including Oscar Zariski, John Thompson, Robert Guralnick, and Michael Aschbacher. In this paper we bring this problem near completion by solving it when $G$ is in any of the classes of groups which previously seemed intractable. We introduce new techniques combining methods from algebraic geometry, Galois theory, group theory, representation theory, and combinatorics. The classication of rational functions with special behavior will have many consequences, including far-reaching generalizations of Mazur's theorem on uniform boundedness of rational torsion on elliptic curves and Nevanlinna's theorem on uniqueness of meromorphic functions with prescribed preimages of ve points. This improved understanding of rational functions has potential signicance in various elds of science and engineering where rational functions arise.

148) Michael Ma, New Results on Pattern-Replacement Equivalences: Generalizing a Classical Theorem and Revising a Recent Conjecture (6 Sept 2018)

In this paper we study pattern-replacement equivalence relations on the set $S_n$ of permutations of length $n$. Each equivalence relation is determined by a set of patterns, and equivalent permutations are connected by pattern-replacements in a manner similar to that of the Knuth relation. One of our main results generalizes the celebrated Erdös-Szekeres Theorem for permutation pattern-avoidance to a new result for permutation pattern-replacement. In particular, we show that under the $ \left \{ 123...k, k...321 \right \}$-equivalence, all permutations in $S_n$ are equivalent up to parity when $n \geq \Omega(k^2)$. Additionally, we extend the work of Kuszmaul and Zhou on an infinite family of pattern-replacement equivalences known as the rotational equivalences. Kuszmaul and Zhou proved that the rotational equivalences always yield either one or two nontrivial equivalence classes in Sn, and conjectured that the number of nontrivial classes depended only on the patterns involved in the rotational equivalence (rather than on $n$). We present a counterexample to their conjecture, and prove a new theorem fully classifying (for large $n$) when there is one nontrivial equivalence class and when there are two nontrivial equivalence classes. Finally, we computationally analyze the pattern-replacement equivalences given by sets of pairs of patterns of length four. We then focus on three cases, in which the number of nontrivial equivalence classes matches an OEIS sequence. For two of these we present full proofs of the enumeration and for the third we suggest a potential future method of proof.

147) Kyle Gatesman (PRIMES), James Unwin (University of Illinois at Chicago), Lattice Studies of Gerrymandering Strategies (arXiv.org, 8 Aug 2018)

We propose three novel gerrymandering algorithms which incorporate the spatial distribution of voters with the aim of constructing gerrymandered, equal-population, connected districts. Moreover, we develop lattice models of voter distributions, based on analogies to electrostatic potentials, in order to compare different gerrymandering strategies. Due to the probabilistic population fluctuations inherent to our voter models, Monte Carlo methods can be applied to the districts constructed via our gerrymandering algorithms. Through Monte Carlo studies we quantify the effectiveness of each of our gerrymandering algorithms and we also argue that gerrymandering strategies which do not include spatial data lead to (legally prohibited) highly disconnected districts. Of the three algorithms we propose, two are based on different strategies for packing opposition voters, and the third is a new approach to algorithmic gerrymandering based on genetic algorithms, which automatically guarantees that all districts are connected. Furthermore, we use our lattice voter model to examine the effectiveness of isoperimetric quotient tests and our results provide further quantitative support for implementing compactness tests in real-world political redistricting.

146) William Zhang, Improved bounds on the extremal function of hypergraphs (arXiv.org, 5 Jul 2018)

A fundamental problem in pattern avoidance is describing the asymptotic behavior of the extremal function and its generalizations. We prove an equivalence between the asymptotics of the graph extremal function for a class of bipartite graphs and the asymptotics of the matrix extremal function. We use the equivalence to prove several new bounds on the extremal functions of graphs. We develop a new method to bound the extremal function of hypergraphs in terms of the extremal function of their associated multidimensional matrices, improving the bound of the extremal function of $d$-permutation hypergraphs of length $k$ from $O(n^{d-1})$ to $2^{O(k)}n^{d-1}$.

145) P. A. Crowdmath, The Broken Stick Project (arXiv.org, 16 May 2018)

The broken stick problem is the following classical question. You have a segment $[0,1]$. You choose two points on this segment at random. They divide the segment into three smaller segments. Show that the probability that the three segments form a triangle is $1/4$.

The MIT PRIMES program, together with Art of Problem Solving, organized a high school research project where participants worked on several variations of this problem. Participants were generally high school students who posted ideas and progress to the Art of Problem Solving forums over the course of an entire year, under the supervision of PRIMES mentors. This report summarizes the findings of this CrowdMath project.

144) Aaron Kaufer, Superalgebra in characteristic 2 (arXiv.org, 3 Apr 2018)

Following the work of Siddharth Venkatesh, we study the category $\textbf{sVec}_2$. This category is a proposed candidate for the category of supervector spaces over fields of characteristic $2$ (as the ordinary notion of a supervector space does not make sense in charcacteristic $2$). In particular, we study commutative algebras in $\textbf{sVec}_2$, known as $d$-algebras, which are ordinary associative algebras $A$ together with a linear derivation $d:A \to A$ satisfying the twisted commutativity rule: $ab = ba + d(b)d(a)$. In this paper, we generalize many results from standard commutative algebra to the setting of $d$-algebras; most notably, we give two proofs of the statement that Artinian $d$-algebras may be decomposed as a direct product of local $d$-algebras. In addition, we show that there exists no noncommutative $d$-algebras of dimension $\leq 7$, and that up to isomorphism there exists exactly one $d$-algebra of dimension $7$. Finally, we give the notion of a Lie algebra in the category $\textbf{sVec}_2$, and we state and prove the Poincare-Birkhoff-Witt theorem for this category.

143) Kaiying Hou and Brian Rhee, Continuum Modelling of Traffic Systems with Autonomous Vehicles (17 Mar 2018)

Describing the behavior of automobile traffic via mathematical modeling and computer simulation has been a field of study conducted by mathematicians throughout the last century. One of the oldest models in traffic flow theory casts the problem in terms of densities and fluxes in partial differential conservation laws. In the past few years, the rise of autonomous vehicles (driven by software without human intervention) presents a new problem for classical traffic modeling. Autonomous vehicles react very differently from the traditional human-driven vehicles, resulting in modifications to the underlying partial differential equation constitutive laws. In this paper, we aim to provide insight into some new proposed constitutive laws by using continuum modelling to study traffic flows with a mix of human and autonomous vehicles. We also introduce various existing traffic flow models and present a new model for traffic flow that is based on an interaction between human drivers and autonomous vehicles where each vehicle can only measure the total density of surrounding cars, regardless of human or autonomous status. By implementing the Lax-Friedrichs scheme in Octave, we test how these different constitutive laws perform in our model and analyze the density curves that form over time steps. We also analytically derive and implement a Roe solver for a class of coupled conservation equations in which the velocities of cars are polynomial functions of the total density of surrounding cars regardless of type. We hope that our results could help civil engineers bring forth real progress in implementing efficient road systems that integrates both human-operated and unmanned vehicles.

142) Michael Gintz, Classifying Graph Lie Algebras (14 Mar 2018)

A Lie algebra is a linear object which has a powerful homomorphism with a Lie group, an important object in differential geometry. In previous work a construction is given that builds a Lie algebra on a Dynkin diagram, a commonly studied structure in Lie theory. We expand this definition to construct a Lie algebra given any simple graph, and consider the problem of determining its structure. We begin by defining an alteration on a graph which preserves its underlying graph Lie algebra structure, and use it to simplify the general graph. We then provide a decomposition move which further simplifies the Lie algebra structure of the general graph. Finally, we combine these two moves to classify all graph Lie algebras.

141) Sanjit Bhat (PRIMES), David Lu (PRIMES), Albert Kwon (MIT), and Srinivas Devadas (MIT), Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning (arXiv.org, 28 Feb 2018), published in Proceedings on Privacy Enhancing Technologies (PETS 2019) (4): 292-310.

In recent years, there have been several works that use website fingerprinting techniques to enable a local adversary to determine which website a Tor user visits. While the current state-of-the-art attack, which uses deep learning, outperforms prior art with medium to large amounts of data, it attains marginal to no accuracy improvements when both use small amounts of training data. In this work, we propose Var-CNN, a website fingerprinting attack that leverages deep learning techniques along with novel insights specific to packet sequence classification. In open-world settings with large amounts of data, Var-CNN attains over $1\%$ higher true positive rate (TPR) than state-of-the-art attacks while achieving $4\times$ lower false positive rate (FPR). Var-CNN's improvements are especially notable in low-data scenarios, where it reduces the FPR of prior art by $3.12\%$ while increasing the TPR by $13\%$. Overall, insights used to develop Var-CNN can be applied to future deep learning based attacks, and substantially reduce the amount of training data needed to perform a successful website fingerprinting attack. This shortens the time needed for data collection and lowers the likelihood of having data staleness issues.

140) Richard Xu, Algebraicity regarding Graphs and Tilings (27 Jan 2018)

Given a planar graph *G*, we prove that there exists a tiling of a rectangle by squares such that each square corresponds to a face of the graph and the side lengths of the squares solve an extremal problem on the graph. Furthermore, we provide a practical algorithm for calculating the side lengths. Finally, we strengthen our theorem by restricting the centers and side lengths of the squares to algebraic numbers and explore the application of our technique in proving algebraicity in packing problems.

139) Anlin Zhang (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), Modelling epidemics on *d*-cliqued graphs (published in *Letters in Biomathematics* 5:1 (Jan 16, 2018)

Since social interactions have been shown to lead to symmetric clusters, we propose here that symmetries play a key role in epidemic modelling. Mathematical models on *d*-ary tree graphs were recently shown to be particularly effective for modelling epidemics in simple networks. To account for symmetric relations, we generalize this to a new type of networks modelled on *d*-cliqued tree graphs, which are obtained by adding edges to regular *d*-trees to form *d*-cliques. This setting gives a more realistic model for epidemic outbreaks originating within a family or classroom and which could reach a population by transmission via children in schools. Specifically, we quantify how an infection starting in a clique (e.g. family) can reach other cliques through the body of the graph (e.g. public places). Moreover, we propose and study the notion of a *safe zone*, a subset that has a negligible probability of infection.

138) Dylan Pentland, Coefficients of Gaussian polynomials modulo *N* (arXiv.org, 30 Dec 2017)

The $q$-analogue of the binomial coefficient, known as a $q$-binomial coefficient, is typically denoted $\left[{n \atop k}\right]_q$. These polynomials are important combinatorial objects, often appearing in generating functions related to permutations and in representation theory.

Stanley conjectured that the function $f_{k,R}(n) = \#\left\{i : [q^{i}] \left[{n \atop k}\right]_q \equiv R \pmod{N}\right\}$ is quasipolynomial for $N=2$. We generalize, showing that this is in fact true for any integer $N\in \mathbb{N}$ and determine a quasi-period $\pi'_N(k)$ derived from the minimal period $\pi_N(k)$ of partitions with at most $k$ parts modulo $N$.

137) Andy Xu and Wendy Wu, Higher Gonalities of Erdös-Rényi Random Graphs (22 Dec 2017)

We consider the asymptotic behavior of the second and higher gonalities of an Erdös-Rényi random graph and provide upper bounds for both via the probabilistic method. Our results suggest that for sufficiently large $n$, the second gonality of an Erdös-Rényi random Graph $G(n,p)$ is strictly less than and asymptotically equal to the number of vertices under a suitable restriction of the probability $p$. We also prove an asymptotic upper bound for all higher gonalities of large Erdös-Rényi random graphs that adapts and generalizes a similar result on complete graphs. We suggest another approach towards finding both upper and lower bounds for the second and higher gonalities for small $p=\frac{c}{n}$, using a special case of the Riemann-Roch Theorem, and fully determine the asymptotic behavior of arbitrary gonalities when $c\leq 1$.

136) Michael Ren (PRIMES) and Xiaomeng Xu (MIT), Quasi-invariants in characteristic *p* and twisted quasi-invariants (15 Nov 2017; arXiv.org, 31 Jul 2019)

The spaces of quasi-invariant polynomials were introduced by Feigin and Veselov, where their Hilbert series over fields of characteristic 0 were computed. In this paper, we show some partial results and make two conjectures on the Hilbert series of these spaces over fields of positive characteristic.

On the other hand, Braverman, Etingof, and Finkelberg introduced the spaces of quasi-invariant polynomials twisted by a monomial. We extend some of their results to the spaces twisted by a smooth function.

135) David Darrow, A Novel, Near-Optimal Spectral Method for Simulating Fluids in a Cylinder (13 Nov 2017)

Simulations of fluid flow offer theoretical insight into fluid dynamics and critical applications in industry, with implications ranging from blood flow to hurricanes. However, open problems in fluid dynamics require more accurate simulations and lower computational resource costs than current algorithms provide. Accordingly, we develop in this paper a novel, computationally efficient spectral method for computing solutions of the incompressible Navier–Stokes equations, which model incompressible fluid flow, on the cylinder. The method described addresses three major limitations of current methods. First, while current methods either underresolve the cylinder's boundary or overresolve its center (effectively overemphasizing less physically interesting non-boundary regions), this new method more evenly resolves all parts of the cylinder. Secondly, current simulation times scale proportionally as $N^{7/3}$ or higher (where $N$ is the number of discretization points), while the new method requires at most $\mathcal{O}(N\log N)$ operations per time step. For large $N$, this means that calculations that required weeks can now be run in minutes. Lastly, current practical methods offer only low order (algebraic) accuracy. The new method has *spectral* accuracy, which often represents an improvement of the accuracy of the results by 5–10 orders of magnitude or more.

134) Espen Slettnes, Carl Joshua Quines, Shen-Fu Tsai, and Jesse Geneson (CrowdMath-2017), Variations of the cop and robber game on graphs (arXiv.org, 31 Oct 2017)

We prove new theoretical results about several variations of the cop and robber game on graphs. First, we consider a variation of the cop and robber game which is more symmetric called the cop and killer game. We prove for all $c < 1$ that almost all random graphs are stalemate for the cop and killer game, where each edge occurs with probability $p$ such that $\frac{1}{n^{c}} \le p \le 1-\frac{1}{n^{c}}$. We prove that a graph can be killer-win if and only if it has exactly $k\ge 3$ triangles or none at all. We prove that graphs with multiple cycles longer than triangles permit cop-win and killer-win graphs. For $\left(m,n\right)\neq\left(1,5\right)$ and $n\geq4$, we show that there are cop-win and killer-win graphs with $m$ $C_n$s. In addition, we identify game outcomes on specific graph products.

Next, we find a generalized version of Dijkstra's algorithm that can be applied to find the minimal expected capture time and the minimal evasion probability for the cop and gambler game and other variations of graph pursuit.

Finally, we consider a randomized version of the killer that is similar to the gambler. We use the generalization of Dijkstra's algorithm to find optimal strategies for pursuing the random killer. We prove that if $G$ is a connected graph with maximum degree $d$, then the cop can win with probability at least $\frac{\sqrt d}{1+\sqrt d}$ after learning the killer's distribution. In addition, we prove that this bound is tight only on the $\left(d+1\right)$-vertex star, where the killer takes the center with probability $\frac1{1+\sqrt d}$ and each of the other vertices with equal probabilities.

133) Ayush Agarwal (PRIMES) and Christian Gaetz (MIT), Differential posets and restriction in critical groups (arXiv.org, 23 Oct 2017), published in *Algebraic Combinatorics*, vol. 2:6 (2019): 1311-1327.

In recent work, Benkart, Klivans, and Reiner defined the critical group of a faithful representation of a finite group $G$, which is analogous to the critical group of a graph. In this paper we study maps between critical groups induced by injective group homomorphisms and in particular the map induced by restriction of the representation to a subgroup. We show that in the abelian group case the critical groups are isomorphic to the critical groups of a certain Cayley graph and that the restriction map corresponds to a graph covering map. We also show that when $G$ is an element in a differential tower of groups, critical groups of certain representations are closely related to words of up-down maps in the associated differential poset. We use this to generalize an explicit formula for the critical group of the permutation representation of the symmetric group given by the second author, and to enumerate the factors in such critical groups.

132) Louis Golowich (PRIMES) and Chiheon Kim (MIT), New Classes of Set-Sequential Tree (arXiv.org, 14 Oct 2017), published in *Discrete Mathematics*, vol. 343:3 (March 2020)

A graph is called set-sequential if its vertices can be labeled with distinct nonzero vectors in $\mathbb{F}_2^n$ such that when each edge is labeled with the sum$\pmod{2}$ of its vertices, every nonzero vector in $\mathbb{F}_2^n$ is the label for either a single vertex or a single edge. We resolve certain cases of a conjecture of Balister, Gyori, and Schelp in order to show many new classes of trees to be set-sequential. We show that all caterpillars $T$ of diameter $k$ such that $k \leq 18$ or $|V(T)| \geq 2^{k-1}$ are set-sequential, where $T$ has only odd-degree vertices and $|T| = 2^{n-1}$ for some positive integer $n$. We also present a new method of recursively constructing set-sequential trees.

131) Zachary Steinberg, Automated Segmentation of 3D Punctate Neural Expansion Microscopy Data (30 Sept 2017)

The comprehensive study of multiple-neuron circuits, known as connectomics, has historically been hampered by the time-consuming process of obtaining data with perfect morphological reconstructions of neurons. Existing attempts to automate the reconstruction of synaptic connnections have used electron microscope data to some success, but were limited due to the black-and-white nature of such data and the computational requirements of supervised learning. Now that multicolor data is available at 20nm resolution via Expansion Microscopy (ExM), creating an automated, reliable algorithm requiring minimal training that can process the future petabytes of neural tissue data in a reasonable amount of time is an open problem. Here, we outline an automated approach to segment neurons in a 20x expanded hippocampus slice expressing Brainbow fluorescent proteins. We first use a neural network as a mask to filter data, oversegment in color space to create supervoxels, and finally merge those supervoxels together to reconstruct the 3D volume for an individual neuron. The results demonstrate this approach shows promise to harness ExM data for 3D neural imaging. Our approach offers several insights that can guide future work.

130) Andrew Gritsevskiy, Towards Generative Drug Discovery: Metric Learning using Variational Autoencoders (30 Sept 2017)

We report a method for metric learning using an extended variational autoencoder. Our architecture, based on deep learning, provides the ability to learn a transformation- invariant metric on any set of data. Our architecture consists of a pair of encoding and decoding networks. The encoder network converts the data into dierentiable latent representations, while the decoder network learns to convert these representations back into data. We then apply an additional set of losses to the encoder network, forcing it to learn codings that are independent of orientation and re ect the desired metric. Then, our architecture is able to predict the real metric for a set of data points, and can generate data points that match a set of requirements. We demonstrate our networks ability to calculate the maximum overlap area of any two shapes in one shot; we also demonstrate our networks success at matching halves of geometric shapes. We then propose the applications of our network to areas of biochemistry and medicine, especially generative drug discovery.

129) Kaan Dokmeci, Theorems on Field Extensions and Radical Denesting (26 Sept 2017)

The problem of radical denesting is the problem that looks into given nested radical expressions and ways to denest them, or decrease the number of layers of radicals. This is a fairly recent problem, with applications in mathematical software that do algebraic manipulations like denesting given radical expressions. Current algorithms are either limited or inefficient.

We tackle the problem of denesting real radical expressions without the use of Galois Theory. This uses various theorems on field extensions formed by adjoining roots of elements of the original field. These theorems are proven via the roots of unity filter and degree arguments. These theorems culminate in proving a general theorem on denesting and leads to a general algorithm that does not require roots of unity. We optimize this algorithm further. Also, special cases of radical expressions are covered, giving more efficient algorithms in these cases, spanning many examples of radicals. Additionally, a condition for a radical not to denest is given. The results of denesting radicals over $Q$ are extended to real extensions of $Q$ and also transcendental extensions like $Q$(t). Finally, the case of denesting sums of radicals is explored as well.

2016 Research Papers

128) Piotr Suwara (MIT) and Albert Yue (PRIMES), An Index-Type Invariant of Knot Diagrams Giving Bounds for Unknotting Framed Unknots (arXiv.org, 7 Jul 2017)

We introduce a new knot diagram invariant called the *Self-Crossing Index* (SCI). Using SCI, we provide bounds for unknotting two families of framed unknots. For one of these families, unknotting using framed Reidemeister moves is significantly harder than unknotting using regular Reidemeister moves.

We also investigate the relation between SCI and Arnold's curve invariant St, as well as the relation with Hass and Nowik's invariant, which generalizes cowrithe. In particular, the change of SCI under $\Omega$3 moves depends only on the forward/backward character of the move, similar to how the change of St or cowrithe depends only on the positive/negative quality of the move.

127) P.A. CrowdMath, Results on Pattern Avoidance Games (arXiv.org, 18 Apr 2017)

A zero-one matrix $A$ contains another zero-one matrix $P$ if some submatrix of $A$ can be transformed to $P$ by changing some ones to zeros. $A$ avoids $P$ if $A$ does not contain $P$. The Pattern Avoidance Game is played by two players. Starting with an all-zero matrix, two players take turns changing zeros to ones while keeping $A$ avoiding $P$. We study the strategies of this game for some patterns $P$. We also study some generalizations of this game.

126) P.A. CrowdMath, Algorithms for Pattern Containment in 0-1 Matrices (arXiv.org, 18 Apr 2017)

We say a zero-one matrix $A$ avoids another zero-one matrix $P$ if no submatrix of $A$ can be transformed to $P$ by changing some ones to zeros. A fundamental problem is to study the extremal function $ex(n,P)$, the maximum number of nonzero entries in an $n \times n$ zero-one matrix $A$ which avoids $P$. To calculate exact values of $ex(n,P)$ for specific values of $n$, we need containment algorithms which tell us whether a given $n \times n$ matrix $A$ contains a given pattern matrix $P$. In this paper, we present optimal algorithms to determine when an $n \times n$ matrix $A$ contains a given pattern $P$ when $P$ is a column of all ones, an identity matrix, a tuple identity matrix, an $L$-shaped pattern, or a cross pattern. These algorithms run in $\Theta(n^2)$ time, which is the lowest possible order a containment algorithm can achieve. When $P$ is a rectangular all-ones matrix, we also obtain an improved running time algorithm, albeit with a higher order.

125) Malte Möser, Kyle Soska, Ethan Heilman, Kevin Lee, Henry Heffan (PRIMES), Shashvat Srivastava (PRIMES), Kyle Hogan, Jason Hennessey, Andrew Miller, Arvind Narayanan, and Nicolas Christin, An Empirical Analysis of Traceability in the Monero Blockchain (arXiv.org, 13 Apr 2017); to appear at PETS (Privacy Enhancing Technologies Symposium) 2018; an accompanying article about this paper appread in Wired (March 27, 2018)

Monero is a privacy-centric cryptocurrency that allows users to obscure their transactions by including chaff coins, called "mixins," along with the actual coins they spend. In this paper, we empirically evaluate two weaknesses in Monero's mixin sampling strategy. First, about 62% of transaction inputs with one or more mixins are vulnerable to "chain-reaction" analysis -- that is, the real input can be deduced by elimination. Second, Monero mixins are sampled in such a way that they can be easily distinguished from the real coins by their age distribution; in short, the real input is usually the "newest" input. We estimate that this heuristic can be used to guess the real input with 80% accuracy over all transactions with 1 or more mixins. Next, we turn to the Monero ecosystem and study the importance of mining pools and the former anonymous marketplace AlphaBay on the transaction volume. We find that after removing mining pool activity, there remains a large amount of potentially privacy-sensitive transactions that are affected by these weaknesses. We propose and evaluate two countermeasures that can improve the privacy of future transactions.

124) Alec Leng, Independence of the Miller-Rabin and Lucas Probable Prime Tests (30 Mar 2017)

In the modern age, public-key cryptography has become a vital component for secure online communication. To implement these cryptosystems, rapid primality testing is necessary in order to generate keys. In particular, probabilistic tests are used for their speed, despite the potential for pseudoprimes. So, we examine the commonly used Miller-Rabin and Lucas tests, showing that numbers with many nonwitnesses are usually Carmichael or Lucas-Carmichael numbers in a specific form. We then use these categorizations, through a generalization of Korselt’s criterion, to prove that there are no numbers with many nonwitnesses for both tests, affirming the two tests’ relative independence. As Carmichael and Lucas-Carmichael numbers are in general more difficult for the two tests to deal with, we next search for numbers which are both Carmichael and Lucas-Carmichael numbers, experimentally finding none less than $10^{16}$. We thus conjecture that there are no such composites and, using multivariate calculus with symmetric polynomials, begin developing techniques to prove this.

123) Ria Das, Exploring the Ant Mill: Numerical and Analytical Investigations of Mixed Memory-Reinforcement Systems (arXiv.org, 20 Mar 2017)

Under certain circumstances, a swarm of a species of trail-laying ants known as army ants can become caught in a doomed revolving motion known as the death spiral, in which each ant follows the one in front of it in a never-ending loop until they all drop dead from exhaustion. This phenomenon, as well as the ordinary motions of many ant species and certain slime molds, can be modeled using reinforced random walks and random walks with memory. In a reinforced random walk, the path taken by a moving particle is influenced by the previous paths taken by other particles. In a random walk with memory, a particle is more likely to continue along its line of motion than change its direction. Both memory and reinforcement have been studied independently in random walks with interesting results. However, real biological motion is a result of a combination of both memory and reinforcement. In this paper, we construct a continuous random walk model based on diffusion-advection partial differential equations that combine memory and reinforcement. We find an axi-symmetric, time-independent solution to the equations that resembles the death spiral. Finally, we prove numerically that the obtained steady-state solution is stable.

122) Andrew Gritsevskiy and Adithya Vellal, Development and Biological Analysis of a Neural Network Based Genomic Compression System (3 Mar 2017)

The advent of Next Generation Sequencing (NGS) technologies has resulted in a barrage of genomic data that is now available to the scientific community. This data contains information that is driving fields such as precision medicine and pharmacogenomics, where clinicians use a patient’s genetics in order to develop custom treatments. However, genomic data is immense in size, which makes it extremely costly to store, transport and process. A genomic compression system which takes advantage of intrinsic biological patterns can help reduce the costs associated with this data while also identifying important biological patterns. In this project, we aim to create a compression system which uses unsupervised neural networks to compress genomic data. The complete compression suite, GenComp, is compared to existing genomic data compression methods. The results are then analyzed to discover new biological features of genomic data. Testing showed that GenComp achieves at least 40 times more compression than existing variant compression solutions, while providing comparable decoding times in most applications. GenComp also provides some insight into genetic patterns, which has significant potential to aid in the fields of pharmacogenomics and precision medicine. Our results demonstrate that neural networks can be used to significantly compress genomic data while also assisting in better understanding genetic biology.

121) Vivek Bhupatiraju, John Kuszmaul, and Vinjai Vale, On the Viability of Distributed Consensus by Proof of Space (3 Mar 2017)

In this paper, we present our implementation of Proof of Space (PoS) and our study of its viability in distributed consensus. PoS is a new alternative to the commonly used Proof of Work, which is a protocol at the heart of distributed consensus systems such as Bitcoin. PoS resolves the two major drawbacks of Proof of Work: high energy cost and bias towards individuals with specialized hardware. In PoS, users must store large “hard-to-pebble” PTC graphs, which are recursively generated using subgraphs called superconcentrators. We implemented two types of superconcentrators to examine their differences in performance. Linear superconcentrators are about 1:8 times slower than butterfly superconcentrators, but provide a better lower bound on space consumption. Finally, we discuss our simulation of using PoS to reach consensus in a peer-to-peer network. We conclude that Proof of Space is indeed viable for distributed consensus. To the best of our knowledge, we are the first to implement linear superconcentrators and to simulate the use of PoS to reach consensus on a decentralized network.

120) Albert Yue, An Index-Type Invariant of Knot Diagrams and Bounds for Unknotting Framed Knots (3 Mar 2017)

We introduce a new knot diagram invariant called self-crossing index, or $\mathrm{SCI}$. We found that $\mathrm{SCI}$ changes by at most $\pm 1$ under framed Reidemeister moves, and specifically provides a lower bound for the number of 3 moves. We also found that $\mathrm{SCI}$ is additive under connected sums, and is a Vassiliev invariant of order 1. We also conduct similar calculations with Hass and Nowik's diagram invariant and cowrithe, and present a relationship between forward/backward, ascending/descending, and positive/negative 3 moves.

119) Valerie Zhang, Computer-Based Visualizations and Manipulations of Matching Paths (2 Mar 2017)

Given n points in the 2-D plane, a matching path is a path that starts at one of these n points and ends at a different one without going through any of the other n - 2 points. Matching paths, as well as an important operation called the Hurwitz move, come up naturally in the study of complex algebraic varieties. At the heart of the Hurwitz move is the twist operation, which “twists” one matching path along another to produce a new (third) matching path. Performing the twist operation by hand, however, is not only tedious but also prone to errors and unnecessary complications. Therefore, using computer-based methods to represent matching paths and perform the twist operation makes sense. In this project, which was coded in Java, computer-based methods are developed to perform the twist operation efficiently and accurately, providing a framework for visualizing and manipulating matching paths with computers. The computer program performs fast computations and represents matching paths as simply as possible in a simple visual interface. This program could be utilized when solving open problems in symplectic geometry: potential applications include characterizing the overtwistedness of contact manifolds, as well as better understanding braid group actions.

118) Harshal Sheth, Nihar Sheth, and Aashish Welling, Read-Copy Update in a Garbage Collected Environment (1 Mar 2017)

Read-copy update (RCU) is a synchronization mechanism that allows efficient parallelism when there are a high number of readers compared to writers. The primary use of RCU is in Linux, a highly popular operating system kernel. The Linux kernel is written in C, a language that is not garbage collected, and yet the functionality that RCU provides is effectively that of a “poor man’s garbage collector” (P. E. McKenney). RCU in C is also complicated to use, and this can lead to bugs. The purpose of this paper is to investigate whether RCU implemented in a garbage collected language (Go) is easier to use while delivering comparable performance to RCU in C. This is tested through the implementation and benchmarking of 4 linked lists, 2 using RCU and 2 using mutexes. One RCU linked list and one mutex linked list are implemented in each language. This paper finds that RCU in a garbage collected language is indeed significantly easier to use, has similar overall performance to, and on very high read loads, outperforms, RCU in C.

117) Xiangyao Yu (MIT), Siye Zhu (PRIMES), Justin Kaashoek (PRIMES), Andrew Pavlo (Carnegie Mellon University), and Srinivas Devadas (MIT), Taurus: A Parallel Transaction Recovery Method Based on Fine-Granularity Dependency Tracking (28 Feb 2017)

Logging is crucial to performance in modern multicore main-memory database management systems (DBMSs). Traditional data logging (ARIES) and command logging algorithms enforce a sequential order among log records using a global log sequence number (LSN). Log flushing and recovery after a crash are both performed in the LSN order. This serialization of transaction logging and recovery can limit the system performance at high core count. In this paper, we propose Taurus to break the LSN abstraction and enable parallel logging and recovery by tracking fine-grained dependencies among transactions. The dependency tracking lends Taurus three salient features. (1) Taurus decouples the transaction logging order with commit order and allows transactions to be flushed to persistent storage in parallel independently. Transactions that are persistent before commit can be discovered and ignored by the recovery algorithm using the logged dependency information. (2) Taurus can leverage multiple persistent devices for logging. (3) Taurus can leverage multiple devices and multiple worker threads for parallel recovery. Taurus improves logging and recovery parallelism for both data and command logging. .

116) Louis Golowich (PRIMES), Chiheon Kim (MIT), and Richard Zhou (PRIMES), Maximum Size of a Family of Pairwise Graph-Different Permutations (arXiv.org, 27 Feb 2017), published in The Electronic Journal of Combinatorics 24:4 (2017)

Two permutations of the vertices of a graph $G$ are called $G$-different if there exists an index $i$ such that $i$-th entry of the two permutations form an edge in $G$. We bound or determine the maximum size of a family of pairwise $G$-different permutations for various graphs $G$. We show that for all balanced bipartite graphs $G$ of order $n$ with minimum degree $n/2 - o(n)$, the maximum number of pairwise $G$-different permutations of the vertices of $G$ is $2^{(1-o(1))n}$. We also present examples of bipartite graphs $G$ with maximum degree $O(\log n)$ that have this property. We explore the problem of bounding the maximum size of a family of pairwise graph-different permutations when an unlimited number of disjoint vertices is added to a given graph. We determine this exact value for the graph of 2 disjoint edges, and present some asymptotic bounds relating to this value for graphs consisting of the union of $n/2$ disjoint edges.

115) Sathwik Karnik, On the Classification and Algorithmic Analysis of Carmichael Numbers (arXiv.org, 26 Feb 2017)

In this paper, we study the properties of Carmichael numbers, false positives to several primality tests. We provide a classification for Carmichael numbers with a proportion of Fermat witnesses of less than 50%, based on if the smallest prime factor is greater than a determined lower bound. In addition, we conduct a Monte Carlo simulation as part of a probabilistic algorithm to detect if a given composite number is Carmichael. We modify this highly accurate algorithm with a deterministic primality test to create a novel, more efficient algorithm that differentiates between Carmichael numbers and prime numbers.

114) Felix Wang, Functional equations in Complex Analysis and Number Theory (26 Feb 2017)

We study the following questions:

(1) What are all solutions to $f\circ \hat{f} = g\circ \hat{g}$ with $f,g,\hat{f},\hat{g}\in\mathbb{C}(X)$ being complex rational functions?

(2) For which rational functions $f(X)$ and $g(X)$ with rational coefficients does the equation $f(a)=g(b)$ have infinitely many solutions with $a,b\in$ $Q$?

We utilize various algebraic, geometric and analytic results in order to resolve both (1) and a variant of (2) in case the numerator of $f(X)-g(Y)$ is an irreducible polynomial in $\mathbb{C}[X,Y]$. Our results have applications in various mathematical fields, such as complex analysis, number theory, and dynamical systems. Our work resolves a 1973 question of Fried, and makes significant progress on a 1924 question of Ritt and a 1997 question of Lyubich and Minsky. In addition, we prove a quantitative refinement of a 2015 conjecture of Cahn, Jones and Spear.

113) Laura Pierson, Signatures of Stable Multiplicity Spaces in Restrictions of Representations of Symmetric Groups (25 Feb 2017)

Representation theory is a way of studying complex mathematical structures such as groups and algebras by mapping them to linear actions on vector spaces. Recently, Deligne proposed a new way to study the representation theory of finite groups by generalizing the collection of representations of a sequence of groups indexed by positive integer rank to an arbitrary complex rank, creating an abelian tensor category. In this project, we focused on the case of the symmetric groups $S_n,$ the groups of permutations of $n$ objects. Elements of the Deligne category Rep $S_t$ can be constructed by taking a stable sequence of $S_n$ representations for increasing $n$ and interpolating the associated formulas to an arbitrary complex number $t.$ In this project, we studied the case of restriction multiplicity spaces $V_{\lambda,\rho}$, counting the number of copies of an irreducible representation $V_{\rho}$ of $S_{n-k}$ in the restriction $\text{Res}_{S_{n-k}}^{S_n} V_{\lambda}$ of an irreducible representation of $S_n.$ We found formulas for norms of orthogonal basis vectors in these spaces, and ultimately for signatures (the number of basis vectors with positive norm minus the number with negative norm), an invariant that multiplies over tensor products and has important combinatorial connections.

112) Albert Gerovitch, Automatically Improving 3D Neuron Segmentations for Expansion Microscopy Connectomics (25 Feb 2017)

Understanding the geometry of neurons and their connections is key to comprehending brain function. This is the goal of a new optical approach to brain mapping using expansion microscopy (ExM), developed in the Boyden Lab at MIT to replace the traditional approach of electron microscopy. A challenge here is to perform image segmentation to delineate the boundaries of individual neurons. Currently, however, there is no method implemented for assessing a segmentation algorithm’s accuracy in ExM. The aim of this project is to create automated assessment of neuronal segmentation algorithms, enabling their iterative improvement. By automating the process, I aim to devise powerful segmentation algorithms that reveal the “connectome” of a neural circuit. I created software, called SEV-3D, which uses the pixel error and warping error metrics to assess 3D segmentations of single neurons. To allow better assessment beyond a simple numerical score, I visualized the results as a multilayered image. My program runs in a closed loop with a segmentation algorithm, modifying its parameters until the algorithm yields an optimal segmentation. I am further developing my application to enable evaluation of multi-cell segmentations. In the future, I aim to further implement the principles of machine learning to automatically improve the algorithms, yielding even better accuracy.

111) Kevin Chang, Upper Bounds for Ordered Ramsey Numbers of Small 1-Orderings (arXiv.org, 7 Feb 2017)

A $k$-ordering of a graph $G$ assigns distinct order-labels from the set $\{1,\ldots,|G|\}$ to $k$ vertices in $G$. Given a $k$-ordering $H$, the ordered Ramsey number $R_{<} (H)$ is the minimum $n$ such that every edge-2-coloring of the complete graph on the vertex set $\{1, \ldots, n\}$ contains a copy of $H$, the $i$th smallest vertex of which either has order-label $i$ in $H$ or no order-label in $H$.

This paper conducts the first systematic study of ordered Ramsey numbers for $1$-orderings of small graphs. We provide upper bounds for $R_{<} (H)$ for each connected $1$-ordering $H$ on $4$ vertices. Additionally, for every $1$-ordering $H$ of the $n$-vertex path $P_n$, we prove that $R_{<} (H) \in O(n)$. Finally, we provide an upper bound for the generalized ordered Ramsey number $R_{<} (K_n, H)$ which can be applied to any $k$-ordering $H$ containing some vertex with order-label $1$.

110) Nikhil Marda, On Equal Point Separation by Planar Cell Decompositions (arXiv.org, 17 Jan 2017)

In this paper, we investigate the problem of separating a set $X$ of points in $\mathbb{R}^{2}$ with an arrangement of $K$ lines such that each cell contains an asymptotically equal number of points (up to a constant ratio). We consider a property of curves called the stabbing number, defined to be the maximum countable number of intersections possible between the curve and a line in the plane. We show that large subsets of $X$ lying on Jordan curves of low stabbing number are an obstacle to equal separation. We further discuss Jordan curves of minimal stabbing number containing $X$. Our results generalize recent bounds on the Erdös-Szekeres Conjecture, showing that for fixed $d$ and sufficiently large $n$, if $|X| \ge 2^{c_dn/d + o(n)}$ with $c_d = 1 + O(\frac{1}{\sqrt{d}})$, then there exists a subset of $n$ points lying on a Jordan curve with stabbing number at most $d$.

109) Samuel Cohen and Peter Rowley, Results of Triangles Under Discrete Curve Shortening Flow (7 Jan 2017)

In this paper, we analyze the results of triangles under discrete curve shortening flow, specifically isosceles triangles with top angles greater than $\frac{\pi}{3}$, and scalene triangles. By considering the location of the three vertices of the triangle after some small time $\epsilon$, we use the definition of the derivative to calculate a system of differential equations involving parameters that can describe the triangle. Constructing phase plane diagrams and then analyzing them, we find that the singular behavior of discrete curve shorting flow on isosceles triangles with top angles greater than $\frac{\pi}{3}$ is a point, and for scalene triangles is a line segment.

108) Matthew Hase-Liu (PRIMES) and Nicholas Triantafillou (MIT), Efficient Point-Counting Algorithms for Superelliptic Curves (7 Jan 2017; arXiv.org, 7 Sep 2017)

In this paper, we present efficient algorithms for computing the number of points and the order of the Jacobian group of a superelliptic curve over finite fields of prime order p. Our method employs the Hasse-Weil bounds in conjunction with the Hasse-Witt matrix for superelliptic curves, whose entries we express in terms of multinomial coefficients. We present a fast algorithm for counting points on specific trinomial superelliptic curves and a slower, more general method for all superelliptic curves. For the first case, we reduce the problem of simplifying the entries of the Hasse-Witt matrix modulo *p* to a problem of solving quadratic Diophantine equations. For the second case, we extend Bostan et al.'s method for hyperelliptic curves to general superelliptic curves. We believe the methods we describe are asymptotically the most efficient known point-counting algorithms for certain families of trinomial superelliptic curves.

107) P.A. CrowdMath, Bounds on parameters of minimally non-linear patterns (arXiv.org, 31 Dec 2016), published in the* Electronic Journal of Combinatorics* 25:1 (2018)

Let $ex(n, P)$ be the maximum possible number of ones in any 0-1 matrix of dimensions $n \times n$ that avoids $P$. Matrix $P$ is called minimally non-linear if $ex(n, P) = \omega(n)$ but $ex(n, P') = O(n)$ for every strict subpattern $P'$ of $P$. We prove that the ratio between the length and width of any minimally non-linear 0-1 matrix is at most $4$, and that a minimally non-linear 0-1 matrix with $k$ rows has at most $5k-3$ ones. We also obtain an upper bound on the number of minimally non-linear 0-1 matrices with $k$ rows.

In addition, we prove corresponding bounds for minimally non-linear ordered graphs. The minimal non-linearity that we investigate for ordered graphs is for the extremal function $ex_{<}(n, G)$, which is the maximum possible number of edges in any ordered graph on $n$ vertices with no ordered subgraph isomorphic to $G$.

106) Seth Shelley-Abrahamson (MIT) and Alec Sun (PRIMES), Towards a Classification of Finite-Dimensional Representations of Rational Cherednik Algebras of Type D (arXiv.org, 15 Dec 2016)

Using a combinatorial description due to Jacon and Lecouvey of the wall crossing bijections for cyclotomic rational Cherednik algebras, we show that the irreducible representations $L_c(\lambda^\pm)$ of the rational Cherednik algebra $H_c(D_n, \mathbb{C}^n)$ of type $D$ for symmetric bipartitions $\lambda$ are infinite dimensional for all parameters $c$. In particular, all finite-dimensional irreducible representations of rational Cherednik algebras of type $D$ arise as restrictions of finite-dimensional irreducible representations of rational Cherednik algebras of type $B$.

105) Nicholas Guo (PRIMES) and Guangyi Yue (MIT), Counting Independent Sets in Graphs of Hyperplane Arrangements (arXiv.org, 13 Dec 2016), published in *Discrete Mathematics*, vol. 343:3 (March 2020)

In this paper, we count the number of independent sets of a type of graph $G(\mathcal{A},q)$ associated to some hyperplane arrangement $\mathcal{A}$, which is a generalization of the construction of graphical arrangements. We show that when the parameters of $\mathcal{A}$ satisfy certain conditions, the number of independent sets of the disjoint union $G(\mathcal{A},q_1)\cup\cdots\cup G(\mathcal{A},q_s)$ depends only on the coefficients of $\mathcal{A}$ and the total number of vertices $\sum_i q_i$ when $q_i$'s are powers of large enough prime numbers. In addition it is independent of the coefficients as long as $\mathcal{A}$ is central and the coefficients are multiplicatively independent.

104) Yatharth Agarwal (PRIMES), Vishnu Murale (PRIMES), Jason Hennessey (Boston University), Kyle Hogan (Boston University), and Mayank Varia (Boston University), Moving in Next Door: Network Flooding as a Side Channel in Cloud Environments (14-16 Nov 2016), published in Sara Foresti and Giuseppe Persiano, eds., *Cryptology and Network Security: 15th International Conference Proceedings, CANS 2016, Milan, Italy, November 14–16, 2016*, pp. 755-760.

Co-locating multiple tenants’ virtual machines (VMs) on the same host underpins public clouds’ affordability, but sharing physical hardware also exposes consumer VMs to side channel attacks from adversarial co-residents. We demonstrate passive bandwidth measurement to perform traffic analysis attacks on co-located VMs. Our attacks do not assume a privileged position in the network or require any communication between adversarial and victim VMs. Using a single feature in the observed bandwidth data, our algorithm can identify which of 3 potential YouTube videos a co-resident VM streamed with 66 % accuracy. We discuss defense from both a cloud provider’s and a consumer’s perspective, showing that effective defense is difficult to achieve without costly under-utilization on the part of the cloud provider or over-utilization on the part of the consumer.

103) Dhruv Rohatgi, A Connection Between Vector Bundles over Smooth Projective Curves and Representations of Quivers (31 Oct 2016)

We create a partition bijection that yields a partial result on a recent conjecture by Schiffmann relating the problems of counting over a finite field (1) vector bundles over smooth projective curves, and (2) representations of quivers.

102) Aaron Yeiser (PRIMES) and Alex Townsend (Cornell University), A spectral element method for meshes with skinny elements (30 Oct 2016; arXiv.org, 27 Mar 2018)

When numerically solving partial differential equations (PDEs), the first step is often to discretize the geometry using a mesh and to solve a corresponding discretization of the PDE. Standard finite and spectral element methods require that the underlying mesh has no skinny elements for numerical stability. Here, we develop a novel spectral element method that is numerically stable on meshes that contain skinny elements, while also allowing for high degree polynomials on each element. Our method is particularly useful for PDEs for which anisotropic mesh elements are beneficial and we demonstrate it with a Navier--Stokes simulation. Code for our method can be found at this URL.

101) Tanya Khovanova (MIT) and Rafael Saavedra (PRIMES), Discreet Coin Weighings and the Sorting Strategy (arXiv.org, 23 Sep 2016)

In 2007, Alexander Shapovalov posed an old twist on the classical coin weighing problem by asking for strategies that manage to conceal the identities of specific coins while providing general information on the number of fake coins. In 2015, Diaco and Khovanova studied various cases of these "discreet strategies" and introduced the revealing factor, a measure of the information that is revealed.

In this paper we discuss a natural coin weighing strategy which we call the sorting strategy: divide the coins into equal piles and sort them by weight. We study the instances when the strategy is discreet, and given an outcome of the sorting strategy, the possible number of fake coins. We prove that in many cases, the number of fake coins can be any value in an arithmetic progression whose length depends linearly on the number of coins in each pile. We also show the strategy can be discreet when the number of fake coins is any value within an arithmetic subsequence whose length also depends linearly on the number of coins in each pile. We arrive at these results by connecting our work to the classic Frobenius coin problem. In addition, we calculate the revealing factor for the sorting strategy.

100) Kai-Siang Ang (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), On the geometry of regular icosahedral capsids containing disymmetrons (arXiv.org, 29 Aug 2016), published in Journal of Structural Biology (19 Jan 2017)

Icosahedral virus capsids are composed of symmetrons, organized arrangements of capsomers. There are three types of symmetrons: disymmetrons, trisymmetrons, and pentasymmetrons, which have different shapes and are centered on the icosahedral 2-fold, 3-fold and 5-fold axes of symmetry, respectively. In 2010 [Sinkovits & Baker] gave a classification of all possible ways of building an icosahedral structure solely from trisymmetrons and pentasymmetrons, which requires the triangulation number T to be odd. In the present paper we incorporate disymmetrons to obtain a geometric classification of icosahedral viruses formed by regular penta-, tri-, and disymmetrons. For every class of solutions, we further provide formulas for symmetron sizes and parity restrictions on h, k, and T numbers. We also present several methods in which invariants may be used to classify a given configuration.

99) Tanya Khovanova (MIT) and Shuheng Niu (PRIMES), *m*-Modular Wythoff (arXiv.org, 2 Aug 2016)

We discuss a variant of Wythoff's Game, $m$-Modular Wythoff's Game, and identify the winning and losing positions for this game.

2015 Research Papers

98) Caleb Ji, Robin Park, and Angela Song, Combinatorial Games of No Strategy (20 Aug 2016)

In this paper, we study a particular class of combinatorial game motivated by previous research conducted by Professor James Propp, called *Games of No Strategy*, or games whose winners are predetermined. Finding the number of ways to play such games often leads to new combinatorial sequences and involves methods from analysis, number theory, and other fields. For the game *Planted Brussel Sprouts*, a variation on the well-known game Sprouts, we find a new proof that the number of ways to play is equal to the number of spanning trees on n vertices, and for *Mozes’ Game of Numbers*, a game studied for its interesting connections with other fields, we use prior work by Alon to calculate the number of ways to play the game for a certain case. Finally, in the game *Binary Fusion*, we show through both algebraic and combinatorial proofs that the number of ways to play generates Catalan’s triangle.

97) Meena Jagadeesan, The Exchange Graphs of Weakly Separated Collections (arXiv.org, 19 Aug 2016)

Weakly separated collections arise in the cluster algebra derived from the Pl\"ucker coordinates on the nonnegative Grassmannian. Oh, Postnikov, and Speyer studied weakly separated collections over a general Grassmann necklace $\mathcal{I}$ and proved the connectivity of every exchange graph. Oh and Speyer later introduced a generalization of exchange graphs that we call $\mathcal{C}$-constant graphs. They characterized these graphs in the smallest two cases. We prove an isomorphism between exchange graphs and a certain class of $\mathcal{C}$-constant graphs. We use this to extend Oh and Speyer's characterization of these graphs to the smallest four cases, and we present a conjecture on a bound on the maximal order of these graphs. In addition, we fully characterize certain classes of these graphs in the special cases of cycles and trees.

96) Nicholas Diaco, Counting Counterfeit Coins: A New Coin Weighing Problem (arXiv.org, 13 Jun 2016)

In 2007, a new variety of the well-known problem of identifying a counterfeit coin using a balance scale was introduced in the sixth International Kolmogorov Math Tournament. This paper offers a comprehensive overview of this new problem by presenting it in the context of the traditional coin weighing puzzle and then explaining what makes the new problem mathematically unique. Two weighing strategies described previously are used to derive lower bounds for the optimal number of admissible situations for given parameters. Additionally, a new weighing procedure is described that can be adapted to provide a solution for a broad spectrum of initial parameters by representing the number of counterfeit coins as a linear combination of positive integers. In closing, we offer a new form of the traditional counterfeit coin problem and provide a lower bound for the number of weighings necessary to solve it.

95) Jesse Geneson (MIT) and Meghal Gupta (PRIMES), Bounding extremal functions of forbidden 0-1 matrices using *(r,s)*-formations (19 Mar 2016)

First, we prove tight bounds of $n 2^{\frac{1}{(t-2)!}\alpha(n)^{t-2} \pm O(\alpha(n)^{t-3})}$ on the extremal function of the forbidden pair of ordered sequences $(1 2 3 \ldots k)^t$ and $(k \ldots 3 2 1)^t$ using bounds on a class of sequences called $(r,s)$-formations. Then, we show how an analogous method can be used to derive similar bounds on the extremal functions of forbidden pairs of $0-1$ matrices consisting of horizontal concatenations of identical identity matrices and their horizontal reflections.

94) Varun Jain, Novel Relationships Between Circular Planar Graphs and Electrical Networks (20 Feb 2016)

Circular planar graphs are used to model electrical networks, which arise in classical physics. Associated with such a network is a network response matrix, which carries information about how the network behaves in response to certain potential differences. Circular planar graphs can be organized into equivalence classes based upon these response matrices. In each equivalence class, certain fundamental elements are called critical. Additionally, it is known that equivalent graphs are related by certain local transformations. Using wiring diagrams, we first investigate the number of Y-∆ transformations required to transform one critical graph in an equivalence class into another, proving a quartic bound in the order of the graph. Next, we consider positivity phenomena, studying how testing the signs of certain circular minors can be used to determine if a given network response matrix is associated with a particular equivalence class. In particular, we prove a conjecture by Kenyon and Wilson for some cases.

93) Arthur Azvolinsky, Explicit Computations of the Frozen Boundaries of Rhombus Tilings of Polygonal Domains (12 Feb 2016)

Consider a polygonal domain $\Omega$ drawn on a regular triangular lattice. A *rhombus tiling* of $\Omega$ is defined as a complete covering of the domain with $60^{\textrm{o}}$-rhombi, where each one is obtained by gluing two neighboring triangles together.
We consider a uniform measure on the set of all tilings of $\Omega$. As the mesh size of the lattice approaches zero while the polygon remains fixed, a random tiling approaches a deterministic limit shape. An important phenomenon that occurs with the convergence towards a limit shape is the formation of *frozen facets*; that is, areas where there are asymptotically tiles of only one particular type. The sharp boundary between these ordered facet formations and the disordered region is a curve inscribed in $\Omega$. This inscribed curve is defined as the *frozen boundary*. The goal of this project was to understand the purely algebraic approach, elaborated on in a paper by Kenyon and Okounkov, to the problem of explicitly computing the frozen boundary. We will present our results for a number of special cases we considered.

92) David Amirault, Better Bounds on the Rate of Non-Witnesses of Lucas Pseudoprimes (3 Feb 2016)

Efficient primality testing is fundamental to modern cryptography for the purpose of key generation. Different primality tests may be compared using their runtimes and rates of non-witnesses. With the Lucas primality test, we analyze the frequency of Lucas pseudoprimes using MATLAB. We prove that a composite integer* n* can be a strong Lucas pseudoprime to at most ^{1}⁄_{6} of parameters *P*, *Q* unless *n* belongs to a short list of exception cases, thus improving the bound from the previous result of ^{4}⁄_{15}: We also explore the properties obeyed by such exceptions and how these cases may be handled by an extended version of the Lucas primality test.

91) Daniel Guo, An Infection Spreading Model on Binary Trees (26 Jan 2016)

An important and ongoing topic of research is the study of infectious diseases and the speed at which these diseases spread. Modeling the spread and growth of such diseases leads to a more precise understanding of the phenomenon and accurate predictions of spread in real life. We consider a long-range infection model on an infinite regular binary tree. Given a spreading coefficient $\alpha>1$, the time it takes for the infection to travel from one node to another node below it is exponentially distributed with specific rate functions such as $2^{-k}k^{-\alpha}$ or $\frac{1}{\alpha^k}$, where $k$ is the difference in layer number between the two nodes. We simulate and analyze the time needed for the infection to reach layer $m$ or below starting from the root node. The resulting time is recorded and graphed for different values of $\alpha$ and $m$. Finally, we prove rigorous lower and upper bounds for the infection time, both of which are approximately logarithmic with respect to $m$. The same techniques and results are valid for other regular $d$-ary trees, in which each node has exactly $d$ children where $d>2$.

90) Jacob Klegar, Bounded Tiling-Harmonic Functions on the Integer Lattice (25 Jan 2016)

Tiling-harmonic functions are a class of functions on square tilings that minimize a specific energy. These functions may provide a useful tool in studying square Sierpinski carpets. In this paper we show two new Maximum Modulus Principles for these functions, prove Harnack's Inequality, and give a proof that the set of tiling-harmonic functions is closed. One of these Maximum Modulus Principles is used to show that bounded infinite tiling-harmonic functions must have arbitrarily long constant lines. Additionally, we give three sufficient conditions for tiling-harmonic functions to be constant. Finally, we explore comparisons between tiling and graph-harmonic functions, especially in regards to oscillating boundary values.

89) Richard Yi, A Probability-Based Model of Traffic Flow (22 Jan 2016)

Describing the behavior of traffic via mathematical modeling and computer simulation has been a challenge confronted by mathematicians in various ways throughout the last century. In this project, we introduce various existing traffic flow models and present a new, probability-based model that is a hybrid of the microscopic and macroscopic views, drawing upon current ideas in traffic flow theory. We examine the correlations found in the data of our computer simulation. We hope that our results could help civil engineers implement efficient road systems that fit their needs, as well as contribute toward the design of safely operating unmanned vehicles.

88) Kenz Kallal, Matthew Lipman, and Felix Wang, Equal Compositions of Rational Functions (21 Jan 2016)

We study the following questions:

(1) What are all solutions to $f\circ \hat{f} = g\circ \hat{g}$ in complex rational functions $f,g\in\mathbb{C}(X)$ and meromorphic functions $\hat{f}, \hat{g}$ on the complex plane?

(2) For which rational functions $f(X)$ and $g(X)$ with coefficients in an algebraic number field $K$ does the equation $f(a)=g(b)$ have infinitely many solutions with $a,b\in K$?

We utilize various algebraic, geometric and analytic results in order to resolve both questions in the case that the numerator of $f(X)-g(Y)$ is an irreducible polynomial in $\mathbb{C}[X,Y]$ of sufficiently large degree. Our work answers a 1973 question of Fried in all but finitely many cases, and makes significant progress towards answering a 1924 question of Ritt and a 1997 question of Lyubich and Minsky.

87) Dhruv Medarametla, Bounding Norms of Locally Random Matrices (21 Jan 2016)

Recently, several papers proving lower bounds for the performance of the Sum Of Squares Hierarchy on the planted clique problem have come out. A crucial part of all four papers is probabilistically bounding the norms of certain \locally random" matrices. In these matrices, the entries are not completely independent of each other, but rather depend upon a few edges of the input graph. In this paper, we study the norms of these locally random matrices. We start by bounding the norms of simple locally random matrices, whose entries depend on a bipartite graph *H* and a random graph *G*; we then generalize this result by bounding the norms of complex locally random matrices, matrices based o of a much more general graph *H* and a random graph *G*. For both cases, we prove almost-tight probabilistic bounds on the asymptotic behavior of the norms of these matrices.

86) Rachel Zhang, Statistics of Intersections of Curves on Surfaces (19 Jan 2016)

Each orientable surface with nonempty boundary can be associated with a planar model,
whose edges can then be labeled with letters that read out a surface word. Then, the curve
word of a free homotopy class of closed curves on a surface is the minimal sequence of edges of
the planar model through which a curve in the class passes. The length of a class of curves is
defined to be the number of letters in its curve word.
We fix a surface and its corresponding planar model.

Fix a free homotopy class of curves ω on the surface. For another class of curves *c*, let *i*(ω; *c*) be the minimal number of intersections
of curves in ω and *c*. In this paper, we show that the mean of the distribution of *i*(ω; *c*), for
random curve *c* of length *n*, grows proportionally with *n* and approaches μ(ω) ⋅ *n* for a constant
μ(ω). We also give an algorithm to compute μ(ω) and have written a program that calculates
μ(ω) for any curve ω on any surface. In addition, we prove that *i*(ω; *c*) approahces a Gaussian
distribution as *n* → ∞ by viewing the generation of a random curve as a Markov Chain.

85) Cristian Gutu and Fengyao Ding, SecretRoom: An Anonymous Chat Client (16 Jan 2016)

While many people would like to be able to communicate anonymously, the few existing anonymous communication systems sacrifice anonymity for performance, or viceversa. The most popular such app is Tor, which relies on a series of relays to protect anonymity. Though proven to be efficient, Tor does not guarantee anonymity in the presence of strong adversaries like ISPs and government agencies who can conduct indepth traffic analysis. In contrast, our messaging application, SecretRoom, implements an improved version of a secure messaging protocol called Dining Cryptographers Networks (DCNets) to guarantee true anonymity in moderately sized groups. However, unlike traditional DCNets, SecretRoom does not require direct communication between all participants and does not depend on the presence of honest clients for anonymity. By introducing an untrusted server that performs the DCNet protocol on behalf of the clients, SecretRoom manages to reduce the O(*n*^{2}) communication associated with traditional DCNets to O(*n*) for *n* clients. Moreover, by introducing artificially intelligent clients, SecretRoom makes the anonymity set size independent of the number of “real” clients. Ultimately SecretRoom reduces the communication to O(*n*) and allows the DCNet protocol to scale to hundreds of clients compared to a few tens of clients in traditional DCNets.

84) Girishvar Venkat, Signatures of the Contravariant Form on Representations of the Hecke Algebra and Rational Cherednik Algebra associated to *G *(*r*,1,*n*) (15 Jan 2016)

The Hecke algebra and rational Cherednik algebra of the group *G *(*r*,1,*n*) are non-commutative algebras that are deformations of certain classical algebras associated to the group. These algebras have numerous applications in representation theory, number theory, algebraic geometry and integrable systems in quantum physics. Consequently, understanding their irreducible representations is important. If the deformation parameters are generic, then these irreducible representations, called Specht modules in the case of the Hecke algebra and Verma modules in the case of the Cherednik algebra, are in bijection with the irreducible representations of *G *(*r*,1,*n*). However, while every irreducible representation of *G *(*r*,1,*n*) is unitary, the Hermitian contravariant form on the Specht modules and Verma modules may only be non-degenerate. Thus, the signature of this form provides a great deal of information about the representations of the algebras that cannot be seen by looking at the group representations. In this paper, we compute the signature of arbitrary Specht modules of the Hecke algebra and use them to give explicit formulas of the parameter values for which these modules are unitary. We also compute asymptotic limits of existing formulas for the signature character of the polynomial representations of the Cherednik algebra which are vastly simpler than the full signature characters and show that these limits are rational functions in *t*. In addition, we show that for half of the parameter values, for each *k*, the degree *k* portion of the polynomial representation is unitary for large enough *n*.

83) Mehtaab Sawhney (PRIMES) and Jonathan Weed (MIT), Further results on arc and bar k-visibility graphs (arXiv.org, 6 Jan 2016)

We consider visibility graphs involving bars and arcs in which lines of sight can pass through up to k objects. We prove a new edge bound for arc k-visibility graphs, provide maximal constructions for arc and semi-arc k-visibility graphs, and give a complete characterization of semi-arc visibility graphs. We show that the family of arc i-visibility graphs is never contained in the family of bar j-visibility graphs for any i and j, and that the family of bar i-visibility graphs is not contained in the family of bar j-visibility graphs for $i \neq j$. We also give the first thickness bounds for arc and semi-arc k-visibility graphs. Finally, we introduce a model for random semi-bar and semi-arc k-visibility graphs and analyze its properties.

82) Harshal Sheth and Aashish Welling, An Implementation and Analysis of a Kernel Network Stack in Go with the CSP Style (30 Dec 2015; arXiv.org, 17 Mar 2016)

Modern operating system kernels are written in lower-level languages such as C. Although the low-level functionalities of C are often useful within kernels, they also give rise to several classes of bugs. Kernels written in higher level languages avoid many of these potential problems, at the possible cost of decreased performance. This research evaluates the advantages and disadvantages of a kernel written in a higher level language. To do this, the network stack subsystem of the kernel was implemented in Go with the Communicating Sequential Processes (CSP) style. Go is a high-level programming language that supports the CSP style, which recommends splitting large tasks into several smaller ones running in independent "threads". Modules for the major networking protocols, including Ethernet, ARP, IPv4, ICMP, UDP, and TCP, were implemented. In this study, the implemented Go network stack, called GoNet, was compared to a representative network stack written in C. The GoNet code is more readable and generally performs better than that of its C stack counterparts. From this, it can be concluded that Go with CSP style is a viable alternative to C for the language of kernel implementations.

81) Xiangyao Yu (MIT), Hongzhe Liu (PRIMES), Ethan Zou (PRIMES), and Srini Devadas (MIT), Tardis 2.0: An Optimized Time Traveling Coherence Protocol (arXiv.org, 27 Nov 2015), published in Proceedings of the 2016 International Conference on Parallel Architectures and Compilation (PACT '16), pp. 261-274.

The scalability of cache coherence protocols is a significant challenge in multicore and other distributed shared memory systems. Traditional snoopy and directory-based coherence protocols are difficult to scale up to many-core systems because of the overhead of broadcasting and storing sharers for each cacheline. Tardis, a recently proposed coherence protocol, shows potential in solving the scalability problem, since it only requires O(logN) storage per cacheline for an N-core system and needs no broadcasting support. The original Tardis protocol, however, only supports the sequential consistency memory model. This limits its applicability in real systems since most processors today implement relaxed consistency models like Total Store Order (TSO). Tardis also incurs large network traffic overhead on some benchmarks due to an excessive number of renew messages. Furthermore, the original Tardis protocol has suboptimal performance when the program uses spinning to communicate between threads. In this paper, we address these downsides of Tardis protocol and make it significantly more practical. Specifically, we discuss the architectural, memory system and protocol changes required in order to implement TSO consistency model on Tardis, and prove that the modified protocol satisfies TSO. We also propose optimizations for better leasing policies and to handle program spinning. Evaluated on 20 benchmarks, optimized Tardis at 64 (256) cores can achieve average performance improvement of 15.8% (8.4%) compared to the baseline Tardis and 1% (3.4%) compared to the baseline directory protocol. Our optimizations also reduce the average network traffic by 4.3% (6.1%) compared to the baseline directory protocol. On this set of benchmarks, optimized Tardis improves on a fullmap directory protocol in the metrics of energy, performance and storage, while being simpler to implement.

80) Allison Paul, Spectral Inference of a Directed Acyclic Graph Using Pairwise Similarities (11 Nov 2015)

A gene ontology graph is a directed acyclic graph (DAG) which represents relationships among biological processes. Inferring such a graph using a gene similarity matrix is NP-hard in general. Here, we propose an approximate algorithm to solve this problem efficiently by reducing the dimensionality of the problem using spectral clustering. We show that the original problem can be simplified to the inference problem of overlapping clusters in a network. We then solve the simplified problem in two steps:first we infer clusters using a spectral clustering technique. Then, we identify possible overlaps among the inferred clusters by identifying maximal cliques over the cluster similarity graph. We illustrate the effectiveness of our method over various synthetic networks in terms of both the performance and computational complexity compared to existing methods.

79) Niket Gowravaram, A Variation of nil-Temperley-Lieb Algebras of type A (26 Sep 2015)

We investigate a variation on the nil-Temperley-Lieb algebras of type A. This variation is formed by removing one of the relations and, in some sense, can be considered as a type B of the algebras. We give a general description of the structure of monomials formed by generators in the algebras. We also show that the dimension of these algebras is the sequence ${2n \choose n}$, by showing that the dimension is the Catalan transform of the sequence $2^n$.

78) Caleb Ji, Tanya Khovanova (MIT), Robin Park, and Angela Song, Chocolate Numbers (arXiv.org, 21 Sep 2015), published in Journal of Integer Sequences, vol. 19 (2016)

In this paper, we consider a game played on a rectangular $m \times n$ gridded chocolate bar. Each move, a player breaks the bar along a grid line. Each move after that consists of taking any piece of chocolate and breaking it again along existing grid lines, until just $mn$ individual squares remain.

This paper enumerates the number of ways to break an $m \times n$ bar, which we call chocolate numbers, and introduces four new sequences related to these numbers. Using various techniques, we prove interesting divisibility results regarding these sequences.

77) Albert Gerovitch, Andrew Gritsevskiy, and Gregory Barboy, Mobile Health Surveillance: The Development of Software Tools for Monitoring the Spread of Disease (21 Sep 2015)

Disease spread monitoring data often comes with a significant delay and low geospatial resolution. We aim to develop a software tool for data collection, which enables daily monitoring and prediction of the spread of disease in a small community. We have developed a crowdsourcing application that collects users' health statuses and locations. It allows users to update their daily status online, and, in return, provides a visual map of geospatial distribution of sick people in a community, outlining locations with increased disease incidence. Currently, due to the lack of a large user base, we substitute this information with simulated data, and demonstrate our program's capabilities on a hypothetical outbreak. In addition, we use analytical methods for predicting town-level disease spread in the future. We model the disease spread via interpersonal probabilistic interactions on an undirected social graph. The network structure is based on scale-free networks integrated with Census data. The epidemic is modeled using the Susceptible-Infected-Recovered (SIR) model and a set of parameters, including transmission rate and vaccination patterns. The developed application will provide better methods for early detection of epidemics, identify places with high concentrations of infected people, and predict localized disease spread.

76) Niket Gowravaram and Tanya Khovanova (MIT), On the Structure of nil-Temperley-Lieb Algebras of type A (arXiv.org, 1 Sep 2015)

We investigate nil-Temperley-Lieb algebras of type A. We give a general description of the structure of monomials formed by the generators. We also show that the dimensions of these algebras are the famous Catalan numbers by providing a bijection between the monomials and Dyck paths. We show that the distribution of these monomials by degree is the same as the distribution of Dyck paths by the sum of the heights of the peaks minus the number of peaks.

75) Tanya Khovanova (MIT) and Karan Sarkar, P-positions in Modular Extensions to Nim (arXiv.org, 27 Aug 2015), published in International Journal of Game Theory, vol. 46 (2017)

In this paper, we consider a modular extension to the game of Nim, which we call $m$-Modular Nim, and explore its optimal strategy. In $m$-Modular Nim, a player can either make a standard Nim move or remove a multiple of $m$ tokens in total. We develop a winning strategy for all $m$ with $2$ heaps and for odd $m$ with any number of heaps.

74) Nicholas Diaco and Tanya Khovanova (MIT), Weighing Coins and Keeping Secrets (arXiv.org, 20 Aug 2015), published in Mathematical Intelligencer (September 2016)

In this expository paper we discuss a relatively new counterfeit coin problem with an unusual goal: maintaining the privacy of, rather than revealing, counterfeit coins in a set of both fake and real coins. We introduce two classes of solutions to this problem --- one that respects the privacy of all the coins and one that respects the privacy of only the fake coins --- and give several results regarding each. We describe and generalize 6 unique strategies that fall into these two categories. Furthermore, we explain conditions for the existence of a solution, as well as showing proof of a solution's optimality in select cases. In order to quantify exactly how much information is revealed by a given solution, we also define the revealing factor and revealing coefficient; these two values additionally act as a means of comparing the relative effectiveness of different solutions. Most importantly, by introducing an array of new concepts, we lay the foundation for future analysis of this very interesting problem, as well as many other problems related to privacy and the transfer of information.

73) Luke Sciarappa, Simple commutative algebras in Deligne's categories Rep($S_t$) (arXiv.org, 24 Jun 2015)

We show that in the Deligne categories $\mathrm{Rep}(S_t)$ for $t$ a transcendental number, the only simple algebra objects are images of simple algebras in the category of representations of a symmetric group under a canonical induction functor. They come in families which interpolate the families of algebras of functions on the cosets of $H\times S_{n-k}$ in $S_n$, for a fixed subgroup $H$ of $S_k$.

2014 Research Papers

72) Geoffrey Fudenberg (Harvard), Maxim Imakaev (MIT), Carolyn Lu (PRIMES), Anton Goloborodko (MIT), Nezar Abdennur (MIT), and Leonid Mirny (MIT), Formation of Chromosomal Domains by Loop Extrusion (bioRxiv, 14 Aug 2015), published in Cell Reports 15:9 (31 May 2016): 2038–2049.

Characterizing how the three-dimensional organization of eukaryotic interphase chromosomes modulates regulatory interactions is an important contemporary challenge. Here we propose an active process underlying the formation of chromosomal domains observed in Hi-C experiments. In this process, cis-acting factors extrude progressively larger loops, but stall at domain boundaries; this dynamically forms loops of various sizes within but not between domains. We studied this mechanism using a polymer model of the chromatin fiber subject to loop extrusion dynamics. We find that systems of dynamically extruded loops can produce domains as observed in Hi-C experiments. Our results demonstrate the plausibility of the loop extrusion mechanism, and posit potential roles of cohesin complexes as a loop-extruding factor, and CTCF as an impediment to loop extrusion at domain boundaries.

71) Kavish Gandhi, Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube (4 April 2015)

A geodesic in the hypercube is the shortest possible path between two vertices. Leader and Long (2013) conjectured that, in every antipodal $2$-coloring of the edges of the hypercube, there exists a monochromatic geodesic between antipodal vertices. For this and an equivalent conjecture, we prove the cases $n = 2, 3, 4, 5$. We also examine the *maximum* number of monochromatic geodesics of length $k$ in an antipodal $2$-coloring and find it to be $2^{n-1}(n-k+1)\binom{n-1}{k-1}(k-1)!$. In this case, we classify all colorings in which this maximum occurs. Furthermore, we explore the maximum number of antipodal geodesics in a subgraph of the hypercube with a fixed proportion of edges, providing a conjectured optimal configuration as a lower bound, which, interestingly, contains a constant proportion of geodesics with respect to $n$. Finally, we present a series of smaller results that could be of use in finding an upper bound on the maximum number of antipodal geodesics in such a subgraph of the hypercube.

70) Jesse Geneson (MIT) and Peter M. Tian (PRIMES), Sequences of formation width $4$ and alternation length $5$ (arXiv.org, 13 Feb 2015)

Sequence pattern avoidance is a central topic in combinatorics. A sequence
$s$ contains a sequence $u$ if some subsequence of $s$ can be changed into $u$
by a one-to-one renaming of its letters. If $s$ does not contain $u$, then $s$
avoids $u$. A widely studied extremal function related to pattern avoidance is
$Ex(u, n)$, the maximum length of an $n$-letter sequence that avoids $u$ and
has every $r$ consecutive letters pairwise distinct, where $r$ is the number of
distinct letters in $u$.

We bound $Ex(u, n)$ using the formation width function, $fw(u)$, which is the
minimum $s$ for which there exists $r$ such that any concatenation of $s$
permutations, each on the same $r$ letters, contains $u$. In particular, we
identify every sequence $u$ such that $fw(u)=4$ and $u$ contains $ababa$. The
significance of this result lies in its implication that, for every such
sequence $u$, we have $Ex(u, n) = \Theta(n \alpha(n))$, where $\alpha(n)$
denotes the incredibly slow-growing inverse Ackermann function. We have thus
identified the extremal function of many infinite classes of previously
unidentified sequences.

69) William Wu (PRIMES), Nicolaas Kaashoek (PRIMES), Matthew Weinberg (MIT), Christos Tzamos (MIT), and Costis Daskalakis (MIT), Game Theory based Peer Grading Mechanisms for MOOCs, paper for the Learning at Scale 2015 conference, March 14-18, 2015, Vancouver, BC, Canada (4 February 2015)

An efficient peer grading mechanism is proposed for grading the multitude of assignments in online courses. This novel approach is based on game theory and mechanism design. A set of assumptions and a mathematical model is ratified to simulate the dominant strategy behavior of students in a given mechanism. A benchmark function accounting for grade accuracy and workload is established to quantitatively compare eectiveness and scalability of various mechanisms. After multiple iterations of mechanisms under increasingly realistic assumptions, three are proposed: Calibration, Improved Calibration, and Deduction. The Calibration mechanism performs as predicted by game theory when tested in an online crowd-sourced experiment, but fails when students are assumed to communicate. The Improved Calibration mechanism addresses this assumption, but at the cost of more eort spent grading. The Deduction mechanism performs relatively well in the benchmark, outperforming the Calibration, Improved Calibration, traditional automated, and traditional peer grading systems. The mathematical model and benchmark opens the way for future derivative works to be performed and compared.

68) Alexandria Yu, Towards the classification of unital 7-dimensional commutative algebras (19 Jan 2015)

An *algebra* is a vector space with a compatible product operation. An algebra
is called *commutative* if the product of any two elements is independent of the order
in which they are multiplied. A basic problem is to determine how many unital
commutative algebras exist in a given dimension and to find all of these algebras. This
classification problem has its origin in number theory and algebraic geometry. For dimension
less than or equal to 6, Poonen has completely classified all unital commutative
algebras up to isomorphism. For dimension greater than or equal to 7, the situation is
much more complicated due to the fact that there are infinitely many algebras up to
isomorphism. The purpose of this work is to develop new techniques to classify unital
7-dimensional commutative algebras up to isomorphism. An algebra is called *local* if there exists a unique maximal ideal m. Local algebras are basic building blocks for
general algebras as any finite dimensional unital commutative algebra is isomorphic to
a direct sum of finite dimensional unital commutative local algebras. Hence, in order
to classify all finite dimensional unital commutative algebras, it suffices to classify all
finite dimensional unital commutative local algebras. In this article, we classify all unital
7-dimensional commutative local algebras up to isomorphism with the exception
of the special case *k*_{1} = 3 and *k*_{2} = 3, where, for each positive integer* i*, **m**^{i} is the
subalgebra generated by products of *i *elements in the maximal ideal **m** and *k*_{i} is the
dimension of the quotient algebra **m**^{i}/**m**^{i+1}. When *k*_{2} = 1, we classify all finite dimensional
unital commutative local algebras up to isomorphism. As a byproduct of our classification
theorems, we discover several new classes of unital finite dimensional commutative
algebras.

67) Niket Gowravaram and Uma Roy, Diagrammatic Calculus of Coxeter and Braid Groups (arXiv.org, 15 Mar 2015)

We investigate a novel diagrammatic approach to examining strict actions of a Coxeter group or a braid group on a category. This diagrammatic language, which was developed in a series of papers by Elias, Khovanov and Williamson, provides new tools and methods to attack many problems of current interest in representation theory. In our research we considered a particular problem which arises in this context. To a Coxeter group $W$ one can associate a real hyperplane arrangement, and can consider the complement of these hyperplanes in the complexification $Y_W$. The celebrated $K(\pi,1)$ conjecture states that $Y_W$ should be a classifying space for the pure braid group, and thus a natural quotient ${Y_W}/{W}$ should be a classifying space for the braid group. Salvetti provided a cell complex realization of the quotient, which we refer to as the Salvetti complex. In this paper we investigate a part of the $K(\pi,1)$ conjecture, which we call the $K(\pi,1)$ conjecturette, that states that the second homotopy group of the Salvetti complex is trivial. In this paper we present a diagrammatic proof of the $K(\pi,1)$ conjecturette for a family of braid groups as well as an analogous result for several families of Coxeter groups.

66) Arjun Khandelwal, Compact dot representations in permutation avoidance (3 Mar 2015)

A paper by a Eriksson et. al (2001) introduced a new form of representing a permutation, referred to as the compact dot representation, with the goal of constructing a smaller superpattern. We study this representation and give bounds on its size. We also consider a variant of the problem, where limitations on the alphabet size are imposed, and obtain lower bounds. Lastly, we consider the Mobius function of the poset of permutations ordered by containment.

65) Suzy Lou and Max Murin, On the Strongly Regular Graph of Parameters (99, 14, 1, 2) (9 Jan 2015)

In an attempt to find a strongly regular graph of parameters (99; 14; 1; 2) or to disprove its existence, we studied its possible substructure and constructions.

64) Shashwat Kishore (PRIMES) and Augustus Lonergan (MIT), Signatures of Multiplicity Spaces in Tensor Products of* sl*_{2} and *U*_{q}(*sl*_{2}) Representations (9 Jan 2015; arXiv.org, 8 Jun 2015)

We study multiplicity space signatures in tensor products of sl2 and *U*_{q}(*sl*_{2}) representations and their applications. We completely classify definite multiplicity spaces for generic tensor products of *sl*_{2} Verma modules. This provides a classification of a family of unitary representations of a basic quantized quiver variety, one of the first such classifications for any quantized quiver variety. We use multiplicity space signatures to provide the first real critical point lower bound for generic *sl*_{2} master functions. As a corollary of this bound, we obtain a simple and asymptotically correct approximation for the number of real critical points of a generic *sl*_{2} master function. We obtain a formula for multiplicity space signatures in tensor products of finite dimensional simple *U*_{q}(*sl*_{2}) representations. Our formula also gives multiplicity space signatures in generic tensor products of *sl*_{2} Verma modules and generic tensor products of real *U*_{q}(*sl*_{2}) Verma modules. Our results have relations with knot theory, statistical mechanics, quantum physics, and geometric representation theory.

63) Joseph Zurier, Generalizations of the Joints Problem (9 Jan 2015)

In this paper we explore generalizations of the joints problem introduced by B. Chazelle et al.

62) Nathan Wolfe (PRIMES), Ethan Zou (PRIMES), Ling Ren (MIT), and Xiangyao Yu (MIT), Optimizing Path ORAM for Cloud Storage Applications (arXiv.org, 8 Jan 2015)

We live in a world where our personal data are both valuable and vulnerable to misappropriation through exploitation of security vulnerabilities in online services. For instance, Dropbox, a popular cloud storage tool, has certain security flaws that can be exploited to compromise a user's data, one of which being that a user's access pattern is unprotected. We have thus created an implementation of Path Oblivious RAM (Path ORAM) for Dropbox users to obfuscate path access information to patch this vulnerability. This implementation differs significantly from the standard usage of Path ORAM, in that we introduce several innovations, including a dynamically growing and shrinking tree architecture, multi-block fetching, block packing and the possibility for multi-client use. Our optimizations together produce about a 77% throughput increase and a 60% reduction in necessary tree size; these numbers vary with file size distribution.

61) Brice Huang, Monomization of Power Ideals and Generalized Parking Functions (8 Jan 2015)

A power ideal is an ideal in a polynomial ring generated by powers of homogeneous linear forms. Power ideals arise in many areas of mathematics, including the study of zonotopes, approximation theory, and fat point ideals; in particular, their applications in approximation theory are relevant to work on splines and pertinent to mathematical modeling, industrial design, and computer graphics. For this reason, understanding the structure of power ideals, especially their Hilbert series, is an important problem. Unfortunately, due to the computational complexity of power ideals, this is a difficult problem. Only a few cases of this problem have been solved; efficient ways to compute the Hilbert series of a power ideal are known only for power ideals of certain forms. In this paper, we find an efficient way to compute the Hilbert series of a class of power ideals.

60) Kyle Gettig, Linear Extensions of Acyclic Orientations (7 Jan 2015)

Given a graph, an acyclic orientation of the edges determines a partial ordering of the vertices. This partial ordering has a number of linear extensions,* i.e. *total orderings of the vertices that agree with the partial ordering. The purpose of this paper is twofold. Firstly, properties of the orientation that induces the maximum number of linear extensions are investigated. Due to similarities between the optimal orientation in simple cases and the solution to the Max-Cut Problem, the possibility of a correlation is explored, though with minimal success. Correlations are then explored between the optimal orientation of a graph *G *and the comparability graphs with the minimum number of edges that contain *G* as a subgraph, as well as to certain graphical colorings induced by the orientation. Specifically, small cases of non-comparability graphs are investigated and compared to the known results for comparability graphs. We then explore the optimal orientation for odd anti-cycles and related graphs, proving that the conjectured orientations are optimal in the odd anti-cycle case. In the second part of this paper, the above concepts are extended to random graphs, that is, graphs with probabilities associated with each edge. New definitions and theorems are introduced to create a more intuitive system that agrees with the discrete case when all probabilities are 0 or 1, though complete results for this new system would be much more difficult to prove.

59) Shyam Narayanan, Improving the Speed and Accuracy of the Miller-Rabin Primality Test (7 Jan 2015)

In this paper, we discuss the accuracy of the Miller-Rabin Primality Test and the number of nonwitnesses for a composite odd integer *n*.

58) Peter M. Tian, Extremal Functions of Forbidden Multidimensional Matrices (7 Jan 2015)

We advance the extremal theory of matrices in two directions. The methods that we use come from combinatorics, probability, and analysis.

57) Eric Neyman, Cylindric Young Tableaux and their Properties (7 Jan 2015; earlier version on arXiv.org, 19 Oct 2014)

Cylindric Young tableaux are combinatorial objects that first appeared in the 1990s. A natural extension of the classical notion of a Young tableau, they have since been used several times, most notably by Gessel and Krattenthaler and by Alexander Postnikov. Despite this, relatively little is known about cylindric Young tableaux. This paper is an investigation of the properties of this object. In this paper, we extend the Robinson-Schensted-Knuth Correspondence, a well-known and very useful bijection concerning regular Young tableaux, to be a correspondence between pairs of cylindric tableaux. We use this correspondence to reach further results about cylindric tableaux. We then establish an interpretation of cylindric tableaux in terms of a game involving marble-passing. Next, we demonstrate a generic method to use results concerning cylindric tableaux in order to prove results about skew Young tableaux. We finish with a note on Knuth equivalence and its analog for cylindric tableaux.

56) Yilun Du, On the Algorithmic and Theoretical Exploration of Tiling-Harmonic Functions (6 Jan 2015)

In this paper, we explore a new class of harmonic functions defined on a tiling *T*, a square tiling of a region *D*, in * C*. We define these functions as tiling harmonic functions. We develop an efficient algorithm for computing interior values of tiling harmonic functions and graph harmonic functions in a tiling. Using our algorithm, we find that in general tiling harmonic functions are not generally equivalent to graph harmonic functions. In addition, we prove some theoretical results on the structure of tiling harmonic functions and classify one type of tiling harmonic function.

55) Jessica Li, On the Modeling of Snowflake Growth Using Hexagonal Automata (2 Jan 2015; arXiv.org, 8 May 2015; pubished (with Laura P. Schaposnik) in Physical Review E 93:2 (Feb. 2016))

Snowflake growth is an example of crystallization, a basic phase transition in physics. Studying snowflake growth helps gain fundamental understanding of this basic process and may help produce better crystalline materials and benefit several major industries. The basic theoretical physical mechanisms governing the growth of snowflake are not well understood: whilst current computer modeling methods can generate snowflake images that successfully capture some basic features of actual snowflakes, so far there has been no analysis of these computer models in the literature, and more importantly, certain fundamental features of snowflakes are not well understood. A key challenge of analysis is that the snowflake growth models consist of a large set of partial difference equations, and as in many chaos theory problems, rigorous study is difficult. In this paper we analyze a popular model (Reiter’s model) using a combined approach of mathematical analysis and numerical simulation. We divide a snowflake image into main branches and side branches and define two new variables (growth latency and growth direction) to characterize the growth patterns. We derive a closed form solution of the main branch growth latency using a one dimensional linear model, and compare it with the simulation results using the hexagonal automata. We discover a few interesting patterns of the growth latency and direction of side branches. On the basis of the analysis and the principle of surface free energy minimization, we propose a new geometric rule to incorporate interface control, a basic mechanism of crystallization that is not taken into account in the original Reiter’s model.

54) Amy Chou and Justin Kaashoek, __PuzzleJAR: Automated Constraint-based Generation of Puzzles of Varying Complexity__ (30 Sept 2014)

Engaging students in practicing a wide range of problems facilitates their learning. However, generating fresh problems that have specific characteristics, such as using a certain set of concepts or being of a given difficulty level, is a tedious task for a teacher. In this paper, we present PuzzleJAR, a system that is based on an iterative constraint-based technique for automatically generating problems. The PuzzleJAR system takes as parameters the problem definition, the complexity function, and domain-specific semantics-preserving transformations. We present an instantiation of our technique with automated generation of Sudoku and Fillomino puzzles, and we are currently extending our technique to generate Python programming problems. Since defining complexities of Sudoku and Fillomino puzzles is still an open research question, we developed our own mechanism to define complexity, using machine learning to generate a function for difficulty from puzzles with already known difficulties. Using this technique, PuzzleJAR generated over 200,000 Sudoku puzzles of different sizes (9x9, 16x16, 25x25) and over 10,000 Fillomino puzzles of sizes ranging from 2x2 to 16x16. .

53) Tanya Khovanova, Eric Nie, and Alok Puranik, The Sierpinski Triangle and The Ulam-Warburton Automaton (arXiv.org, 25 Aug 2014), published in Math Horizons (September 2015), reprinted in The Best Writing on Mathematics 2016

This paper is about the beauty of fractals and the surprising connections between them. We will explain the pioneering role that the Sierpinski triangle plays in the Ulam-Warburton automata and show you a number of pictures along the way.

52) Tanya Khovanova and Joshua Xiong, Cookie Monster Plays Games (arXiv.org, 6 July 2014)

We research a combinatorial game based on the Cookie
Monster problem called the Cookie Monster game that
generalizes the games of Nim and Wythoff. We also propose
several combinatorial games that are in between the Cookie
Monster game and Nim. We discuss properties of P-positions
of all of these games.

Each section consists of two parts. The first part is a
story presented from the Cookie Monster's point of view, the
second part is a more abstract discussion of the same ideas
by the authors.

51) Tanya Khovanova and Joshua Xiong, Nim Fractals (arXiv.org, 23 May 2014), published in Journal of Integer Sequences, Vol. 17 (2014)

We enumerate P-positions in the game of Nim in two different ways. In one series of sequences we enumerate them by the maximum number of counters in a pile. In another series of sequences we enumerate them by the total number of counters. We show that the game of Nim can be viewed as a cellular automaton, where the total number of counters divided by 2 can be considered as a generation in which P-positions are born. We prove that the three-pile Nim sequence enumerated by the total number of counters is a famous toothpick sequence based on the Ulam-Warburton cellular automaton. We introduce 10 new sequences.

50) Noah Golowich, Resolving a Conjecture on Degree of Regularity of Linear Homogeneous Equations (arXiv.org, 13 Apr 2014), published in The Electronic Journal of Combinatorics 21:3 (2014)

A linear equation is $r$-regular, if, for every $r$-coloring of the positive integers, there exist positive integers of the same color which satisfy the equation. In 2005, Fox and Radoićič conjectured that the equation $x_1 + 2x_2 + \cdots + 2^{n-2}x_{n-1} - 2^{n-1}x_n = 0$, for any $n \geq 2$, has a degree of regularity of $n-1$, which would verify a conjecture of Rado from 1933. Rado's conjecture has since been verified with a different family of equations. In this paper, we show that Fox and Radoićič's family of equations indeed have a degree of regularity of $n-1$. We also prove a few extensions of this result.

2013 Research Papers

49) Ritesh Ragavender, Odd Dunkl Operators and nilHecke Algebras (30 May 2014)

Symmetric functions appear in many areas of mathematics
and physics, including enumerative combinatorics, the
representation theory of symmetric groups, statistical
mechanics, and the quantum statistics of ideal gases. In the
commutative (or “even”) case of these symmetric functions,
Kostant and Kumar introduced a nilHecke algebra that
categorifies the quantum group *U*_{q} (*sl*_{2})
. This categorification helps to better understand Khovanov
homology, which has important applications in studying knot
polynomials and gauge theory. Recently, Ellis and Khovanov
initiated the program of “oddification” as an effort to
create a representation theoretic understanding of a new
“odd” Khovanov homology, which often yields more powerful
results than regular Khovanov homology. In this paper, we
contribute to- wards the project of oddification by studying
the odd Dunkl operators of Khongsap and Wang in the setting
of the odd nilHecke algebra. Specifically, we show that odd
divided difference operators can be used to construct odd
Dunkl operators, which we use to give a representation of * sl*_{2} on the algebra of skew polynomials and
evaluate the odd Dunkl Laplacian. We then investigate *q*-analogs
of divided difference operators to introduce new algebras
that are similar to the even and odd nilHecke algebras and
act on *q*-symmetric polynomials. We describe such
algebras for all previously unstudied values of *q*. We
conclude by generalizing a diagrammatic method and
developing the novel method of insertion in order to study *q*-symmetric polynomials from the perspective of
bialgebras.

48) Gabriella Studt, Construction of the higher Bruhat order on the Weyl group of type B (27 May 2014)

Manin and Schechtman defined the Bruhat order on the type
A Weyl group, which is closely associated to the Symmetric
group *S _{n}*, as the order of all pairs of
numbers in {1, 2, ..., n} . They proceeded to define a
series of higher orders. Each higher order is an order on
the subsets of {1, 2, ..., n} of size

*k*, and can be computed using an inductive argument. It is also possible to define each of these higher orders explicitly, and therefore know conclusively the lexicographic orders for all

*k*. It is thought that a closely related concept of lexicographic order exists for the Weyl group of type B, and that a similar method can be used to compute this series of higher orders. The applicability of this method is demonstrated in the paper, and we are able to determine and characterize the higher Bruhat order explicitly for certain

*n*and

*k*. We therefore conjecture the existence of such an order for all

*n*>

*k*,as well as its accompanying properties.

47) Jeffrey Cai, Orbits of a fixed-point subgroup of the symplectic group on partial flag varieties of type A (24 May 2014)

In this paper we compute the orbits of the symplectic
group Sp_{2n} on partial flag varieties GL_{2n }/* P* and on partial flag varieties enhanced
by a vector space, C^{2n} x GL_{2n }/* P*. This extends analogous results proved by
Matsuki on full flags. The general technique used in this
paper is to take the orbits in the full flag case and
determine which orbits remain distinct when the full flag
variety GL_{2n }/* B* is projected down
to the partial flag variety GL_{2n }/* P*.

The recent discovery of a connection between abstract algebra and the classical combinatorial Robinson-Schensted (RS) correspondence has sparked research on related algebraic structures and relationships to new combinatorial bijections, such as the Robinson- Schensted-Knuth (RSK) correspondence, the "mirabolic" RSK correspondence, and the "exotic" RS correspondence. We conjecture an exotic RSK correspondence between the or- bits described in this paper and semistandard bi-tableaux, which would yield an extension to the exotic RS correspondence found in a paper of Henderson and Trapa.

46) John Long, Evidence of Purifying Selection in Mammals (9 May 2014)

The Human Genome Project completed in 2003 gave us a reference genome for the human species. Before the project was completed, it was believed that the primary function of DNA was to code for protein. However, it was discovered that only 2% of the genome consists of regions that code for proteins. The remaining regions of the genome are either functional regions that regulate the coding regions or junk DNA regions that do nothing. The distinct ion between these two types of regions is not completely clear. Evidence of purifying selection, the decrease in frequency of deleterious mutations , is likely a sign that a region is functional. The goal of this project was to find evidence of purifying se lection in newly acquired regions in the human genome that are hypothesized to be functional. The mean Derived Allele Frequency of the featured regions was compared to that of control regions to determine the likelihood of selection.

45) Ravi Jagadeesan, A new Gal(Q/Q)-invariant of dessins d'enfants (arXiv.org, 30 March 2014)

We study the action of $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$ on the category of Belyi functions (finite, \'{e}tale covers of $\mathbb{P}^1_{\overline{\mathbb{Q}}}\setminus \{0,1,\infty\}$). We describe a new combinatorial $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$-invariant for a certain class of Belyi functions. As a corollary, we obtain that for all $k < 2^{\sqrt{\frac{2}{3}}}$ and all positive integers $N$, there is an $n \le N$ such that the set of degree $n$ Belyi functions of a particular rational Nielsen class must split into at least $\Omega\left(k^{\sqrt{N}}\right)$ Galois orbits. In addition, we define a new version of the Grothendieck-Teichm\"{u}ller group $\widehat{GT}$ into which $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$ embeds.

44) Andrey Grinshpun (MIT), Raj Raina (PRIMES), and Rik Sengupta (MIT), Minimum Degrees of Minimal Ramsey Graphs for Almost-Cliques (arXiv.org, 26 Jun 2014)

For graphs $F$ and $H$, we say $F$ is Ramsey for $H$ if every $2$-coloring of
the edges of $F$ contains a monochromatic copy of $H$. The graph $F$ is Ramsey
$H$-minimal if $F$ is Ramsey for $H$ and there is no proper subgraph $F'$ of
$F$ so that $F'$ is Ramsey for $H$. Burr, Erdös, and Lovasz defined $s(H)$ to
be the minimum degree of $F$ over all Ramsey $H$-minimal graphs $F$. Define
$H_{t,d}$ to be a graph on $t+1$ vertices consisting of a complete graph on $t$
vertices and one additional vertex of degree $d$. We show that $s(H_{t,d})=d^2$
for all values $1<d\le t$; it was previously known that $s(H_{t,1})=t-1$, so it
is surprising that $s(H_{t,2})=4$ is much smaller.

We also make some further progress on some sparser graphs. Fox and Lin
observed that $s(H)\ge 2\delta(H)-1$ for all graphs $H$, where $\delta(H)$ is
the minimum degree of $H$; Szabo, Zumstein, and Zurcher investigated which
graphs have this property and conjectured that all bipartite graphs $H$ without
isolated vertices satisfy $s(H)=2\delta(H)-1$. Fox, Grinshpun, Liebenau,
Person, and Szabo further conjectured that all triangle-free graphs without
isolated vertices satisfy this property. We show that $d$-regular $3$-connected
triangle-free graphs $H$, with one extra technical constraint, satisfy $s(H) =
2\delta(H)-1$; the extra constraint is that $H$ has a vertex $v$ so that if one
removes $v$ and its neighborhood from $H$, the remainder is connected.

43) Boryana Doyle (PRIMES), Geoffrey
Fudenberg (Harvard), Maxim Imakaev (MIT), and Leonid Mirny (MIT), Chromatin Loops as Allosteric Modulators of Enhancer-Promoter Interactions, published in *PLoS Computational Biology* (23 Oct 2014; earlier version in BioRxiv.org, 26 February
2014)

The classic model of eukaryotic gene expression requires direct spatial contact between a distal enhancer and a proximal promoter. Recent Chromosome Conformation Capture (3C) studies show that enhancers and promoters are embedded in a complex network of looping interactions. Here we use a polymer model of chromatin fiber to investigate whether, and to what extent, looping interactions between elements in the vicinity of an enhancer-promoter pair can influence their contact frequency. Our equilibrium polymer simulations show that a chromatin loop, formed by elements flanking either an enhancer or a promoter, suppresses enhancer-promoter interactions, working as an insulator. A loop formed by elements located in the region between an enhancer and a promoter, on the contrary, facilitates their interactions. We find that different mechanisms underlie insulation and facilitation; insulation occurs due to steric exclusion by the loop, and is a global effect, while facilitation occurs due to an effective shortening of the enhancer-promoter genomic distance, and is a local effect. Consistently, we find that these effects manifest quite differently for* in silico* 3C and microscopy. Our results show that looping interactions that do not directly involve an enhancer-promoter pair can nevertheless significantly modulate their interactions. This phenomenon is analogous to allosteric regulation in proteins, where a conformational change triggered by binding of a regulatory molecule to one site affects the state of another site.

42) William Kuszmaul, A New Approach to
Enumerating Statistics Modulo *n* (arXiv.org, 16
February
2014)

We find a new approach to computing the remainder of a polynomial modulo $x^n-1$; such a computation is called modular enumeration. Given a polynomial with coefficients from a commutative $\mathbb{Q}$-algebra, our first main result constructs the remainder simply from the coefficients of residues of the polynomial modulo $\Phi_d(x)$ for each $d\mid n$. Since such residues can often be found to have nice values, this simplifies a number of modular enumeration problems; indeed in some cases, such residues are already known while the related modular enumeration problem has remained unsolved. We list six such cases which our technique makes easy to solve. Our second main result is a formula for the unique polynomial $a$ such that $a \equiv f \mod \Phi_n(x)$ and $a\equiv 0 \mod x^d-1$ for each proper divisor $d$ of $n$.

We find a formula for remainders of $q$-multinomial coefficients and for remainders of $q$-Catalan numbers modulo $q^n-1$, reducing each problem to a finite number of cases for any fixed $n$. In the prior case, we solve an open problem posed by Hartke and Radcliffe. In considering $q$-Catalan numbers modulo $q^n-1$, we discover a cyclic group operation on certain lattice paths which behaves predictably with regard to major index. We also make progress on a problem in modular enumeration on subset sums posed by Kitchloo and Pachter.

41) Ajay Saini, Predictive Modeling of Opinion and Connectivity Dynamics in Social Networks (26 January 2014)

Social networks have been extensively studied in recent years with the aim of understanding how the connectivity of different societies and their subgroups influences the spread of innovations and opinions through human networks. Using data collected from real-world social networks, researchers are able to gain a better understanding of the dynamics of such networks and subsequently model the changes that occur in these networks over time. In our work, we use data from the Social Evolution dataset of the MIT Human Dynamics Lab to develop a data-driven model capable of predicting the trends and long term changes observed in a real- world social network. We demonstrate the effectiveness of the model by predicting changes in both opinion spread and connectivity that reflect the changes observed in our dataset. After validating the model, we use it to understand how different types of social networks behave over time by varying the conditions governing the change of opinions and connectivity. We conclude with a study of opinion propagation under different conditions in which we use the structure and opinion distribution of various networks to identify sets of agents capable of propagating their opinion throughout an entire network. Our results demonstrate the effectiveness of the proposed modeling approach in predicting the future state of social networks and provide further insight into the dynamics of interactions between agents in real-world social networks.

40) Rohil Prasad, * Investigating GCD in Z[√2] (1*1
January
2014)

We attempt to optimize the time needed to calculate greatest common divisors in the Euclidean domain Z[√2].

39) Jin-Woo Bryan Oh, Towards Generalizing Thrackles to Arbitrary Graphs (1 January 2014)

In the 1950s, John Conway came up with the notion of * thrackles*, graphs with embeddings in which no edge
crosses itself, but every pair of distinct edges intersects
each other exactly once. He conjectured that |E(G)| ≤ |V(G)|
for any thrackle G, a question unsolved to this day. In this
paper, we discuss some of the known properties of thrackles
and contribute a few new ones.

Only a few sparse graphs can be thrackles, and so it is
of interest to find an analogous notion that applies to
denser graphs as well. In this paper we introduce a
generalized version of thrackles called *near-thrackles*,
and prove some of their properties. We also discuss a large
number of conjectures about them which seem very obvious but
nonetheless are hard to prove. In the final section, we
introduce *thrackleability*, a number between 0 and 1
that turns out to be an accurate measure of how far away a
graph is from being a thrackle..

38) Junho Won, Lower bounds for
the Crossing Number of the Cartesian Product of a
Vertex-transitive Graph with a Cycle* (1* January
2014)

The minimum number of crossings for all drawings of a given graph $G$ on a plane is called its crossing number, denoted $cr(G)$. Exact crossing numbers are known only for a few families of graphs, and even the crossing number of a complete graph $K_m$ is not known for all $m$. Wenping et al. showed that $cr(K_m\Box C_n)\geqslant n\cdot cr(K_{m+2})$ for $n\geqslant 4$ and $m\geqslant 4$. We adopt their method to find a lower bound for $cr(G\Box C_n)$ where $G$ is a vertex-transitive graph of degree at least 3. We also suggest some particular vertex-transitive graphs of interest, and give two corollaries that give lower bounds for $cr(G\Box C_n)$ in terms of $n$, $cr(G)$, the number of vertices of $G$, and the degree of $G$, which improve on Wenping et al.'s result.

37) Ying Gao, On an Extension of Stanley Depth for Refinement-Ordered Posets (30 December 2013)

The concept of Stanley depth was originally defined for graded modules over commutative rings in 1982 by Richard P. Stanley. However, in 2009 Herzog, Vladiou, and Zheng found a property, ndepth, of posets analogous to the Stanley depths of certain modules, which provides an important link between combinatorics and commutative algebra. Due to this link, there arises the question of what this ndepth is for certain classes of posets.

Because ndepth was only recently defined, much remains to
be discovered about it. In 2009, Biro, Howard, Keller,
Trotter and Young found a lower bound for the ndepth of the
poset of nonempty subsets of {1; 2; ...; n} ordered
by inclusion. In 2010, Wang calculated the ndepth of the
product of chains n^{k} \ 0. However, ndepth
has yet to be studied in relation to many other commonly
found classes of posets. We chose to research the properties
of the ndepths of one such well-known class of posets - the
posets which consist of non-empty partitions of sets ordered
by refinement, which we denote as G_{i}.

We use combinatorial and algebraic methods to find the
ndepths for small posets in G_{i}. We show
that for posets of increasing size in G_{i},
new depth is strictly non-decreasing, and furthermore we
show that ndepth[G_{i}] ≥ [8i/29]
for all i. We also find that for all i,
ndepth[G_{i}] ≤ i through the
proof that ndepth[G_{i+1}] ≤ ndepth[G_{i}]
+ 1.

36) Nihal Gowravaram, Enumeration of Subclasses of (2+2)-free Partially Ordered Sets (26 December 2013)

We investigate avoidance in (2+2)-free partially ordered sets, posets that do not contain any induced subposet isomorphic to the union of two disjoint chains of length two. In particular, we are interested in enumerating the number of partially ordered sets of size N avoiding both 2+2 and some other poset α. For any α of size 3, the results are already well-known. However, out of the 15 such α of size 4, only 2 were previously known. Through the course of this paper, we explicitly enumerate 7 other such α of size 4. Also, we consider the avoidance of three posets simultaneously, 2+2 along with some pair (α,β); it turns out that this enumeration is often clean, and has sometimes surprising results. Furthermore, we turn to the question of Wilf-equivalences in (2+2)-free posets. We show such an equivalence between the Y-shaped and chain posets of size 4 via a direct bijection, and in fact, we extend this to show a Wilf-equivalence between the general chain poset and a general Y-shaped poset of the same size. In this paper, while our focus is on enumeration, we also seek to develop an understanding of the structures of the posets in the subclasses we are studying.

35) Yael Fregier (MIT) and Isaac Xia, Lower Central Series Ideal Quotients Over $\mathbb{F}_p$ and $\mathbb{Z}$ (17 November 2013; arXiv.org, 28 Jun 2015)

Given a graded associative algebra $A$, its lower central series is defined by $L_1 = A$ and $L_{i+1} = [L_i, A]$. We consider successive quotients $N_i(A) = M_i(A) / M_{i+1}(A)$, where $M_i(A) = AL_i(A) A$. These quotients are direct sums of graded components. Our purpose is to describe the $\mathbb{Z}$-module structure of the components; i.e., their free and torsion parts. Following computer exploration using *MAGMA*, two main cases are studied. The first considers $A = A_n / (f_1,\dots, f_m)$, with $A_n$ the free algebra on $n$ generators $\{x_1, \ldots, x_n\}$ over a field of characteristic $p$. The relations $f_i$ are noncommutative polynomials in $x_j^{p^{n_j}},$ for some integers $n_j$. For primes * p > 2*, we prove that $p^{\sum n_j} \mid \text{dim}(N_i(A))$. Moreover, we determine polynomials dividing the Hilbert series of each $N_i(A)$. The second concerns $A = \mathbb{Z} \langle x_1, x_2, \rangle / (x_1^m, x_2^n)$. For $i = 2,3$, the bigraded structure of $N_i(A_2)$ is completely described.

34) Steven Homberg, Finding Enrichments of Functional Annotations for Disease- Associated Single-Nucleotide Polymorphisms (10 November 2013)

Computational analysis of SNP-disease associations from GWAS as well as functional annotations of the genome enables the calculation of a SNP set's enrichment for a disease. These statistical enrichments can be and are calculated with a variety of statistical techniques, but there is no standard statistical method for calculating enrichments. Several entirely different tests are used by different investigators in the field. These tests can also be conducted with several variations in parameters which also lack a standard. In our investigation, we develop a computational tool for conducting various enrichment calculations and, using breast cancer-associated SNPs from a GWAS catalog as a foreground against all GWAS SNPs as a background, test the tool and analyze the relative performance of the various tests. The computational tool will soon be released to the scientific community as a part of the Bioconductor package. Our analysis shows that, for R2 threshold in LD block construction, values around 0.8-0.9 are preferable to those with more lax and more strict thresholds respectively. We find that block-matching tests yield better results than peak-shifting tests. Finally, we find that, in block-matching tests, block tallying using binary scoring, noting whether or not a block has an annotation only, yields the most meaningful results, while weighting LD r2 threshold has no influence.

33) Kavish Gandhi, Noah Golowich, and László Miklós Lovász, Degree of Regularity of Linear Homogeneous Equations (arXiv.org, 27 Sept 2013), published in Journal of Combinatorics 5:2 (2014)

We define a linear homogeneous equation to be strongly r-regular if, when a finite number of inequalities is added to the equation, the system of the equation and inequalities is still r-regular. In this paper, we derive a constraint on the coefficients of a linear homogeneous equation that gives a sufficient condition for the equation to be strongly r-regular. In 2009, Alexeev and Tsimerman introduced a family of equations, each of which is (n-1)-regular but not n-regular, verifying a conjecture of Rado from 1933. We show that these equations are actually strongly (n-1)-regular as a corollary of our results.

32) Leigh Marie Braswell and Tanya
Khovanova, __ On the Cookie Monster Problem__ (arXiv.org, 23 Sept 2013), published in Jennifer Beineke & Jason Rosenhouse, The Mathematics of Various Entertaining Subjects: Research in Recreational Math (Princeton University Press, 2015).

The Cookie Monster Problem supposes that the Cookie Monster wants to empty a set of jars filled with various numbers of cookies. On each of his moves, he may choose any subset of jars and take the same number of cookies from each of those jars. The Cookie Monster number of a set is the minimum number of moves the Cookie Monster must use to empty all of the jars. This number depends on the initial distribution of cookies in the jars. We discuss bounds of the Cookie Monster number and explicitly find the Cookie Monster number for jars containing cookies in the Fibonacci, Tribonacci, n-nacci, and Super-n-nacci sequences. We also construct sequences of k jars such that their Cookie Monster numbers are asymptotically rk, where r is any real number between 0 and 1 inclusive.

31) Vahid Fazel-Rezai, Equivalence Classes of Permutations Modulo Replacements Between 123 and Two-Integer Patterns (arXiv.org, 18 Sept 2013), published in The Electronic Journal of Combinatorics 21:2 (2014)

We explore a new type of replacement of patterns in permutations, suggested by James Propp, that does not preserve the length of permutations. In particular, we focus on replacements between 123 and a pattern of two integer elements. We apply these replacements in the classical sense; that is, the elements being replaced need not be adjacent in position or value. Given each replacement, the set of all permutations is partitioned into equivalence classes consisting of permutations reachable from one another through a series of bi-directional replacements. We break the eighteen replacements of interest into four categories by the structure of their classes and fully characterize all of their classes.

30) Jesse Geneson (MIT), Rohil Prasad (PRIMES), and Jonathan Tidor (PRIMES), __Bounding sequence extremal functions with formations__ (arXiv.org, 17 Aug 2013), published in The Electronic Journal of Combinatorics 21:3 (2014)

An $(r, s)$-formation is a concatenation of $s$ permutations of $r$ letters.
If $u$ is a sequence with $r$ distinct letters, then let $\mathit{Ex}(u, n)$ be
the maximum length of any $r$-sparse sequence with $n$ distinct letters which
has no subsequence isomorphic to $u$. For every sequence $u$ define
$\mathit{fw}(u)$, the formation width of $u$, to be the minimum $s$ for which
there exists $r$ such that there is a subsequence isomorphic to $u$ in every
$(r, s)$-formation. We use $\mathit{fw}(u)$ to prove upper bounds on
$\mathit{Ex}(u, n)$ for sequences $u$ such that $u$ contains an alternation
with the same formation width as $u$.

We generalize Nivasch's bounds on $\mathit{Ex}((ab)^{t}, n)$ by showing that
$\mathit{fw}((12 \ldots l)^{t})=2t-1$ and $\mathit{Ex}((12\ldots l)^{t}, n)
=n2^{\frac{1}{(t-2)!}\alpha(n)^{t-2}\pm O(\alpha(n)^{t-3})}$ for every $l \geq
2$ and $t\geq 3$, such that $\alpha(n)$ denotes the inverse Ackermann function.
Upper bounds on $\mathit{Ex}((12 \ldots l)^{t} , n)$ have been used in other
papers to bound the maximum number of edges in $k$-quasiplanar graphs on $n$
vertices with no pair of edges intersecting in more than $O(1)$ points.

If $u$ is any sequence of the form $a v a v' a$ such that $a$ is a letter,
$v$ is a nonempty sequence excluding $a$ with no repeated letters and $v'$ is
obtained from $v$ by only moving the first letter of $v$ to another place in
$v$, then we show that $\mathit{fw}(u)=4$ and $\mathit{Ex}(u, n)
=\Theta(n\alpha(n))$. Furthermore we prove that
$\mathit{fw}(abc(acb)^{t})=2t+1$ and $\mathit{Ex}(abc(acb)^{t}, n) =
n2^{\frac{1}{(t-1)!}\alpha(n)^{t-1}\pm O(\alpha(n)^{t-2})}$ for every $t\geq
2$.

29) Jesse Geneson (MIT), Tanya Khovanova (MIT), and Jonathan Tidor (PRIMES), __Convex geometric (k+2)-quasiplanar representations of semi-bar k-visibility graphs__ (arXiv.org, 3 Jul 2013), published in Discrete Mathematics 331 (2014)

We examine semi-bar visibility graphs in the plane and on a cylinder in which sightlines can pass through k objects. We show every semi-bar k-visibility graph has a (k+2)-quasiplanar representation in the plane with vertices drawn as points in convex position and edges drawn as segments. We also show that the graphs having cylindrical semi-bar k-visibility representations with semi-bars of different lengths are the same as the (2k+2)-degenerate graphs having edge-maximal (k+2)-quasiplanar representations in the plane with vertices drawn as points in convex position and edges drawn as segments.

28) Leigh Marie Braswell and Tanya
Khovanova, __ Cookie Monster Devours Naccis__ (arXiv.org, 18 May 2013),
published in the College Mathematics Journal 45:2 (2014)

In 2002, Cookie Monster appeared in *The Inquisitive
Problem Solver*. The hungry monster wants to empty a set
of jars filled with various numbers of cookies. On each of
his moves, he may choose any subset of jars and take the
same number of cookies from each of those jars. The Cookie
Monster number is the minimum number of moves Cookie Monster
must use to empty all of the jars. This number depends on
the initial distribution of cookies in the jars. We discuss
bounds of the Cookie Monster number and explicitly find the
Cookie Monster number for Fibonacci, Tribonacci and other
nacci sequences.

2012 Research Papers

27) William Kuszmaul and Ziling
Zhou, __Equivalence
classes in S _{n} for three families of
pattern-replacement relations__ (arXiv.org, 20 April 2013)

We study a family of equivalence relations in *S _{n}*,
the group of permutations on

*n*letters, created in a manner similar to that of the Knuth relation and the forgotten relation. For our purposes, two permutations are in the same equivalence class if one can be reached from the other through a series of pattern-replacements using patterns whose order permutations are in the same part of a predetermined partition of

*S*. In particular, we are interested in the number of classes created in

_{c}*S*by each relation and in characterizing these classes. Imposing the condition that the partition of

_{n}*S*has one nontrivial part containing the cyclic shifts of a single permutation, we find enumerations for the number of nontrivial classes. When the permutation is the identity, we are able to compare the sizes of these classes and connect parts of the problem to Young tableaux and Catalan lattice paths. Imposing the condition that the partition has one nontrivial part containing all of the permutations in

_{c}*S*beginning with 1, we both enumerate and characterize the classes in

_{c}*S*. We do the same for the partition that has two nontrivial parts, one containing all of the permutations in

_{n}*S*beginning with 1, and one containing all of the permutations in

_{c}*S*ending with 1.

_{c}

26) William Kuszmaul, __Counting permutations
modulo pattern-replacement equivalences for three-letter
patterns__ (arXiv.org, 20 April 2013), published in the Electronic Journal of Combinatorics 20:4 (2013)

We study a family of equivalence relations in *S _{n}*,
the group of permutations on

*n*letters, created in a manner similar to that of the Knuth relation and the forgotten relation. For our purposes, two permutations are in the same equivalence class if one can be reached from the other through a series of pattern-replacements using patterns whose order permutations are in the same part of a predetermined partition of

*S*. When the partition is of

_{c}*S*and has one nontrivial part of size greater than two, we provide formulas for the number of classes created in all unresolved cases. When the partition is of

_{3}*S*and has two nontrivial parts, each of size two (as do the Knuth and forgotten relations), we enumerate the classes for 13 of the 14 unresolved cases. In two of these cases, enumerations arise which are the same as those yielded by the Knuth and forgotten relations. The reasons for this phenomenon are still largely a mystery.

_{3}

25) Tanya Khovanova and Ziv Scully, __Efficient
Calculation of Determinants of Symbolic Matrices with Many
Variables__ (arXiv.org, 13 April 2013)

Efficient matrix determinant calculations have been studied since the 19th century. Computers expand the range of determinants that are practically calculable to include matrices with symbolic entries. However, the fastest determinant algorithms for numerical matrices are often not the fastest for symbolic matrices with many variables. We compare the performance of two algorithms, fraction-free Gaussian elimination and minor expansion, on symbolic matrices with many variables. We show that, under a simplified theoretical model, minor expansion is faster in most situations. We then propose optimizations for minor expansion and demonstrate their effectiveness with empirical data.

24) Michael Zanger-Tishler and Saarik Kalia, __On the
Winning and Losing Parameters of Schmidt's Game__ (8
April
2013)

First introduced by Wolfgang Schmidt, the (*α*,* β*)-game and
its modifications have been shown to be a powerful tool in
Diophantine approximation, metric number theory, and
dynamical systems. However, natural questions about the
winning-losing parameters of most sets have not been studied
thoroughly even after more than 40 years. There are a few
results in the literature showing that some non-trivial
points and small regions are winning or losing, but complete
pictures remain largely unknown. Our main goal in this paper
is to provide as much detail as possible about the global
pictures of winning-losing parameters for some interesting
families of sets.

23) Sheela Devadas and Steven Sam, __Representations
of Cherednik algebras of G (m, r, n) in positive
characteristic__ (arXiv.org, 3 April 2013; forthcoming
in Journal of Commutative Algebra)

We study lowest-weight irreducible representations of
rational Cherednik algebras attached to the complex
reflection groups *G(m, r, n)* in characteristic *p*.
Our approach is mostly from the perspective of commutative
algebra. By studying the kernel of the contravariant
bilinear form on Verma modules, we obtain formulas for
Hilbert series of irreducible representations in a number of
cases, and present conjectures in other cases. We observe
that the form of the Hilbert series of the irreducible
representations and the generators of the kernel tend to be
determined by the value of *n *modulo *p*, and are
related to special classes of subspace arrangements. Perhaps
the most novel (conjectural) discovery from the commutative
algebra perspective is that the kernel can be given the
structure of a "matrix regular sequence" in some instances,
which we prove in some small cases.

22) Christina Chen and Nan Li, __Apollonian Equilateral Triangles__ (arXiv.org,
1 March
2013)

Given an equilateral triangle with a the square of its
side length and a point in its plane with *b, c, d *the
squares of the distances from the point to the vertices of
the triangle, it can be computed that *a, b, c, d * satisfy 3(*a*^{2}+*b*^{2}+*c*^{2}+*d*^{2})
= (*a*+*b*+*c*+*d*)^{2}. This
paper derives properties of quadruples of nonnegative
integers (*a; b; c; d*), called triangle quadruples,
satisfying this equation. It is easy to verify that the
operation generating (*a; b; c; a*+*b*+*c*-*d*)
from (*a; b; c; d*) preserves this feature and that it
and analogous ones for the other elements can be represented
by four matrices. We examine in detail the triangle group,
the group with these operations as generators, and
completely classify the orbits of quadruples with respect to
the triangle group action. We also compute the number of
triangle quadruples generated after a certain number of
operations and approximate the number of quadruples bounded
by characteristics such as the maximal element. Finally, we
prove that the triangle group is a hyperbolic Coxeter group
and derive information about the elements of triangle
quadruples by invoking Lie groups. We also generalize the
problem to higher dimensions.

21) Dhroova Aiylam, __Modified Stern-Brocot
sequences__ (arXiv.org, 29 January 2013)

We present the classical Stern-Brocot tree and provide a new proof of the fact that every rational number between 0 and 1 appears in the tree. We then generalize the Stern-Brocot tree to allow for arbitrary choice of starting terms, and prove that in all cases the tree maintains the property that every rational number between the two starting terms appears exactly once.

20) Nihal Gowravaram and Ravi Jagadeesan, __Beyond
alternating permutations: Pattern avoidance in Young
diagrams and tableaux__ (arXiv.org, 28 January
2013), published in the Electronic Journal of Combinatorics 20:4 (2013)

We investigate pattern avoidance in alternating permutations and generalizations thereof. First, we study pattern avoidance in an alternating analogue of Young diagrams. In particular, we extend Babson-West's notion of shape-Wilf equivalence to apply to alternating permutations and so generalize results of Backelin-West-Xin and Ouchterlony to alternating permutations. Second, we study pattern avoidance in the more general context of permutations with restricted ascents and descents. We consider a question of Lewis regarding permutations that are the reading words of thickened staircase Young tableaux, that is, permutations that have (k - 1) ascents followed by a descent, followed by (k - 1) ascents, et cetera. We determine the relative sizes of the sets of pattern-avoiding (k - 1)-ascent permutations in terms of the forbidden pattern. Furthermore, we give inequalities in the sizes of sets of pattern-avoiding permutations in this context that arise from further extensions of shape-equivalence type enumerations.

19) Rohil Prasad and Jonathan Tidor, __Optimal Results in
Staged Self-Assembly of Wang Tiles__ (22 January
2013)

The subject of self-assembly deals with the spontaneous creation of ordered systems from simple units and is most often applied in the field of nanotechnology. The self-assembly model of Winfree describes the assembly of Wang tiles, simulating assembly in real-world systems. We use an extension of this model, known as the staged self-assembly model introduced by Demaine et al. that allows for discrete steps to be implemented and permits more diverse constructions. Under this model, we resolve the problem of constructing segments, creating a method to produce them optimally. Generalizing this construction to squares gives a new flexible method for their construction. Changing a parameter of the model, we explore much simpler constructions of complex monotone shapes. Finally, we present an optimal method to build most arbitrary shapes.

18) Aaron Klein, __On Rank Functions of
Graphs__ (6 January 2013)

We study *rank functions* (also known as graph
homomorphisms onto Z), ways of imposing graded poset
structures on graphs. We rst look at a variation on rank
functions called discrete *Lipschitz functions*. We
relate the number of Lipschitz functions of a graph *G * to the number of rank functions of both *G* and *G* X *E*. We then find generating functions that enable us
to compute the number of rank or Lipschitz functions of a
given graph. We look at a subset of graphs called * squarely generated graphs*, which are graphs whose cycle
space has a basis consisting only of 4-cycles. We show that
the number of rank functions of such a graph is proportional
to the number of 3-colorings of the same graph, thereby
connecting rank functions to the Potts model of statistical
mechanics. Lastly, we look at some asymptotics of rank and
Lipschitz functions for various types of graphs.

17) Andrew Xia, __Integrated Gene Expression
Probabilistic Models for Cancer Staging__ (1 January 2013)

The current system for classifying cancer patients' stages was introduced more than one hundred years ago. With the modern advance in technology, many parts of the system have been outdated. Because the current staging system emphasizes surgical procedures that could be harmful to patients, there has been a movement to develop a new Taxonomy, using molecular signatures to potentially avoid surgical testing. This project explores the issues of the current classification system and also looking for a potentially better way to classify cancer patients’ stages. Computerization has made a vast amount of cancer data available online. However, a significant portion of the data is incomplete; some crucial information is missing. It is logical to attempt to develop a system of recovering missing cancer data. Successful completion of this research saves costs and increases efficiency in cancer research and curing. Using various methods, we have shown that cancer stages cannot be simply extrapolated with incomplete data. Furthermore, a new approach of using RNA Sequencing data is studied. RNA Sequencing can potentially become a cost-efficient way to determine a cancer patient’s stage. We have obtained promising results of using RNA sequencing data in breast cancer staging.

16) Surya
Bhupatiraju, __On the Complexity
of the Marginal Satisfiability Problem__ (18 November
2012)

The marginal satisfiability problem (MSP) asks: Given
desired marginal distributions *D _{S}* for
every subset

*S*of c variable indices from {1, . . . , n}, does there exist a distribution

*D*over n-tuples of values in {1, . . . , m} with those

*S*-marginals

*D*? Previous authors have studied MSP in fixed dimensions, and have classified the complexity up to certain upper bounds. However, when using general dimensions, it is known that the size of distributions grows exponentially, making brute force algorithms impractical. This presents an incentive to study more general, tractable variants, which in turn may shed light on the original problem's structure. Thus, our work seeks to explore MSP and its variants for arbitrary dimension, and pinpoint its complexity more precisely. We solve MSP for

_{S}*n*= 2 and completely characterize the complexity of three closely related variants of MSP. In particular, we detail novel greedy and stochastic algorithms that handle exponentially-sized data structures in polynomial time, as well as generate accurate representative samples of these structures in polynomial time. These algorithms are also unique in that they represent possible protocols in data compression for communication purposes. Finally, we posit conjectures related to more generalized MSP variants, as well as the original MSP.

15) Fengning Ding and Aleksander Tsymbaliuk, __Representations
of Infinitesimal Cherednik Algebras__ (arXiv.org, 17 October
2012), published in Representation Theory 17 (2013)

Infinitesimal Cherednik algebras, first introduced by
Etingof, Gan, and Ginzburg (2005), are continuous analogues
of rational Cherednik algebras, and in the case of gl_{n},
are deformations of universal enveloping algebras of the Lie
algebras sl_{n+1}. Despite these connections, infinitesimal Cherednik algebras are not widely-studied, and basic
questions of intrinsic algebraic and representation
theoretical nature remain open. In the first half of this
paper, we construct the complete center of H_{ζ}(gl_{n}) for the
case of n = 2 and give one particular generator of the
center, the Casimir operator, for general n. We find the
action of this Casimir operator on the highest weight
modules to prove the formula for the Shapovalov determinant,
providing a criterion for the irreducibility of Verma
modules. We classify all irreducible finite dimensional
representations and compute their characters. In the second
half, we investigate Poisson-analogues of the infinitesimal
Cherednik algebras and use them to gain insight on the
center of H_{ζ}(gl_{n}). Finally, we investigate H_{ζ}(sp_{2n}) and
extend various results from the theory of H_{ζ}(gl_{n}), such as
a generalization of Kostant's theorem.

14) Tanya Khovanova and Dai Yang, __Halving Lines and
Their Underlying Graphs__ (arXiv.org, 17 October
2012), published in Involve 11:1 (2018): 1–11

In this paper we study halving-edges graphs corresponding to a set of halving lines. Particularly, we study the vertex degrees, path, cycles and cliques of such graphs. In doing so, we study a vertex-partition of said graph called chains which are equipped with interesting properties.

2011 Research Papers

13) Carl Lian, __Representations of Cherednik Algebras Associated to
Complex Reflection Groups in Positive Characteristic__ (arXiv.org, 1
July
2012)

We consider irreducible lowest-weight representations of
Cherednik algebras associated to certain classes of complex
reflection groups in characteristic *p*. In particular,
we study maximal submodules of Verma modules associated to
these algebras. Various results and conjectures are
presented concerning generators of these maximal submodules,
which are found by computing singular polynomials of Dunkl
operators. This work represents progress toward the general
problem of determining Hilbert series of irreducible
lowest-weight representations of arbitrary Cherednik
algebras in characteristic *p*.

12) Aaron Klein, Joel Brewster Lewis, and Alejandro Morales, __Counting matrices over finite fields with support on skew
Young and Rothe diagrams__ (arXiv.org, 26 March 2012);
published in the Journal of Algebraic Combinatorics (May 2013)

We consider the problem of finding the number of matrices over a finite field with a certain rank and with support that avoids a subset of the entries. These matrices are a q-analogue of permutations with restricted positions (i.e., rook placements). For general sets of entries these numbers of matrices are not polynomials in q (Stembridge 98); however, when the set of entries is a Young diagram, the numbers, up to a power of q-1, are polynomials with nonnegative coefficients (Haglund 98). In this paper, we give a number of conditions under which these numbers are polynomials in q, or even polynomials with nonnegative integer coefficients. We extend Haglund's result to complements of skew Young diagrams, and we apply this result to the case when the set of entries is the Rothe diagram of a permutation. In particular, we give a necessary and sufficient condition on the permutation for its Rothe diagram to be the complement of a skew Young diagram up to rearrangement of rows and columns. We end by giving conjectures connecting invertible matrices whose support avoids a Rothe diagram and Poincaré polynomials of the strong Bruhat order.

11) Surya
Bhupatiraju, Pavel Etingof, David Jordan, William Kuszmaul, and Jason
Li, __Lower central series of a free associative algebra over
the integers and finite fields__ (arXiv.org, 8 March
2012), published in the Journal of Algebra (December 2012)

Consider the free algebra A_n generated over Q by n
generators x_1, ..., x_n. Interesting objects attached to A
= A_n are members of its lower central series, L_i = L_i(A),
defined inductively by L_1 = A, L_{i+1} = [A,L_{i}], and
their associated graded components B_i = B_i(A) defined as
B_i=L_i/L_{i+1}. These quotients B_i, for i at least 2, as
well as the reduced quotient \bar{B}_1=A/(L_2+A L_3),
exhibit a rich geometric structure, as shown by Feigin and
Shoikhet and later authors (Dobrovolska-Kim-Ma, Dobrovolska-Etingof,
Arbesfeld-Jordan, Bapat-Jordan).

We study the same problem over the integers Z and finite
fields F_p. New phenomena arise, namely, torsion in B_i over
Z, and jumps in dimension over F_p. We describe the torsion
in the reduced quotient RB_1 and B_2 geometrically in terms
of the De Rham cohomology of Z^n. As a corollary we obtain a
complete description of \bar{B}_1(A_n(Z)) and
\bar{B}_1(A_n(F_p)), as well as of B_2(A_n(Z[1/2])) and
B_2(A_n(F_p)), p>2. We also give theoretical and
experimental results for B_i with i>2, formulating a number
of conjectures and questions based on them. Finally, we
discuss the supercase, when some of the generators are odd (fermionic)
and some are even (bosonic), and provide some theoretical
results and experimental data in this case.

10) David Jordan and Masahiro Namiki, __Determinant
formulas for the reflection equation algebra__ (19
Feb 2012)

In this note, we report on work in progress to explicitly describe generators of the center of the reflection equation algebra associated to the quantum GL(N) R-matrix. In particular, we conjecture a formula for the quantum determinant, and for the quadratic central element, both of which involve the excedance statistic on the symmetric group. Current efforts are directed at proving these formulas, and at finding formulas for the remaining central elements.

9) Ziv Scully, Yan Zhang, and Tian-Yi (Damien) Jiang, __Firing Patterns in the Parallel Chip-Firing Game__ (arXiv.org, 29 Nov 2012), published in *Discrete Mathematics and Theoretical Computer Science (DMTCS)* proc., Nancy, France, 2014

The *parallel chip-firing* game is an automaton on graphs in which vertices “fire” chips to their neighbors. This simple model, analogous to sandpiles forming and collapsing, contains much emergent complexity and has connections to different areas of mathematics including self-organized criticality and the study of the sandpile group. In this work, we study *firing sequences*, which describe each vertex’s interaction with its neighbors in this game. Our main contribution is a complete characterization of the periodic firing sequences that can occur in a game, which have a surprisingly simple combinatorial description. We also obtain other results about local behavior of the game after introducing the concept of *motors*.

8) Sheela Devadas, __Lowest-weight representations of Cherednik algebras in
positive characteristic__ (29 Jan 2012)

We study lowest-weight irreducible representations of
rational Cherednik algebras attached to the complex
reflection groups *G(m, r, n)* in characteristic *p*,
focusing specifically on the case *p* ≤ *n* ,
which is more complicated than the case *p *> *n*.
The goal of our work is to calculate characters (and in
particular Hilbert series) of these representations. By
studying the kernel of the contravariant bilinear form on
Verma modules, we proved formulas for Hilbert series of
irreducible modules in a number of cases, and also obtained
a lot of computer data which suggests a number of
conjectures. Specifically, we find that the shape and form
of the Hilbert series of the irreducible representations and
the generators of the kernel tend to be determined by the
value of *n* modulo *p* .

7) Christina Chen, __Maximizing Volume Ratios for Shadow Covering by
Tetrahedra__ (arXiv.org, 9 Jan 2012)

Define a body A to be able to hide behind a body B if the orthogonal projection of B contains a translation of the corresponding orthogonal projection of A in every direction. In two dimensions, it is easy to observe that there exist two objects such that one can hide behind another and have a larger area than the other. It was recently shown that similar examples exist in higher dimensions as well. However, the highest possible volume ratio for such bodies is still undetermined. We investigated two three-dimensional examples, one involving a tetrahedron and a ball and the other involving a tetrahedron and an inverted tetrahedron. We calculate the highest volume ratio known up to this date, 1.16, which is generated by our second example.

6) Yongyi Chen, Pavel Etingof, David
Jordan, and Michael Zhang, __Poisson traces in positive characteristic__ (arXiv.org,
29 Dec
2011)

We study Poisson traces of the structure algebra A of an affine Poisson variety X defined over a field of characteristic p. According to arXiv:0908.3868v4, the dual space HP_0(A) to the space of Poisson traces arises as the space of coinvariants associated to a certain D-module M(X) on X. If X has finitely many symplectic leaves and the ground field has characteristic zero, then M(X) is holonomic, and thus HP_0(A) is finite dimensional. However, in characteristic p, the dimension of HP_0(A) is typically infinite. Our main results are complete computations of HP_0(A) for sufficiently large p when X is 1) a quasi-homogeneous isolated surface singularity in the three-dimensional space, 2) a quotient singularity V/G, for a symplectic vector space V by a finite subgroup G in Sp(V), and 3) a symmetric power of a symplectic vector space or a Kleinian singularity. In each case, there is a finite nonnegative grading, and we compute explicitly the Hilbert series. The proofs are based on the theory of D-modules in positive characteristic.

5) Saarik Kalia, __The Generalizations of the Golden Ratio: Their Powers,
Continued Fractions, and Convergents__ (23 Dec
2011)

The relationship between the golden ratio and continued fractions is commonly known about throughout the mathematical world: the convergents of the continued fraction are the ratios of consecutive Fibonacci numbers. The continued fractions for the powers of the golden ratio also exhibit an interesting relationship with the Lucas numbers. In this paper, we study the silver means and introduce the bronze means, which are generalizations of the golden ratio. We correspondingly introduce the silver and bronze Fibonacci and Lucas numbers, and we prove the relationship between the convergents of the continued fractions of the powers of the silver and bronze means and the silver and bronze Fibonacci and Lucas numbers. We further generalize this to the Lucas constants, a two-parameter generalization of the golden ratio.

4) Caroline
Ellison, __The Number of Nonzero Coefficients of Powers of a
Polynomial over a Finite Field__ (15 Nov 2011)

Coefficients of polynomials over finite fields often
encode information that can be applied in various areas of
science; for instance, computer science and representation
theory. The purpose of this project is to investigate these
coefficients over the finite field F* _{p}*. We
find four exact results for the number of nonzero
coefficients in special cases of

*n*and

*p*for the polynomial (1 + x + x

^{2})

*. More importantly, we use Amdeberhan and Stanley's matrices to find what we conjecture to be an approximation for the sum of the number of nonzero coefficients of P(x)*

^{n}*over F*

^{n}*. We also relate the number of nonzero coefficients to the number of base*

_{p}*p*digits of

*n*. These results lead to questions in representation theory and combinatorics.

3) Xiaoyu He, __On the
Classification of Universal Rotor-Routers__ (arXiv.org, 6
Nov 2011)

The combinatorial theory of rotor-routers has connections with problems of statistical mechanics, graph theory, chaos theory, and computer science. A rotor-router network defines a deterministic walk on a digraph G in which a particle walks from a source vertex until it reaches one of several target vertices. Motivated by recent results due to Giacaglia et al., we study rotor-router networks in which all non-target vertices have the same type. A rotor type r is universal if every hitting sequence can be achieved by a homogeneous rotor-router network consisting entirely of rotors of type r. We give a conjecture that completely classifies universal rotor types. Then, this problem is simplified by a theorem we call the Reduction Theorem that allows us to consider only two-state rotors. A rotor-router network called the compressor, because it tends to shorten rotor periods, is introduced along with an associated algorithm that determines the universality of almost all rotors. New rotor classes, including boppy rotors, balanced rotors, and BURD rotors, are defined to study this algorithm rigorously. Using the compressor the universality of new rotor classes is proved, and empirical computer results are presented to support our conclusions. Prior to these results, less than 100 of the roughly 260,000 possible two-state rotor types of length up to 17 were known to be universal, while the compressor algorithm proves the universality of all but 272 of these rotor types.

2) Yongyi Chen and Michael Zhang, __On zeroth Poisson homology in positive characteristic__ (30
Sept
2011)

A Poisson algebra is a commutative algebra with a Lie bracket {,} satisfying the Leibniz rule. An important invariant of a Poisson algebra A is its zeroth Poisson homology HP_0(A)=A/A,A}. It characterizes densities on the phase space invariant under all Hamiltonian flows. Also, the dimension of HP_0(A) gives an upper bound for the number of irreducible representations of any quantization of A. We study HP_0(A) when A is the algebra of functions on an isolated quasihomogeneous surface singularity. Over C, it's known that HP_0(A) is the Jacobi ring of the singularity whose dimension is the Milnor number. We generalize this to characteristic p. In this case, HP_0(A) is a finite (although not finite dimensional) module over A^p. We give its conjectural Hilbert series for Kleinian singularities and for cones of smooth projective curves, and prove the conjecture in several cases. (The conjecture has now been proved in general in our follow-up paper with P. Etingof and D. Jordan.)

1) Christina Chen, Tanya Khovanova, and
Daniel A. Klain, __Volume bounds for shadow covering__ (arXiv.org, 8 Sep
2011), published in Transactions of the American Mathematical Society 366 (2014)

For *n* ≥ 2 a construction is given for a large family of compact convex sets *K* and *L* in *n*-dimensional Euclidean space such that the orthogonal projection *L _{u}* onto the subspace

*u*contains a translate of the corresponding projection

^{⊥}*K*for every direction

_{u}*u*, while the volumes of

*K*and

*L*satisfy

*V*>

_{n}(K)*V*. It is subsequently shown that, if the orthogonal projection

_{n}(L)*L*onto the subspace

_{u}*u*contains a translate of

^{⊥}*K*for every direction

_{u}*u*, then the set

*(n/(n−1))L*contains a translate of

*K*. It follows that

*V*≤

_{n}(K)*(n/(n−1))*. In particular, we derive a universal constant bound

^{n}V_{n}(L)*V*≤ 2.942

_{n}(K)*V*, independent of the dimension

_{n}(L)*n*of the ambient space. Related results are obtained for projections onto subspaces of some fixed intermediate co-dimension. Open questions and conjectures are also posed.

**Email us:** Primes@math.mit.edu