Primes Logo PRIMES: Research Papers

2022 Research Papers

320) Nitya Mani (MIT) and Edward Yu (PRIMES), Turán Problems for Mixed Graphs (arXiv.org, 23 Oct 2022)

We investigate natural Turán problems for mixed graphs, generalizations of graphs where edges can be either directed or undirected. We study a natural Turán density coefficient that measures how large a fraction of directed edges an $F$-free mixed graph can have; we establish an analogue of the Erdős-Stone-Simonovits theorem and give a variational characterization of the Turán density coefficient of any mixed graph (along with an associated extremal $F$-free family). This characterization enables us to highlight an important divergence between classical extremal numbers and the Turán density coefficient. We show that Turán density coefficients can be irrational, but are always algebraic; for every $k \in \mathbb N$, we construct a family of mixed graphs whose Turán density coefficient has algebraic degree $k$.

319) Alan Bu, Joseph Vulakh, and Alex Zhao, Length-Factoriality and Pure Irreducibility (arXiv.org, 13 Oct 2022)

An atomic monoid $M$ is called length-factorial if for every non-invertible element $x \in M$, no two distinct factorizations of $x$ into irreducibles have the same length (i.e., number of irreducible factors, counting repetitions). The notion of length-factoriality was introduced by J. Coykendall and W. Smith in 2011 under the term 'other-half-factoriality': they used length-factoriality to provide a characterization of unique factorization domains. In this paper, we study length-factoriality in the more general context of commutative, cancellative monoids. In addition, we study factorization properties related to length-factoriality, namely, the PLS property (recently introduced by Chapman et al.) and bi-length-factoriality in the context of semirings.

318) Alan Lee, Connectedness in Friends-and-Strangers Graphs of Spiders and Complements (arXiv.org, 5 Oct 2022)

Let $X$ and $Y$ be two graphs with vertex set $[n]$. Their friends-and-strangers graph $\mathsf{FS}(X,Y)$ is a graph with vertex set $S_n$, and two permutations $σ$ and $σ'$ are adjacent if they are separated by a transposition $\{a,b\}$ such that $a$ and $b$ are adjacent in $X$ and $σ(a)$ and $σ(b)$ are adjacent in $Y$. Specific friends-and-strangers graphs such as $\mathsf{FS}(\mathsf{Path}_n,Y)$ and $\mathsf{FS}(\mathsf{Cycle}_n,Y)$ have been researched, and their connected components have been enumerated using various equivalence relations such as double-flip equivalence. A spider graph is a collection of path graphs that are all connected to a single center point. In this paper, we delve deeper into the question of when $\mathsf{FS}(X,Y)$ is connected when $X$ is a spider and $Y$ is the complement of a spider or a tadpole.

317) Paula Bergero, Laura P. Schaposnik, and Grace Wang (PRIMES), Correlations Between COVID-19 and Dengue (arXiv.org, 27 Jul 2022)

A dramatic increase in the number of outbreaks of Dengue has recently been reported, and climate change is likely to extend the geographical spread of the disease. In this context, this paper shows how a neural network approach can incorporate Dengue and COVID-19 data as well as external factors (such as social behaviour or climate variables), to develop predictive models that could improve our knowledge and provide useful tools for health policy makers. Through the use of neural networks with different social and natural parameters, in this paper we define a Correlation Model through which we show that the number of cases of COVID-19 and Dengue have very similar trends. We then illustrate the relevance of our model by extending it to a Long short-term memory model (LSTM) that incorporates both diseases, and using this to estimate Dengue infections via COVID-19 data in countries that lack sufficient Dengue data.

316) Zifan (Carl) Guo (PRIMES) and William S. Moses (MIT), Understanding High-Level Properties of Low-Level Programs Through Transformers (8 July 2022)

Transformer models have enabled breakthroughs in the field of natural language processing largely because unlike other models, Transformers can be trained on a large corpus of unlabeled data. One can then perform fine-tuning on the model to fit a specific task. Unlike natural language, which is somewhat tolerant of minor differences in word choices or ordering, the structured nature of programming languages means that program meaning can be completely redefined or be invalid if even one token is altered. In comparison to highlevel languages, low-level languages are less expressive and more repetitive with more details from the computer microarchitecture. Whereas recent literature has examined how to effectively use Transformer models on high-level programming semantics, this project explores the effectiveness of applying Transformer models on low-level representations of programs that can shed light on better optimizing compilers. In this paper, we show that Transformer models can translate C to LLVM-IR with high accuracy, by training on a parallel corpus of functions extract from 1 million compilable, open-sourced C programs (AnghaBench) and its corresponding LLVM-IR after compiling with Clang. Our model shows a $49.57\%$ verbatim match when performed on the AnghaBench dataset and a high BLEU score of 87.68. We also present another case study that analyzes x86 64 basic blocks for estimating their throughput and match the state of the art. We show through ablation studies that a collection of preprocessing simplifications of the low-level programs especially improves the model’s ability to generate low level programs and discuss data selection, network architecture, as well as limitations to the use of Transformers on low-level programs.

315) Tanisha Saxena (PRIMES) and Jun Wan (MIT), A Systematic Study on the Difference and Conversion Between Synchronous and Asynchronous Protocols (1 July 2022)

In this paper, we provide a fundamental analysis of the similarities and differences between synchronous and asynchronous distributed systems. Specifically, we define a special and normal adversary such that any protocol for a synchronous system that is resilient to the special adversary can be replicated by a protocol for an asynchronous system that is resilient to the normal adversary. Protocols for the synchronous model are less complex, as the guarantee that messages will be delivered within a bounded time makes it easy to determine the sequence of events in the system. But, this is unrealistic in the real world, as systems tend to be asynchronous where messages are not guaranteed to be delivered in a timely manner. Protocols for the asynchronous model, on the other hand, are more complex as there are many edge cases to account for. Our adversaries help to create intermediary models that allow us to replicate protocol outputs across both synchronous and asynchronous systems, allowing for simpler creation of protocols that remain functional under the asynchronous model.

2021 Research Papers

314) Tanya Khovanova (MIT) and Atharva Pathak (PRIMES), Combinatorial Aspects of the Card Game War (arXiv.org, 28 Jan 2022)

This paper studies a single-suit version of the card game War on a finite deck of cards. There are varying methods of how players put the cards that they win back into their hands, but we primarily consider randomly putting the cards back and deterministically always putting the winning card before the losing card. The concept of a $\textit{passthrough}$ is defined, which refers to a player playing through all cards in their hand from a particular point in the game. We consider games in which the second player wins during their first passthrough. We introduce several combinatorial objects related to the game: game graphs, win-loss sequences, win-loss binary trees, and game posets. We show how these objects relate to each other. We enumerate states depending on the number of rounds and the number of passthroughs.

313) Luke Robitaille, Topological Entropy of Simple Braids (22 Jan 2022)

Mathematical objects called $\textit{braids}$ are formed from “strands” (like string or yarn) that intertwine. A certain collection of braids, called $\textit{simple braids}$, correspond to permutations, depending on how the strands get permuted. We can think of braids as maps from a disc with some “punctures” to itself; using this idea, we can consider the $\textit{topological entropy}$ of a braid, which can be zero or positive. What proportion of simple braids have positive topological entropy? The main theorem of this project is that, in the limit as the number of strands increases, the proportion of simple braids that have positive topological entropy approaches 1. This can be proved by showing that we can almost always find a long cycle in the permutation that will enable us to get a braid with three strands that has positive topological entropy, yielding the theorem. Topological entropy of braids can have use beyond just being interesting mathematics, such as for considering how to stir fluids.

312) Andrew Gu, On LU Matrices and Springer Theory (19 Jan 2022)

In this paper, we investigate and find the number of LU matrices in $GL_n(\mathbb{F}_q)$ that are similar to a regular semisimple $s$ in $GL_n(\mathbb{F}_q)$. Linking our results with M.-T. Trinh's study of certain ``generalized Steinberg varieties,'' we expand on his work. Trinh has established certain numerical identities coming from a $P=W$ conjecture of Cataldo-Hausel-Migliorini between affine Springer fibers and these generalized Steinberg varieties. The results of this paper provide numerical evidence of the relation between Springer fibers and LU matrices. Using a linear-algebraic approach, we find a direct relation between LU matrices and Trinh's spaces. Consequently, we derive a closed formula for a point count of LU matrices that is a constant factor from the point count of Trinh's spaces. Furthermore, we identify a common point count among these sets. From this we propose a conjecture that generalizes our results.

311) Zifan (Carl) Guo, The Effectiveness of Transformer Models for Analyzing Low-Level Programs (18 Jan 2022)

Recently, transformer networks have enabled breakthroughs in the field of natural language processing. This is partially due to the fact that transformer models can be first trained on a large corpus of unlabeled data prior to fine-tuning on a downstream task. Unlike natural language, which is somewhat tolerant of minor differences in word choices or ordering, the structured nature of programming languages means that program meaning can be completely redefined or be invalid if even one token is altered. In comparison to high-level languages, low-level languages are less expressive and more repetitive with more details from the computer microarchitecture. Whereas recent literature has examined how to effectively use transformer models on high-level programming semantics, this project explores the effectiveness of applying transformer models on low-level representations of programs that can shed light on better optimizing compilers. In this paper, we show that transformer models can translate C to LLVM-IR with high accuracy, by training on a parallel corpus of functions extract from 1 million compilable, open-sourced C programs (AnghaBench) and its corresponding LLVM-IR after compiling with Clang. We also present another case study that analyzes x86_64 basic blocks for estimating their throughput. We discuss various changes in data selection, program representation, network architecture, and other modifications that influence the effectiveness of transformer models on low-level programs.

310) Arun S. Kannan and Zifan (Atticus) Wang (PRIMES), Representation Stability and Finite Orthogonal Groups (17 Jan 2022)

In this paper, we prove stability results about orthogonal groups over finite commutative rings where 2 is a unit. Inspired by Putman and Sam (2017), we construct a category $\mathbf{OrI}(R)$ and prove a Noetherianity theorem for the category of $\mathbf{OrI}(R)$-modules. This implies an asymptotic structure theorem for orthogonal groups. In addition, we show general homological stability theorems for orthogonal groups, with both untwisted and twisted coefficients, partially generalizing a result of Charney (1987).

309) Ilaria Seidel, Bounds on Generalized Symmetric Numerical Semigroups (16 Jan 2022)

Numerical semigroups are combinatorial objects that are easy to define, but have rich connections to other fields. Certain families of numerical semigroups are of particular interest because of their connections to algebraic geometry. We focus on one such family known as symmetric semigroups, and analyze the rate of growth of the number of symmetric semigroups $S(g)$ with genus $g$. Then, we partition semigroups of genus $g$ by their Frobenius number, and denote by $N(g, F)$ the number of semigroups with genus $g$ and Frobenius number $F$. We extend results from $S(g)$ to $N(g, 2g-k)$ for $k$ fixed in the range $1 \leq k \leq g$. We state a conjecture about the local behavior of the ratio $\frac{S(g+1)}{S(g)}$, depending on the residue of $g \pmod 3$. Finally, we generalize this conjecture to include $N(g, 2g-k)$ for fixed $k$.

308) Kevin Cong, Square Tilings of Translation Surfaces (16 Jan 2022)

Translation surfaces are obtained by identifying opposite edges of a polygon with an even number of sides, paired together. We explore the question of tiling translation surfaces including the torus and the surfaces generated by the regular octagon with squares. Given any tiling, we identify its contacts graph, a triangulation formed by corresponding one vertex per square and drawing edges between vertices corresponding to adjacent squares. In particular, we prove that under certain conditions, there is exactly one torus tiling that has contacts graph a given torus triangulation. We then provide a method to approximately construct this tiling. We also show that the regular octagon translation surface cannot be tiled with squares. However, we give constructive tilings of translation surfaces corresponding to certain affine transformations of the octagon.

307) Akhil Kammila, Proposed Improvements to the Tor Handshake (15 Jan 2022)

Tor is the world’s largest anonymous communication network. It conceals its users’ identities by sending their traffic through three successive Tor relays. To establish connections between users, relays, and destinations, Tor uses a unique two-staged handshake. The first stage is a modified version of TLS 1.2 and the second stage is a fully encrypted exchange of Tor cells. The two-stage process enables both parties to authenticate while masking the differences that the Tor’s handshake has from standard TLS. The Tor handshake has multiple shortcomings when compared to widely-used cryptographic protocols like TLS and QUIC. It has high latency that detracts from the user experience and increased complexity that makes maintenance challenging. The first stage of the handshake also only supports TLS 1.2 despite TLS 1.3’s release in 2018. Our work presents an analysis of Tor’s handshake and proposes improvements. We find messages in the second stage of the Tor handshake that are redundant. Most notably, the responder sends a certificate that is not necessary for authentication. Removing these messages reduces the data transferred in the handshake without compromising the key exchange or authentication. Further, we find that removing backward compatibility from the Tor handshake allows for the trivial use of TLS 1.3 in the first stage. This reduces the round-trips and improves the security of the Tor handshake.

306) Abigail Thomas, The Implementation of Model Pruning to Optimize zk-SNARKs (15 Jan 2022)

Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge (zk-SNARK)s are used to convince a verifier that a server possesses certain information without revealing these private inputs. Thus, zk-SNARKs can be useful when outsourcing computations for cloud computing. The proofs returned by the server must be less computationally intensive than the given task, but the more complex the task, the more expensive the proof. We present a method that involves model pruning to decrease the complexity of the given task and thus the proof as well, to allow clients to outsource more complex programs. The proposed method harnesses the benefits of producing accurate results using a lower number of constraints, while remaining secure.

305) Vishnu Emani (PRIMES), Vijay Govindarajan, and David Hoganson (Boston Children's Hospital), Computational Fluid Modeling for Surgical Planning of Single Ventricle Congenital Heart Defects (15 Jan 2022)

Single ventricle defects (SVD) refer to the collection of congenital heart defects in which one chamber of the heart remains weak or underdeveloped. The most common palliative treatment for SVD physiologies involves a 3-stage surgical intervention, ending with the Fontan procedure. For patients with bilateral Superior Vena Cavae (SVC), the bilateral bidirectional Glenn (BBDG) procedure is typically employed. The primary goal of this study was to examine the effects of various physiological factors, such as vascular sizes, hepatic vein angle, curvature and position of the Fontan conduit, and the construction of a neo-innominate vein on the distribution of hepatic flow to the lungs in BBDG geometries.

304) Tanisha Saxena (PRIMES) and Jun Wan (MIT), A Compromise Between Synchronous and Asynchronous Systems (15 Jan 2022)

In this paper, we introduce a partially synchronous model for distributed systems such that any protocol for our model can be transformed to a corresponding protocol for the asynchronous model. Given a distributed system with $n$ users, we define a normal adversary as one that allows up to $ f (f < n/2)$ users to send any arbitrary message at any time, and a special adversary that can, additionally, block up to $f$ message channels for any number of users. We prove that, for any synchronous protocol that is resilient to the special adversary, there is an equivalent protocol for the asynchronous model that is resilient to the normal adversary. The special adversary helps us relax the restriction of time-bounded delivery and provides a model that is useful in analyzing if a synchronous protocol can be modified to work correctly in an asynchronous distributed system. Our model provides a basis to use synchronous protocols to function on asynchronous systems such as electronic banking and Blockchain systems distributed across the Internet.

303) Yavor Litchev, Signature Scheme with Access Control (15 Jan 2022)

A wide variety of digital signature schemes currently exist, from RSA to El-Gamal to Schnorr. More recently, multi-party signature schemes have been developed, including distributed signature schemes and threshold signature schemes. In particular, threshold signature schemes provide useful functionality, in that they require the number of participating parties to pass a threshold in order to generate a valid signature. However, they are limited in their complexity, as they can only model a threshold function. The proposed signature scheme (monotonic signature scheme) allows for the modeling of complex functions, so long as they are monotonic. This would allow for a much greater degree of access control, all while security and correctness are preserved.

302) Jack Wang, Exploration of Capabilities and Limitations in View Change of the X-Fields Model (15 Jan 2022)

Generating images of the same scenes from different perspectives — whether that is from different points, from different angles, under varying illumination, or with other parameters — has a myriad of use cases, stretching from creating debug models to producing smooth videos. In the X-Fields model, hard-coded graphics tricks like lighting, 3D projection, and albedo are used to supplement neural networks in creating a differentiable map for the image parameters and the actual pixels using sample images and their corresponding coordinate values. Although X-Fields performs well on datasets of images concentrated on a 2D (x, y) plane relative to alternative interpolation methods, the original model cannot support broader, practical use cases like the interpolation of images in different 3D (x, y, z) positions. In this paper, we use 3D images and coordinates generated by the 3DB framework in our dimensionally expanded X-Fields model. We find that the new model can generate promising interpolation results with relatively sparse datasets and with large view angle changes; parameters such as learning rate, the bandwidth parameter in soft blending, and others have impact over the interpolation quality and construct trade-offs between training cost and interpolation quality; and that adding certain backgrounds (like the ocean) reference images can pose challenges for interpolation.

301) Garrett Heller (PRIMES) and Chengyang Shao (MIT), Strichartz and Multi-linear Estimates for the One-dimensional Periodic Dysthe equation (11 Jan 2022)

This paper presents Strichartz estimates for the linearized 1D periodic Dysthe equation on the torus, namely estimate of the $L^6_{x,t}(\mathbb{T}^2)$ norm of the solution in terms of the initial data, and estimate of the $L^4_{x,t}(\mathbb{T}^2)$ norm in terms of the Bourgain space norm. The paper also presents other results such as bilinear and trilinear estimates pertaining to local well-posedness of the 1-dimensional periodic Dysthe equation in a suitable Bourgain space, and ill-posedness results in Sobolev spaces.

300) Neil Chowdhury, Interplay Between Loop Extrusion and Compartmentalization During Mitosis (10 Jan 2022)

During mitosis, DNA changes its physical structure from diffuse chromatin spread throughout the cell nucleus to discrete, compacted, cylindrical chromatids. This process is essential for cells to be able to transfer replicated chromosomes to the daughter nuclei. During interphase, chromatin is compartmentalized into heterochromatin and euchromatin, resulting in a visible signal in Hi-C contact maps. However, as the cell enters mitosis, this signal is disrupted, only to reappear after the cell divides. This paper explores the interphase and mitotic states by modeling DNA using polymer simulations. It is shown that loop extrusion, the mechanism underlying mitotic chromosome formation, can simultaneously be responsible for disrupting compartmentalization.

299) Nathan Xiong (PRIMES) and Pu Yu (MIT), The Master Field and Free Brownian Motions (10 Jan 2022)

The master field on the plane is the large $N$ limit of the Wilson loop functionals from the two-dimensional Yang–Mills holonomy process. In this paper, we redefine the master field purely through free Brownian motions, so that its definition is independent from finite $N$ Yang–Mills theory. From this aspect, we prove that the master field does not depend on the lasso basis chosen on a graph. We also give a new, elementary proof for the Makeenko–Migdal equations, which allow us to efficiently calculate the master field of any loop via a system of differential equations. While previous work in this field is mostly differential geometric in nature, our proofs all use combinatorial techniques, heavily utilizing the moment-cumulant relation from free probability.

298) Sushanth Sathish Kumar, The Restricted Lie Algebra Structure on the Bar Spectral Sequence of an Iterated Loop Space (8 Jan 2022)

There is a rich algebraic structure in the mod $p$ homology of the iterated loop space $H_*(\Omega^n X; \mathbb{F}_p)$. It admits a Lie bracket called the Browder bracket that is compatible with the Dyer-Lashof operations $Q_0, Q_1,\ldots, Q_{n-1}$. Furthermore, the top Dyer-Lashof operation $Q_{n-1}$ is a restriction for the Browder bracket. Ni proved that the Browder bracket on the homology $H_*(\Omega^n X)$ converges to the bracket on $H_*(\Omega^{n-1} X)$ in the bar spectral sequence, making it a spectral sequence of Poission-Hopf algebras. Our goal is to use the bar spectral sequence to relate the restricted Lie algebra structure given by the top Dyer-Lashof operation on $H_*(\Omega^n X; \mathbb{F}_2)$ to that of $H_*(\Omega^{n-1} X; \mathbb{F}_2)$.

297) Nancy Jiang, Bangzheng Li, and Sophie Zhu, On the primality and elasticity of algebraic valuations of cyclic free semirings (arXiv.org, 4 Jan 2022)

A cancellative commutative monoid is atomic if every non-invertible element factors into irreducibles. Under certain mild conditions on a positive algebraic number $\alpha$, the additive monoid $M_\alpha$ of the evaluation semiring $\mathbb{N}_0[\alpha]$ is atomic. The atomic structure of both the additive and the multiplicative monoids of $\mathbb{N}_0[\alpha]$ has been the subject of several recent papers. Here we focus on the monoids $M_\alpha$, and we study its omega-primality and elasticity, aiming to better understand some fundamental questions about their atomic decompositions. We prove that when $\alpha$ is less than 1, the atoms of $M_\alpha$ are as far from being prime as they can possibly be. Then we establish some results about the elasticity of $M_\alpha$, including that when $\alpha$ is rational, the elasticity of $M_\alpha$ is full (this was previously conjectured by S. T. Chapman, F. Gotti, and M. Gotti).

296) Kunal Kapoor (PRIMES) and Jun Wan (MIT), Consensus under a Dynamic Synchronous Model (3 Jan 2022)

With the advance of blockchain and cryptocurrency, the need for efficient and practical consensus algorithms is growing. However, most existing works only consider protocols under the synchronous setting. It is usually assumed that there exist at least $h$ users who are always honest and online. This is impractical as honest users might alternate between online and offline states. In this paper, we adapt Byzantine Broadcast protocols to a dynamic synchronous model which features sleepy/offline users as well as information gaps. We do this by building off an approach centered around a Trust Graph, modifying key algorithms from previous works such as the post-processing algorithm to ensure correctness with the dynamic model. This allows the creation of a more fault-tolerant protocol.

295) Andrew Du, Quaternion-Based Analytical Inverse Dynamics for the Human Body (31 Dec 2021)

The human body provides unique challenges to study from a dynamical perspective, due to its mechanical complexity and the difficulty of obtaining measurements of internal dynamic quantities. Thus, it is essential to create models that both simplify analysis and account for important anatomical details, the two of which must necessarily be balanced into a sufficiently accurate-yet-manageable framework. A number of critical applications require accurate inverse dynamic models of the human body, including medical treatment and virtual simulation of human motion. A recent general technique was developed by Dumas et. al. that used a quaternion screw algebra to make computation of inverse dynamic quantities more practical and more efficient. In this paper, we adapt their technique to the case of human anatomy, integrating these computational improvements within a novel framework for modeling human musculature.

294) Tanmay Gupta and Anshul Rastogi, Threshold-Based Inference of Dependencies in Distributed Systems (31 Dec 2021)

Many current online services rely on the interaction between different components that form a distributed system. Analyzing distributed systems is important in performance analysis (e.g. critical path analysis), debugging, and testing newfeatures. However, the analysis of these systems can be difficult due to limited knowledge of how components work and the variety of services and applications that are usually instrumented. The Mystery Machine , introduced by Chow et al. in 2014, has a “big data” approach, using logged events across many traces to generate and refine a causal model. We introduce Scooby Systems , our extension of The Mystery Machine ’s algorithm. We introduce thresholds to increase the tolerance to violations in the formation of causal relationships. In the future, we hope to improve Scooby Systems ’s scalability with a Hadoop MapReduce implementation.

293) Yihao (Michael) Huang and Claire Wang, Efficient Algorithms for Parallel Bi-core Decomposition (31 Dec 2021)

Graphs are used in the modeling of social networks, biological networks, user-product networks, and many other real-world relationships. Identifying dense regions within these graphs can often aid in applications including product-recommendation, spam identification, and protein-function discovery. A fundamental dense substructure discovery problem in graph theory is the k -core decomposition. However, the k -core decomposition does not directly apply to bipartite graphs, which are graphs that model the connections between two disjoint sets of entities, such as book-authorship, affiliation, and gene-disease association. Given the prevalence of bipartite graphs, solving the dense subgraph discovery problem on bipartite graphs has wide-reaching real-world impacts.
In this paper, we solve the bipartite analogue of the k- core decomposition problem, which is the bi-core decomposition problem. Existing sequential bi-core decomposition algorithms are not scalable to large-scale bipartite graphs with hundreds of millions of edges. Therefore, we develop a theoretically efficient parallel bi-core decomposition algorithm. Our algorithm improves the theoretical bounds of existing algorithms, reducing the length of the computation graph’s longest dependency path, which asymptotically bounds the runtime of a parallel algorithm when there are sufficiently many processors. We prove the problem of bi-core decomposition to be P-complete. We also devise a parallel bi-core index structure to allow for fast queries of the computed cores. Finally, we provide optimized parallel implementations of our algorithms that are scalable and fast. Using 30 threads, our parallel bi-core decomposition algorithm achieves up to a 44x speedup over the best existing sequential algorithm and up to a 2.9x speedup over the best existing parallel algorithm. Our parallel query implementation is up to 22.3x faster than the existing sequential query implementation.

292) Raymond Feng, Andrew Lee, and Espen Slettnes, Results on Various Models of Mistake-Bounded Online Learning (29 Dec 2021)

We determine bounds for several variations of the mistake-bound model. The first half of our paper presents various bounds on the weak reinforcement model and the delayed, ambiguous reinforcement model. In both models, the adversary gives $r$ inputs in one round and only indicates a correct answer if all $r$ guesses are correct. The only difference between the two models is that in the delayed, ambiguous model, the learner must answer each input before receiving the next input of the round, while the learner receives all $r$ inputs at once in the modified weak reinforcement model. We also prove generalizations for multi-class functions.
Then, we prove a lower and upper bound of the maximum factor gap that are tight up to a factor of $r$ between the modified weak reinforcement model and the standard model.
Lastly, we also introduce several related models for learning with permutation patterns: the order model, the relative position model, and the delayed relative position model. In these models, a learner attempts to learn a permutation from a set of permutations $F$ by guessing statistics related to sub-permutations. We similarly define the notions of weak versus strong reinforcement and of delayed, ambiguous, reinforcement, and determine some sharp bounds by mimicking sorting algorithms.

291) Fenghuan (Linda) He, A Topological Centrality Measure for Directed Networks (24 Dec 2021; arXiv.org, 30 Jan 2022)

Given a directed network G , we are interested in studying the qualitative features of G which govern how perturbations propagate across G . Various classical centrality measures have been already developed and proven useful to capture qualitative features and behaviors for undirected networks. In this paper, we use topological data analysis (TDA) to adapt measures of centrality to capture both directedness and non-local propagating behaviors in networks. We introduce a new metric for computing centrality in directed weighted networks, namely the quasi-centrality measure. We compute these metrics on trade networks to illustrate that our measure successfully captures propagating effects in the network and can also be used to identify sources of shocks that can disrupt the topology of directed networks. Moreover, we introduce a method that gives a hierarchical representation of the topological influences of nodes in a directed network.

290) Joshua Guo (PRIMES) and Kevin Chang (MIT), On the Gauss-Epple homomorphism of the braid group $B_n$, and generalizations to Artin groups of crystallographic type (24 Dec 2021)

In this paper, we introduce a broad family of group homomorphisms that we name the Gauss-Epple homomorphisms. In the setting of braid groups, the Gauss-Epple invariant was originally defined by Epple based on a note of Gauss as an action of the braid group $B_n$ on the set $\{1, \dots, n\}\times\mathbb{Z}$; we prove that it is well-defined. We consider the associated group homomorphism from $B_n$ to the symmetric group $\text{Sym}(\{1, \dots, n\}\times\mathbb{Z})$. We prove that this homomorphism factors through $\mathbb{Z}^n\rtimes S_n$ (in fact, its image is an order 2 subgroup of the previous group). We also describe the kernel of the homomorphism and calculate the asymptotic probability that it contains a random braid of a given length. Furthermore, we discuss the super-Gauss-Epple homomorphism, a homomorphism which extends the generalization of the Gauss-Epple homomorphism and describe a related 1-cocycle of the symmetric group $S_n$ on the set of antisymmetric $n\times n$ matrices over the integers. We then generalize the super-Gauss-Epple homomorphism and the associated 1-cocycle to Artin groups of finite type. For future work, we suggest studying possible generalizations to complex reflection groups and computing the vector spaces of Gauss-Epple analogues.

289) Valeri Frumkin (MIT) and Rishabh Das (PRIMES), Thermal modulation of fluidic lenses in microgravity (22 Dec 2021)

The fluidic shaping method is an exciting new technology that allows to rapidly shape liquids into a wide range of optical topographies with sub-nanometer surface quality. The scale-invariance of the method makes it well suited for for space-based fabrication of large fluidic optics. However, in microgravity, the resulting optical topographies are limited to constant mean curvature surfaces. Here we study how variations in surface tension result in deviations from constant mean curvature topographies, allowing one to introduce optical corrections which would not be obtainable otherwise. Under the assumption of small thermal Peclet number, we derive a differential equation governing the steady-state shape of the liquid surface under the effect of spatially varying surface tension. This equation allows us to formulate an inverse problem of finding the required surface-tension distribution for a desired correction. Lastly, we provide several examples for surface tension distributions yielding required aspheric topographies.

288) Yi Liang (PRIMES) and James Unwin (University of Illinois at Chicago), COVID-19 Forecasts via Stock Market Indicators (arXiv.org, 13 Dec 2021)

Reliable short term forecasting can provide potentially lifesaving insights into logistical planning, and in particular, into the optimal allocation of resources such as hospital staff and equipment. By reinterpreting COVID-19 daily cases in terms of candlesticks, we are able to apply some of the most popular stock market technical indicators to obtain predictive power over the course of the pandemics. By providing a quantitative assessment of MACD, RSI, and candlestick analyses, we show their statistical significance in making predictions for both stock market data and WHO COVID-19 data. In particular, we show the utility of this novel approach by considering the identification of the beginnings of subsequent waves of the pandemic. Finally, our new methods are used to assess whether current health policies are impacting the growth in new COVID-19 cases.

287) Anuj Sakarda, Jerry Tan, and Armaan Tipirneni, On the Distance Spectra of Extended Double Stars (arXiv.org, 6 Dec 2021)

The distance matrix of a connected graph is defined as the matrix in which the entries are the pairwise distances between vertices. The distance spectrum of a graph is the set of eigenvalues of its distance matrix. A graph is said to be determined by its distance spectrum if there does not exist a non-isomorphic graph with the same spectrum. The question of which graphs are determined by their spectrum has been raised in the past, but it remains largely unresolved. In this paper, we prove that extended double stars are determined by their distance spectra.

286) Daniel Xia (PRIMES) and Pei-Ken Hung (University of Minnesota), A Minkowski-type inequality in the AdS-Melvin space (arXiv.org, 19 Nov 2021)

The AdS-Melvin spacetime was introduced by Astorino and models the AdS soliton with electromagnetic charge. It is a static spacetime with a time-symmetric Cauchy hypersurface, which we refer to as the AdS-Melvin space. In this paper, we study a sharp Minkowski-type inequality for surfaces embedded in the AdS-Melvin space. We first prove the inequality for special cases in which the surface enjoys axisymmetry or is a small perturbation of a coordinate torus. We then use a weighted normal flow to show that the inequality holds for general surfaces.

285) Jeremy Yu (PRIMES), Lu Lu (MIT), Xuhui Meng, and George Em Karniadakis, Gradient-enhanced physics-informed neural networks for forward and inverse PDE problems (arXiv.org, 1 Nov 2021), published in Computer Methods in Applied Mechanics and Engineering , vol. 393 (1 April 2022)

Deep learning has been shown to be an effective tool in solving partial differential equations (PDEs) through physics-informed neural networks (PINNs). PINNs embed the PDE residual into the loss function of the neural network, and have been successfully employed to solve diverse forward and inverse PDE problems. However, one disadvantage of the first generation of PINNs is that they usually have limited accuracy even with many training points. Here, we propose a new method, gradient-enhanced physics-informed neural networks (gPINNs), for improving the accuracy and training efficiency of PINNs. gPINNs leverage gradient information of the PDE residual and embed the gradient into the loss function. We tested gPINNs extensively and demonstrated the effectiveness of gPINNs in both forward and inverse PDE problems. Our numerical results show that gPINN performs better than PINN with fewer training points. Furthermore, we combined gPINN with the method of residual-based adaptive refinement (RAR), a method for improving the distribution of training points adaptively during training, to further improve the performance of gPINN, especially in PDEs with solutions that have steep gradients.

284) Felix Gotti (MIT) and Bangzheng Li (PRIMES), Atomic semigroup rings and the ascending chain condition on principal ideals (arXiv.org, 30 Oct 2021)

An integral domain is called atomic if every nonzero nonunit element factors into irreducibles. On the other hand, an integral domain is said to satisfy the ascending chain condition on principal ideals (ACCP) if every ascending chain of principal ideals terminates. It was asserted by Cohn back in the sixties that every atomic domain satisfies the ACCP, but such an assertion was refuted by Grams in the seventies with an explicit construction of a neat example. Still, atomic domains without the ACCP are notoriously elusive, and just a few classes have been found since Grams' first construction. In the first part of this paper, we generalize Grams' construction to provide new classes of atomic domains without the ACCP. In the second part of this paper, we construct what seems to be the first atomic semigroup ring without the ACCP in the existing literature.

283) Karthik Seetharaman, William Yue, and Isaac Zhu, Patterns in the Lattice Homology of Seifert Homology Spheres (arXiv.org, 26 Oct 2021)

In this paper, we study various homology cobordism invariants for Seifert fibered integral homology 3-spheres derived from Heegaard Floer homology. Our main tool is lattice homology, a combinatorial invariant defined by Ozsv\'ath-Szab\'o and N\'emethi. We reprove the fact that the $d$-invariants of Seifert homology spheres $\Sigma(a_1,a_2,\dots,a_n)$ and $\Sigma(a_1,a_2,\dots,a_n+a_1a_2\cdots a_{n-1})$ are the same using an explicit understanding of the behavior of the numerical semigroup minimally generated by $a_1a_2\cdots a_n/a_i$ for $i\in[1,n]$. We also study the maximal monotone subroots of the lattice homologies, another homology cobordism invariant introduced by Dai and Manolescu. We show that the maximal monotone subroots of the lattice homologies of Seifert homology spheres $\Sigma(a_1,a_2,\dots,a_n)$ and $\Sigma(a_1,a_2,\dots,a_n+2a_1a_2\cdots a_{n-1})$ are the same.

282) Christian Gaetz (MIT) and Ram K. Goel (PRIMES), Products of reflections in smooth Bruhat intervals (arXiv.org, 25 Oct 2021)

A permutation is called smooth if the corresponding Schubert variety is smooth. Gilboa and Lapid prove that in the symmetric group, multiplying the reflections below a smooth element $w$ in Bruhat order in a compatible order yields back the element $w$. We strengthen this result by showing that such a product in fact determines a saturated chain $e \to w$ in Bruhat order, and that this property characterizes smooth elements.

281) Yash Agarwal (PRIMES) and Sarah Greer (MIT), Convolutional encoder decoder network for the removal of coherent seismic noise (arXiv.org, 25 Oct 2021)

Seismologists often need to gather information about the subsurface structure of a location to determine if it is fit to be drilled for oil. However, there may be electrical noise in seismic data which is often removed by disregarding certain portions of the data with the use of a notch filter. Instead, we use a convolutional encoder decoder network to remove such noise by training the network to take the noisy shot record as input and remove the noise from the shot record as output. In this way, we retain important information about the data collected while still removing coherent noise in seismic data.

280) Sophia Benjamin, Arushi Mantri, and Quinn Perian, On the Wasserstein Distance Between $k$-Step Probability Measures on Finite Graphs (arXiv.org, 20 Oct 2021)

We consider random walks $X,Y$ on a finite graph $G$ with respective lazinesses $\alpha, \beta \in [0,1]$. Let $\mu_k$ and $\nu_k$ be the $k$-step transition probability measures of $X$ and $Y$. In this paper, we study the Wasserstein distance between $\mu_k$ and $\nu_k$ for general $k$. We consider the sequence formed by the Wasserstein distance at odd values of $k$ and the sequence formed by the Wasserstein distance at even values of $k$. We first establish that these sequences always converge, and then we characterize the possible values for the sequences to converge to. We further show that each of these sequences is either eventually constant or converges at an exponential rate. By analyzing the cases of different convergence values separately, we are able to partially characterize when the Wasserstein distance is constant for sufficiently large $k$.

279) Sheryl Hsu (PRIMES), Fidel I. Schaposnik Massolo (Université Libre de Bruxelles), and Laura P. Schaposnik (University of Illinois at Chicago), The Power of Many: A Physarum Swarm Steiner Tree Algorithm (arXiv.org, 15 Oct 2021)

We create a novel Physarum Steiner algorithm designed to solve the Euclidean Steiner tree problem. Physarum is a unicellular slime mold with the ability to form networks and fuse with other Physarum organisms. We use the simplicity and fusion of Physarum to create large swarms which independently operate to solve the Steiner problem. The Physarum Steiner tree algorithm then utilizes a swarm of Physarum organisms which gradually find terminals and fuse with each other, sharing intelligence. The algorithm is also highly capable of solving the obstacle avoidance Steiner tree problem and is a strong alternative to the current leading algorithm. The algorithm is of particular interest due to its novel approach, rectilinear properties, and ability to run on varying shapes and topological surfaces.

278) Alexander Tianlin Hu (PRIMES) and Andrey Boris Khesin (MIT), Improved Graph Formalism for Quantum Circuit Simulation (arXiv.org, 20 Sep 2021)

Improving the simulation of quantum circuits on classical computers is important for understanding quantum advantage and increasing development speed. In this paper, we explore a new way to express stabilizer states and further improve the speed of simulating stabilizer circuits with a current existing approach. First, we discover a unique and elegant canonical form for stabilizer states based on graph states to better represent stabilizer states and show how to efficiently simplify stabilizer states to canonical form. Second, we develop an improved algorithm for graph state stabilizer simulation and establish limitations on reducing the quadratic runtime of applying controlled-Pauli $Z$ gates. We do so by creating a simpler formula for combining two Pauli-related stabilizer states into one. Third, to better understand the linear dependence of stabilizer states, we characterize all linearly dependent triplets, revealing symmetries in the inner products. Using our novel controlled-Pauli $Z$ algorithm, we improve runtime for inner product computation from $O(n^3)$ to $O(nd^2)$ where $d$ is the maximum degree of the graph.

277) Sophie Zhu, Factorizations in evaluation monoids of Laurent semirings (arXiv.org, 26 Aug 2021), published in Communications in Algebra 50:6 (2022): 2719-2730

For a positive real number $α$, let $\mathbb{N}_0[α,α^{-1}]$ be the semiring of all real numbers $f(α)$ for $f(x)$ lying in $\mathbb{N}_0[x,x^{-1}]$, which is the semiring of all Laurent polynomials over the set of nonnegative integers $\mathbb{N}_0$. In this paper, we study various factorization properties of the additive structure of $\mathbb{N}_0[α, α^{-1}]$. We characterize when $\mathbb{N}_0[α, α^{-1}]$ is atomic. Then we characterize when $\mathbb{N}_0[α, α^{-1}]$ satisfies the ascending chain condition on principal ideals in terms of certain well-studied factorization properties. Finally, we characterize when $\mathbb{N}_0[α, α^{-1}]$ satisfies the unique factorization property and show that, when this is not the case, $\mathbb{N}_0[α, α^{-1}]$ has infinite elasticity.

276) Felix Gotti (MIT) and Bangzheng Li (PRIMES), Divisibility in rings of integer-valued polynomials (arXiv.org, 25 July 2021), published in The New York Journal of Mathematics 28 (2022): 117–139

In this paper, we address various aspects of divisibility by irreducibles in rings consisting of integer-valued polynomials. An integral domain is called atomic if every nonzero nonunit factors into irreducibles. Atomic domains that do not satisfy the ascending chain condition on principal ideals (ACCP) have proved to be elusive, and not many of them have been found since the first one was constructed by A. Grams in 1974. Here we exhibit the first class of atomic rings of integer-valued polynomials without the ACCP. An integral domain is called a finite factorization domain (FFD) if it is simultaneously atomic and an idf-domain (i.e., every nonzero element is divisible by only finitely many irreducibles up to associates). We prove that a ring is an FFD if and only if its ring of integer-valued polynomials is an FFD. In addition, we show that neither being atomic nor being an idf-domain transfer, in general, from an integral domain to its ring of integer-valued polynomials. In the same class of rings of integer-valued polynomials, we consider further properties that are defined in terms of divisibility by irreducibles, including being Cohen-Kaplansky and being Furstenberg.

275) Beining Zhou, High-Order Sensor Array Geometries for Improved Direction of Arrival Estimation in Signal Processing (9 July 2021)

In signal processing, the direction of arrival (DOA) estimation is a central problem to locate the source of a signal. It applies extensively in wireless communication systems such as radars and the GPS, in medical imaging, in telescopes, etc. Devising a signal sensor array geometry that achieves higher degree of freedom (DOF) has been a crucial challenge to improve the efficiency of DOA estimation. Recently, high-order cumulants are used extensively to construct high-order sensor arrays, but the state-of-the art high-order arrays are not optimal. This paper proposes novel sensor array geometries, the high-order embeded arrays (HOEA) for the 4th- and 6th-order and then extends those arrays to the 2$q$th-order by layering. Compared to previous methods, the proposed HOEA significantly improves the DOF generation from $O(2^{q}N^{2q})$ to $O(17^{q/3}N^{2q})$, which increases the theoretical efficiency by $25\%$ in the 4th order, $113\%$ in the 6th, and $352\%$ in the 12th order.

274) Benjamin Chen, Practical Anonymity Sets in a Pseudonymous Forum Setting (6 July 2021)

Pseudonymous forums are online websites where users can post publicly visible content and participate in discussions under a pseudonym. Such forums are not perfectly private, as their privacy can be compromised to traffic analysis attacks. However, many methods of providing perfect privacy to such a system come with a heavy performance cost—whether in bandwidth or latency. We examine the practicality of anonymity sets, a defense against such attacks that can still provide a formal privacy guarantee with less performance losses, and attempt to simulate their implementation in a real-world setting using real data scraped from Reddit, a popular pseudonymous forum. We try various different methods of creating these anonymity sets, finding that K-means with some dimensionality compression yields decent results; we also propose a method of defining a common traffic budget for members of a set. We find that anonymity sets are a feasible defense against such attacks in the pseudonymous forum setting.

273) Matthew Ding, An Analysis of Multi-hop Iterative Approximate Byzantine Consensus with Local Communication (27 June 2021)

Iterative Approximate Byzantine Consensus (IABC) is a fundamental problem of fault-tolerant distributed computing where machines seek to achieve approximate consensus to arbitrary exactness in the presence of Byzantine failures. We present a novel algorithm for this problem, named Relay-IABC, which relies on the usage of a multi-hop relayed messaging system and crytographically secure message signatures. The use of signatures and relays allows the strict necessary network conditions of traditional IABC algorithms to be circumvented. In addition, we show evidence that Relay-IABC achieves faster convergence than traditional algorithms even under these strict network conditions with both theoretical analysis and experimental results.

272) Jason Yang (PRIMES), Jun Wan (MIT), and Hanshen Xiao (MIT), Decentralized Gradient Descent: how network structure affects convergence (26 June 2021)

We investigate decentralized gradient descent among a network of nodes where an adversary has corrupted certain nodes. We focus on the case where the utility functions of all nodes are 1-dimensional quadratics, and where each corrupted node is connected to all honest nodes.

271) Sheryl Hsu (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), Cell fusion through slime mold network dynamics (arXiv.org, 21 June 2021)

Physarum Polycephalum is a unicellular slime mold that has been intensely studied due to its ability to solve mazes, find shortest paths, generate Steiner trees, share knowledge, remember past events, and its applications to unconventional computing. The CELL model is a unicellular automaton introduced in the recent work of Gunji et al. in 2008, that models Physarum's amoeboid motion, tentacle formation, maze solving, and network creation. In the present paper, we extend the CELL model by spawning multiple CELLs, allowing us to understand the interactions between multiple cells, and in particular, their mobility, merge speed, and cytoplasm mixing. We conclude the paper with some notes about applications of our work to modeling the rise of present day civilization from the early nomadic humans and the spread of trends and information around the world. Our study of the interactions of this unicellular organism should further the understanding of how Physarum Polycephalum communicates and shares information.

270) Linda Chen, Communication Complexity of Byzantine Broadcast (19 June 2021)

Byzantine Broadcast is a fundamental problem in distributed computing, with communication complexity being an important aspect of Byzantine Broadcast protocols. In Byzantine Broadcast, a designated leader must ensure that all honest users in a distributed system reach a consensus, even in the presence of some dishonest users. Previous works have shown an $O(n^2)$ lower bound on communication complexity, as well as protocols with $O(n^2)$ communication complexity for the honest majority scenario. In this paper, we review the previous work and provide various methods and intuition towards a possible $O(n^3)$ communication complexity lower bound for dishonest majority Byzantine Broadcast.

2020 Research Papers

269) Varun Suraj (PRIMES), Catherine Del Vecchio Fitz, Laura Kleiman, Suresh Bhavnani, Chinmay Jani, Surbhi Shah, Rana McKay, Jeremy Warner, and Gil Alterovitz, SMART COVID Navigator, a Clinical Decision Support Tool for COVID-19 Treatment: Design and Development Study , published in Journal of Medical Internet Research 24, no. 2 (18 Feb 2022)

COVID-19 caused by SARS-CoV-2 has infected 219 million individuals at the time of writing of this paper. A large volume of research findings from observational studies about disease interactions with COVID-19 is being produced almost daily, making it difficult for physicians to keep track of the latest information on COVID-19’s effect on patients with certain pre-existing conditions.

268) Ayshwarya Subramanian (Broad Institute), Mikhail Alperovich (PRIMES), Yiming Yang, and Bo Li, Biology-inspired data-driven quality control for scientific discovery in single-cell transcriptomics (bioRxiv.org, 28 Oct 2021)

Quality control (QC) of cells, a critical step in single-cell RNA sequencing data analysis, has largely relied on arbitrarily fixed data-agnostic thresholds on QC metrics such as gene complexity and fraction of reads mapping to mitochondrial genes. The few existing data-driven approaches perform QC at the level of samples or studies without accounting for biological variation in the commonly used QC criteria. We demonstrate that the QC metrics vary both at the tissue and cell state level across technologies, study conditions, and species. We propose data-driven QC ( ddqc ), an unsupervised adaptive quality control framework that performs flexible and data-driven quality control at the level of cell states while retaining critical biological insights and improved power for downstream analysis. On applying ddqc to 6,228,212 cells and 835 mouse and human samples, we retain a median of 39.7% more cells when compared to conventional data-agnostic QC filters. With ddqc , we recover biologically meaningful trends in gene complexity and ribosomal expression among cell-types enabling exploration of cell states with minimal transcriptional diversity or maximum ribosomal protein expression. Moreover, ddqc allows us to retain cell-types often lost by conventional QC such as metabolically active parenchymal cells, and specialized cells such as neutrophils or gastric chief cells. Taken together, our work proposes a revised paradigm to quality filtering best practices - iterative QC, providing a data-driven quality control framework compatible with observed biological diversity.

267) Robert H. Dolin, Shaileshbhai R. Gothi, Aziz Boxwala, Bret S. E. Heale, Ammar Husami, James Jones, Himanshu Khangar, Shubham Londhe, Frank Naeymi-Rad, Soujanya Rao, Barbara Rapchak, James Shalaby, Varun Suraj (PRIMES), Ning Xie, Srikar Chamala & Gil Alterovitz, vcf2fhir: a utility to convert VCF files into HL7 FHIR format for genomics-EHR integration , published in BMC Bioinformatics 22, article No. 104 (2 Mar 2021)

VCF formatted files are the lingua franca of next-generation sequencing, whereas HL7 FHIR is emerging as a standard language for electronic health record interoperability. A growing number of FHIR-based clinical genomics applications are emerging. Here, we describe an open source utility for converting variants from VCF format into HL7 FHIR format.

266) Quanlin Chen, The Center of the $q$-Weyl Algebra over Rings with Torsion (23 Jan 2021)

We compute the centers of the Weyl algebra, $q$-Weyl algebra, and the "first $q$-Weyl algebra" over the quotient of the ring $\mathbb{Z}/p^N \mathbb{Z}[q]$ by some polynomial $P(q)$. Through this, we generalize and "quantize" part of a result by Stewart and Vologodsky on the center of the ring of differential operators on a smooth variety over $\mathbb{Z}/p^N \mathbb{Z}$. We prove that a corresponding Witt vector structure appears for general $P(q)$ and compute the extra terms for special $P(q)$ with particular properties, answering a question by Bezrukavnikov of possible interpolation between two known results.

265) Tanisha Saxena and Daniel Xu, Graph Alignment-Based Protein Comparison (23 Jan 2021)

Inspired by the question of identifying mechanisms of viral infection, we are interested in the problem of comparing pairs of proteins, given by amino acid sequences and traces of their 3-dimensional structure. While it is true that the problem of predicting and comparing protein function is one of the most famous unsolved problems in computational biology, we propose a heuristic which poses it as a simple alignment problem, which - after some linear-algebraic pre-processing - is amenable to a dynamic programming solution.

264) Andrew Cai, Ratios of Naruse-Newton Coefficients Obtained from Descent Polynomials (arXiv.org, 20 Jan 2021)

We study Naruse-Newton coefficients, which are obtained from expanding descent polynomials in a Newton basis introduced by Jiradilok and McConville. These coefficients $C_0, C_1, \ldots$ form an integer sequence associated to each finite set of positive integers. For fixed nonnegative integers $a<b$, we examine the set $R_{a, b}$ of all ratios $\frac{C_a}{C_b}$ over finite sets of positive integers. We characterize finite sets for which $\frac{C_a}{C_b}$ is minimized and provide a construction to prove $R_{a, b}$ is unbounded above. We use this construction to obtain results on the closure of $R_{a, b}$. We also examine properties of Naruse-Newton coefficients associated with doubleton sets, such as unimodality and log-concavity. Finally, we find an explicit formula for all ratios $\frac{C_a}{C_b}$ of Naruse-Newton coefficients associated with ribbons of staircase shape.

263) Ishan Levy (MIT) and Justin Wu (PRIMES), The Borel Cohomology of Free Iterated Loop Spaces (16 Jan 2021; arXiv.org, 28 May 2021)

We compute the $\text{SO}(n+1)$-equivariant mod $2$ Borel cohomology of the free iterated loop space $Z^{S^n}$ when $n$ is odd and $Z$ is a product of mod $2$ Eilenberg Mac Lane spaces. When $n=1$, this recovers Ottosen and B\"okstedt's computation for the free loop space. The highlight of our computation is a construction of cohomology classes using an $O(n)$-equivariant evaluation map and a pushforward map. We then reinterpret our computation as giving a presentation of the zeroth derived functor of the Borel cohomology of $Z^{S^n}$ for arbitrary $Z$. We also include an appendix where we give formulas for computing the zeroth derived functor of the cohomology of mapping spaces, and study the dependence of such derived functors on the Steenrod operations.

262) Linda Chen, Reducing Round Complexity of Byzantine Broadcast (15 Jan 2021)

Byzantine Broadcast is an important topic in distributed systems and improving its round complexity has long been a focused challenge. Under honest majority, the state of the art for Byzantine Broadcast is 10 rounds for a static adversary and 16 rounds for an adaptive adversary. In this paper, we present a Byzantine Broadcast protocol with expected 8 rounds under a static adversary and expected 10 rounds under an adaptive adversary. We also generalize our idea to the dishonest majority setting and achieve an improvement over existing protocols.

261) Zarathustra Brady (MIT) and Holden Mui (PRIMES), Symmetric Operations on Domains of Size at Most 4 (15 Jan 2021)

To convert a fractional solution to an instance of a constraint satisfaction problem into a solution, a rounding scheme is needed, which can be described by a collection of symmetric operations with one of each arity. An intriguing possibility, raised in a recent paper by Carvalho and Krokhin, would imply that any clone of operations on a set $D$ which contains symmetric operations of arities $1, 2, \ldots, |D|$ contains symmetric operations of all arities in the clone. If true, then it is possible to check whether any given family of constraint satisfaction problems is solved by its linear programming relaxation. We characterize all idempotent clones containing symmetric operations of arities $1, 2, \ldots, |D|$ for all sets $D$ with size at most four and prove that each one contains symmetric operations of every arity, proving the conjecture above for $|D|{\leq}4$.

260) Yuxiao Wang, Asymptotics for Iterating the Lusztig-Vogan Bijection for $GL_n$ on Dominant Weights (15 Jan 2021)

In this paper, we iterate the explicit algorithm computing the Lusztig-Vogan bijection in Type $A$ ($GL_n$) on dominant weights, which was proposed by Achar and simplified by Rush. Our main result focuses on describing asymptotic behavior between the number of iterations for an input and the length of the input; we also present a recursive formula to compute the slope of the asymptote. This serves as another contribution to understanding the Lusztig-Vogan bijection from a combinatorial perspective and a first step in understanding the iterative behavior of the Lusztig-Vogan bijection in Type $A$.

259) Quanlin Chen, Tianze Jiang, and Yuxiao Wang, On the Generational Behavior of Gaussian Binomial Coefficients at Roots of Unity (15 Jan 2021)

The generational behavior of Gaussian binomial coefficients at roots of unity shadows the relationship between the reductive algebraic group in prime characteristic and the quantum group at roots of unity. In this paper, we study three ways of obtaining integer values from Gaussian binomial coefficients at roots of unity. We rigorously define the generations in this context and prove such behavior at primes power and two times primes power roots of unity. Moreover, we investigate and make conjectures on the vanishing, valuation, and sign behavior under the big picture of generations.

258) Fiona Abney-McPeek, Serena An, and Jakin Ng, The Stembridge Equality for Skew Stable Grothendieck Polynomials and Skew Dual Stable Grothendieck Polynomialsls (15 Jan 2021; arXiv.org, 9 Feb 2021)

The Schur polynomials $s_{\lambda}$ are essential in understanding the representation theory of the general linear group. They also describe the cohomology ring of the Grassmannians. For $\rho = (n, n-1, \dots, 1)$ a staircase shape and $\mu \subseteq \rho$ a subpartition, the Stembridge equality states that $s_{\rho/\mu} = s_{\rho/\mu^T}$. This equality provides information about the symmetry of the cohomology ring. The stable Grothendieck polynomials $G_{\lambda}$, and the dual stable Grothendieck polynomials $g_{\lambda}$, developed by Buch, Lam, and Pylyavskyy, are variants of the Schur polynomials and describe the $K$-theory of the Grassmannians. Using the Hopf algebra structure of the ring of symmetric functions and a generalized Littlewood-Richardson rule, we prove that $G_{\rho/\mu} = G_{\rho/\mu^T}$ and $g_{\rho/\mu} = g_{\rho/\mu^T}$, the analogues of the Stembridge equality for the skew stable and skew dual stable Grothendieck polynomials.

257) Samuel H. Florin (PRIMES), Matthew H. Ho (PRIMES), and Zilin Jiang (MIT), On the binary adder channel with complete feedback, with an application to quantitative group testing (arXiv.org, 25 Jan 2021), published in IEEE Transactions on Information Theory 68:5 (May 2022): 2839-2856

We determine the exact value of the optimal symmetric rate point in the Dueck zero-error capacity region of the binary adder channel with complete feedback. Our motivation is a problem in quantitative group testing. Given a set of $n$ elements two of which are defective, the quantitative group testing problem asks for the identification of these two defectives through a series of tests. Each test gives the number of defectives contained in the tested subset, and the outcomes of previous tests are assumed known at the time of designing the current test. We establish that the minimum number of tests is asymptotic to $(\log_2 n) / r$, where the constant $r \approx 0.78974$ lies strictly between the lower bound $5/7 \approx 0.71428$ due to Gargano et al. and the information-theoretic upper bound $(\log_2 3) / 2 \approx 0.79248$.

256) Adithya Balachandran, Andrew Huang, and Siwen Sun, Product Expansions of q -Character Polynomials (15 Jan 2021)

We consider certain class functions defined simultaneously on the groups $Gl_n(\mathbb{F}_q)$ for all n , which we also interpret as statistics on matrices. It has been previously shown that these simultaneous class functions are closed under multiplication, and we work towards computing the structure constants of this ring of functions. We derive general criteria for determining which statistics have nonzero expansion coefficients in the product of two fixed statistics. To this end, we introduce an algorithm that computes expansion coefficients in general, which we furthermore use to give closed form expansions in some cases. We conjecture that certain indecomposable statistics generate the whole ring, and indeed prove this to be the case for statistics associated with matrices consisting of up to 2 Jordan blocks. The coefficients we compute exhibit surprising stability phenomena, which in turn reflect stabilizations of joint moments as well as multiplicities in the irreducible decomposition of tensor products of representations of finite general linear groups.

255) Daniel Hong, Hyunwoo Lee, and Alex Wei, Optimal solutions and ranks in the max-cut SDP (15 Jan 2021)

The max-cut problem is a classical graph theory problem which is NP-complete. The best polynomial time approximation scheme relies on semidefinite programming (SDP). We study the conditions under which graphs of certain classes have rank 1 solutions to the max-cut SDP. We apply these findings to look at how solutions to the max-cut SDP behave under simple combinatorial constructions. Our results determine when solutions to the max-cut SDP for cycle graphs are rank 1. We find the solutions to the max-cut SDP of the vertex sum of two graphs. We then characterize the SDP solutions upon joining two triangle graphs by an edge sum.

254) Sam Florin, Matthew Ho, and Rahul Thomas, Group testing for two defectives and the zero-error channel capacity (14 Jan 2021)

The issue of identifying defects in a set with as few tests as possible has many applications, including in maximum efficiency pool testing during the COVID-19 pandemic. This research aims to determine the rate of growth of the number of tests required relative to the logarithm of the size of the set. In particular, we focus on the case where there are exactly two defects in the set, which is equivalent to the problem of determining the zero-error capacity of a two-user binary adder channel with complete feedback. The channel capacity is given by a non-linear optimization problem involving entropy functions, whose optimal value remains unknown. In this paper, using the linear dependence technique, we are able to reduce the complexity of the optimization problem significantly. We also gather numerical evidence for the conjectured optimal value.

253) Sarah Chen, In silico prediction of retained intron-derived neoantigens in leukemia (8 Jan 2021)

Alternative splicing is critical for the regulation and diversification of gene expression. Conversely, splicing dysregulation, caused by mutations in splicing machinery or splice junctions, is a hallmark of cancer. Tumor-specific isoforms are a potential source of neoantigens, cancer-specific peptides presented by human leukocyte antigen (HLA) class I molecules and potentially recognized by T cells. For cancers such as acute myeloid leukemia (AML) with a low mutation burden but widespread splicing aberrations, splice variants and retained introns (RIs) in particular, may broaden the number of suitable targets for immunotherapy. I developed a computational pipeline to predict AS-derived neoepitopes from tumor RNA-Seq. I first used the B721.221 B cell line as a model system, for which RNA-Seq, Ribo-Seq, and immunoproteome data from >90 HLA class I monoallelic lines were available. I performed de novo transcriptome assembly with StringTie, identifying on average 694±73 AS isoforms across 4 technical replicates. Using HLAthena, I identified 1,087 AS-derived neoepitopes predicted to bind across 4 frequent HLA alleles. Of them, 192 (18%) also displayed evidence of mRNA translation, measured as the alignment of ≥1 Ribo-Seq. To further increase prediction accuracy, I am currently analyzing the HLA I immunopeptidome to define the features of predicted AS isoforms more likely to be not only translated but also HLA presented. Finally, I applied my prediction pipeline to AML cell lines ( n =8) and primary samples ( n =7). I identified 682±113 AS isoforms in AML cell lines, similar to the 694 in B721, but the proportion of isoforms containing RIs (as opposed to alternative 5' and 3' splice sites or cassette exons) was 3.5x higher than in B721, in line with the biological relevance of RIs in particular in this disease setting. Primary AML samples yielded 1496±294 AS isoforms, more than twofold the number in B721 or AML cell lines, thus reinforcing the significant contribution of AS to the cancer immunopeptidome. Accurate prediction of AS-derived neoantigens through this pipeline will contribute to the design of novel cancer immunotherapies.

252) Kenta Suzuki (PRIMES) and Michael E. Zieve (University of Michigan), Meromorphic functions with the same preimages at several finite sets (31 Dec 2020)

Let $p$ and $q$ be nonconstant meromorphic functions on $\mathbb{C}^m$. We show that if $p$ and $q$ have the same preimages as one another, counting multiplicities, at each of four nonempty pairwise disjoint subsets $S_1,\ldots,S_4$ of $ \mathbb{C}$, then $p$ and $q$ have the same preimages as one another at each of infinitely many subsets of $ \mathbb{C}$, and moreover $g(p)=g(q)$ for some nonconstant rational function $g(x)$ whose degree is bounded in terms of the sizes of the $S_i$'s. This result is new already when $m=1$, and it implies many previous results about the extent to which a meromorphic function is determined by its preimages of a few points or a few small sets.

251) Yavor Litchev and Abigail Thomas, Hybrid Privacy Scheme (31 Dec 2020)

Local Differential Privacy (LDP) is an approach that allows a central server to compute on data submitted by multiple users while maintaining the privacy of each user. LDP is a very efficient approach to security; however, as privacy increases, the accuracy of these computations decreases. Multi-Party Computation (MPC) is a process by which multiple parties work together to compute the output of a function without revealing their own information. MPC is highly secure and accurate for such computations, but it is very computationally expensive and slow. The proposed hybrid privacy model harnesses the benefits of both LDP and MPC to create a secure, accurate, and fast algorithm for machine learning.

250) Ho Tin Fan and Alvin Lu, Parallel Batch-Dynamic 3-Vertex Subgraph Maintenance (31 Dec 2020)

Counting certain subgraphs is a fundamental problem that is crucial in recognizing patterns in large graphs, such as social networks and biological interactomes. However, many real world graphs are constantly evolving and are subject to changes over time, and previous work on efficient parallel subgraph counting algorithms either do not support dynamic modifications or do not extend to general subgraphs. This paper presents a theoretically-efficient and demonstrably fast algorithm for parallel batch-dynamic 3-vertex subgraph counting, and the underlying data structure can be extended to counting 4-vertex subgraph counts as well. The algorithm maintains the h -index of the graph, or the maximum h such that the graph contains h vertices with degree at least h , and uses this to update subgraph counts through an efficient traversal of two-paths, or wedges. For a batch of size b , the algorithm takes O( bh ) expected amortized work and O(log( bh )) span with high probability.

249) Kevin Edward Zhao (PRIMES), Vladislav Lialin & Anna Rumshisky (UMass Lowell), Text Is an Image: Augmentation via Embedding Mixing (30 Dec 2020)

Data augmentation techniques are essential for computer vision, yielding significant accuracy improvements with little engineering costs. However, data augmentation for text has always been tricky. Synonym replacement techniques require a good thesaurus and domain-specific rules for synonym selection from the synset, while backtranslation techniques are computationally expensive and require a good translation model for the language in interest.
In this paper, we present simple text augmentation techniques on the embeddings level, inspired by mixing-based image augmentations. These techniques are language-agnostic and require little to no hyperparameter tuning. We evaluate the augmentation techniques on IMDB and GLUE tasks, and the results show that the augmentations significantly improve the score of the RoBERTa model.

248) Alvin Chen (PRIMES) and Kai Huang (MIT), Alpha invariants of $K$-semistable smooth toric Fano varieties (29 Dec 2020)

Jiang conjectured that the $\alpha$-invariant for $n$-dimensional $K$-semistable smooth Fano varieties has a gap between $\frac{1}{n}$ and $\frac{1}{n+1}$, where $\frac{1}{n+1}$ can only be achieved by projective $n$-space. Assuming a weaker version of Ewald's conjecture, we prove this gap conjecture in the toric case. We also prove a necessary and sufficient classification for all possible values of the $\alpha$-invariant for $K$-semistable smooth toric Fano varieties by providing an explicit construction of the polytopes that can achieve these values. This provides an important step towards understanding the types of polytopes that correspond to particular values of the $\alpha$-invariant; in particular, we show that $K$-semistable smooth Fano polytopes are centrally symmetric if and only if they have an $\alpha$-invariant of $\frac{1}{2}$. Lastly, we examine the effects of the Picard number on the $\alpha$-invariant, classifying the $K$-semistable smooth toric Fano varieties with Picard number 1 or 2 and their $\alpha$-invariants.

247) Vishnu Emani (PRIMES), Klaus Schmitz-Abe and Pankaj Agrawal (Boston Children's Hospital), Statistical Ranking Model for Candidate Genes in Rare Genetic Disorders (28 Dec 2020)

Genetic mutations are responsible for a significant number of rare diseases, and so investigating the genetic basis of various rare diseases has been a crucial area of study. More specifically, studying variants in the exome, the protein coding region which makes up approximately 1% of the human genome, has been proven effective at identifying the most likely pathogenic variants. The advent of whole exome and whole genome sequencing facilitates identification of the most likely pathogenic mutations much more efficiently and on a greater scale. Next-generation sequencing has been growing rapidly in the past decade and has led to numerous successful disease-detection pipelines. The pipeline involved in this study was the Variant Explorer Pipeline (VExP), developed by our laboratory to improve diagnostic yield. In the VExP pipeline, genetic variants are filtered based on a variety of criteria, which can be divided into the categories of genotype data and phenotype data (Figure 1). After the filtering process, the most likely variants are isolated, a process which requires meticulous examination of a large number of mutations. Furthermore, determining the strength of a phenotype match presents challenges because a number of resources need to be consulted to make an informed decision. The purpose of this project was to develop an automated algorithm, using a host of parameters, to rank mutation candidates based on the two computed scores for pathogenicity.

246) Neil Chowdhury, Modeling the Effect of Histone Methylation on Chromosomal Organization in Colon Cancer Cells (27 Dec 2020)

Loop extrusion and compartmentalization are the two most important processes regulating the high-level organization of DNA in the cell nucleus. These processes are largely believed to be independent and competing. Chromatin consists of nucleosomes, which contain coils of DNA wrapped around histone proteins. Besides packing DNA, nucleosomes contain an "epigenetic code" - tails of histone proteins are chemically modified at certain positions to leave certain "histone marks" on the chromatin fiber. This paper explores the effect of the H3K9me3 histone modification, which typically corresponds to inactive and repressed chromatin, on genome structure. Interestingly, in H3K9me3 domains, there are much fewer topologically associating domains (TADs) than in other domains, and there is a unique compartmentalization pattern. A high-resolution polymer model simulating both loop extrusion and compartmentalization is created to explore these differences.

245) Daniel Xu, Modeling of Network Based Digital Contact Tracing and Testing Strategies for the COVID-19 Pandemic (26 Dec 2020; arXiv.org, 28 Dec 2020), published in Mathematical Biosciences , vol. 338 (August 2021)

With more than 1.7 million COVID-19 deaths, identifying effective measures to prevent COVID19 is a top priority. We developed a mathematical model to simulate the COVID-19 pandemic with digital contact tracing and testing strategies. The model uses a real-world social network generated from a high-resolution contact data set of 180 students. This model incorporates infectivity variations, test sensitivities, incubation period, and asymptomatic cases. We present a method to extend the weighted temporal social network and present simulations on a network of 5000 students. The purpose of this work is to investigate optimal quarantine rules and testing strategies with digital contact tracing. The results show that the traditional strategy of quarantining direct contacts reduces infections by less than 20% without sufficient testing. Periodic testing every 2 weeks without contact tracing reduces infections by less than 3%. A variety of strategies are discussed including testing second and third degree contacts and the pre-exposure notification system, which acts as a social radar warning users how far they are from COVID-19. The most effective strategy discussed in this work was combined the pre-exposure notification system with testing second and third degree contacts. This strategy reduces infections by 18.3% when 30% of the population uses the app, 45.2% when 50% of the population uses the app, 72.1% when 70% of the population uses the app, and 86.8% when 95% of the population uses the app. When simulating the model on an extended network of 5000 students, the results are similar with the contact tracing app reducing infections by up to 79%.

244) Yongyi Chen (MIT) and Tae Kyu Kim (PRIMES), On Generalized Carmichael Numbers (15 Dec 2020; arXiv.org 5 Mar 2021)

Given an integer $k$, define $C_k$ as the set of integers $n > \max(k,0)$ such that $a^{n-k+1} \equiv a \pmod{n}$ holds for all integers $a$. We establish various multiplicative properties of the elements in $C_k$ and give a sufficient condition for the infinitude of $C_k$. Moreover, we prove that there are finitely many elements in $C_k$ with one and two prime factors if and only if $k>0$ and $k$ is prime. In addition, if all but two prime factors of $n \in C_k$ are fixed, then there are finitely many elements in $C_k$, excluding certain infinite families of $n$. We also give conjectures about the growth rate of $C_k$ with numerical evidence. We explore a similar question when both $a$ and $k$ are fixed and prove that for fixed integers $a \geq 2$ and $k$, there are infinitely many integers $n$ such that $a^{n-k} \equiv 1 \pmod{n}$ if and only if $(k,a) \neq (0,2)$ by building off the work of Kiss and Phong. Finally, we discuss the multiplicative properties of positive integers $n$ such that Carmichael function $\lambda(n)$ divides $n-k$.

243) William Qin, HOMFLY Polynomials of Pretzel Knots (11 Dec 2020; arXiv.org, 3 Jan 2021)

HOMFLY polynomials are one of the major knot invariants being actively studied. They are difficult to compute in the general case but can be far more easily expressed in certain specific cases. In this paper, we examine two particular knots, as well as one more general infinite class of knots.
From our calculations, we see some apparent patterns in the polynomials for the knots $9_{35}$ and $9_{46}$, and in particular their $F$-factors. These properties are of a form that seems conducive to finding a general formula for them, which would yield a general formula for the HOMFLY polynomials of the two knots.
Motivated by these observations, we demonstrate and conjecture some properties both of the $F$-factors and HOMFLY polynomials of these knots and of the more general class that contains them, namely pretzel knots with 3 odd parameters. We make the first steps toward a matrix-less general formula for the HOMFLY polynomials of these knots.

242) Jonathan Yin (PRIMES), Hattie Chung (Broad Institute), and Aviv Regev (Broad Institute), A multi-view generative model for molecular representation improves prediction tasks (7 Dec 2020), accepted paper for LMRL2020 (Learning Meaningful Representations of Life) workshop at NeurIPS 2020 (Thirty-fourth Conference on Neural Information Processing Systems)

Unsupervised generative models have been a popular approach to representing molecules. These models extract salient molecular features to create compact vec- tors that can be used for downstream prediction tasks. However, current generative models for molecules rely mostly on structural features and do not fully capture global biochemical features. Here, we propose a multi-view generative model that integrates low-level structural features with global chemical properties to create a more holistic molecular representation. In proof-of-concept analyses, compared to purely structural latent representations, multi-view latent representations improve model accuracy on various tasks when used as input to feed-forward prediction networks. For some tasks, simple models trained on multi-view representations perform comparably to more complex supervised methods. Multi-view represen- tations are an attractive method to improve representations in an unsupervised manner, and could be useful for prediction tasks, particularly in contexts where data is limited.

241) Yibo Gao (MIT), Joshua Guo (PRIMES), Karthik Seetharaman (PRIMES), and Ilaria Seidel (PRIMES), The Rank-Generating Functions of Upho Posets (arXiv.org, 3 Nov 2020), published in Discrete Mathematics 345:1 (Jan 2022)

Upper homogeneous finite type (upho) posets are a large class of partially ordered sets with the property that the principal order filter at every vertex is isomorphic to the whole poset. Well-known examples include k-array trees, the grid graphs, and the Stern poset. Very little is known about upho posets in general. In this paper, we construct upho posets with Schur-positive Ehrenborg quasisymmetric functions, whose rank-generating functions have rational poles and zeros. We also categorize the rank-generating functions of all planar upho posets. Finally, we prove the existence of an upho poset with uncomputable rank-generating function.

240) Jason Yang (PRIMES) and Jun Wan (MIT), On Updating and Querying Submatrices (arXiv.org, 25 Oct 2020)

In this paper, we study the $d$-dimensional update-query problem. We provide lower bounds on update and query running times, assuming a long-standing conjecture on min-plus matrix multiplication, as well as algorithms that are close to the lower bounds. Given a $d$-dimensional matrix, an \textit{update} changes each element in a given submatrix from $x$ to $x\bigtriangledown v$, where $v$ is a given constant. A \textit{query} returns the $\bigtriangleup$ of all elements in a given submatrix. We study the cases where $\bigtriangledown$ and $\bigtriangleup$ are both commutative and associative binary operators. When $d = 1$, updates and queries can be performed in $O(\log N)$ worst-case time for many $(\bigtriangledown,\bigtriangleup)$ by using a segment tree with lazy propagation. However, when $d\ge 2$, similar techniques usually cannot be generalized. We show that if min-plus matrix multiplication cannot be computed in $O(N^{3-\varepsilon})$ time for any $\varepsilon>0$ (which is widely believed to be the case), then for $(\bigtriangledown,\bigtriangleup)=(+,\min)$, either updates or queries cannot both run in $O(N^{1-\varepsilon})$ time for any constant $\varepsilon>0$, or preprocessing cannot run in polynomial time. Finally, we show a special case where lazy propagation can be generalized for $d\ge 2$ and where updates and queries can run in $O(\log^d N)$ worst-case time. We present an algorithm that meets this running time and is simpler than similar algorithms of previous works.

239) Vishaal Ram (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), A modified age-structured SIR model for COVID-19 type viruses (arXiv.org, 23 Sept 2020), published in Nature Scientific Reports (2021) 11:15194

We present a modified age-structured SIR model based on known patterns of social contact and distancing measures within Washington, USA. We find that population age-distribution has a significant effect on disease spread and mortality rate, and contribute to the efficacy of age-specific contact and treatment measures. We consider the effect of relaxing restrictions across less vulnerable age-brackets, comparing results across selected groups of varying population parameters. Moreover, we analyze the mitigating effects of vaccinations and examine the effectiveness of age-targeted distributions. Lastly, we explore how our model can be applied to other states to reflect social-distancing policy based on different parameters and metrics.

238) Richard Chen (PRIMES), Feng Gui (MIT), Jason Tang (PRIMES), and Nathan Xiong (PRIMES), Few distance sets in $\ell_p$ spaces and $\ell_p$ product spaces (19 Sept 2020; arXiv.org, 26 Sept 2020), published in European Journal of Combinatorics 102 (May 2022)

Kusner asked if $n+1$ points is the maximum number of points in $\mathbb{R}^n$ such that the $\ell_p$ distance between any two points is $1$. We present an improvement to the best known upper bound when $p$ is large in terms of $n$, as well as a generalization of the bound to $s$-distance sets. We also study equilateral sets in the $\ell_p$ sums of Euclidean spaces, deriving upper bounds on the size of an equilateral set for when $p=\infty$, $p$ is even, and for any $1\le p<\infty$.

237) Tanya Khovanova (MIT) and Sean Li (PRIMES), The Penney's Game with Group Action (arXiv.org, 13 Sept 2020), published in Annals of Combinatorics (15 Jan 2022)

We generalize word avoidance theory by equipping the alphabet $\mathcal{A}$ with a group action. We call equivalence classes of words patterns. We extend the notion of word correlation to patterns using group stabilizers. We extend known word avoidance results to patterns. We use these results to answer standard questions for the Penney's game on patterns and show non-transitivity for the game on patterns as the length of the pattern tends to infinity. We also analyze bounds on the pattern-based Conway leading number and expected wait time, and further explore the game under the cyclic and symmetric group actions.

236) Ankit Bisain (PRIMES) and Eric J. Hanson (Brandeis University), The Bernardi Formula for Non-Transitive Deformations of the Braid Arrangement (7 Sept 2020; arXiv.org, 2 Oct 2020), published in The Electronic Journal of Combinatorics 28:4 (2021)

Bernardi has given a general formula to compute the number of regions of a deformation of the braid arrangement as a signed sum over boxed trees . We prove that the contribution to this sum of the set of boxed trees sharing an underlying rooted labeled tree is 0 or ±1 and give an algorithm for computing this value. We then restrict to arrangements which we call almost transitive and construct a sign-reversing involution which reduces Bernardi's signed sum to enumeration of a set of rooted labeled trees in this case. We conclude by explicitly enumerating the trees corresponding to the regions of certain nested Ish arrangements which we call non-negative , recovering their known counting formula.

235) Alejandro H. Morales (UMass Amherst) and William Shi (PRIMES), Refinements and Symmetries of the Morris identity for volumes of flow polytopes (7 Sept 2020; arXiv.org, 11 Feb 2021), published in Comptes Rendus Mathématique 359 (2021): 823-851

Flow polytopes are an important class of polytopes in combinatorics whose lattice points and volumes have interesting properties and relations. The Chan-Robbins-Yuen (CRY) polytope is a flow polytope with normalized volume equal to the product of consecutive Catalan numbers. Zeilberger proved this by evaluating the Morris constant term identity, but no combinatorial proof is known. There is a refinement of this formula that splits the largest Catalan number into Narayana numbers, which Mészáros gave an interpretation as the volume of a collection of flow polytopes. We introduce a new refinement of the Morris identity with combinatorial interpretations both in terms of lattice points and volumes of flow polytopes. Our results generalize Mészáros's construction and a recent flow polytope interpretation of the Morris identity by Corteel-Kim-Mészáros. We prove the product formula of our refinement following the strategy of the Baldoni-Vergne proof of the Morris identity. Lastly, we study a symmetry of the Morris identity bijectively using the Danilov-Karzanov-Koshevoy triangulation of flow polytopes and a bijection of Mészáros-Morales-Striker.

234) Vishaal Ram (PRIMES), Laura P. Schaposnik (University of Illinois at Chicago) et al., Extrapolating continuous color emotions through deep learning (2 Sept 2020), published in Physical Review Research 2:3 (September–November 2020)

By means of an experimental dataset, we use deep learning to implement an RGB (red, green, and blue) extrapolation of emotions associated to color, and do a mathematical study of the results obtained through this neural network. In particular, we see that males (type-$m$ individuals) typically associate a given emotion with darker colors, while females (type-$f$ individuals) associate it with brighter colors. A similar trend was observed with older people and associations to lighter colors. Moreover, through our classification matrix, we identify which colors have weak associations to emotions and which colors are typically confused with other colors.

233) Jesse Geneson, Suchir Kaustav, and Antoine Labelle (CrowdMath-2020), Extremal results for graphs of bounded metric dimension (arXiv.org, 31 Aug 2020), published in Discrete Applied Mathematics 309 (15 March 2022): 123-129

Metric dimension is a graph parameter motivated by problems in robot navigation, drug design, and image processing. In this paper, we answer several open extremal problems on metric dimension and pattern avoidance in graphs from (Geneson, Metric dimension and pattern avoidance, Discrete Appl. Math. 284, 2020, 1-7). Specifically, we construct a new family of graphs that allows us to determine the maximum possible degree of a graph of metric dimension at most $k$, the maximum possible degeneracy of a graph of metric dimension at most $k$, the maximum possible chromatic number of a graph of metric dimension at most $k$, and the maximum $n$ for which there exists a graph of metric dimension at most $k$ that contains $K_{n, n}$.
We also investigate a variant of metric dimension called edge metric dimension and solve another problem from the same paper for $n$ sufficiently large by showing that the edge metric dimension of $P_n^{d}$ is $d$ for $n \geq d^{d-1}$. In addition, we use a probabilistic argument to make progress on another open problem from the same paper by showing that the maximum possible clique number of a graph of edge metric dimension at most $k$ is $2^{\Theta(k)}$. We also make progress on a problem from (N. Zubrilina, On the edge dimension of a graph, Discrete Math. 341, 2018, 2083-2088) by finding a family of new triples $(x, y, n)$ for which there exists a graph of metric dimension $x$, edge metric dimension $y$, and order $n$. In particular, we show that for each integer $k > 0$, there exist graphs $G$ with metric dimension $k$, edge metric dimension $3^k(1-o(1))$, and order $3^k(1+o(1))$.

232) William Li, Lebesgue Measure Preserving Thompson's Monoid (30 Aug 2020)

This paper defines Lebesgue measure preserving Thompson's monoid, denoted by $\mathbb{G}$, which is modeled on Thompson's group $\mathbb{F}$ except that the elements of $\mathbb{G}$ are non-invertible. Moreover, it is required that the elements of $\mathbb{G}$ preserve Lebesgue measure. Monoid $\mathbb{G}$ exhibits very different properties from Thompson's group $\mathbb{F}$. The paper studies a number of algebraic (group-theoretic) and dynamical properties of $\mathbb{G}$ including approximation, mixing, periodicity, entropy, decomposition, generators, and topological conjugacy.

231) Srinath Mahankali, Velocity Inversion Using the Quadratic Wasserstein Metric (24 Aug 2020; arXiv.org 26 Aug 2020)

Full-waveform inversion (FWI) is a method used to determine properties of the Earth from information on the surface. We use the squared Wasserstein distance (squared $W_2$ distance) as an objective function to invert for the velocity as a function of position in the Earth, and we discuss its convexity with respect to the velocity parameter. In one dimension, we consider constant, piecewise increasing, and linearly increasing velocity models as a function of position, and we show the convexity of the squared $W_2$ distance with respect to the velocity parameter on the interval from zero to the true value of the velocity parameter when the source function is a probability measure. Furthermore, we consider a two-dimensional model where velocity is linearly increasing as a function of depth and prove the convexity of the squared $W_2$ distance in the velocity parameter on large regions containing the true value. We discuss the convexity of the squared $W_2$ distance compared with the convexity of the squared $L^2$ norm, and we discuss the relationship between frequency and convexity of these respective distances. We also discuss multiple approaches to optimal transport for non-probability measures by first converting the wave data into probability measures.

230) Michael Gerovitch, Environment-aware Pedestrian Trajectory Prediction for Autonomous Driving (21 Aug 2020)

People's safety is a primary concern in autonomous driving. There exist efficient methods for identifying static obstacles. However, the prediction of future trajectories of moving elements, such as pedestrians crossing a street, is a much more challenging problem. A promising direction of research is the use of machine learning algorithms with location bias maps. Our goal was to further explore this idea by training an interchangeable location bias map, a location-specific feature that is added into the middle of a convolutional neural network. For different locations, we used different location bias maps to allow the network to learn from different setting contexts without overfitting to a specific setting. Using pre-annotated video footage of pedestrians moving around in crowded areas, we implemented a pedestrian behavior encoding scheme to generate input and output volumes for the neural network. Using this encoding scheme, we trained our neural network and interchangeable location bias map. Our research demonstrates that the network with an interchangeable location bias map can predict realistic pedestrian trajectories even when trained simultaneously in multiple settings.

229) Andrew Shen, Towards Proving Application Isolation for Cryptocurrency Hardware Wallets (22 Jul 2020)

We often perform security-sensitive operations in our day-to-day lives such as performing monetary transactions. To perform these operations securely, we can isolate the confirmation of such operations to separate hardware devices. However, proving that these devices operate securely is still difficult given the complexity of their kernels, yet important given the rise in popularity of cryptocurrency transaction devices. To support multiple cryptocurrencies and other functionality, these devices must be able to run multiple applications that are isolated from one another as they could be potentially maliciously acting applications. We can simplify our device by modeling it as running applications sequentially in user mode. We seek to prove that these applications cannot tamper with the kernel memory and show that the kernel protection is set up correctly. To do this, we developed a RISC-V machine emulator in Rosette, which enables us to reason about the behaviour of symbolic machine states and symbolic applications. We make progress towards verifying application isolation for launching and running applications on a simple kernel.

228) Andrey Boris Khesin (MIT) and Alexander Lu Zhang (PRIMES), On Quasisymmetric Functions with Two Bordering Variables (arXiv.org, 23 Jul 2020)

We extend past results on a family of formal power series $K_{n, \Lambda}$, parameterized by $n$ and $\Lambda \subseteq [n]$, that largely resemble quasisymmetric functions. This family of functions was conjectured to have the property that the product $K_{n, \Lambda}K_{m, \Omega}$ of any two functions $K_{n, \Lambda}$ and $K_{m, \Omega}$ from the family can be expressed as a linear combination of other functions from the family. In this paper, we show that this is indeed the case and that the span of the $K_{n, \Lambda}$'s forms an algebra. We also provide techniques for examining similar families of functions and a formula for the product $K_{n, \Lambda}K_{m, \Omega}$ when $n=1$.

227) Neel Bhalla, Constructing Workflow-centric Traces in Close to Real Time for the Hadoop File System (22 Jul 2020)

Diagnosing problems in large scale systems using cloud based distributed services is a challenging problem. Workflow-centric tracing captures the workflow (work done to process requests) and dependency graph of causally-related events among the components of a distributed system. But, constructing traces has historically been performed offline in batch fashion, so trace data is not immediately available to engineers for their diagnosis efforts. In this work, we present an approach based on graph abstraction and streaming framework to construct workflow-centric traces in near real time for the Hadoop file system. This approach will provide the network operators with a real time understanding of the distributed system behavior.

226) Yunseo Choi (PRIMES) and James Unwin (University of Illinois at Chicago), Racial Impact on Infections and Deaths due to COVID-19 in New York City (11 Jul 2020; arXiv.org , 9 Jul 2020), forthcoming in Harvard Technology Review

Redlining is the discriminatory practice whereby institutions avoided investment in certain neighborhoods due to their demographics. Here we explore the lasting impacts of redlining on the spread of COVID-19 in New York City (NYC). Using data available through the Home Mortgage Disclosure Act, we construct a redlining index for each NYC census tract via a multi-level logistical model. We compare this redlining index with the COVID-19 statistics for each NYC Zip Code Tabulation Area. Accurate mappings of the pandemic would aid the identification of the most vulnerable areas and permit the most effective allocation of medical resources, while reducing ethnic health disparities.

225) Sanath Govindarajan (PRIMES) and William S. Moses (MIT), SyFER-MLIR: Integrating Fully Homomorphic Encryption Into the MLIR Compiler Framework (3 Jul 2020)

Fully homomorphic encryption opens up the possibility of secure computation on private data. However, fully homomorphic encryption is limited by its speed and the fact that arbitrary computations must be represented by combinations of primitive operations, such as addition, multiplication, and binary gates. Integrating FHE into the MLIR compiler infrastructure allows it to be automatically optimized at many different levels and will allow any program which compiles into MLIR to be modified to be encrypted by simply passing another flag into the compiler. The process of compiling into an intermediate representation and dynamically generating the encrypted program, rather than calling functions from a library, also allows for optimizations across multiple operations, such as rewriting a DAG of operations to run faster and removing unnecessary operations.

224) Ethan Mendes (PRIMES) and Kyle Hogan (MIT), Defending Against Imperceptible Audio Adversarial Examples Using Proportional Additive Gaussian Noise (30 Jun 2020)

Neural networks are susceptible to adversarial examples, which are specific inputs to a network that result in a misclassification or an incorrect output. While most past work has focused on methods to generate adversarial examples to fool image classification networks, recently, similar attacks on automatic speech recognition systems have been explored. Due to the relative novelty of these audio adversarial examples, there exist few robust defenses for these attacks. We present a robust defense for inaudible or imperceptible audio adversarial examples. This approach mimics the adversarial strategy to add targeted proportional additive Gaussian noise in order to revert an adversarial example back to its original transcription. Our defense performs similarly to other defenses yet is the first randomized or probabilistic strategy. Additionally, we demonstrate the challenges that arise when applying defenses against adversarial examples for images to audio adversarial examples.

223) Walden Yan (PRIMES) and William S. Moses (MIT) , Token pairing to improve neural program synthesis models (30 Jun 2020)

In neural program synthesis (NPS), a network is trained to output or aid in the output of code that satisfies a given program specification. In our work, we make modifications upon the simple sequence-to-sequence (Seq2Seq) LSTM model. Extending the most successful techniques from previous works, we guide a beam search with an encoder-decoder scheme augmented with attention mechanisms and a specialized syntax layer. But one of the withstanding difficulties of NPS is the implicit tree structure of programs, which makes it inherently more difficult for linearly-structured models. To address this, we experiment with a novel technique we call token pairing . Our model is trained and evaluated on AlgoLisp, a dataset of English description-to-code programming problems paired with example solutions and test cases on which to evaluate programs. We also create a new interpreter for AlgoLisp that fixes the bugs present in the builtin executor. In the end, our model achieves 99.24% accuracy at evaluation, which greatly improves on the previous state-of-the-art of 95.80% while using fewer of parameters.

222) Zhenkun Li (MIT) and Jessica Zhang (PRIMES), Classification of tight contact structures on a solid torus (arXiv.org, 30 Jun 2020)

It is a basic question in contact geometry to classify all non-isotopic tight contact structures on a given 3-manifold. If the manifold has a boundary, we need also specify the dividing set on the boundary. In this paper, we answer the classification question completely for the case of a solid torus by writing down a closed formula for the number of non-isotopic tight contact structures with any given dividing set on the boundary of the solid torus. Previously, only a few special cases were known due to work by Honda.

221) Christian Gaetz (MIT) and Katherine Tung (PRIMES), The Sperner property for $132$-avoiding intervals in the weak order (arXiv.org, 29 Jun 2020), published in Bulletin of the London Mathematical Society 53:2 (April 2021): 442-457.

A well-known result of Stanley implies that the weak order on a maximal parabolic quotient of the symmetric group $S_n$ has the Sperner property; this same property was recently established for the weak order on all of $S_n$ by Gaetz and Gao, resolving a long-open problem. In this paper we interpolate between these results by showing that the weak order on any parabolic quotient of $S_n$ (and more generally on any $132$-avoiding interval) has the Sperner property. This result is proven by exhibiting an action of $\mathfrak{sl}_2$ respecting the weak order on these intervals. As a corollary we obtain a new formula for principal specializations of Schubert polynomials. Our formula can be seen as a strong Bruhat order analogue of Macdonald's reduced word formula. This proof technique and formula generalize work of Hamaker, Pechenik, Speyer, and Weigandt and Gaetz and Gao.

220) Yuxuan (Jason) Chen, Real World Application of Event-based End to End Autonomous Driving (29 Jun 2020)

End-to-end autonomous driving has recently been a popular area of study for deep learning. This work studies the use of event cameras for real-world deep learned driving task in comparison to traditions RGB cameras. In this work, we evaluate existing stateof-the-art event-based models on offline datasets, design a novel model that fuses the benefits from both event-based and traditional frame-based cameras, and integrate the trained models on board a full-scale vehicle. We conduct tests in a challenging track with features unseen to the model. Through our experiments and saliency visualization, we show that event-based models actually predict the existing motion of the car rather than the active control the car should take. Therefore, while event-based models excel at offline tasks such as motion estimation, our experiments reveal a fundamental challenge in applying event-based end-to-end learning to active control tasks, that the models need to learn reasoning about future actions with a feedback loop that impacts its future state.

219) Arun S. Kannan (MIT) and Honglin Zhu (PRIMES), Characters for Projective Modules in the BGG Category $\mathcal{O}$ for the Orthosymplectic Lie Superalgebra $\mathfrak{osp}(3|4)$ (arXiv.org, 11 Jun 2020), published in Journal of Algebra 569 (1 March 2021): 723-757

We determine the Verma multiplicities of standard filtrations of projective modules for integral atypical blocks in the BGG category $\mathcal{O}$ for the orthosymplectic Lie superalgebras $\mathfrak{osp}(3|4)$ by way of translation functors. We then explicitly determine the composition factor multiplicities of Verma modules using BGG reciprocity.

2019 Research Papers

218) Espen Slettnes, Minimal Embedding Dimensions of Rectangle k-Visibility Graphs , published in Journal of Graph Algorithms and Applications 25:1 (January 2021): 59-96.

Bar visibility graphs were adopted in the 1980s as a model to represent traces, e.g., on circuit boards and in VLSI chip designs. Two generalizations of bar visibility graphs, rectangle visibility graphs and bar $k$-visibility graphs, were subsequently introduced. Here, we combine bar $k$- and rectangle visibility graphs to form rectangle $k$-visibility graphs (R$k$VGs), and further generalize these to higher dimensions. A graph is a $d$-dimensional R$k$VG if and only if it can be represented with vertices as disjoint axis-aligned hyperrectangles in $d$-space, such that there is an axis-parallel line of sight between two hyperrectangles that intersects at most $k$ other hyperrectangles if and only if there is an edge between the two corresponding vertices. For any graph $G$ and a fixed $k$, we prove that given enough spatial dimensions, $G$ has a rectangle $k$-visibility representation, and thus we define the minimal embedding dimension (MED) with $k$-visibility of $G$ to be the smallest $d$ such that $G$ is a $d$-dimensional R$k$VG. We study the properties of MEDs and find upper bounds on the MEDs of various types of graphs. In particular, we find that the $k$-visibility MED of the complete graph on $m$ vertices $K_m$ is at most $\lceil{m/(2(k+1))}\rceil,$ of complete $r$-partite graphs is at most $r+1,$ and of the $m^{\rm th}$ hypercube graph $Q_m$ is at most $\lceil{2m/3}\rceil$ in general, and at most $\lfloor{\sqrt{m}\,}\rceil$ for $k=0,~ m \ne 2.$

217) Zhengyang (Leo) Dong (PRIMES) and Gil Alterovitz (MIT), netAE: Semi-supervised dimensionality reduction of single-cell RNA sequencing to facilitate cell labeling , published in Bioinformatics (29 Jul 2020)

Single-cell RNA sequencing allows us to study cell heterogeneity at an unprecedented cell-level resolution and identify known and new cell populations. Current cell labeling pipeline uses unsupervised clustering and assigns labels to clusters by manual inspection. However, this pipeline does not utilize available gold-standard labels because there are usually too few of them to be useful to most computational methods. This paper aims to facilitate cell labeling with a semi-supervised method in an alternative pipeline, in which a few gold-standard labels are first identified and then extended to the rest of the cells computationally. We built a semi-supervised dimensionality reduction method, a network-enhanced autoencoder (netAE). Tested on three public datasets, netAE outperforms various dimensionality reduction baselines and achieves satisfactory classification accuracy even when the labeled set is very small, without disrupting the similarity structure of the original space.

216) Tanya Khovanova (MIT) and Kevin Wu (PRIMES), Base 3/2 and Greedily Partitioned Sequences (arXiv.org, 19 Jul 2020)

We delve into the connection between base $\frac{3}{2}$ and the greedy partition of non-negative integers into 3-free sequences. Specifically, we find a fractal structure on strings written with digits 0, 1, and 2. We use this structure to prove that the even non-negative integers written in base $\frac{3}{2}$ and then interpreted in base 3 form the Stanley cross-sequence, where the Stanley cross-sequence comprises the first terms of the infinitely many sequences that are formed by the greedy partition of non-negative integers into 3-free sequences.

215) Dmitry Kleinbock (Brandeis University), Anurag Rao (Brandeis University), and Srinivasan Sathiamurthy (PRIMES), Critical loci of convex domains in the plane (26 Mar 2020; arXiv.org, 30 Mar 2020), published in Indagationes Mathematicae 32:3 (May 2021): 719-728.

Let $K$ be a bounded convex domain in $\mathbb{R}^2$ symmetric about the origin. The critical locus of $K$ is defined to be the (non-empty compact) set of lattices $\Lambda$ in $\mathbb{R}^2$ of smallest possible covolume such that $\Lambda \cap K= \lbrace 0\rbrace$. These are classical objects in geometry of numbers; yet all previously known examples of critical loci were either finite sets or finite unions of closed curves. In this paper we give a new construction which, in particular, furnishes examples of domains having critical locus of arbitrary Hausdorff dimension between $0$ and $1$.

214) P. A. Crowdmath, Propagation time for weighted zero forcing (arXiv.org, 15 May 2020)

Zero forcing is a graph coloring process that was defined as a tool for bounding the minimum rank and maximum nullity of a graph. It has also been used for studying control of quantum systems and monitoring electrical power networks. One of the problems from the 2017 AIM workshop "Zero forcing and its applications" was to explore edge-weighted probabilistic zero forcing, where edges have weights that determine the probability of a successful force if forcing is possible under the standard zero forcing coloring rule.
In this paper, we investigate the expected time to complete the weighted zero forcing coloring process, known as the expected propagation time, as well as the time for the process to be completed with probability at least $\alpha$, known as the $\alpha$-confidence propagation time. We demonstrate how to find the expected and confidence propagation times of any edge-weighted graph using Markov matrices. We also determine the expected and confidence propagation times for various families of edge-weighted graphs including complete graphs, stars, paths, and cycles.

213) P. A. Crowdmath, Applications of the abc conjecture to powerful numbers (arXiv.org, 15 May 2020)

The abc conjecture is one of the most famous unsolved problems in number theory. The conjecture claims for each real $\epsilon > 0$ that there are only a finite number of coprime positive integer solutions to the equation $a+b = c$ with $c > (rad(a b c))^{1+\epsilon}$. If true, the abc conjecture would imply many other famous theorems and conjectures as corollaries. In this paper, we discuss the abc conjecture and find new applications to powerful numbers, which are integers $n$ for which $p^2 | n$ for every prime $p$ such that $p | n$. We answer several questions from an earlier paper on this topic, assuming the truth of the abc conjecture.

212) Alin Tomescu (MIT CSAIL), Robert Chen (PRIMES), Yiming Zheng (PRIMES), Ittai Abraham (VMware Research), Benny Pinkas (VMware Research and Bar Ilan University), Guy Golan Gueta (VMware Research), and Srinivas Devadas (MIT CSAIL), Towards Scalable Threshold Cryptosystems (9 Mar 2020), published in Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP) , San Francisco, CA, vol. 1, pp. 1242-1258.

The resurging interest in Byzantine fault tolerant systems will demand more scalable threshold cryptosystems. Unfortunately, current systems scale poorly, requiring time quadratic in the number of participants. In this paper, we present techniques that help scale threshold signature schemes (TSS), verifiable secret sharing (VSS) and distributed key generation (DKG) protocols to hundreds of thousands of participants and beyond. First, we use efficient algorithms for evaluating polynomials at multiple points to speed up computing Lagrange coefficients when aggregating threshold signatures. As a result, we can aggregate a 130,000 out of 260,000 BLS threshold signature in just 6 seconds (down from 30 minutes). Second, we show how "authenticating" such multipoint evaluations can speed up proving polynomial evaluations, a key step in communicationefficient VSS and DKG protocols. As a result, we reduce the asymptotic (and concrete) computational complexity of VSS and DKG protocols from quadratic time to quasilinear time, at a small increase in communication complexity. For example, using our DKG protocol, we can securely generate a key for the BLS scheme above in 2.3 hours (down from 8 days). Our techniques improve performance for thresholds as small as 255 and generalize to any Lagrange-based threshold scheme, not just threshold signatures. Our work has certain limitations: we require a trusted setup, we focus on synchronous VSS and DKG protocols and we do not address the worst-case complaint overhead in DKGs. Nonetheless, we hope it will spark new interest in designing large-scale distributed systems.

211) Daniil Kalinov (MIT) and Lev Kruglyak (PRIMES), The Rational Cherednik Algebra of Type $A_1$ with Divided Powers (5 Mar 2020), published in New York Journal of Mathematics 27 (2021): 1328-1346

Motivated by the recent developments of the theory of Cherednik algebras in positive characteristic, we study rational Cherednik algebras with divided powers. In our research we have started with the simplest case, the rational Cherednik algebra of type $A_1$. We investigate its maximal divided power extensions over $R[c]$ and $R$ for arbitrary principal ideal domains $R$ of characteristic zero. In these cases, we prove that the maximal divided power extensions are free modules over the base rings, and construct an explicit basis in the case of $R[c]$. In addition, we provide an abstract construction of the rational Cherednik algebra of type $A_1$ over an arbitrary ring, and prove that this generalization expands the rational Cherednik algebra to include all of the divided powers.

210) Sebastian Jeon (PRIMES) and Tanya Khovanova (MIT), 3-Symmetric Graphs (arXiv.org, 8 Mar 2020)

An intuitive property of a random graph is that its subgraphs should also appear randomly distributed. We consider graphs whose subgraph densities exactly match their expected values. We call graphs with this property for all subgraphs with $k$ vertices to be $k$-symmetric. We discuss some properties and examples of such graphs. We construct 3-symmetric graphs and provide some statistics.

209) Lucy Cai, Espen Slettnes, and Jeremy Zhou, A Combinatorial Approach to Extracting Rooted Tree Statistics from the Order Quasisymmetric Function (3 Mar 2020)

The chromatic symmetric function defined by Stanley is a power series that is symmetric in an infinite number of variables and generalizes the chromatic polynomial. Shareshian and Wachs defined the chromatic quasisymmetric function, and Awan and Bernardi defined an analog of it for digraphs.
Three decades ago, Stanley posed a question equivalent to "Does the chromatic symmetric function distinguish between all trees?" A similar question can be raised for rooted trees: "Does the chromatic quasisymmetric function distinguish between all rooted trees?". Hasebe and Tsujie showed algebraically the stronger statement that the order quasisymmetric function distinguishes rooted trees. Here, we aim to directly extract useful statistics about a tree given only its order quasisymmetric function. This approach emphasizes the combinatorics of trees over the the algebraic properties of quasisymmetric functions. We show that a rooted-tree-statistic we name the "co-height profile profile" is extractable, and that it distinguishes rooted 2-caterpillars.

208) Heidi Lei, On the Hausdorff Dimension of the Visible Koch Curve (28 Feb 2020)

In geometry, a point in a set is visible from another point if the line segment connecting two points does not contain other points in the set. We show that the Hausdorff dimension is 1 for the portion of the Koch curve that is visible from points at infinity and points in certain defined regions of the plane.

207) Aditya Saligrama (PRIMES) and Guillaume Leclerc (MIT), Revisiting Ensembles in an Adversarial Context: Improving Natural Accuracy (arXiv.org, 26 Feb 2020), presented at the ICLR 2020 Workshop on Towards Trustworthy ML: Rethinking Security and Privacy for ML (26 April 2020) ( slides )

A necessary characteristic for the deployment of deep learning models in real world applications is resistance to small adversarial perturbations while maintaining accuracy on non-malicious inputs. While robust training provides models that exhibit better adversarial accuracy than standard models, there is still a significant gap in natural accuracy between robust and non-robust models which we aim to bridge. We consider a number of ensemble methods designed to mitigate this performance difference. Our key insight is that model trained to withstand small attacks, when ensembled, can often withstand significantly larger attacks, and this concept can in turn be leveraged to optimize natural accuracy. We consider two schemes, one that combines predictions from several randomly initialized robust models, and the other that fuses features from robust and standard models.

206) William Kuszmaul (MIT) and Alek Westover (PRIMES), In-Place Parallel-Partition Algorithms using Exclusive-Read-and-Write Memory (25 Feb 2020)

We present an in-place algorithm for the parallel partition problem that has linear work and polylogarithmic span. The algorithm uses only exclusive read/write shared variables, and can be implemented using parallel-for-loops without any additional concurrency considerations (i.e., the algorithm is EREW). A key feature of the algorithm is that it exhibits provably optimal cache behavior, up to small-order factors.
We also present a second in-place EREW algorithm that has linear work and span O (log n ·loglog n ), which is within an O (loglog n ) factor of the optimal span. By using this low-span algorithm as a subroutine within the cache-friendly algorithm, we are able to obtain a single EREW algorithm that combines their theoretical guarantees: the algorithm achieves span O (log n ·loglog n ) and optimal cache behavior. As an immediate consequence, we also get an in-place EREW quicksort algorithm with work O ( n log n ), span O (log 2 n ·loglog n ).

205) Justin Yu, On a rank game (22 Feb 2020)

We introduce a new game played by two players that generates an $(0,1)$-matrix of size $n$. The first player aims to maximize its resulting rank, while the second player aims to minimize it. We show that the first player can force almost full rank given additional power in move possibilities.

204) Benjamin Kang (PRIMES) and James Unwin (University of Illinois at Chicago), All-Pay Auctions as Models for Trade Wars and Military Annexation (arXiv.org, 10 Feb 2020), published in Letters in Spatial and Resource Sciences (13 May 2022)

We explore an application of all-pay auctions to model trade wars and territorial annexation. Specifically, in the model we consider the expected resource, production, and aggressive (military/tariff) power are public information, but actual resource levels are private knowledge. We consider the resource transfer at the end of such a competition which deprives the weaker country of some fraction of its original resources. In particular, we derive the quasi-equilibria strategies for two country conflicts under different scenarios. This work is relevant for the ongoing US-China trade war, and the recent Russian capture of Crimea, as well as historical and future conflicts.

203) Benjamin Kang (PRIMES) and James Unwin (University of Illinois at Chicago), All-Pay Auctions with Different Forfeits (arXiv.org, 7 Feb 2020), forthcoming in the Yau Competition finalists compendium

In an auction each party bids a certain amount and the one which bids the highest is the winner. Interestingly, auctions can also be used as models for other real-world systems. In an all pay auction all parties must pay a forfeit for bidding. In the most commonly studied all pay auction, parties forfeit their entire bid, and this has been considered as a model for expenditure on political campaigns. Here we consider a number of alternative forfeits which might be used as models for different real-world competitions, such as preparing bids for defense or infrastructure contracts.

202) Victoria Zhang, Patterns and Symmetries in Spiking Neural Networks (11 Jan 2020)

Inspired by recent progress in computational neuroscience and artificial intelligence, this paper explores rich temporal patterns in networks of neurons that communicate via electric pulses known as spikes. In particular, we describe the attractors in small circuits of spiking neurons with different symmetries and connectivities. Using methods developed in the theory of dynamical systems, we extend an analytical approach to capture the phase-locked states and their stability for a general N -cell system. We then systematically explore attractors in reduced state spaces via Poincaré maps for both all-to-all coupled and star-like coupled networks. We identify a sequence of bifurcations when the coupling strengths vary from inhibition to excitation. Moreover, using high-precision numerical simulations, we find two novel states in star-like networks that are unobserved in all-to-all networks: the death of oscillation for inhibitory coupling and quasi-periodic behaviors for excitatory coupling. Our results elucidate the interplay between dynamical patterns and symmetries in the building blocks of real networks. Furthermore, as self-sustained oscillations with pulsatile couplings are ubiquitous, our analysis may clarify understanding of not only neural dynamics but also other pulse-coupled oscillator systems such as non-linear electric circuits, wireless sensor networks, and self-organizing chemical reactions.

201) Zander Hill, Upper Bound on the Distortion of Cabled Knots (8 Jan 2020)

The torus knots are a class of knots generated by ordered pairs $(p,q)$ of relatively prime integers, where the $(p,q)$-torus knot is the curve defined by a ray of slope $\frac{p}{q}$ emanating from the origin in the representation of the torus as a square with opposing sides identified. Furthermore, given a curve $K$, we can define the $(p,q)$-cabling of $K$ to be the $(p,q)$-torus knot living on an embedding of the torus which follows $K$, as opposed to the standard embedding of the torus which follows $S^1$ in $\mathbb{R}^3$. We show that for all $p$ and $q \gg p$, there exists a curve in the isotopy class of the $(p,q)$-torus knot whose supremal ratio of arc length to Euclidean distance, called the distortion of the curve, is bounded above by $\frac{7q}{\log(q)}$, and additionally show that this bound holds for the $(p,q)$-cabling of any knot. This extends a result of Studer establishing sublinear upper bounds for the distortion of the $(2,q)-$torus knots.

200) Oliver Hayman (PRIMES) and Ashwin Narayan (MIT), Analyzing Visualization and Dimensionality-Reduction Algorithms (9 Jan 2020)

In order to find patterns among high dimensional data sets in scientific studies, scientists use mapping algorithms to produce representative two-dimensional or three-dimensional data sets that are easier to visualize. The most prominent of these algorithms is the t-Distributed Stochastic Neighbor Embedding algorithm (t-SNE). In this project, we create a metric for evaluating how clustered a data set is, and use it to measure how the perplexity parameter of the t-SNE algorithm affects the clustering of outputted data sets. Additionally, we propose a modification in which improved how well randomness is preserved in outputted data sets. Finally, we create a separate metric to test whether a group of points contains one or multiple clusters in a data set of centered clusters.

199) Frank Wang, The integral shuffle algebra and the $K$-theory of the Hilbert scheme of points in $\mathbb{A}^2$ (8 Jan 2020; arXiv.org, 12 Feb 2020)

We examine the shuffle algebra defined over the ring $\mathbf{R} = \mathbb{C}[q_1^{\pm 1}, q_2^{\pm 1}]$, also called the integral shuffle algebra, which was found by Schiffmann and Vasserot to act on the equivariant $K$-theory of the Hilbert Scheme of points in the plane. We find that the modules of 2 and 3 variable elements of the shuffle algebra are finitely generated, and prove a necessary condition for an element to be in the integral shuffle algebra for arbitrarily many variables.

198) Tejas Gopalakrishna (PRIMES) and Yichi Zhang (MIT), Analysis of the One Line Factoring Algorithm (6 Jan 2020)

For integers that fit within $42$ bits, a competitive factoring algorithm is the so-called One Line Factoring Algorithm proposed by William B. Hart. We analyze this algorithm in special cases, in particular, for semiprimes $N = pq$, and look for optimizations. We first observe the cases in which the larger or smaller prime is returned. We then show that when $p$ and $q$ are sufficiently close, we always finish on the first iteration. An upper bound can be found for the first iteration that successfully factors an odd semiprime. Using this upper bound, we demonstrate some simplifications to the algorithm for odd semiprimes in particular. One of our observations is that we only need to iterate numbers $\{ 0,1,3,5,7 \}$ modulo $8$, as the other iterators are very rarely the first that successfully factor the semiprime. Finally, we inspect the performance of the optimized algorithm.

197) Sunay Joshi, On the degenerate Turán problem and its variants (3 Jan 2020)

Given a family of graphs $\mathcal{F}$, a central problem in extremal graph theory is to determine the maximum number $\text{ex}(n,\mathcal{F})$ of edges in a graph on $n$ vertices that does not contain any member of $\mathcal{F}$ as a subgraph. The degenerate Turán problem regards the asymptotic behavior of $\text{ex}(n,\mathcal{F})$ for familes $\mathcal{F}$ of bipartite graphs. In this paper, we prove four new theorems regarding the extremal number and its variants. We begin by investigating several notions central to providing lower bounds on extremal numbers, including balanced rooted graphs and the Erdös--Simonovits Reduction Theorem. In addition, we present new lower bounds on the asymmetric extremal number $\text{ex}(m,n,F)$ and the lopsided asymmetric extremal number $\text{ex}^*(m,n,F)$ when $F$ is a blowup of a bipartite graph or a theta graph.

196) Alexander J. Ding, An Evaluation of UPC++ by Porting Shared-Memory Parallel Graph Algorithms (1 Jan 2020)

Unified Parallel C++ (UPC++), a C++ library, attempts to address the programming difficulty introduced by distributed parallel systems and still take advantage of the model's high scalability by exposing an API that represents the distributed memory as a contiguous global address space, similar to that of a sharedmemory parallel system. Though previous work, including the various benchmarks by UPC++ developers, has demonstrated the library's effectiveness in simple tasks and in porting distributed-memory parallel algorithms that are often implemented in OpenMPI, there lacks an assessment of the ease and effectiveness of porting shared-memory parallel algorithms into UPC++. We implement a number of graph algorithms in OpenMP, a common shared-memory parallel library, and port them into UPC++ in a locality-aware, communication-averse manner to evaluate the convenience, scalability, and robustness of UPC++. Tests on both a single-node, multicore system and the NERSC supercomputer (a multi-node system), with a plethora of real and random input graphs, demonstrate a number of prerequisites for high scalability in our UPC++ implementation: large input graphs, dense input graphs, and dense operations. Similar tests on our OpenMP implementation function as control, proving the algorithms' performance in shared-memory systems. Despite the relatively straightforward and naive porting from OpenMP, we still achieve competitive performance and scalability in dense algorithms on large inputs. The porting demonstrates UPC++'s ease of usage and good porting potential, especially when compared with other distributed libraries like OpenMPI. Finally, we extrapolate a distributed graph processing system on UPC++, optimized with a hybrid top-down/bottom-up approach, to simplify future distributed graph algorithm implementations.

195) Jason Yang (PRIMES), Martin Falk (MIT), and Sameer Abraham (MIT), The relationship between gene expression correlation and 3D genome organization (31 Dec 2019)

In some organisms such as E. coli and S. cerevisiae yeast, it is known that there is a relationship between the distance among genes and their coexpression (Pannier et. al., Kruglyak and Tang). It is also known that in general there is a relationship between gene function and genome structure (Szabo et. al). One might also expect to find a relationship between gene expression and TADs, which are domains within the genome where loci inside contact each other more frequently than loci outside. However, by analyzing data from Mus musculus brain cells, we do not find a relationship between gene pair correlation of single-cell RNA-seq gene expression and gene pair distance. Furthermore, despite the body of work linking gene expression and TAD structure, we also find no difference between gene pairs within a single TAD and between two TADs in terms of the relationship between gene pair distance and correlation. Additionally, we find that gene pair correlation is not related to the biological functions of the genes. However, there is a relationship between highly negative gene pair correlation and the number of times both genes are expressed 0 times across different cells.

194) Sarah Chen (PRIMES), Karl Clauser, Travis Law, and Tamara Ouspenskaia (Broad Institute), Seeking Neoantigen Candidates within Retained Introns (28 Dec 2019)

Major histocompatibility complex class I (MHC I) molecules present peptides from cytosolic proteins on the surface of cells. Cytotoxic T cells can recognize the presented antigens, and infected or cancerous cells that present non-self antigens can elicit an immune response. The identification of cancer-specific peptides (neoantigens) produced by somatic mutations in tumor cells and presented by MHC I molecules enables immunotherapies such as personalized cancer vaccines and adoptive T cell transfer. The state of the art approach searches for neoantigens derived from cancer-specific somatic variants and often falls short for cancers with few somatic mutations. Retained introns (RIs) resulting from splicing errors in cancer are an additional source of neoantigens. In this study, we identify RIs which are transcribed, translated, and contribute peptides to MHC I presentation. Using de novo transcriptome assembly of RNA-seq data,we identified 1799 RIs in B721.221 cells. Additionally, we detected 87 peptides from 83 RIs by liquid chromatography-tandem mass spectrometry of the MHC I immunopeptidome (LC-MS/MS). Finally, we use ribosome profiling (Ribo-seq), which provides a readout of mRNA translation, to identify RIs that are translated, a prerequisite for MHC I presentation. Previous studies have predicted thousands of RIs but have been able to validate only a handful through mass spectrometry. By distinguishing transcribed but untranslated versus translated candidates, Ribo-seq has the potential to improve RI predictions. We propose the use of a combination of RNA-seq and Ribo-seq, paired with mass spectrometry validation, to more accurately predict the contribution of RIs to the MHC I immunopeptidome, enabling the use of RI derived neoantigens in future immunotherapies.

193) Kevin Edward Zhao and Vishnu Emani, The Role of Protein Occupancy in DNA Compartmentalization (23 Dec 2019)

The organization of DNA throughout the genome is a complex process to study. Analysis reveals a checker-board pattern of separation at a megabase-pair scale, called compartments, which are captured well by the largest eigenvector of the Hi-C contact matrix. The sign of the eigenvector correlates with active and repressed areas of the genome. These compartments have been characterized into two categories, called A and B compartments, which are hypothesized to be spatially separated based upon the protein occupancy in the region. This project explores the factors that govern DNA compartmentalization, including the relationship between compartments and protein occupancy. In order to analyze contacts within the genome, Hi-C data was loaded and the eigenvectors of the contact matrix were computed. Protein occupancy in murine cortical neurons and neural progenitor cells was measured via ChIP-Seq. Using this data, we calculated the influence of several proteins on the sign of the Hi-C eigenvector via regression and Support Vector Machines (SVMs). Based on our findings, we tried to develop a simple model for compartments and explored this via simulations. We developed simple simulations of compartments based on ChIP-Seq data, and compared the results to compartments identified in experimental Hi-C maps. The results demonstrate a high correlation between the eigenvectors of the simulated and experimental Hi-C maps. In conclusion, the computational methods are effective at determining the proteins which most significantly contribute to compartmentalization.

192) Neil Chowdhury, A method to recognize universal patterns in genome structure using Hi-C (22 Dec 2019)

The expression of genes in cells is a complicated process. Expression levels of a gene are determined not only by its local neighborhood but also by more distal regions, as is the case with enhancer-promoter interactions, which can connect regions millions of bases away. The large-scale organization of DNA within the cell nucleus plays a substantial role in gene expression and cell fate, with recent developments in biochemical assays (such as Hi-C) generating quantitative maps of the higher-order structure of DNA. The interactions captured by Hi-C have been attributed to several distinct physical processes. One of the processes is that of segregation of DNA into compartmental domains by phase separation. While the current consensus is that there are broadly two types of compartmental domains (A and B), there is some evidence for a larger number of compartmental domains. Here a methodology to determine the identity and number of such compartments is presented, and it is observed that there are four distinct compartments within the genome.

191) Yizhen Chen, Mobile Sensor Networks: Bounds on Capacity and Complexity of Realizability (22 Dec 2019; arXiv.org, 21 Jan 2020), submitted to Electronic Journal of Combinatorics

In a restricted combinatorial mobile sensor network (RCMSN), there are n sensors that continuously receive and store information from outside. Every two sensors communicate exactly once, and at an event when two sensors communicate, they receive and store additionally all information the other has stored. C. Gu, I. Downes, O. Gnawali, and L. Guibas proposed a capacity of information diffusion in mobile sensor networks. They collected all information received by two sensors between a communication event and the previous communication events for each of them into one information packet, and considered the number of sensors a packet eventually reaches. Then they defined the capacity of an RCMSN to be the ratio of the average number of sensors the packets reach and the total number of sensors. While they have studied the expected capacity of an RCMSN (when the order of communications is random), we found the RCMSNs with maximum and minimum capacities. We also found the maximum, minimum, and expected capacities for several related mobile sensor network constructions, such as ones generated from intersections of lines, as well as complexity results concerning when a mobile sensor network can be generated in such geometric ways.

124) Alec Leng, Independence of the Miller-Rabin and Lucas Probable Prime Tests (30 Mar 2017)

In the modern age, public-key cryptography has become a vital component for secure online communication. To implement these cryptosystems, rapid primality testing is necessary in order to generate keys. In particular, probabilistic tests are used for their speed, despite the potential for pseudoprimes. So, we examine the commonly used Miller-Rabin and Lucas tests, showing that numbers with many nonwitnesses are usually Carmichael or Lucas-Carmichael numbers in a specific form. We then use these categorizations, through a generalization of Korselt’s criterion, to prove that there are no numbers with many nonwitnesses for both tests, affirming the two tests’ relative independence. As Carmichael and Lucas-Carmichael numbers are in general more difficult for the two tests to deal with, we next search for numbers which are both Carmichael and Lucas-Carmichael numbers, experimentally finding none less than $10^{16}$. We thus conjecture that there are no such composites and, using multivariate calculus with symmetric polynomials, begin developing techniques to prove this.

123) Ria Das, Exploring the Ant Mill: Numerical and Analytical Investigations of Mixed Memory-Reinforcement Systems (arXiv.org, 20 Mar 2017)

Under certain circumstances, a swarm of a species of trail-laying ants known as army ants can become caught in a doomed revolving motion known as the death spiral, in which each ant follows the one in front of it in a never-ending loop until they all drop dead from exhaustion. This phenomenon, as well as the ordinary motions of many ant species and certain slime molds, can be modeled using reinforced random walks and random walks with memory. In a reinforced random walk, the path taken by a moving particle is influenced by the previous paths taken by other particles. In a random walk with memory, a particle is more likely to continue along its line of motion than change its direction. Both memory and reinforcement have been studied independently in random walks with interesting results. However, real biological motion is a result of a combination of both memory and reinforcement. In this paper, we construct a continuous random walk model based on diffusion-advection partial differential equations that combine memory and reinforcement. We find an axi-symmetric, time-independent solution to the equations that resembles the death spiral. Finally, we prove numerically that the obtained steady-state solution is stable.

122) Andrew Gritsevskiy and Adithya Vellal, Development and Biological Analysis of a Neural Network Based Genomic Compression System (3 Mar 2017)

The advent of Next Generation Sequencing (NGS) technologies has resulted in a barrage of genomic data that is now available to the scientific community. This data contains information that is driving fields such as precision medicine and pharmacogenomics, where clinicians use a patient’s genetics in order to develop custom treatments. However, genomic data is immense in size, which makes it extremely costly to store, transport and process. A genomic compression system which takes advantage of intrinsic biological patterns can help reduce the costs associated with this data while also identifying important biological patterns. In this project, we aim to create a compression system which uses unsupervised neural networks to compress genomic data. The complete compression suite, GenComp, is compared to existing genomic data compression methods. The results are then analyzed to discover new biological features of genomic data. Testing showed that GenComp achieves at least 40 times more compression than existing variant compression solutions, while providing comparable decoding times in most applications. GenComp also provides some insight into genetic patterns, which has significant potential to aid in the fields of pharmacogenomics and precision medicine. Our results demonstrate that neural networks can be used to significantly compress genomic data while also assisting in better understanding genetic biology.

121) Vivek Bhupatiraju, John Kuszmaul, and Vinjai Vale, On the Viability of Distributed Consensus by Proof of Space (3 Mar 2017)

In this paper, we present our implementation of Proof of Space (PoS) and our study of its viability in distributed consensus. PoS is a new alternative to the commonly used Proof of Work, which is a protocol at the heart of distributed consensus systems such as Bitcoin. PoS resolves the two major drawbacks of Proof of Work: high energy cost and bias towards individuals with specialized hardware. In PoS, users must store large “hard-to-pebble” PTC graphs, which are recursively generated using subgraphs called superconcentrators. We implemented two types of superconcentrators to examine their differences in performance. Linear superconcentrators are about 1:8 times slower than butterfly superconcentrators, but provide a better lower bound on space consumption. Finally, we discuss our simulation of using PoS to reach consensus in a peer-to-peer network. We conclude that Proof of Space is indeed viable for distributed consensus. To the best of our knowledge, we are the first to implement linear superconcentrators and to simulate the use of PoS to reach consensus on a decentralized network.

120) Albert Yue, An Index-Type Invariant of Knot Diagrams and Bounds for Unknotting Framed Knots (3 Mar 2017)

We introduce a new knot diagram invariant called self-crossing index, or $\mathrm{SCI}$. We found that $\mathrm{SCI}$ changes by at most $\pm 1$ under framed Reidemeister moves, and specifically provides a lower bound for the number of 3 moves. We also found that $\mathrm{SCI}$ is additive under connected sums, and is a Vassiliev invariant of order 1. We also conduct similar calculations with Hass and Nowik's diagram invariant and cowrithe, and present a relationship between forward/backward, ascending/descending, and positive/negative 3 moves.

119) Valerie Zhang, Computer-Based Visualizations and Manipulations of Matching Paths (2 Mar 2017)

Given n points in the 2-D plane, a matching path is a path that starts at one of these n points and ends at a different one without going through any of the other n - 2 points. Matching paths, as well as an important operation called the Hurwitz move, come up naturally in the study of complex algebraic varieties. At the heart of the Hurwitz move is the twist operation, which “twists” one matching path along another to produce a new (third) matching path. Performing the twist operation by hand, however, is not only tedious but also prone to errors and unnecessary complications. Therefore, using computer-based methods to represent matching paths and perform the twist operation makes sense. In this project, which was coded in Java, computer-based methods are developed to perform the twist operation efficiently and accurately, providing a framework for visualizing and manipulating matching paths with computers. The computer program performs fast computations and represents matching paths as simply as possible in a simple visual interface. This program could be utilized when solving open problems in symplectic geometry: potential applications include characterizing the overtwistedness of contact manifolds, as well as better understanding braid group actions.

118) Harshal Sheth, Nihar Sheth, and Aashish Welling, Read-Copy Update in a Garbage Collected Environment (1 Mar 2017)

Read-copy update (RCU) is a synchronization mechanism that allows efficient parallelism when there are a high number of readers compared to writers. The primary use of RCU is in Linux, a highly popular operating system kernel. The Linux kernel is written in C, a language that is not garbage collected, and yet the functionality that RCU provides is effectively that of a “poor man’s garbage collector” (P. E. McKenney). RCU in C is also complicated to use, and this can lead to bugs. The purpose of this paper is to investigate whether RCU implemented in a garbage collected language (Go) is easier to use while delivering comparable performance to RCU in C. This is tested through the implementation and benchmarking of 4 linked lists, 2 using RCU and 2 using mutexes. One RCU linked list and one mutex linked list are implemented in each language. This paper finds that RCU in a garbage collected language is indeed significantly easier to use, has similar overall performance to, and on very high read loads, outperforms, RCU in C.

117) Xiangyao Yu (MIT), Siye Zhu (PRIMES), Justin Kaashoek (PRIMES), Andrew Pavlo (Carnegie Mellon University), and Srinivas Devadas (MIT), Taurus: A Parallel Transaction Recovery Method Based on Fine-Granularity Dependency Tracking (28 Feb 2017)

Logging is crucial to performance in modern multicore main-memory database management systems (DBMSs). Traditional data logging (ARIES) and command logging algorithms enforce a sequential order among log records using a global log sequence number (LSN). Log flushing and recovery after a crash are both performed in the LSN order. This serialization of transaction logging and recovery can limit the system performance at high core count. In this paper, we propose Taurus to break the LSN abstraction and enable parallel logging and recovery by tracking fine-grained dependencies among transactions. The dependency tracking lends Taurus three salient features. (1) Taurus decouples the transaction logging order with commit order and allows transactions to be flushed to persistent storage in parallel independently. Transactions that are persistent before commit can be discovered and ignored by the recovery algorithm using the logged dependency information. (2) Taurus can leverage multiple persistent devices for logging. (3) Taurus can leverage multiple devices and multiple worker threads for parallel recovery. Taurus improves logging and recovery parallelism for both data and command logging. .

116) Louis Golowich (PRIMES), Chiheon Kim (MIT), and Richard Zhou (PRIMES), Maximum Size of a Family of Pairwise Graph-Different Permutations (arXiv.org, 27 Feb 2017), published in The Electronic Journal of Combinatorics 24:4 (2017)

Two permutations of the vertices of a graph $G$ are called $G$-different if there exists an index $i$ such that $i$-th entry of the two permutations form an edge in $G$. We bound or determine the maximum size of a family of pairwise $G$-different permutations for various graphs $G$. We show that for all balanced bipartite graphs $G$ of order $n$ with minimum degree $n/2 - o(n)$, the maximum number of pairwise $G$-different permutations of the vertices of $G$ is $2^{(1-o(1))n}$. We also present examples of bipartite graphs $G$ with maximum degree $O(\log n)$ that have this property. We explore the problem of bounding the maximum size of a family of pairwise graph-different permutations when an unlimited number of disjoint vertices is added to a given graph. We determine this exact value for the graph of 2 disjoint edges, and present some asymptotic bounds relating to this value for graphs consisting of the union of $n/2$ disjoint edges.

115) Sathwik Karnik, On the Classification and Algorithmic Analysis of Carmichael Numbers (arXiv.org, 26 Feb 2017)

In this paper, we study the properties of Carmichael numbers, false positives to several primality tests. We provide a classification for Carmichael numbers with a proportion of Fermat witnesses of less than 50%, based on if the smallest prime factor is greater than a determined lower bound. In addition, we conduct a Monte Carlo simulation as part of a probabilistic algorithm to detect if a given composite number is Carmichael. We modify this highly accurate algorithm with a deterministic primality test to create a novel, more efficient algorithm that differentiates between Carmichael numbers and prime numbers.

114) Felix Wang, Functional equations in Complex Analysis and Number Theory (26 Feb 2017)

We study the following questions:
(1) What are all solutions to $f\circ \hat{f} = g\circ \hat{g}$ with $f,g,\hat{f},\hat{g}\in\mathbb{C}(X)$ being complex rational functions?
(2) For which rational functions $f(X)$ and $g(X)$ with rational coefficients does the equation $f(a)=g(b)$ have infinitely many solutions with $a,b\in$ $Q$?
We utilize various algebraic, geometric and analytic results in order to resolve both (1) and a variant of (2) in case the numerator of $f(X)-g(Y)$ is an irreducible polynomial in $\mathbb{C}[X,Y]$. Our results have applications in various mathematical fields, such as complex analysis, number theory, and dynamical systems. Our work resolves a 1973 question of Fried, and makes significant progress on a 1924 question of Ritt and a 1997 question of Lyubich and Minsky. In addition, we prove a quantitative refinement of a 2015 conjecture of Cahn, Jones and Spear.

113) Laura Pierson, Signatures of Stable Multiplicity Spaces in Restrictions of Representations of Symmetric Groups (25 Feb 2017)

Representation theory is a way of studying complex mathematical structures such as groups and algebras by mapping them to linear actions on vector spaces. Recently, Deligne proposed a new way to study the representation theory of finite groups by generalizing the collection of representations of a sequence of groups indexed by positive integer rank to an arbitrary complex rank, creating an abelian tensor category. In this project, we focused on the case of the symmetric groups $S_n,$ the groups of permutations of $n$ objects. Elements of the Deligne category Rep $S_t$ can be constructed by taking a stable sequence of $S_n$ representations for increasing $n$ and interpolating the associated formulas to an arbitrary complex number $t.$ In this project, we studied the case of restriction multiplicity spaces $V_{\lambda,\rho}$, counting the number of copies of an irreducible representation $V_{\rho}$ of $S_{n-k}$ in the restriction $\text{Res}_{S_{n-k}}^{S_n} V_{\lambda}$ of an irreducible representation of $S_n.$ We found formulas for norms of orthogonal basis vectors in these spaces, and ultimately for signatures (the number of basis vectors with positive norm minus the number with negative norm), an invariant that multiplies over tensor products and has important combinatorial connections.

112) Albert Gerovitch, Automatically Improving 3D Neuron Segmentations for Expansion Microscopy Connectomics (25 Feb 2017)

Understanding the geometry of neurons and their connections is key to comprehending brain function. This is the goal of a new optical approach to brain mapping using expansion microscopy (ExM), developed in the Boyden Lab at MIT to replace the traditional approach of electron microscopy. A challenge here is to perform image segmentation to delineate the boundaries of individual neurons. Currently, however, there is no method implemented for assessing a segmentation algorithm’s accuracy in ExM. The aim of this project is to create automated assessment of neuronal segmentation algorithms, enabling their iterative improvement. By automating the process, I aim to devise powerful segmentation algorithms that reveal the “connectome” of a neural circuit. I created software, called SEV-3D, which uses the pixel error and warping error metrics to assess 3D segmentations of single neurons. To allow better assessment beyond a simple numerical score, I visualized the results as a multilayered image. My program runs in a closed loop with a segmentation algorithm, modifying its parameters until the algorithm yields an optimal segmentation. I am further developing my application to enable evaluation of multi-cell segmentations. In the future, I aim to further implement the principles of machine learning to automatically improve the algorithms, yielding even better accuracy.

111) Kevin Chang, Upper Bounds for Ordered Ramsey Numbers of Small 1-Orderings (arXiv.org, 7 Feb 2017)

A $k$-ordering of a graph $G$ assigns distinct order-labels from the set $\{1,\ldots,|G|\}$ to $k$ vertices in $G$. Given a $k$-ordering $H$, the ordered Ramsey number $R_{<} (H)$ is the minimum $n$ such that every edge-2-coloring of the complete graph on the vertex set $\{1, \ldots, n\}$ contains a copy of $H$, the $i$th smallest vertex of which either has order-label $i$ in $H$ or no order-label in $H$.
This paper conducts the first systematic study of ordered Ramsey numbers for $1$-orderings of small graphs. We provide upper bounds for $R_{<} (H)$ for each connected $1$-ordering $H$ on $4$ vertices. Additionally, for every $1$-ordering $H$ of the $n$-vertex path $P_n$, we prove that $R_{<} (H) \in O(n)$. Finally, we provide an upper bound for the generalized ordered Ramsey number $R_{<} (K_n, H)$ which can be applied to any $k$-ordering $H$ containing some vertex with order-label $1$.

110) Nikhil Marda, On Equal Point Separation by Planar Cell Decompositions (arXiv.org, 17 Jan 2017)

In this paper, we investigate the problem of separating a set $X$ of points in $\mathbb{R}^{2}$ with an arrangement of $K$ lines such that each cell contains an asymptotically equal number of points (up to a constant ratio). We consider a property of curves called the stabbing number, defined to be the maximum countable number of intersections possible between the curve and a line in the plane. We show that large subsets of $X$ lying on Jordan curves of low stabbing number are an obstacle to equal separation. We further discuss Jordan curves of minimal stabbing number containing $X$. Our results generalize recent bounds on the Erdös-Szekeres Conjecture, showing that for fixed $d$ and sufficiently large $n$, if $|X| \ge 2^{c_dn/d + o(n)}$ with $c_d = 1 + O(\frac{1}{\sqrt{d}})$, then there exists a subset of $n$ points lying on a Jordan curve with stabbing number at most $d$.

109) Samuel Cohen and Peter Rowley, Results of Triangles Under Discrete Curve Shortening Flow (7 Jan 2017)

In this paper, we analyze the results of triangles under discrete curve shortening flow, specifically isosceles triangles with top angles greater than $\frac{\pi}{3}$, and scalene triangles. By considering the location of the three vertices of the triangle after some small time $\epsilon$, we use the definition of the derivative to calculate a system of differential equations involving parameters that can describe the triangle. Constructing phase plane diagrams and then analyzing them, we find that the singular behavior of discrete curve shorting flow on isosceles triangles with top angles greater than $\frac{\pi}{3}$ is a point, and for scalene triangles is a line segment.

108) Matthew Hase-Liu (PRIMES) and Nicholas Triantafillou (MIT), Efficient Point-Counting Algorithms for Superelliptic Curves (7 Jan 2017; arXiv.org, 7 Sep 2017)

In this paper, we present efficient algorithms for computing the number of points and the order of the Jacobian group of a superelliptic curve over finite fields of prime order p. Our method employs the Hasse-Weil bounds in conjunction with the Hasse-Witt matrix for superelliptic curves, whose entries we express in terms of multinomial coefficients. We present a fast algorithm for counting points on specific trinomial superelliptic curves and a slower, more general method for all superelliptic curves. For the first case, we reduce the problem of simplifying the entries of the Hasse-Witt matrix modulo p to a problem of solving quadratic Diophantine equations. For the second case, we extend Bostan et al.'s method for hyperelliptic curves to general superelliptic curves. We believe the methods we describe are asymptotically the most efficient known point-counting algorithms for certain families of trinomial superelliptic curves.

107) P.A. CrowdMath, Bounds on parameters of minimally non-linear patterns (arXiv.org, 31 Dec 2016), published in the Electronic Journal of Combinatorics 25:1 (2018)

Let $ex(n, P)$ be the maximum possible number of ones in any 0-1 matrix of dimensions $n \times n$ that avoids $P$. Matrix $P$ is called minimally non-linear if $ex(n, P) = \omega(n)$ but $ex(n, P') = O(n)$ for every strict subpattern $P'$ of $P$. We prove that the ratio between the length and width of any minimally non-linear 0-1 matrix is at most $4$, and that a minimally non-linear 0-1 matrix with $k$ rows has at most $5k-3$ ones. We also obtain an upper bound on the number of minimally non-linear 0-1 matrices with $k$ rows.
In addition, we prove corresponding bounds for minimally non-linear ordered graphs. The minimal non-linearity that we investigate for ordered graphs is for the extremal function $ex_{<}(n, G)$, which is the maximum possible number of edges in any ordered graph on $n$ vertices with no ordered subgraph isomorphic to $G$.

106) Seth Shelley-Abrahamson (MIT) and Alec Sun (PRIMES), Towards a Classification of Finite-Dimensional Representations of Rational Cherednik Algebras of Type D (arXiv.org, 15 Dec 2016)

Using a combinatorial description due to Jacon and Lecouvey of the wall crossing bijections for cyclotomic rational Cherednik algebras, we show that the irreducible representations $L_c(\lambda^\pm)$ of the rational Cherednik algebra $H_c(D_n, \mathbb{C}^n)$ of type $D$ for symmetric bipartitions $\lambda$ are infinite dimensional for all parameters $c$. In particular, all finite-dimensional irreducible representations of rational Cherednik algebras of type $D$ arise as restrictions of finite-dimensional irreducible representations of rational Cherednik algebras of type $B$.

105) Nicholas Guo (PRIMES) and Guangyi Yue (MIT), Counting Independent Sets in Graphs of Hyperplane Arrangements (arXiv.org, 13 Dec 2016), published in Discrete Mathematics , vol. 343:3 (March 2020)

In this paper, we count the number of independent sets of a type of graph $G(\mathcal{A},q)$ associated to some hyperplane arrangement $\mathcal{A}$, which is a generalization of the construction of graphical arrangements. We show that when the parameters of $\mathcal{A}$ satisfy certain conditions, the number of independent sets of the disjoint union $G(\mathcal{A},q_1)\cup\cdots\cup G(\mathcal{A},q_s)$ depends only on the coefficients of $\mathcal{A}$ and the total number of vertices $\sum_i q_i$ when $q_i$'s are powers of large enough prime numbers. In addition it is independent of the coefficients as long as $\mathcal{A}$ is central and the coefficients are multiplicatively independent.

112) Albert Gerovitch, Automatically Improving 3D Neuron Segmentations for Expansion Microscopy Connectomics (25 Feb 2017)

Understanding the geometry of neurons and their connections is key to comprehending brain function. This is the goal of a new optical approach to brain mapping using expansion microscopy (ExM), developed in the Boyden Lab at MIT to replace the traditional approach of electron microscopy. A challenge here is to perform image segmentation to delineate the boundaries of individual neurons. Currently, however, there is no method implemented for assessing a segmentation algorithm’s accuracy in ExM. The aim of this project is to create automated assessment of neuronal segmentation algorithms, enabling their iterative improvement. By automating the process, I aim to devise powerful segmentation algorithms that reveal the “connectome” of a neural circuit. I created software, called SEV-3D, which uses the pixel error and warping error metrics to assess 3D segmentations of single neurons. To allow better assessment beyond a simple numerical score, I visualized the results as a multilayered image. My program runs in a closed loop with a segmentation algorithm, modifying its parameters until the algorithm yields an optimal segmentation. I am further developing my application to enable evaluation of multi-cell segmentations. In the future, I aim to further implement the principles of machine learning to automatically improve the algorithms, yielding even better accuracy.

111) Kevin Chang, Upper Bounds for Ordered Ramsey Numbers of Small 1-Orderings (arXiv.org, 7 Feb 2017)

A $k$-ordering of a graph $G$ assigns distinct order-labels from the set $\{1,\ldots,|G|\}$ to $k$ vertices in $G$. Given a $k$-ordering $H$, the ordered Ramsey number $R_{<} (H)$ is the minimum $n$ such that every edge-2-coloring of the complete graph on the vertex set $\{1, \ldots, n\}$ contains a copy of $H$, the $i$th smallest vertex of which either has order-label $i$ in $H$ or no order-label in $H$.
This paper conducts the first systematic study of ordered Ramsey numbers for $1$-orderings of small graphs. We provide upper bounds for $R_{<} (H)$ for each connected $1$-ordering $H$ on $4$ vertices. Additionally, for every $1$-ordering $H$ of the $n$-vertex path $P_n$, we prove that $R_{<} (H) \in O(n)$. Finally, we provide an upper bound for the generalized ordered Ramsey number $R_{<} (K_n, H)$ which can be applied to any $k$-ordering $H$ containing some vertex with order-label $1$.

110) Nikhil Marda, On Equal Point Separation by Planar Cell Decompositions (arXiv.org, 17 Jan 2017)

In this paper, we investigate the problem of separating a set $X$ of points in $\mathbb{R}^{2}$ with an arrangement of $K$ lines such that each cell contains an asymptotically equal number of points (up to a constant ratio). We consider a property of curves called the stabbing number, defined to be the maximum countable number of intersections possible between the curve and a line in the plane. We show that large subsets of $X$ lying on Jordan curves of low stabbing number are an obstacle to equal separation. We further discuss Jordan curves of minimal stabbing number containing $X$. Our results generalize recent bounds on the Erdös-Szekeres Conjecture, showing that for fixed $d$ and sufficiently large $n$, if $|X| \ge 2^{c_dn/d + o(n)}$ with $c_d = 1 + O(\frac{1}{\sqrt{d}})$, then there exists a subset of $n$ points lying on a Jordan curve with stabbing number at most $d$.

109) Samuel Cohen and Peter Rowley, Results of Triangles Under Discrete Curve Shortening Flow (7 Jan 2017)

In this paper, we analyze the results of triangles under discrete curve shortening flow, specifically isosceles triangles with top angles greater than $\frac{\pi}{3}$, and scalene triangles. By considering the location of the three vertices of the triangle after some small time $\epsilon$, we use the definition of the derivative to calculate a system of differential equations involving parameters that can describe the triangle. Constructing phase plane diagrams and then analyzing them, we find that the singular behavior of discrete curve shorting flow on isosceles triangles with top angles greater than $\frac{\pi}{3}$ is a point, and for scalene triangles is a line segment.

108) Matthew Hase-Liu (PRIMES) and Nicholas Triantafillou (MIT), Efficient Point-Counting Algorithms for Superelliptic Curves (7 Jan 2017; arXiv.org, 7 Sep 2017)

In this paper, we present efficient algorithms for computing the number of points and the order of the Jacobian group of a superelliptic curve over finite fields of prime order p. Our method employs the Hasse-Weil bounds in conjunction with the Hasse-Witt matrix for superelliptic curves, whose entries we express in terms of multinomial coefficients. We present a fast algorithm for counting points on specific trinomial superelliptic curves and a slower, more general method for all superelliptic curves. For the first case, we reduce the problem of simplifying the entries of the Hasse-Witt matrix modulo p to a problem of solving quadratic Diophantine equations. For the second case, we extend Bostan et al.'s method for hyperelliptic curves to general superelliptic curves. We believe the methods we describe are asymptotically the most efficient known point-counting algorithms for certain families of trinomial superelliptic curves.

107) P.A. CrowdMath, Bounds on parameters of minimally non-linear patterns (arXiv.org, 31 Dec 2016), published in the Electronic Journal of Combinatorics 25:1 (2018)

Let $ex(n, P)$ be the maximum possible number of ones in any 0-1 matrix of dimensions $n \times n$ that avoids $P$. Matrix $P$ is called minimally non-linear if $ex(n, P) = \omega(n)$ but $ex(n, P') = O(n)$ for every strict subpattern $P'$ of $P$. We prove that the ratio between the length and width of any minimally non-linear 0-1 matrix is at most $4$, and that a minimally non-linear 0-1 matrix with $k$ rows has at most $5k-3$ ones. We also obtain an upper bound on the number of minimally non-linear 0-1 matrices with $k$ rows.
In addition, we prove corresponding bounds for minimally non-linear ordered graphs. The minimal non-linearity that we investigate for ordered graphs is for the extremal function $ex_{<}(n, G)$, which is the maximum possible number of edges in any ordered graph on $n$ vertices with no ordered subgraph isomorphic to $G$.

106) Seth Shelley-Abrahamson (MIT) and Alec Sun (PRIMES), Towards a Classification of Finite-Dimensional Representations of Rational Cherednik Algebras of Type D (arXiv.org, 15 Dec 2016)

Using a combinatorial description due to Jacon and Lecouvey of the wall crossing bijections for cyclotomic rational Cherednik algebras, we show that the irreducible representations $L_c(\lambda^\pm)$ of the rational Cherednik algebra $H_c(D_n, \mathbb{C}^n)$ of type $D$ for symmetric bipartitions $\lambda$ are infinite dimensional for all parameters $c$. In particular, all finite-dimensional irreducible representations of rational Cherednik algebras of type $D$ arise as restrictions of finite-dimensional irreducible representations of rational Cherednik algebras of type $B$.

105) Nicholas Guo (PRIMES) and Guangyi Yue (MIT), Counting Independent Sets in Graphs of Hyperplane Arrangements (arXiv.org, 13 Dec 2016), published in Discrete Mathematics , vol. 343:3 (March 2020)

In this paper, we count the number of independent sets of a type of graph $G(\mathcal{A},q)$ associated to some hyperplane arrangement $\mathcal{A}$, which is a generalization of the construction of graphical arrangements. We show that when the parameters of $\mathcal{A}$ satisfy certain conditions, the number of independent sets of the disjoint union $G(\mathcal{A},q_1)\cup\cdots\cup G(\mathcal{A},q_s)$ depends only on the coefficients of $\mathcal{A}$ and the total number of vertices $\sum_i q_i$ when $q_i$'s are powers of large enough prime numbers. In addition it is independent of the coefficients as long as $\mathcal{A}$ is central and the coefficients are multiplicatively independent.

104) Yatharth Agarwal (PRIMES), Vishnu Murale (PRIMES), Jason Hennessey (Boston University), Kyle Hogan (Boston University), and Mayank Varia (Boston University), Moving in Next Door: Network Flooding as a Side Channel in Cloud Environments (14-16 Nov 2016), published in Sara Foresti and Giuseppe Persiano, eds., Cryptology and Network Security: 15th International Conference Proceedings, CANS 2016, Milan, Italy, November 14–16, 2016 , pp. 755-760.

Co-locating multiple tenants’ virtual machines (VMs) on the same host underpins public clouds’ affordability, but sharing physical hardware also exposes consumer VMs to side channel attacks from adversarial co-residents. We demonstrate passive bandwidth measurement to perform traffic analysis attacks on co-located VMs. Our attacks do not assume a privileged position in the network or require any communication between adversarial and victim VMs. Using a single feature in the observed bandwidth data, our algorithm can identify which of 3 potential YouTube videos a co-resident VM streamed with 66 % accuracy. We discuss defense from both a cloud provider’s and a consumer’s perspective, showing that effective defense is difficult to achieve without costly under-utilization on the part of the cloud provider or over-utilization on the part of the consumer.

103) Dhruv Rohatgi, A Connection Between Vector Bundles over Smooth Projective Curves and Representations of Quivers (31 Oct 2016)

We create a partition bijection that yields a partial result on a recent conjecture by Schiffmann relating the problems of counting over a finite field (1) vector bundles over smooth projective curves, and (2) representations of quivers.

102) Aaron Yeiser (PRIMES) and Alex Townsend (Cornell University), A spectral element method for meshes with skinny elements (30 Oct 2016; arXiv.org, 27 Mar 2018)

When numerically solving partial differential equations (PDEs), the first step is often to discretize the geometry using a mesh and to solve a corresponding discretization of the PDE. Standard finite and spectral element methods require that the underlying mesh has no skinny elements for numerical stability. Here, we develop a novel spectral element method that is numerically stable on meshes that contain skinny elements, while also allowing for high degree polynomials on each element. Our method is particularly useful for PDEs for which anisotropic mesh elements are beneficial and we demonstrate it with a Navier--Stokes simulation. Code for our method can be found at this URL .

101) Tanya Khovanova (MIT) and Rafael Saavedra (PRIMES), Discreet Coin Weighings and the Sorting Strategy (arXiv.org, 23 Sep 2016)

In 2007, Alexander Shapovalov posed an old twist on the classical coin weighing problem by asking for strategies that manage to conceal the identities of specific coins while providing general information on the number of fake coins. In 2015, Diaco and Khovanova studied various cases of these "discreet strategies" and introduced the revealing factor, a measure of the information that is revealed.
In this paper we discuss a natural coin weighing strategy which we call the sorting strategy: divide the coins into equal piles and sort them by weight. We study the instances when the strategy is discreet, and given an outcome of the sorting strategy, the possible number of fake coins. We prove that in many cases, the number of fake coins can be any value in an arithmetic progression whose length depends linearly on the number of coins in each pile. We also show the strategy can be discreet when the number of fake coins is any value within an arithmetic subsequence whose length also depends linearly on the number of coins in each pile. We arrive at these results by connecting our work to the classic Frobenius coin problem. In addition, we calculate the revealing factor for the sorting strategy.

100) Kai-Siang Ang (PRIMES) and Laura P. Schaposnik (University of Illinois at Chicago), On the geometry of regular icosahedral capsids containing disymmetrons (arXiv.org, 29 Aug 2016), published in Journal of Structural Biology (19 Jan 2017)

Icosahedral virus capsids are composed of symmetrons, organized arrangements of capsomers. There are three types of symmetrons: disymmetrons, trisymmetrons, and pentasymmetrons, which have different shapes and are centered on the icosahedral 2-fold, 3-fold and 5-fold axes of symmetry, respectively. In 2010 [Sinkovits & Baker] gave a classification of all possible ways of building an icosahedral structure solely from trisymmetrons and pentasymmetrons, which requires the triangulation number T to be odd. In the present paper we incorporate disymmetrons to obtain a geometric classification of icosahedral viruses formed by regular penta-, tri-, and disymmetrons. For every class of solutions, we further provide formulas for symmetron sizes and parity restrictions on h, k, and T numbers. We also present several methods in which invariants may be used to classify a given configuration.

99) Tanya Khovanova (MIT) and Shuheng Niu (PRIMES), m -Modular Wythoff (arXiv.org, 2 Aug 2016)

We discuss a variant of Wythoff's Game, $m$-Modular Wythoff's Game, and identify the winning and losing positions for this game.

2015 Research Papers

98) Caleb Ji, Robin Park, and Angela Song, Combinatorial Games of No Strategy (20 Aug 2016)

In this paper, we study a particular class of combinatorial game motivated by previous research conducted by Professor James Propp, called Games of No Strategy , or games whose winners are predetermined. Finding the number of ways to play such games often leads to new combinatorial sequences and involves methods from analysis, number theory, and other fields. For the game Planted Brussel Sprouts , a variation on the well-known game Sprouts, we find a new proof that the number of ways to play is equal to the number of spanning trees on n vertices, and for Mozes’ Game of Numbers , a game studied for its interesting connections with other fields, we use prior work by Alon to calculate the number of ways to play the game for a certain case. Finally, in the game Binary Fusion , we show through both algebraic and combinatorial proofs that the number of ways to play generates Catalan’s triangle.

97) Meena Jagadeesan, The Exchange Graphs of Weakly Separated Collections (arXiv.org, 19 Aug 2016)

Weakly separated collections arise in the cluster algebra derived from the Pl\"ucker coordinates on the nonnegative Grassmannian. Oh, Postnikov, and Speyer studied weakly separated collections over a general Grassmann necklace $\mathcal{I}$ and proved the connectivity of every exchange graph. Oh and Speyer later introduced a generalization of exchange graphs that we call $\mathcal{C}$-constant graphs. They characterized these graphs in the smallest two cases. We prove an isomorphism between exchange graphs and a certain class of $\mathcal{C}$-constant graphs. We use this to extend Oh and Speyer's characterization of these graphs to the smallest four cases, and we present a conjecture on a bound on the maximal order of these graphs. In addition, we fully characterize certain classes of these graphs in the special cases of cycles and trees.

96) Nicholas Diaco, Counting Counterfeit Coins: A New Coin Weighing Problem (arXiv.org, 13 Jun 2016)

In 2007, a new variety of the well-known problem of identifying a counterfeit coin using a balance scale was introduced in the sixth International Kolmogorov Math Tournament. This paper offers a comprehensive overview of this new problem by presenting it in the context of the traditional coin weighing puzzle and then explaining what makes the new problem mathematically unique. Two weighing strategies described previously are used to derive lower bounds for the optimal number of admissible situations for given parameters. Additionally, a new weighing procedure is described that can be adapted to provide a solution for a broad spectrum of initial parameters by representing the number of counterfeit coins as a linear combination of positive integers. In closing, we offer a new form of the traditional counterfeit coin problem and provide a lower bound for the number of weighings necessary to solve it.

95) Jesse Geneson (MIT) and Meghal Gupta (PRIMES), Bounding extremal functions of forbidden 0-1 matrices using (r,s) -formations (19 Mar 2016)

First, we prove tight bounds of $n 2^{\frac{1}{(t-2)!}\alpha(n)^{t-2} \pm O(\alpha(n)^{t-3})}$ on the extremal function of the forbidden pair of ordered sequences $(1 2 3 \ldots k)^t$ and $(k \ldots 3 2 1)^t$ using bounds on a class of sequences called $(r,s)$-formations. Then, we show how an analogous method can be used to derive similar bounds on the extremal functions of forbidden pairs of $0-1$ matrices consisting of horizontal concatenations of identical identity matrices and their horizontal reflections.

94) Varun Jain, Novel Relationships Between Circular Planar Graphs and Electrical Networks (20 Feb 2016)

Circular planar graphs are used to model electrical networks, which arise in classical physics. Associated with such a network is a network response matrix, which carries information about how the network behaves in response to certain potential differences. Circular planar graphs can be organized into equivalence classes based upon these response matrices. In each equivalence class, certain fundamental elements are called critical. Additionally, it is known that equivalent graphs are related by certain local transformations. Using wiring diagrams, we first investigate the number of Y-∆ transformations required to transform one critical graph in an equivalence class into another, proving a quartic bound in the order of the graph. Next, we consider positivity phenomena, studying how testing the signs of certain circular minors can be used to determine if a given network response matrix is associated with a particular equivalence class. In particular, we prove a conjecture by Kenyon and Wilson for some cases.

93) Arthur Azvolinsky, Explicit Computations of the Frozen Boundaries of Rhombus Tilings of Polygonal Domains (12 Feb 2016)

Consider a polygonal domain $\Omega$ drawn on a regular triangular lattice. A rhombus tiling of $\Omega$ is defined as a complete covering of the domain with $60^{\textrm{o}}$-rhombi, where each one is obtained by gluing two neighboring triangles together. We consider a uniform measure on the set of all tilings of $\Omega$. As the mesh size of the lattice approaches zero while the polygon remains fixed, a random tiling approaches a deterministic limit shape. An important phenomenon that occurs with the convergence towards a limit shape is the formation of frozen facets ; that is, areas where there are asymptotically tiles of only one particular type. The sharp boundary between these ordered facet formations and the disordered region is a curve inscribed in $\Omega$. This inscribed curve is defined as the frozen boundary . The goal of this project was to understand the purely algebraic approach, elaborated on in a paper by Kenyon and Okounkov, to the problem of explicitly computing the frozen boundary. We will present our results for a number of special cases we considered.

92) David Amirault, Better Bounds on the Rate of Non-Witnesses of Lucas Pseudoprimes (3 Feb 2016)

Efficient primality testing is fundamental to modern cryptography for the purpose of key generation. Different primality tests may be compared using their runtimes and rates of non-witnesses. With the Lucas primality test, we analyze the frequency of Lucas pseudoprimes using MATLAB. We prove that a composite integer n can be a strong Lucas pseudoprime to at most 16 of parameters P , Q unless n belongs to a short list of exception cases, thus improving the bound from the previous result of 415 : We also explore the properties obeyed by such exceptions and how these cases may be handled by an extended version of the Lucas primality test.

91) Daniel Guo, An Infection Spreading Model on Binary Trees (26 Jan 2016)

An important and ongoing topic of research is the study of infectious diseases and the speed at which these diseases spread. Modeling the spread and growth of such diseases leads to a more precise understanding of the phenomenon and accurate predictions of spread in real life. We consider a long-range infection model on an infinite regular binary tree. Given a spreading coefficient $\alpha>1$, the time it takes for the infection to travel from one node to another node below it is exponentially distributed with specific rate functions such as $2^{-k}k^{-\alpha}$ or $\frac{1}{\alpha^k}$, where $k$ is the difference in layer number between the two nodes. We simulate and analyze the time needed for the infection to reach layer $m$ or below starting from the root node. The resulting time is recorded and graphed for different values of $\alpha$ and $m$. Finally, we prove rigorous lower and upper bounds for the infection time, both of which are approximately logarithmic with respect to $m$. The same techniques and results are valid for other regular $d$-ary trees, in which each node has exactly $d$ children where $d>2$.

90) Jacob Klegar, Bounded Tiling-Harmonic Functions on the Integer Lattice (25 Jan 2016)

Tiling-harmonic functions are a class of functions on square tilings that minimize a specific energy. These functions may provide a useful tool in studying square Sierpinski carpets. In this paper we show two new Maximum Modulus Principles for these functions, prove Harnack's Inequality, and give a proof that the set of tiling-harmonic functions is closed. One of these Maximum Modulus Principles is used to show that bounded infinite tiling-harmonic functions must have arbitrarily long constant lines. Additionally, we give three sufficient conditions for tiling-harmonic functions to be constant. Finally, we explore comparisons between tiling and graph-harmonic functions, especially in regards to oscillating boundary values.

89) Richard Yi, A Probability-Based Model of Traffic Flow (22 Jan 2016)

Describing the behavior of traffic via mathematical modeling and computer simulation has been a challenge confronted by mathematicians in various ways throughout the last century. In this project, we introduce various existing traffic flow models and present a new, probability-based model that is a hybrid of the microscopic and macroscopic views, drawing upon current ideas in traffic flow theory. We examine the correlations found in the data of our computer simulation. We hope that our results could help civil engineers implement efficient road systems that fit their needs, as well as contribute toward the design of safely operating unmanned vehicles.

88) Kenz Kallal, Matthew Lipman, and Felix Wang, Equal Compositions of Rational Functions (21 Jan 2016)

We study the following questions:
(1) What are all solutions to $f\circ \hat{f} = g\circ \hat{g}$ in complex rational functions $f,g\in\mathbb{C}(X)$ and meromorphic functions $\hat{f}, \hat{g}$ on the complex plane?
(2) For which rational functions $f(X)$ and $g(X)$ with coefficients in an algebraic number field $K$ does the equation $f(a)=g(b)$ have infinitely many solutions with $a,b\in K$?
We utilize various algebraic, geometric and analytic results in order to resolve both questions in the case that the numerator of $f(X)-g(Y)$ is an irreducible polynomial in $\mathbb{C}[X,Y]$ of sufficiently large degree. Our work answers a 1973 question of Fried in all but finitely many cases, and makes significant progress towards answering a 1924 question of Ritt and a 1997 question of Lyubich and Minsky.

87) Dhruv Medarametla, Bounding Norms of Locally Random Matrices (21 Jan 2016)

Recently, several papers proving lower bounds for the performance of the Sum Of Squares Hierarchy on the planted clique problem have come out. A crucial part of all four papers is probabilistically bounding the norms of certain \locally random" matrices. In these matrices, the entries are not completely independent of each other, but rather depend upon a few edges of the input graph. In this paper, we study the norms of these locally random matrices. We start by bounding the norms of simple locally random matrices, whose entries depend on a bipartite graph H and a random graph G ; we then generalize this result by bounding the norms of complex locally random matrices, matrices based o of a much more general graph H and a random graph G . For both cases, we prove almost-tight probabilistic bounds on the asymptotic behavior of the norms of these matrices.

86) Rachel Zhang, Statistics of Intersections of Curves on Surfaces (19 Jan 2016)

Each orientable surface with nonempty boundary can be associated with a planar model, whose edges can then be labeled with letters that read out a surface word. Then, the curve word of a free homotopy class of closed curves on a surface is the minimal sequence of edges of the planar model through which a curve in the class passes. The length of a class of curves is defined to be the number of letters in its curve word. We fix a surface and its corresponding planar model.
Fix a free homotopy class of curves ω on the surface. For another class of curves c , let i (ω; c ) be the minimal number of intersections of curves in ω and c . In this paper, we show that the mean of the distribution of i (ω; c ), for random curve c of length n , grows proportionally with n and approaches μ(ω) ⋅ n for a constant μ(ω). We also give an algorithm to compute μ(ω) and have written a program that calculates μ(ω) for any curve ω on any surface. In addition, we prove that i (ω; c ) approahces a Gaussian distribution as n → ∞ by viewing the generation of a random curve as a Markov Chain.

85) Cristian Gutu and Fengyao Ding, SecretRoom: An Anonymous Chat Client (16 Jan 2016)

While many people would like to be able to communicate anonymously, the few existing anonymous communication systems sacrifice anonymity for performance, or vice­versa. The most popular such app is Tor, which relies on a series of relays to protect anonymity. Though proven to be efficient, Tor does not guarantee anonymity in the presence of strong adversaries like ISPs and government agencies who can conduct in­depth traffic analysis. In contrast, our messaging application, SecretRoom, implements an improved version of a secure messaging protocol called Dining Cryptographers Networks (DC­Nets) to guarantee true anonymity in moderately sized groups. However, unlike traditional DC­Nets, SecretRoom does not require direct communication between all participants and does not depend on the presence of honest clients for anonymity. By introducing an untrusted server that performs the DC­Net protocol on behalf of the clients, SecretRoom manages to reduce the O( n 2 ) communication associated with traditional DC­Nets to O( n ) for n clients. Moreover, by introducing artificially intelligent clients, SecretRoom makes the anonymity set size independent of the number of “real” clients. Ultimately SecretRoom reduces the communication to O( n ) and allows the DC­Net protocol to scale to hundreds of clients compared to a few tens of clients in traditional DC­Nets.

84) Girishvar Venkat, Signatures of the Contravariant Form on Representations of the Hecke Algebra and Rational Cherednik Algebra associated to G ( r ,1, n ) (15 Jan 2016)

The Hecke algebra and rational Cherednik algebra of the group G ( r ,1, n ) are non-commutative algebras that are deformations of certain classical algebras associated to the group. These algebras have numerous applications in representation theory, number theory, algebraic geometry and integrable systems in quantum physics. Consequently, understanding their irreducible representations is important. If the deformation parameters are generic, then these irreducible representations, called Specht modules in the case of the Hecke algebra and Verma modules in the case of the Cherednik algebra, are in bijection with the irreducible representations of G ( r ,1, n ). However, while every irreducible representation of G ( r ,1, n ) is unitary, the Hermitian contravariant form on the Specht modules and Verma modules may only be non-degenerate. Thus, the signature of this form provides a great deal of information about the representations of the algebras that cannot be seen by looking at the group representations. In this paper, we compute the signature of arbitrary Specht modules of the Hecke algebra and use them to give explicit formulas of the parameter values for which these modules are unitary. We also compute asymptotic limits of existing formulas for the signature character of the polynomial representations of the Cherednik algebra which are vastly simpler than the full signature characters and show that these limits are rational functions in t . In addition, we show that for half of the parameter values, for each k , the degree k portion of the polynomial representation is unitary for large enough n .

83) Mehtaab Sawhney (PRIMES) and Jonathan Weed (MIT), Further results on arc and bar k-visibility graphs (arXiv.org, 6 Jan 2016)

We consider visibility graphs involving bars and arcs in which lines of sight can pass through up to k objects. We prove a new edge bound for arc k-visibility graphs, provide maximal constructions for arc and semi-arc k-visibility graphs, and give a complete characterization of semi-arc visibility graphs. We show that the family of arc i-visibility graphs is never contained in the family of bar j-visibility graphs for any i and j, and that the family of bar i-visibility graphs is not contained in the family of bar j-visibility graphs for $i \neq j$. We also give the first thickness bounds for arc and semi-arc k-visibility graphs. Finally, we introduce a model for random semi-bar and semi-arc k-visibility graphs and analyze its properties.

82) Harshal Sheth and Aashish Welling, An Implementation and Analysis of a Kernel Network Stack in Go with the CSP Style (30 Dec 2015; arXiv.org, 17 Mar 2016)

Modern operating system kernels are written in lower-level languages such as C. Although the low-level functionalities of C are often useful within kernels, they also give rise to several classes of bugs. Kernels written in higher level languages avoid many of these potential problems, at the possible cost of decreased performance. This research evaluates the advantages and disadvantages of a kernel written in a higher level language. To do this, the network stack subsystem of the kernel was implemented in Go with the Communicating Sequential Processes (CSP) style. Go is a high-level programming language that supports the CSP style, which recommends splitting large tasks into several smaller ones running in independent "threads". Modules for the major networking protocols, including Ethernet, ARP, IPv4, ICMP, UDP, and TCP, were implemented. In this study, the implemented Go network stack, called GoNet, was compared to a representative network stack written in C. The GoNet code is more readable and generally performs better than that of its C stack counterparts. From this, it can be concluded that Go with CSP style is a viable alternative to C for the language of kernel implementations.

81) Xiangyao Yu (MIT), Hongzhe Liu (PRIMES), Ethan Zou (PRIMES), and Srini Devadas (MIT), Tardis 2.0: An Optimized Time Traveling Coherence Protocol (arXiv.org, 27 Nov 2015), published in Proceedings of the 2016 International Conference on Parallel Architectures and Compilation (PACT '16), pp. 261-274.

The scalability of cache coherence protocols is a significant challenge in multicore and other distributed shared memory systems. Traditional snoopy and directory-based coherence protocols are difficult to scale up to many-core systems because of the overhead of broadcasting and storing sharers for each cacheline. Tardis, a recently proposed coherence protocol, shows potential in solving the scalability problem, since it only requires O(logN) storage per cacheline for an N-core system and needs no broadcasting support. The original Tardis protocol, however, only supports the sequential consistency memory model. This limits its applicability in real systems since most processors today implement relaxed consistency models like Total Store Order (TSO). Tardis also incurs large network traffic overhead on some benchmarks due to an excessive number of renew messages. Furthermore, the original Tardis protocol has suboptimal performance when the program uses spinning to communicate between threads. In this paper, we address these downsides of Tardis protocol and make it significantly more practical. Specifically, we discuss the architectural, memory system and protocol changes required in order to implement TSO consistency model on Tardis, and prove that the modified protocol satisfies TSO. We also propose optimizations for better leasing policies and to handle program spinning. Evaluated on 20 benchmarks, optimized Tardis at 64 (256) cores can achieve average performance improvement of 15.8% (8.4%) compared to the baseline Tardis and 1% (3.4%) compared to the baseline directory protocol. Our optimizations also reduce the average network traffic by 4.3% (6.1%) compared to the baseline directory protocol. On this set of benchmarks, optimized Tardis improves on a fullmap directory protocol in the metrics of energy, performance and storage, while being simpler to implement.

80) Allison Paul, Spectral Inference of a Directed Acyclic Graph Using Pairwise Similarities (11 Nov 2015)

A gene ontology graph is a directed acyclic graph (DAG) which represents relationships among biological processes. Inferring such a graph using a gene similarity matrix is NP-hard in general. Here, we propose an approximate algorithm to solve this problem efficiently by reducing the dimensionality of the problem using spectral clustering. We show that the original problem can be simplified to the inference problem of overlapping clusters in a network. We then solve the simplified problem in two steps: first we infer clusters using a spectral clustering technique. Then, we identify possible overlaps among the inferred clusters by identifying maximal cliques over the cluster similarity graph. We illustrate the effectiveness of our method over various synthetic networks in terms of both the performance and computational complexity compared to existing methods.

79) Niket Gowravaram , A Variation of nil-Temperley-Lieb Algebras of type A (26 Sep 2015)

We investigate a variation on the nil-Temperley-Lieb algebras of type A. This variation is formed by removing one of the relations and, in some sense, can be considered as a type B of the algebras. We give a general description of the structure of monomials formed by generators in the algebras. We also show that the dimension of these algebras is the sequence ${2n \choose n}$, by showing that the dimension is the Catalan transform of the sequence $2^n$.

78) Caleb Ji, Tanya Khovanova (MIT), Robin Park, and Angela Song, Chocolate Numbers (arXiv.org, 21 Sep 2015), published in Journal of Integer Sequences , vol. 19 (2016)

In this paper, we consider a game played on a rectangular $m \times n$ gridded chocolate bar. Each move, a player breaks the bar along a grid line. Each move after that consists of taking any piece of chocolate and breaking it again along existing grid lines, until just $mn$ individual squares remain.
This paper enumerates the number of ways to break an $m \times n$ bar, which we call chocolate numbers, and introduces four new sequences related to these numbers. Using various techniques, we prove interesting divisibility results regarding these sequences.

77) Albert Gerovitch, Andrew Gritsevskiy, and Gregory Barboy, Mobile Health Surveillance: The Development of Software Tools for Monitoring the Spread of Disease (21 Sep 2015)

Disease spread monitoring data often comes with a significant delay and low geospatial resolution. We aim to develop a software tool for data collection, which enables daily monitoring and prediction of the spread of disease in a small community. We have developed a crowdsourcing application that collects users' health statuses and locations. It allows users to update their daily status online, and, in return, provides a visual map of geospatial distribution of sick people in a community, outlining locations with increased disease incidence. Currently, due to the lack of a large user base, we substitute this information with simulated data, and demonstrate our program's capabilities on a hypothetical outbreak. In addition, we use analytical methods for predicting town-level disease spread in the future. We model the disease spread via interpersonal probabilistic interactions on an undirected social graph. The network structure is based on scale-free networks integrated with Census data. The epidemic is modeled using the Susceptible-Infected-Recovered (SIR) model and a set of parameters, including transmission rate and vaccination patterns. The developed application will provide better methods for early detection of epidemics, identify places with high concentrations of infected people, and predict localized disease spread.

76) Niket Gowravaram and Tanya Khovanova (MIT), On the Structure of nil-Temperley-Lieb Algebras of type A (arXiv.org, 1 Sep 2015)

We investigate nil-Temperley-Lieb algebras of type A. We give a general description of the structure of monomials formed by the generators. We also show that the dimensions of these algebras are the famous Catalan numbers by providing a bijection between the monomials and Dyck paths. We show that the distribution of these monomials by degree is the same as the distribution of Dyck paths by the sum of the heights of the peaks minus the number of peaks.

75) Tanya Khovanova (MIT) and Karan Sarkar, P-positions in Modular Extensions to Nim (arXiv.org, 27 Aug 2015), published in International Journal of Game Theory , vol. 46 (2017)

In this paper, we consider a modular extension to the game of Nim, which we call $m$-Modular Nim, and explore its optimal strategy. In $m$-Modular Nim, a player can either make a standard Nim move or remove a multiple of $m$ tokens in total. We develop a winning strategy for all $m$ with $2$ heaps and for odd $m$ with any number of heaps.

74) Nicholas Diaco and Tanya Khovanova (MIT), Weighing Coins and Keeping Secrets (arXiv.org, 20 Aug 2015), published in Mathematical Intelligencer (September 2016)

In this expository paper we discuss a relatively new counterfeit coin problem with an unusual goal: maintaining the privacy of, rather than revealing, counterfeit coins in a set of both fake and real coins. We introduce two classes of solutions to this problem --- one that respects the privacy of all the coins and one that respects the privacy of only the fake coins --- and give several results regarding each. We describe and generalize 6 unique strategies that fall into these two categories. Furthermore, we explain conditions for the existence of a solution, as well as showing proof of a solution's optimality in select cases. In order to quantify exactly how much information is revealed by a given solution, we also define the revealing factor and revealing coefficient; these two values additionally act as a means of comparing the relative effectiveness of different solutions. Most importantly, by introducing an array of new concepts, we lay the foundation for future analysis of this very interesting problem, as well as many other problems related to privacy and the transfer of information.

73) Luke Sciarappa, Simple commutative algebras in Deligne's categories Rep($S_t$) (arXiv.org, 24 Jun 2015)

We show that in the Deligne categories $\mathrm{Rep}(S_t)$ for $t$ a transcendental number, the only simple algebra objects are images of simple algebras in the category of representations of a symmetric group under a canonical induction functor. They come in families which interpolate the families of algebras of functions on the cosets of $H\times S_{n-k}$ in $S_n$, for a fixed subgroup $H$ of $S_k$.

2014 Research Papers

72) Geoffrey Fudenberg (Harvard), Maxim Imakaev (MIT), Carolyn Lu (PRIMES), Anton Goloborodko (MIT), Nezar Abdennur (MIT), and Leonid Mirny (MIT), Formation of Chromosomal Domains by Loop Extrusion (bioRxiv, 14 Aug 2015), published in Cell Reports 15:9 (31 May 2016): 2038–2049.

Characterizing how the three-dimensional organization of eukaryotic interphase chromosomes modulates regulatory interactions is an important contemporary challenge. Here we propose an active process underlying the formation of chromosomal domains observed in Hi-C experiments. In this process, cis-acting factors extrude progressively larger loops, but stall at domain boundaries; this dynamically forms loops of various sizes within but not between domains. We studied this mechanism using a polymer model of the chromatin fiber subject to loop extrusion dynamics. We find that systems of dynamically extruded loops can produce domains as observed in Hi-C experiments. Our results demonstrate the plausibility of the loop extrusion mechanism, and posit potential roles of cohesin complexes as a loop-extruding factor, and CTCF as an impediment to loop extrusion at domain boundaries.

71) Kavish Gandhi , Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube (4 April 2015)

A geodesic in the hypercube is the shortest possible path between two vertices. Leader and Long (2013) conjectured that, in every antipodal $2$-coloring of the edges of the hypercube, there exists a monochromatic geodesic between antipodal vertices. For this and an equivalent conjecture, we prove the cases $n = 2, 3, 4, 5$. We also examine the maximum number of monochromatic geodesics of length $k$ in an antipodal $2$-coloring and find it to be $2^{n-1}(n-k+1)\binom{n-1}{k-1}(k-1)!$. In this case, we classify all colorings in which this maximum occurs. Furthermore, we explore the maximum number of antipodal geodesics in a subgraph of the hypercube with a fixed proportion of edges, providing a conjectured optimal configuration as a lower bound, which, interestingly, contains a constant proportion of geodesics with respect to $n$. Finally, we present a series of smaller results that could be of use in finding an upper bound on the maximum number of antipodal geodesics in such a subgraph of the hypercube.

70) Jesse Geneson (MIT) and Peter M. Tian (PRIMES), Sequences of formation width $4$ and alternation length $5$ (arXiv.org, 13 Feb 2015)

Sequence pattern avoidance is a central topic in combinatorics. A sequence $s$ contains a sequence $u$ if some subsequence of $s$ can be changed into $u$ by a one-to-one renaming of its letters. If $s$ does not contain $u$, then $s$ avoids $u$. A widely studied extremal function related to pattern avoidance is $Ex(u, n)$, the maximum length of an $n$-letter sequence that avoids $u$ and has every $r$ consecutive letters pairwise distinct, where $r$ is the number of distinct letters in $u$.
We bound $Ex(u, n)$ using the formation width function, $fw(u)$, which is the minimum $s$ for which there exists $r$ such that any concatenation of $s$ permutations, each on the same $r$ letters, contains $u$. In particular, we identify every sequence $u$ such that $fw(u)=4$ and $u$ contains $ababa$. The significance of this result lies in its implication that, for every such sequence $u$, we have $Ex(u, n) = \Theta(n \alpha(n))$, where $\alpha(n)$ denotes the incredibly slow-growing inverse Ackermann function. We have thus identified the extremal function of many infinite classes of previously unidentified sequences.

69) William Wu (PRIMES), Nicolaas Kaashoek (PRIMES), Matthew Weinberg (MIT), Christos Tzamos (MIT), and Costis Daskalakis (MIT), Game Theory based Peer Grading Mechanisms for MOOCs , paper for the Learning at Scale 2015 conference , March 14-18, 2015, Vancouver, BC, Canada (4 February 2015)

An efficient peer grading mechanism is proposed for grading the multitude of assignments in online courses. This novel approach is based on game theory and mechanism design. A set of assumptions and a mathematical model is ratified to simulate the dominant strategy behavior of students in a given mechanism. A benchmark function accounting for grade accuracy and workload is established to quantitatively compare e ectiveness and scalability of various mechanisms. After multiple iterations of mechanisms under increasingly realistic assumptions, three are proposed: Calibration, Improved Calibration, and Deduction. The Calibration mechanism performs as predicted by game theory when tested in an online crowd-sourced experiment, but fails when students are assumed to communicate. The Improved Calibration mechanism addresses this assumption, but at the cost of more e ort spent grading. The Deduction mechanism performs relatively well in the benchmark, outperforming the Calibration, Improved Calibration, traditional automated, and traditional peer grading systems. The mathematical model and benchmark opens the way for future derivative works to be performed and compared.

68) Alexandria Yu , Towards the classification of unital 7-dimensional commutative algebras (19 Jan 2015)

An algebra is a vector space with a compatible product operation. An algebra is called commutative if the product of any two elements is independent of the order in which they are multiplied. A basic problem is to determine how many unital commutative algebras exist in a given dimension and to find all of these algebras. This classification problem has its origin in number theory and algebraic geometry. For dimension less than or equal to 6, Poonen has completely classified all unital commutative algebras up to isomorphism. For dimension greater than or equal to 7, the situation is much more complicated due to the fact that there are infinitely many algebras up to isomorphism. The purpose of this work is to develop new techniques to classify unital 7-dimensional commutative algebras up to isomorphism. An algebra is called local if there exists a unique maximal ideal m. Local algebras are basic building blocks for general algebras as any finite dimensional unital commutative algebra is isomorphic to a direct sum of finite dimensional unital commutative local algebras. Hence, in order to classify all finite dimensional unital commutative algebras, it suffices to classify all finite dimensional unital commutative local algebras. In this article, we classify all unital 7-dimensional commutative local algebras up to isomorphism with the exception of the special case k 1 = 3 and k 2 = 3, where, for each positive integer i , m i is the subalgebra generated by products of i elements in the maximal ideal m and k i is the dimension of the quotient algebra m i / m i+1 . When k 2 = 1, we classify all finite dimensional unital commutative local algebras up to isomorphism. As a byproduct of our classification theorems, we discover several new classes of unital finite dimensional commutative algebras.

67) Niket Gowravaram and Uma Roy , Diagrammatic Calculus of Coxeter and Braid Groups (arXiv.org, 15 Mar 2015)

We investigate a novel diagrammatic approach to examining strict actions of a Coxeter group or a braid group on a category. This diagrammatic language, which was developed in a series of papers by Elias, Khovanov and Williamson, provides new tools and methods to attack many problems of current interest in representation theory. In our research we considered a particular problem which arises in this context. To a Coxeter group $W$ one can associate a real hyperplane arrangement, and can consider the complement of these hyperplanes in the complexification $Y_W$. The celebrated $K(\pi,1)$ conjecture states that $Y_W$ should be a classifying space for the pure braid group, and thus a natural quotient ${Y_W}/{W}$ should be a classifying space for the braid group. Salvetti provided a cell complex realization of the quotient, which we refer to as the Salvetti complex. In this paper we investigate a part of the $K(\pi,1)$ conjecture, which we call the $K(\pi,1)$ conjecturette, that states that the second homotopy group of the Salvetti complex is trivial. In this paper we present a diagrammatic proof of the $K(\pi,1)$ conjecturette for a family of braid groups as well as an analogous result for several families of Coxeter groups.

66) Arjun Khandelwal, Compact dot representations in permutation avoidance (3 Mar 2015)

A paper by a Eriksson et. al (2001) introduced a new form of representing a permutation, referred to as the compact dot representation, with the goal of constructing a smaller superpattern. We study this representation and give bounds on its size. We also consider a variant of the problem, where limitations on the alphabet size are imposed, and obtain lower bounds. Lastly, we consider the Mobius function of the poset of permutations ordered by containment.

65) Suzy Lou and Max Murin, On the Strongly Regular Graph of Parameters (99, 14, 1, 2) (9 Jan 2015)

In an attempt to find a strongly regular graph of parameters (99; 14; 1; 2) or to disprove its existence, we studied its possible substructure and constructions.

64) Shashwat Kishore (PRIMES) and Augustus Lonergan (MIT), Signatures of Multiplicity Spaces in Tensor Products of sl 2 and U q ( sl 2 ) Representations (9 Jan 2015; arXiv.org, 8 Jun 2015)

We study multiplicity space signatures in tensor products of sl2 and U q ( sl 2 ) representations and their applications. We completely classify definite multiplicity spaces for generic tensor products of sl 2 Verma modules. This provides a classification of a family of unitary representations of a basic quantized quiver variety, one of the first such classifications for any quantized quiver variety. We use multiplicity space signatures to provide the first real critical point lower bound for generic sl 2 master functions. As a corollary of this bound, we obtain a simple and asymptotically correct approximation for the number of real critical points of a generic sl 2 master function. We obtain a formula for multiplicity space signatures in tensor products of finite dimensional simple U q ( sl 2 ) representations. Our formula also gives multiplicity space signatures in generic tensor products of sl 2 Verma modules and generic tensor products of real U q ( sl 2 ) Verma modules. Our results have relations with knot theory, statistical mechanics, quantum physics, and geometric representation theory.

63) Joseph Zurier, Generalizations of the Joints Problem (9 Jan 2015)

In this paper we explore generalizations of the joints problem introduced by B. Chazelle et al.

62) Nathan Wolfe (PRIMES), Ethan Zou (PRIMES), Ling Ren (MIT), and Xiangyao Yu (MIT), Optimizing Path ORAM for Cloud Storage Applications (arXiv.org, 8 Jan 2015)

We live in a world where our personal data are both valuable and vulnerable to misappropriation through exploitation of security vulnerabilities in online services. For instance, Dropbox, a popular cloud storage tool, has certain security flaws that can be exploited to compromise a user's data, one of which being that a user's access pattern is unprotected. We have thus created an implementation of Path Oblivious RAM (Path ORAM) for Dropbox users to obfuscate path access information to patch this vulnerability. This implementation differs significantly from the standard usage of Path ORAM, in that we introduce several innovations, including a dynamically growing and shrinking tree architecture, multi-block fetching, block packing and the possibility for multi-client use. Our optimizations together produce about a 77% throughput increase and a 60% reduction in necessary tree size; these numbers vary with file size distribution.

61) Brice Huang, Monomization of Power Ideals and Generalized Parking Functions (8 Jan 2015)

A power ideal is an ideal in a polynomial ring generated by powers of homogeneous linear forms. Power ideals arise in many areas of mathematics, including the study of zonotopes, approximation theory, and fat point ideals; in particular, their applications in approximation theory are relevant to work on splines and pertinent to mathematical modeling, industrial design, and computer graphics. For this reason, understanding the structure of power ideals, especially their Hilbert series, is an important problem. Unfortunately, due to the computational complexity of power ideals, this is a difficult problem. Only a few cases of this problem have been solved; efficient ways to compute the Hilbert series of a power ideal are known only for power ideals of certain forms. In this paper, we find an efficient way to compute the Hilbert series of a class of power ideals.

60) Kyle Gettig, Linear Extensions of Acyclic Orientations (7 Jan 2015)

Given a graph, an acyclic orientation of the edges determines a partial ordering of the vertices. This partial ordering has a number of linear extensions, i.e. total orderings of the vertices that agree with the partial ordering. The purpose of this paper is twofold. Firstly, properties of the orientation that induces the maximum number of linear extensions are investigated. Due to similarities between the optimal orientation in simple cases and the solution to the Max-Cut Problem, the possibility of a correlation is explored, though with minimal success. Correlations are then explored between the optimal orientation of a graph G and the comparability graphs with the minimum number of edges that contain G as a subgraph, as well as to certain graphical colorings induced by the orientation. Specifically, small cases of non-comparability graphs are investigated and compared to the known results for comparability graphs. We then explore the optimal orientation for odd anti-cycles and related graphs, proving that the conjectured orientations are optimal in the odd anti-cycle case. In the second part of this paper, the above concepts are extended to random graphs, that is, graphs with probabilities associated with each edge. New definitions and theorems are introduced to create a more intuitive system that agrees with the discrete case when all probabilities are 0 or 1, though complete results for this new system would be much more difficult to prove.

59) Shyam Narayanan , Improving the Speed and Accuracy of the Miller-Rabin Primality Test (7 Jan 2015)

In this paper, we discuss the accuracy of the Miller-Rabin Primality Test and the number of nonwitnesses for a composite odd integer n .

58) Peter M. Tian, Extremal Functions of Forbidden Multidimensional Matrices (7 Jan 2015)

We advance the extremal theory of matrices in two directions. The methods that we use come from combinatorics, probability, and analysis.

57) Eric Neyman, Cylindric Young Tableaux and their Properties (7 Jan 2015; earlier version on arXiv.org, 19 Oct 2014)

Cylindric Young tableaux are combinatorial objects that first appeared in the 1990s. A natural extension of the classical notion of a Young tableau, they have since been used several times, most notably by Gessel and Krattenthaler and by Alexander Postnikov. Despite this, relatively little is known about cylindric Young tableaux. This paper is an investigation of the properties of this object. In this paper, we extend the Robinson-Schensted-Knuth Correspondence, a well-known and very useful bijection concerning regular Young tableaux, to be a correspondence between pairs of cylindric tableaux. We use this correspondence to reach further results about cylindric tableaux. We then establish an interpretation of cylindric tableaux in terms of a game involving marble-passing. Next, we demonstrate a generic method to use results concerning cylindric tableaux in order to prove results about skew Young tableaux. We finish with a note on Knuth equivalence and its analog for cylindric tableaux.

56) Yilun Du , On the Algorithmic and Theoretical Exploration of Tiling-Harmonic Functions (6 Jan 2015)

In this paper, we explore a new class of harmonic functions defined on a tiling T , a square tiling of a region D , in C . We define these functions as tiling harmonic functions. We develop an efficient algorithm for computing interior values of tiling harmonic functions and graph harmonic functions in a tiling. Using our algorithm, we find that in general tiling harmonic functions are not generally equivalent to graph harmonic functions. In addition, we prove some theoretical results on the structure of tiling harmonic functions and classify one type of tiling harmonic function.

55) Jessica Li , On the Modeling of Snowflake Growth Using Hexagonal Automata (2 Jan 2015; arXiv.org , 8 May 2015; pubished (with Laura P. Schaposnik) in Physical Review E 93:2 (Feb. 2016) )

Snowflake growth is an example of crystallization, a basic phase transition in physics. Studying snowflake growth helps gain fundamental understanding of this basic process and may help produce better crystalline materials and benefit several major industries. The basic theoretical physical mechanisms governing the growth of snowflake are not well understood: whilst current computer modeling methods can generate snowflake images that successfully capture some basic features of actual snowflakes, so far there has been no analysis of these computer models in the literature, and more importantly, certain fundamental features of snowflakes are not well understood. A key challenge of analysis is that the snowflake growth models consist of a large set of partial difference equations, and as in many chaos theory problems, rigorous study is difficult. In this paper we analyze a popular model (Reiter’s model) using a combined approach of mathematical analysis and numerical simulation. We divide a snowflake image into main branches and side branches and define two new variables (growth latency and growth direction) to characterize the growth patterns. We derive a closed form solution of the main branch growth latency using a one dimensional linear model, and compare it with the simulation results using the hexagonal automata. We discover a few interesting patterns of the growth latency and direction of side branches. On the basis of the analysis and the principle of surface free energy minimization, we propose a new geometric rule to incorporate interface control, a basic mechanism of crystallization that is not taken into account in the original Reiter’s model.

54) Amy Chou and Justin Kaashoek, PuzzleJAR: Automated Constraint-based Generation of Puzzles of Varying Complexity (30 Sept 2014)

Engaging students in practicing a wide range of problems facilitates their learning. However, generating fresh problems that have specific characteristics, such as using a certain set of concepts or being of a given difficulty level, is a tedious task for a teacher. In this paper, we present PuzzleJAR, a system that is based on an iterative constraint-based technique for automatically generating problems. The PuzzleJAR system takes as parameters the problem definition, the complexity function, and domain-specific semantics-preserving transformations. We present an instantiation of our technique with automated generation of Sudoku and Fillomino puzzles, and we are currently extending our technique to generate Python programming problems. Since defining complexities of Sudoku and Fillomino puzzles is still an open research question, we developed our own mechanism to define complexity, using machine learning to generate a function for difficulty from puzzles with already known difficulties. Using this technique, PuzzleJAR generated over 200,000 Sudoku puzzles of different sizes (9x9, 16x16, 25x25) and over 10,000 Fillomino puzzles of sizes ranging from 2x2 to 16x16. .

53) Tanya Khovanova, Eric Nie, and Alok Puranik, The Sierpinski Triangle and The Ulam-Warburton Automaton (arXiv.org, 25 Aug 2014), published in Math Horizons (September 2015), reprinted in The Best Writing on Mathematics 2016

This paper is about the beauty of fractals and the surprising connections between them. We will explain the pioneering role that the Sierpinski triangle plays in the Ulam-Warburton automata and show you a number of pictures along the way.

52) Tanya Khovanova and Joshua Xiong, Cookie Monster Plays Games (arXiv.org, 6 July 2014), published in College Mathematics Journal 46:4 (2015): 283-293

We research a combinatorial game based on the Cookie Monster problem called the Cookie Monster game that generalizes the games of Nim and Wythoff. We also propose several combinatorial games that are in between the Cookie Monster game and Nim. We discuss properties of P-positions of all of these games.
Each section consists of two parts. The first part is a story presented from the Cookie Monster's point of view, the second part is a more abstract discussion of the same ideas by the authors.

51) Tanya Khovanova and Joshua Xiong, Nim Fractals (arXiv.org, 23 May 2014), published in Journal of Integer Sequences , Vol. 17 (2014)

We enumerate P-positions in the game of Nim in two different ways. In one series of sequences we enumerate them by the maximum number of counters in a pile. In another series of sequences we enumerate them by the total number of counters. We show that the game of Nim can be viewed as a cellular automaton, where the total number of counters divided by 2 can be considered as a generation in which P-positions are born. We prove that the three-pile Nim sequence enumerated by the total number of counters is a famous toothpick sequence based on the Ulam-Warburton cellular automaton. We introduce 10 new sequences.

50) Noah Golowich , Resolving a Conjecture on Degree of Regularity of Linear Homogeneous Equations (arXiv.org, 13 Apr 2014), published in The Electronic Journal of Combinatorics 21:3 (2014)

A linear equation is $r$-regular, if, for every $r$-coloring of the positive integers, there exist positive integers of the same color which satisfy the equation. In 2005, Fox and Radoićič conjectured that the equation $x_1 + 2x_2 + \cdots + 2^{n-2}x_{n-1} - 2^{n-1}x_n = 0$, for any $n \geq 2$, has a degree of regularity of $n-1$, which would verify a conjecture of Rado from 1933. Rado's conjecture has since been verified with a different family of equations. In this paper, we show that Fox and Radoićič's family of equations indeed have a degree of regularity of $n-1$. We also prove a few extensions of this result.

2013 Research Papers

49) Ritesh Ragavender , Odd Dunkl Operators and nilHecke Algebras (30 May 2014)

Symmetric functions appear in many areas of mathematics and physics, including enumerative combinatorics, the representation theory of symmetric groups, statistical mechanics, and the quantum statistics of ideal gases. In the commutative (or “even”) case of these symmetric functions, Kostant and Kumar introduced a nilHecke algebra that categorifies the quantum group U q ( sl 2 ) . This categorification helps to better understand Khovanov homology, which has important applications in studying knot polynomials and gauge theory. Recently, Ellis and Khovanov initiated the program of “oddification” as an effort to create a representation theoretic understanding of a new “odd” Khovanov homology, which often yields more powerful results than regular Khovanov homology. In this paper, we contribute to- wards the project of oddification by studying the odd Dunkl operators of Khongsap and Wang in the setting of the odd nilHecke algebra. Specifically, we show that odd divided difference operators can be used to construct odd Dunkl operators, which we use to give a representation of sl 2 on the algebra of skew polynomials and evaluate the odd Dunkl Laplacian. We then investigate q -analogs of divided difference operators to introduce new algebras that are similar to the even and odd nilHecke algebras and act on q -symmetric polynomials. We describe such algebras for all previously unstudied values of q . We conclude by generalizing a diagrammatic method and developing the novel method of insertion in order to study q -symmetric polynomials from the perspective of bialgebras.

48) Gabriella Studt , Construction of the higher Bruhat order on the Weyl group of type B (27 May 2014)

Manin and Schechtman defined the Bruhat order on the type A Weyl group, which is closely associated to the Symmetric group S n , as the order of all pairs of numbers in {1, 2, ..., n} . They proceeded to define a series of higher orders. Each higher order is an order on the subsets of {1, 2, ..., n} of size k , and can be computed using an inductive argument. It is also possible to define each of these higher orders explicitly, and therefore know conclusively the lexicographic orders for all k . It is thought that a closely related concept of lexicographic order exists for the Weyl group of type B, and that a similar method can be used to compute this series of higher orders. The applicability of this method is demonstrated in the paper, and we are able to determine and characterize the higher Bruhat order explicitly for certain n and k . We therefore conjecture the existence of such an order for all n > k ,as well as its accompanying properties.

47) Jeffrey Cai, Orbits of a fixed-point subgroup of the symplectic group on partial flag varieties of type A (24 May 2014)

In this paper we compute the orbits of the symplectic group Sp 2 n on partial flag varieties GL 2 n / P and on partial flag varieties enhanced by a vector space, C 2 n x GL 2 n / P . This extends analogous results proved by Matsuki on full flags. The general technique used in this paper is to take the orbits in the full flag case and determine which orbits remain distinct when the full flag variety GL 2 n / B is projected down to the partial flag variety GL 2 n / P .

The recent discovery of a connection between abstract algebra and the classical combinatorial Robinson-Schensted (RS) correspondence has sparked research on related algebraic structures and relationships to new combinatorial bijections, such as the Robinson- Schensted-Knuth (RSK) correspondence, the "mirabolic" RSK correspondence, and the "exotic" RS correspondence. We conjecture an exotic RSK correspondence between the or- bits described in this paper and semistandard bi-tableaux, which would yield an extension to the exotic RS correspondence found in a paper of Henderson and Trapa.

46) John Long , Evidence of Purifying Selection in Mammals (9 May 2014)

The Human Genome Project completed in 2003 gave us a reference genome for the human species. Before the project was completed, it was believed that the primary function of DNA was to code for protein. However, it was discovered that only 2% of the genome consists of regions that code for proteins. The remaining regions of the genome are either functional regions that regulate the coding regions or junk DNA regions that do nothing. The distinct ion between these two types of regions is not completely clear. Evidence of purifying selection, the decrease in frequency of deleterious mutations , is likely a sign that a region is functional. The goal of this project was to find evidence of purifying se lection in newly acquired regions in the human genome that are hypothesized to be functional. The mean Derived Allele Frequency of the featured regions was compared to that of control regions to determine the likelihood of selection.

45) Ravi Jagadeesan , A new Gal( Q /Q)-invariant of dessins d'enfants (arXiv.org, 30 March 2014)

We study the action of $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$ on the category of Belyi functions (finite, \'{e}tale covers of $\mathbb{P}^1_{\overline{\mathbb{Q}}}\setminus \{0,1,\infty\}$). We describe a new combinatorial $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$-invariant for a certain class of Belyi functions. As a corollary, we obtain that for all $k < 2^{\sqrt{\frac{2}{3}}}$ and all positive integers $N$, there is an $n \le N$ such that the set of degree $n$ Belyi functions of a particular rational Nielsen class must split into at least $\Omega\left(k^{\sqrt{N}}\right)$ Galois orbits. In addition, we define a new version of the Grothendieck-Teichm\"{u}ller group $\widehat{GT}$ into which $\operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q})$ embeds.

44) Andrey Grinshpun (MIT), Raj Raina (PRIMES), and Rik Sengupta (MIT), Minimum Degrees of Minimal Ramsey Graphs for Almost-Cliques (arXiv.org, 26 Jun 2014)

For graphs $F$ and $H$, we say $F$ is Ramsey for $H$ if every $2$-coloring of the edges of $F$ contains a monochromatic copy of $H$. The graph $F$ is Ramsey $H$-minimal if $F$ is Ramsey for $H$ and there is no proper subgraph $F'$ of $F$ so that $F'$ is Ramsey for $H$. Burr, Erdös, and Lovasz defined $s(H)$ to be the minimum degree of $F$ over all Ramsey $H$-minimal graphs $F$. Define $H_{t,d}$ to be a graph on $t+1$ vertices consisting of a complete graph on $t$ vertices and one additional vertex of degree $d$. We show that $s(H_{t,d})=d^2$ for all values $1<d\le t$; it was previously known that $s(H_{t,1})=t-1$, so it is surprising that $s(H_{t,2})=4$ is much smaller.
We also make some further progress on some sparser graphs. Fox and Lin observed that $s(H)\ge 2\delta(H)-1$ for all graphs $H$, where $\delta(H)$ is the minimum degree of $H$; Szabo, Zumstein, and Zurcher investigated which graphs have this property and conjectured that all bipartite graphs $H$ without isolated vertices satisfy $s(H)=2\delta(H)-1$. Fox, Grinshpun, Liebenau, Person, and Szabo further conjectured that all triangle-free graphs without isolated vertices satisfy this property. We show that $d$-regular $3$-connected triangle-free graphs $H$, with one extra technical constraint, satisfy $s(H) = 2\delta(H)-1$; the extra constraint is that $H$ has a vertex $v$ so that if one removes $v$ and its neighborhood from $H$, the remainder is connected.

43) Boryana Doyle (PRIMES), Geoffrey Fudenberg (Harvard), Maxim Imakaev (MIT), and Leonid Mirny (MIT), Chromatin Loops as Allosteric Modulators of Enhancer-Promoter Interactions , published in PLoS Computational Biology (23 Oct 2014; earlier version in BioRxiv.org, 26 February 2014)

The classic model of eukaryotic gene expression requires direct spatial contact between a distal enhancer and a proximal promoter. Recent Chromosome Conformation Capture (3C) studies show that enhancers and promoters are embedded in a complex network of looping interactions. Here we use a polymer model of chromatin fiber to investigate whether, and to what extent, looping interactions between elements in the vicinity of an enhancer-promoter pair can influence their contact frequency. Our equilibrium polymer simulations show that a chromatin loop, formed by elements flanking either an enhancer or a promoter, suppresses enhancer-promoter interactions, working as an insulator. A loop formed by elements located in the region between an enhancer and a promoter, on the contrary, facilitates their interactions. We find that different mechanisms underlie insulation and facilitation; insulation occurs due to steric exclusion by the loop, and is a global effect, while facilitation occurs due to an effective shortening of the enhancer-promoter genomic distance, and is a local effect. Consistently, we find that these effects manifest quite differently for in silico 3C and microscopy. Our results show that looping interactions that do not directly involve an enhancer-promoter pair can nevertheless significantly modulate their interactions. This phenomenon is analogous to allosteric regulation in proteins, where a conformational change triggered by binding of a regulatory molecule to one site affects the state of another site.

42) William Kuszmaul , A New Approach to Enumerating Statistics Modulo n (arXiv.org, 16 February 2014)

We find a new approach to computing the remainder of a polynomial modulo $x^n-1$; such a computation is called modular enumeration. Given a polynomial with coefficients from a commutative $\mathbb{Q}$-algebra, our first main result constructs the remainder simply from the coefficients of residues of the polynomial modulo $\Phi_d(x)$ for each $d\mid n$. Since such residues can often be found to have nice values, this simplifies a number of modular enumeration problems; indeed in some cases, such residues are already known while the related modular enumeration problem has remained unsolved. We list six such cases which our technique makes easy to solve. Our second main result is a formula for the unique polynomial $a$ such that $a \equiv f \mod \Phi_n(x)$ and $a\equiv 0 \mod x^d-1$ for each proper divisor $d$ of $n$.

We find a formula for remainders of $q$-multinomial coefficients and for remainders of $q$-Catalan numbers modulo $q^n-1$, reducing each problem to a finite number of cases for any fixed $n$. In the prior case, we solve an open problem posed by Hartke and Radcliffe. In considering $q$-Catalan numbers modulo $q^n-1$, we discover a cyclic group operation on certain lattice paths which behaves predictably with regard to major index. We also make progress on a problem in modular enumeration on subset sums posed by Kitchloo and Pachter.

41) Ajay Saini , Predictive Modeling of Opinion and Connectivity Dynamics in Social Networks (26 January 2014)

Social networks have been extensively studied in recent years with the aim of understanding how the connectivity of different societies and their subgroups influences the spread of innovations and opinions through human networks. Using data collected from real-world social networks, researchers are able to gain a better understanding of the dynamics of such networks and subsequently model the changes that occur in these networks over time. In our work, we use data from the Social Evolution dataset of the MIT Human Dynamics Lab to develop a data-driven model capable of predicting the trends and long term changes observed in a real- world social network. We demonstrate the effectiveness of the model by predicting changes in both opinion spread and connectivity that reflect the changes observed in our dataset. After validating the model, we use it to understand how different types of social networks behave over time by varying the conditions governing the change of opinions and connectivity. We conclude with a study of opinion propagation under different conditions in which we use the structure and opinion distribution of various networks to identify sets of agents capable of propagating their opinion throughout an entire network. Our results demonstrate the effectiveness of the proposed modeling approach in predicting the future state of social networks and provide further insight into the dynamics of interactions between agents in real-world social networks.

40) Rohil Prasad, Investigating GCD in Z[√ 2 ] (1 1 January 2014)

We attempt to optimize the time needed to calculate greatest common divisors in the Euclidean domain Z[√ 2 ].

39) Jin-Woo Bryan Oh , Towards Generalizing Thrackles to Arbitrary Graphs (1 January 2014)

In the 1950s, John Conway came up with the notion of thrackles , graphs with embeddings in which no edge crosses itself, but every pair of distinct edges intersects each other exactly once. He conjectured that |E(G)| ≤ |V(G)| for any thrackle G, a question unsolved to this day. In this paper, we discuss some of the known properties of thrackles and contribute a few new ones.

Only a few sparse graphs can be thrackles, and so it is of interest to find an analogous notion that applies to denser graphs as well. In this paper we introduce a generalized version of thrackles called near-thrackles , and prove some of their properties. We also discuss a large number of conjectures about them which seem very obvious but nonetheless are hard to prove. In the final section, we introduce thrackleability , a number between 0 and 1 that turns out to be an accurate measure of how far away a graph is from being a thrackle..

38) Junho Won , Lower bounds for the Crossing Number of the Cartesian Product of a Vertex-transitive Graph with a Cycle (1 January 2014)

The minimum number of crossings for all drawings of a given graph $G$ on a plane is called its crossing number, denoted $cr(G)$. Exact crossing numbers are known only for a few families of graphs, and even the crossing number of a complete graph $K_m$ is not known for all $m$. Wenping et al. showed that $cr(K_m\Box C_n)\geqslant n\cdot cr(K_{m+2})$ for $n\geqslant 4$ and $m\geqslant 4$. We adopt their method to find a lower bound for $cr(G\Box C_n)$ where $G$ is a vertex-transitive graph of degree at least 3. We also suggest some particular vertex-transitive graphs of interest, and give two corollaries that give lower bounds for $cr(G\Box C_n)$ in terms of $n$, $cr(G)$, the number of vertices of $G$, and the degree of $G$, which improve on Wenping et al.'s result.

37) Ying Gao, On an Extension of Stanley Depth for Refinement-Ordered Posets (30 December 2013)

The concept of Stanley depth was originally defined for graded modules over commutative rings in 1982 by Richard P. Stanley. However, in 2009 Herzog, Vladiou, and Zheng found a property, ndepth, of posets analogous to the Stanley depths of certain modules, which provides an important link between combinatorics and commutative algebra. Due to this link, there arises the question of what this ndepth is for certain classes of posets.

Because ndepth was only recently defined, much remains to be discovered about it. In 2009, Biro, Howard, Keller, Trotter and Young found a lower bound for the ndepth of the poset of nonempty subsets of {1; 2; ...; n} ordered by inclusion. In 2010, Wang calculated the ndepth of the product of chains n k \ 0. However, ndepth has yet to be studied in relation to many other commonly found classes of posets. We chose to research the properties of the ndepths of one such well-known class of posets - the posets which consist of non-empty partitions of sets ordered by refinement, which we denote as G i .

We use combinatorial and algebraic methods to find the ndepths for small posets in G i . We show that for posets of increasing size in G i , new depth is strictly non-decreasing, and furthermore we show that ndepth[G i ] ≥ [8i/29] for all i. We also find that for all i, ndepth[G i ] ≤ i through the proof that ndepth[G i+1 ] ≤ ndepth[G i ] + 1.

36) Nihal Gowravaram , Enumeration of Subclasses of (2+2)-free Partially Ordered Sets (26 December 2013)

We investigate avoidance in (2+2)-free partially ordered sets, posets that do not contain any induced subposet isomorphic to the union of two disjoint chains of length two. In particular, we are interested in enumerating the number of partially ordered sets of size N avoiding both 2+2 and some other poset α. For any α of size 3, the results are already well-known. However, out of the 15 such α of size 4, only 2 were previously known. Through the course of this paper, we explicitly enumerate 7 other such α of size 4. Also, we consider the avoidance of three posets simultaneously, 2+2 along with some pair (α,β); it turns out that this enumeration is often clean, and has sometimes surprising results. Furthermore, we turn to the question of Wilf-equivalences in (2+2)-free posets. We show such an equivalence between the Y-shaped and chain posets of size 4 via a direct bijection, and in fact, we extend this to show a Wilf-equivalence between the general chain poset and a general Y-shaped poset of the same size. In this paper, while our focus is on enumeration, we also seek to develop an understanding of the structures of the posets in the subclasses we are studying.

35) Yael Fregier (MIT) and Isaac Xia, Lower Central Series Ideal Quotients Over $\mathbb{F}_p$ and $\mathbb{Z}$ (17 November 2013; arXiv.org, 28 Jun 2015)

Given a graded associative algebra $A$, its lower central series is defined by $L_1 = A$ and $L_{i+1} = [L_i, A]$. We consider successive quotients $N_i(A) = M_i(A) / M_{i+1}(A)$, where $M_i(A) = AL_i(A) A$. These quotients are direct sums of graded components. Our purpose is to describe the $\mathbb{Z}$-module structure of the components; i.e., their free and torsion parts. Following computer exploration using MAGMA , two main cases are studied. The first considers $A = A_n / (f_1,\dots, f_m)$, with $A_n$ the free algebra on $n$ generators $\{x_1, \ldots, x_n\}$ over a field of characteristic $p$. The relations $f_i$ are noncommutative polynomials in $x_j^{p^{n_j}},$ for some integers $n_j$. For primes p > 2 , we prove that $p^{\sum n_j} \mid \text{dim}(N_i(A))$. Moreover, we determine polynomials dividing the Hilbert series of each $N_i(A)$. The second concerns $A = \mathbb{Z} \langle x_1, x_2, \rangle / (x_1^m, x_2^n)$. For $i = 2,3$, the bigraded structure of $N_i(A_2)$ is completely described.

34) Steven Homberg , Finding Enrichments of Functional Annotations for Disease- Associated Single-Nucleotide Polymorphisms (10 November 2013)

Computational analysis of SNP-disease associations from GWAS as well as functional annotations of the genome enables the calculation of a SNP set's enrichment for a disease. These statistical enrichments can be and are calculated with a variety of statistical techniques, but there is no standard statistical method for calculating enrichments. Several entirely different tests are used by different investigators in the field. These tests can also be conducted with several variations in parameters which also lack a standard. In our investigation, we develop a computational tool for conducting various enrichment calculations and, using breast cancer-associated SNPs from a GWAS catalog as a foreground against all GWAS SNPs as a background, test the tool and analyze the relative performance of the various tests. The computational tool will soon be released to the scientific community as a part of the Bioconductor package. Our analysis shows that, for R2 threshold in LD block construction, values around 0.8-0.9 are preferable to those with more lax and more strict thresholds respectively. We find that block-matching tests yield better results than peak-shifting tests. Finally, we find that, in block-matching tests, block tallying using binary scoring, noting whether or not a block has an annotation only, yields the most meaningful results, while weighting LD r2 threshold has no influence.

33) Kavish Gandhi , Noah Golowich , and László Miklós Lovász, Degree of Regularity of Linear Homogeneous Equations (arXiv.org, 27 Sept 2013), published in Journal of Combinatorics 5:2 (2014)

We define a linear homogeneous equation to be strongly r-regular if, when a finite number of inequalities is added to the equation, the system of the equation and inequalities is still r-regular. In this paper, we derive a constraint on the coefficients of a linear homogeneous equation that gives a sufficient condition for the equation to be strongly r-regular. In 2009, Alexeev and Tsimerman introduced a family of equations, each of which is (n-1)-regular but not n-regular, verifying a conjecture of Rado from 1933. We show that these equations are actually strongly (n-1)-regular as a corollary of our results.

32) Leigh Marie Braswell and Tanya Khovanova, On the Cookie Monster Problem (arXiv.org, 23 Sept 2013), published in Jennifer Beineke & Jason Rosenhouse, The Mathematics of Various Entertaining Subjects: Research in Recreational Math (Princeton University Press, 2015).

The Cookie Monster Problem supposes that the Cookie Monster wants to empty a set of jars filled with various numbers of cookies. On each of his moves, he may choose any subset of jars and take the same number of cookies from each of those jars. The Cookie Monster number of a set is the minimum number of moves the Cookie Monster must use to empty all of the jars. This number depends on the initial distribution of cookies in the jars. We discuss bounds of the Cookie Monster number and explicitly find the Cookie Monster number for jars containing cookies in the Fibonacci, Tribonacci, n-nacci, and Super-n-nacci sequences. We also construct sequences of k jars such that their Cookie Monster numbers are asymptotically rk, where r is any real number between 0 and 1 inclusive.

31) Vahid Fazel-Rezai, Equivalence Classes of Permutations Modulo Replacements Between 123 and Two-Integer Patterns (arXiv.org, 18 Sept 2013), published in The Electronic Journal of Combinatorics 21:2 (2014)

We explore a new type of replacement of patterns in permutations, suggested by James Propp, that does not preserve the length of permutations. In particular, we focus on replacements between 123 and a pattern of two integer elements. We apply these replacements in the classical sense; that is, the elements being replaced need not be adjacent in position or value. Given each replacement, the set of all permutations is partitioned into equivalence classes consisting of permutations reachable from one another through a series of bi-directional replacements. We break the eighteen replacements of interest into four categories by the structure of their classes and fully characterize all of their classes.

30) Jesse Geneson (MIT), Rohil Prasad (PRIMES), and Jonathan Tidor (PRIMES), Bounding sequence extremal functions with formations (arXiv.org, 17 Aug 2013), published in The Electronic Journal of Combinatorics 21:3 (2014)

An $(r, s)$-formation is a concatenation of $s$ permutations of $r$ letters. If $u$ is a sequence with $r$ distinct letters, then let $\mathit{Ex}(u, n)$ be the maximum length of any $r$-sparse sequence with $n$ distinct letters which has no subsequence isomorphic to $u$. For every sequence $u$ define $\mathit{fw}(u)$, the formation width of $u$, to be the minimum $s$ for which there exists $r$ such that there is a subsequence isomorphic to $u$ in every $(r, s)$-formation. We use $\mathit{fw}(u)$ to prove upper bounds on $\mathit{Ex}(u, n)$ for sequences $u$ such that $u$ contains an alternation with the same formation width as $u$.
We generalize Nivasch's bounds on $\mathit{Ex}((ab)^{t}, n)$ by showing that $\mathit{fw}((12 \ldots l)^{t})=2t-1$ and $\mathit{Ex}((12\ldots l)^{t}, n) =n2^{\frac{1}{(t-2)!}\alpha(n)^{t-2}\pm O(\alpha(n)^{t-3})}$ for every $l \geq 2$ and $t\geq 3$, such that $\alpha(n)$ denotes the inverse Ackermann function. Upper bounds on $\mathit{Ex}((12 \ldots l)^{t} , n)$ have been used in other papers to bound the maximum number of edges in $k$-quasiplanar graphs on $n$ vertices with no pair of edges intersecting in more than $O(1)$ points.
If $u$ is any sequence of the form $a v a v' a$ such that $a$ is a letter, $v$ is a nonempty sequence excluding $a$ with no repeated letters and $v'$ is obtained from $v$ by only moving the first letter of $v$ to another place in $v$, then we show that $\mathit{fw}(u)=4$ and $\mathit{Ex}(u, n) =\Theta(n\alpha(n))$. Furthermore we prove that $\mathit{fw}(abc(acb)^{t})=2t+1$ and $\mathit{Ex}(abc(acb)^{t}, n) = n2^{\frac{1}{(t-1)!}\alpha(n)^{t-1}\pm O(\alpha(n)^{t-2})}$ for every $t\geq 2$.

29) Jesse Geneson (MIT), Tanya Khovanova (MIT), and Jonathan Tidor (PRIMES), Convex geometric (k+2)-quasiplanar representations of semi-bar k-visibility graphs (arXiv.org, 3 Jul 2013), published in Discrete Mathematics 331 (2014)

We examine semi-bar visibility graphs in the plane and on a cylinder in which sightlines can pass through k objects. We show every semi-bar k-visibility graph has a (k+2)-quasiplanar representation in the plane with vertices drawn as points in convex position and edges drawn as segments. We also show that the graphs having cylindrical semi-bar k-visibility representations with semi-bars of different lengths are the same as the (2k+2)-degenerate graphs having edge-maximal (k+2)-quasiplanar representations in the plane with vertices drawn as points in convex position and edges drawn as segments.

28) Leigh Marie Braswell and Tanya Khovanova, Cookie Monster Devours Naccis (arXiv.org, 18 May 2013), published in the College Mathematics Journal 45:2 (2014)

In 2002, Cookie Monster appeared in The Inquisitive Problem Solver . The hungry monster wants to empty a set of jars filled with various numbers of cookies. On each of his moves, he may choose any subset of jars and take the same number of cookies from each of those jars. The Cookie Monster number is the minimum number of moves Cookie Monster must use to empty all of the jars. This number depends on the initial distribution of cookies in the jars. We discuss bounds of the Cookie Monster number and explicitly find the Cookie Monster number for Fibonacci, Tribonacci and other nacci sequences.

2012 Research Papers

27) William Kuszmaul and Ziling Zhou, Equivalence classes in S n for three families of pattern-replacement relations (arXiv.org, 20 April 2013)

We study a family of equivalence relations in S n , the group of permutations on n letters, created in a manner similar to that of the Knuth relation and the forgotten relation. For our purposes, two permutations are in the same equivalence class if one can be reached from the other through a series of pattern-replacements using patterns whose order permutations are in the same part of a predetermined partition of S c . In particular, we are interested in the number of classes created in S n by each relation and in characterizing these classes. Imposing the condition that the partition of S c has one nontrivial part containing the cyclic shifts of a single permutation, we find enumerations for the number of nontrivial classes. When the permutation is the identity, we are able to compare the sizes of these classes and connect parts of the problem to Young tableaux and Catalan lattice paths. Imposing the condition that the partition has one nontrivial part containing all of the permutations in S c beginning with 1, we both enumerate and characterize the classes in S n . We do the same for the partition that has two nontrivial parts, one containing all of the permutations in S c beginning with 1, and one containing all of the permutations in S c ending with 1.

26) William Kuszmaul , Counting permutations modulo pattern-replacement equivalences for three-letter patterns (arXiv.org, 20 April 2013), published in the Electronic Journal of Combinatorics 20:4 (2013)

We study a family of equivalence relations in S n , the group of permutations on n letters, created in a manner similar to that of the Knuth relation and the forgotten relation. For our purposes, two permutations are in the same equivalence class if one can be reached from the other through a series of pattern-replacements using patterns whose order permutations are in the same part of a predetermined partition of S c . When the partition is of S 3 and has one nontrivial part of size greater than two, we provide formulas for the number of classes created in all unresolved cases. When the partition is of S 3 and has two nontrivial parts, each of size two (as do the Knuth and forgotten relations), we enumerate the classes for 13 of the 14 unresolved cases. In two of these cases, enumerations arise which are the same as those yielded by the Knuth and forgotten relations. The reasons for this phenomenon are still largely a mystery.

25) Tanya Khovanova and Ziv Scully , Efficient Calculation of Determinants of Symbolic Matrices with Many Variables (arXiv.org, 13 April 2013)

Efficient matrix determinant calculations have been studied since the 19th century. Computers expand the range of determinants that are practically calculable to include matrices with symbolic entries. However, the fastest determinant algorithms for numerical matrices are often not the fastest for symbolic matrices with many variables. We compare the performance of two algorithms, fraction-free Gaussian elimination and minor expansion, on symbolic matrices with many variables. We show that, under a simplified theoretical model, minor expansion is faster in most situations. We then propose optimizations for minor expansion and demonstrate their effectiveness with empirical data.

24) Michael Zanger-Tishler and Saarik Kalia , On the Winning and Losing Parameters of Schmidt's Game (8 April 2013)

First introduced by Wolfgang Schmidt, the ( α , β )-game and its modifications have been shown to be a powerful tool in Diophantine approximation, metric number theory, and dynamical systems. However, natural questions about the winning-losing parameters of most sets have not been studied thoroughly even after more than 40 years. There are a few results in the literature showing that some non-trivial points and small regions are winning or losing, but complete pictures remain largely unknown. Our main goal in this paper is to provide as much detail as possible about the global pictures of winning-losing parameters for some interesting families of sets.

23) Sheela Devadas and Steven Sam, Representations of Cherednik algebras of G (m, r, n) in positive characteristic (arXiv.org, 3 April 2013), published in Journal of Commutative Algebra (Winter 2014): 525-559

We study lowest-weight irreducible representations of rational Cherednik algebras attached to the complex reflection groups G(m, r, n) in characteristic p . Our approach is mostly from the perspective of commutative algebra. By studying the kernel of the contravariant bilinear form on Verma modules, we obtain formulas for Hilbert series of irreducible representations in a number of cases, and present conjectures in other cases. We observe that the form of the Hilbert series of the irreducible representations and the generators of the kernel tend to be determined by the value of n modulo p , and are related to special classes of subspace arrangements. Perhaps the most novel (conjectural) discovery from the commutative algebra perspective is that the kernel can be given the structure of a "matrix regular sequence" in some instances, which we prove in some small cases.

22) Christina Chen and Nan Li, Apollonian Equilateral Triangles (arXiv.org, 1 March 2013)

Given an equilateral triangle with a the square of its side length and a point in its plane with b, c, d the squares of the distances from the point to the vertices of the triangle, it can be computed that a, b, c, d satisfy 3( a 2 + b 2 + c 2 + d 2 ) = ( a + b + c + d ) 2 . This paper derives properties of quadruples of nonnegative integers ( a; b; c; d ), called triangle quadruples, satisfying this equation. It is easy to verify that the operation generating ( a; b; c; a + b + c - d ) from ( a; b; c; d ) preserves this feature and that it and analogous ones for the other elements can be represented by four matrices. We examine in detail the triangle group, the group with these operations as generators, and completely classify the orbits of quadruples with respect to the triangle group action. We also compute the number of triangle quadruples generated after a certain number of operations and approximate the number of quadruples bounded by characteristics such as the maximal element. Finally, we prove that the triangle group is a hyperbolic Coxeter group and derive information about the elements of triangle quadruples by invoking Lie groups. We also generalize the problem to higher dimensions.

21) Dhroova Aiylam, Modified Stern-Brocot sequences (arXiv.org, 29 January 2013), published in Integers: Electronic Journal of Combinatorics and Number Theory 17 (2017)

We present the classical Stern-Brocot tree and provide a new proof of the fact that every rational number between 0 and 1 appears in the tree. We then generalize the Stern-Brocot tree to allow for arbitrary choice of starting terms, and prove that in all cases the tree maintains the property that every rational number between the two starting terms appears exactly once.

20) Nihal Gowravaram and Ravi Jagadeesan , Beyond alternating permutations: Pattern avoidance in Young diagrams and tableaux (arXiv.org, 28 January 2013), published in the Electronic Journal of Combinatorics 20:4 (2013)

We investigate pattern avoidance in alternating permutations and generalizations thereof. First, we study pattern avoidance in an alternating analogue of Young diagrams. In particular, we extend Babson-West's notion of shape-Wilf equivalence to apply to alternating permutations and so generalize results of Backelin-West-Xin and Ouchterlony to alternating permutations. Second, we study pattern avoidance in the more general context of permutations with restricted ascents and descents. We consider a question of Lewis regarding permutations that are the reading words of thickened staircase Young tableaux, that is, permutations that have (k - 1) ascents followed by a descent, followed by (k - 1) ascents, et cetera. We determine the relative sizes of the sets of pattern-avoiding (k - 1)-ascent permutations in terms of the forbidden pattern. Furthermore, we give inequalities in the sizes of sets of pattern-avoiding permutations in this context that arise from further extensions of shape-equivalence type enumerations.

19) Rohil Prasad and Jonathan Tidor , Optimal Results in Staged Self-Assembly of Wang Tiles (22 January 2013)

The subject of self-assembly deals with the spontaneous creation of ordered systems from simple units and is most often applied in the field of nanotechnology. The self-assembly model of Winfree describes the assembly of Wang tiles, simulating assembly in real-world systems. We use an extension of this model, known as the staged self-assembly model introduced by Demaine et al. that allows for discrete steps to be implemented and permits more diverse constructions. Under this model, we resolve the problem of constructing segments, creating a method to produce them optimally. Generalizing this construction to squares gives a new flexible method for their construction. Changing a parameter of the model, we explore much simpler constructions of complex monotone shapes. Finally, we present an optimal method to build most arbitrary shapes.

18) Aaron Klein, On Rank Functions of Graphs (6 January 2013)

We study rank functions (also known as graph homomorphisms onto Z), ways of imposing graded poset structures on graphs. We rst look at a variation on rank functions called discrete Lipschitz functions . We relate the number of Lipschitz functions of a graph G to the number of rank functions of both G and G X E . We then find generating functions that enable us to compute the number of rank or Lipschitz functions of a given graph. We look at a subset of graphs called squarely generated graphs , which are graphs whose cycle space has a basis consisting only of 4-cycles. We show that the number of rank functions of such a graph is proportional to the number of 3-colorings of the same graph, thereby connecting rank functions to the Potts model of statistical mechanics. Lastly, we look at some asymptotics of rank and Lipschitz functions for various types of graphs.

17) Andrew Xia, Integrated Gene Expression Probabilistic Models for Cancer Staging (1 January 2013)

The current system for classifying cancer patients' stages was introduced more than one hundred years ago. With the modern advance in technology, many parts of the system have been outdated. Because the current staging system emphasizes surgical procedures that could be harmful to patients, there has been a movement to develop a new Taxonomy, using molecular signatures to potentially avoid surgical testing. This project explores the issues of the current classification system and also looking for a potentially better way to classify cancer patients’ stages. Computerization has made a vast amount of cancer data available online. However, a significant portion of the data is incomplete; some crucial information is missing. It is logical to attempt to develop a system of recovering missing cancer data. Successful completion of this research saves costs and increases efficiency in cancer research and curing. Using various methods, we have shown that cancer stages cannot be simply extrapolated with incomplete data. Furthermore, a new approach of using RNA Sequencing data is studied. RNA Sequencing can potentially become a cost-efficient way to determine a cancer patient’s stage. We have obtained promising results of using RNA sequencing data in breast cancer staging.

16) Surya Bhupatiraju , On the Complexity of the Marginal Satisfiability Problem (18 November 2012)

The marginal satisfiability problem (MSP) asks: Given desired marginal distributions D S for every subset S of c variable indices from {1, . . . , n}, does there exist a distribution D over n-tuples of values in {1, . . . , m} with those S -marginals D S ? Previous authors have studied MSP in fixed dimensions, and have classified the complexity up to certain upper bounds. However, when using general dimensions, it is known that the size of distributions grows exponentially, making brute force algorithms impractical. This presents an incentive to study more general, tractable variants, which in turn may shed light on the original problem's structure. Thus, our work seeks to explore MSP and its variants for arbitrary dimension, and pinpoint its complexity more precisely. We solve MSP for n = 2 and completely characterize the complexity of three closely related variants of MSP. In particular, we detail novel greedy and stochastic algorithms that handle exponentially-sized data structures in polynomial time, as well as generate accurate representative samples of these structures in polynomial time. These algorithms are also unique in that they represent possible protocols in data compression for communication purposes. Finally, we posit conjectures related to more generalized MSP variants, as well as the original MSP.

15) Fengning Ding and Aleksander Tsymbaliuk, Representations of Infinitesimal Cherednik Algebras (arXiv.org, 17 October 2012), published in Representation Theory 17 (2013)

Infinitesimal Cherednik algebras, first introduced by Etingof, Gan, and Ginzburg (2005), are continuous analogues of rational Cherednik algebras, and in the case of gl n , are deformations of universal enveloping algebras of the Lie algebras sl n+1 . Despite these connections, infinitesimal Cherednik algebras are not widely-studied, and basic questions of intrinsic algebraic and representation theoretical nature remain open. In the first half of this paper, we construct the complete center of H ζ (gl n ) for the case of n = 2 and give one particular generator of the center, the Casimir operator, for general n. We find the action of this Casimir operator on the highest weight modules to prove the formula for the Shapovalov determinant, providing a criterion for the irreducibility of Verma modules. We classify all irreducible finite dimensional representations and compute their characters. In the second half, we investigate Poisson-analogues of the infinitesimal Cherednik algebras and use them to gain insight on the center of H ζ (gl n ). Finally, we investigate H ζ (sp 2n ) and extend various results from the theory of H ζ (gl n ), such as a generalization of Kostant's theorem.

14) Tanya Khovanova and Dai Yang, Halving Lines and Their Underlying Graphs (arXiv.org, 17 October 2012), published in Involve 11:1 (2018): 1–11

In this paper we study halving-edges graphs corresponding to a set of halving lines. Particularly, we study the vertex degrees, path, cycles and cliques of such graphs. In doing so, we study a vertex-partition of said graph called chains which are equipped with interesting properties.

2011 Research Papers

13) Carl Lian, Representations of Cherednik Algebras Associated to Complex Reflection Groups in Positive Characteristic (arXiv.org, 1 July 2012)

We consider irreducible lowest-weight representations of Cherednik algebras associated to certain classes of complex reflection groups in characteristic p . In particular, we study maximal submodules of Verma modules associated to these algebras. Various results and conjectures are presented concerning generators of these maximal submodules, which are found by computing singular polynomials of Dunkl operators. This work represents progress toward the general problem of determining Hilbert series of irreducible lowest-weight representations of arbitrary Cherednik algebras in characteristic p .

12) Aaron Klein, Joel Brewster Lewis, and Alejandro Morales, Counting matrices over finite fields with support on skew Young and Rothe diagrams (arXiv.org, 26 March 2012); published in the Journal of Algebraic Combinatorics (May 2013)

We consider the problem of finding the number of matrices over a finite field with a certain rank and with support that avoids a subset of the entries. These matrices are a q-analogue of permutations with restricted positions (i.e., rook placements). For general sets of entries these numbers of matrices are not polynomials in q (Stembridge 98); however, when the set of entries is a Young diagram, the numbers, up to a power of q-1, are polynomials with nonnegative coefficients (Haglund 98). In this paper, we give a number of conditions under which these numbers are polynomials in q, or even polynomials with nonnegative integer coefficients. We extend Haglund's result to complements of skew Young diagrams, and we apply this result to the case when the set of entries is the Rothe diagram of a permutation. In particular, we give a necessary and sufficient condition on the permutation for its Rothe diagram to be the complement of a skew Young diagram up to rearrangement of rows and columns. We end by giving conjectures connecting invertible matrices whose support avoids a Rothe diagram and Poincaré polynomials of the strong Bruhat order.

11) Surya Bhupatiraju , Pavel Etingof, David Jordan, William Kuszmaul , and Jason Li, Lower central series of a free associative algebra over the integers and finite fields (arXiv.org, 8 March 2012), published in the Journal of Algebra (December 2012)

Consider the free algebra A_n generated over Q by n generators x_1, ..., x_n. Interesting objects attached to A = A_n are members of its lower central series, L_i = L_i(A), defined inductively by L_1 = A, L_{i+1} = [A,L_{i}], and their associated graded components B_i = B_i(A) defined as B_i=L_i/L_{i+1}. These quotients B_i, for i at least 2, as well as the reduced quotient \bar{B}_1=A/(L_2+A L_3), exhibit a rich geometric structure, as shown by Feigin and Shoikhet and later authors (Dobrovolska-Kim-Ma, Dobrovolska-Etingof, Arbesfeld-Jordan, Bapat-Jordan).
We study the same problem over the integers Z and finite fields F_p. New phenomena arise, namely, torsion in B_i over Z, and jumps in dimension over F_p. We describe the torsion in the reduced quotient RB_1 and B_2 geometrically in terms of the De Rham cohomology of Z^n. As a corollary we obtain a complete description of \bar{B}_1(A_n(Z)) and \bar{B}_1(A_n(F_p)), as well as of B_2(A_n(Z[1/2])) and B_2(A_n(F_p)), p>2. We also give theoretical and experimental results for B_i with i>2, formulating a number of conjectures and questions based on them. Finally, we discuss the supercase, when some of the generators are odd (fermionic) and some are even (bosonic), and provide some theoretical results and experimental data in this case.

10) David Jordan and Masahiro Namiki, Determinant formulas for the reflection equation algebra (19 Feb 2012)

In this note, we report on work in progress to explicitly describe generators of the center of the reflection equation algebra associated to the quantum GL(N) R-matrix. In particular, we conjecture a formula for the quantum determinant, and for the quadratic central element, both of which involve the excedance statistic on the symmetric group. Current efforts are directed at proving these formulas, and at finding formulas for the remaining central elements.

9) Ziv Scully , Yan Zhang, and Tian-Yi (Damien) Jiang, Firing Patterns in the Parallel Chip-Firing Game (arXiv.org, 29 Nov 2012), published in Discrete Mathematics and Theoretical Computer Science (DMTCS) proc., Nancy, France, 2014

The parallel chip-firing game is an automaton on graphs in which vertices “fire” chips to their neighbors. This simple model, analogous to sandpiles forming and collapsing, contains much emergent complexity and has connections to different areas of mathematics including self-organized criticality and the study of the sandpile group. In this work, we study firing sequences , which describe each vertex’s interaction with its neighbors in this game. Our main contribution is a complete characterization of the periodic firing sequences that can occur in a game, which have a surprisingly simple combinatorial description. We also obtain other results about local behavior of the game after introducing the concept of motors .

8) Sheela Devadas , Lowest-weight representations of Cherednik algebras in positive characteristic (29 Jan 2012)

We study lowest-weight irreducible representations of rational Cherednik algebras attached to the complex reflection groups G(m, r, n) in characteristic p , focusing specifically on the case pn , which is more complicated than the case p > n . The goal of our work is to calculate characters (and in particular Hilbert series) of these representations. By studying the kernel of the contravariant bilinear form on Verma modules, we proved formulas for Hilbert series of irreducible modules in a number of cases, and also obtained a lot of computer data which suggests a number of conjectures. Specifically, we find that the shape and form of the Hilbert series of the irreducible representations and the generators of the kernel tend to be determined by the value of n modulo p .

7) Christina Chen , Maximizing Volume Ratios for Shadow Covering by Tetrahedra (arXiv.org, 9 Jan 2012)

Define a body A to be able to hide behind a body B if the orthogonal projection of B contains a translation of the corresponding orthogonal projection of A in every direction. In two dimensions, it is easy to observe that there exist two objects such that one can hide behind another and have a larger area than the other. It was recently shown that similar examples exist in higher dimensions as well. However, the highest possible volume ratio for such bodies is still undetermined. We investigated two three-dimensional examples, one involving a tetrahedron and a ball and the other involving a tetrahedron and an inverted tetrahedron. We calculate the highest volume ratio known up to this date, 1.16, which is generated by our second example.

6) Yongyi Chen, Pavel Etingof, David Jordan, and Michael Zhang , Poisson traces in positive characteristic (arXiv.org, 29 Dec 2011)

We study Poisson traces of the structure algebra A of an affine Poisson variety X defined over a field of characteristic p. According to arXiv:0908.3868v4 , the dual space HP_0(A) to the space of Poisson traces arises as the space of coinvariants associated to a certain D-module M(X) on X. If X has finitely many symplectic leaves and the ground field has characteristic zero, then M(X) is holonomic, and thus HP_0(A) is finite dimensional. However, in characteristic p, the dimension of HP_0(A) is typically infinite. Our main results are complete computations of HP_0(A) for sufficiently large p when X is 1) a quasi-homogeneous isolated surface singularity in the three-dimensional space, 2) a quotient singularity V/G, for a symplectic vector space V by a finite subgroup G in Sp(V), and 3) a symmetric power of a symplectic vector space or a Kleinian singularity. In each case, there is a finite nonnegative grading, and we compute explicitly the Hilbert series. The proofs are based on the theory of D-modules in positive characteristic.

5) Saarik Kalia , The Generalizations of the Golden Ratio: Their Powers, Continued Fractions, and Convergents (23 Dec 2011)

The relationship between the golden ratio and continued fractions is commonly known about throughout the mathematical world: the convergents of the continued fraction are the ratios of consecutive Fibonacci numbers. The continued fractions for the powers of the golden ratio also exhibit an interesting relationship with the Lucas numbers. In this paper, we study the silver means and introduce the bronze means, which are generalizations of the golden ratio. We correspondingly introduce the silver and bronze Fibonacci and Lucas numbers, and we prove the relationship between the convergents of the continued fractions of the powers of the silver and bronze means and the silver and bronze Fibonacci and Lucas numbers. We further generalize this to the Lucas constants, a two-parameter generalization of the golden ratio.

4) Caroline Ellison , The Number of Nonzero Coefficients of Powers of a Polynomial over a Finite Field (15 Nov 2011)

Coefficients of polynomials over finite fields often encode information that can be applied in various areas of science; for instance, computer science and representation theory. The purpose of this project is to investigate these coefficients over the finite field F p . We find four exact results for the number of nonzero coefficients in special cases of n and p for the polynomial (1 + x + x 2 ) n . More importantly, we use Amdeberhan and Stanley's matrices to find what we conjecture to be an approximation for the sum of the number of nonzero coefficients of P(x) n over F p . We also relate the number of nonzero coefficients to the number of base p digits of n . These results lead to questions in representation theory and combinatorics.

3) Xiaoyu He , On the Classification of Universal Rotor-Routers (arXiv.org, 6 Nov 2011)

The combinatorial theory of rotor-routers has connections with problems of statistical mechanics, graph theory, chaos theory, and computer science. A rotor-router network defines a deterministic walk on a digraph G in which a particle walks from a source vertex until it reaches one of several target vertices. Motivated by recent results due to Giacaglia et al., we study rotor-router networks in which all non-target vertices have the same type. A rotor type r is universal if every hitting sequence can be achieved by a homogeneous rotor-router network consisting entirely of rotors of type r. We give a conjecture that completely classifies universal rotor types. Then, this problem is simplified by a theorem we call the Reduction Theorem that allows us to consider only two-state rotors. A rotor-router network called the compressor, because it tends to shorten rotor periods, is introduced along with an associated algorithm that determines the universality of almost all rotors. New rotor classes, including boppy rotors, balanced rotors, and BURD rotors, are defined to study this algorithm rigorously. Using the compressor the universality of new rotor classes is proved, and empirical computer results are presented to support our conclusions. Prior to these results, less than 100 of the roughly 260,000 possible two-state rotor types of length up to 17 were known to be universal, while the compressor algorithm proves the universality of all but 272 of these rotor types.

2) Yongyi Chen and Michael Zhang, On zeroth Poisson homology in positive characteristic (30 Sept 2011)

A Poisson algebra is a commutative algebra with a Lie bracket {,} satisfying the Leibniz rule. An important invariant of a Poisson algebra A is its zeroth Poisson homology HP_0(A)=A/A,A}. It characterizes densities on the phase space invariant under all Hamiltonian flows. Also, the dimension of HP_0(A) gives an upper bound for the number of irreducible representations of any quantization of A. We study HP_0(A) when A is the algebra of functions on an isolated quasihomogeneous surface singularity. Over C, it's known that HP_0(A) is the Jacobi ring of the singularity whose dimension is the Milnor number. We generalize this to characteristic p. In this case, HP_0(A) is a finite (although not finite dimensional) module over A^p. We give its conjectural Hilbert series for Kleinian singularities and for cones of smooth projective curves, and prove the conjecture in several cases. (The conjecture has now been proved in general in our follow-up paper with P. Etingof and D. Jordan.)

1) Christina Chen , Tanya Khovanova, and Daniel A. Klain, Volume bounds for shadow covering (arXiv.org, 8 Sep 2011), published in Transactions of the American Mathematical Society 366 (2014)

For n ≥ 2 a construction is given for a large family of compact convex sets K and L in n -dimensional Euclidean space such that the orthogonal projection L u onto the subspace u contains a translate of the corresponding projection K u for every direction u , while the volumes of K and L satisfy V n (K) > V n (L) . It is subsequently shown that, if the orthogonal projection L u onto the subspace u contains a translate of K u for every direction u , then the set (n/(n−1))L contains a translate of K . It follows that V n (K) (n/(n−1)) n V n (L) . In particular, we derive a universal constant bound V n (K) ≤ 2.942 V n (L) , independent of the dimension n of the ambient space. Related results are obtained for projections onto subspaces of some fixed intermediate co-dimension. Open questions and conjectures are also posed.