David Spivak

Research Scientist
Department of Mathematics
MIT

Email: dspivak--math/mit/edu


"Category theory is a universal modeling language."

Background.

Success is founded on information. A tight connection between success (in anything) and information. It follows that we should (if we want to be more successful) study what information is.

Grant proposals. These are several grant proposals, some funded, some in the pipeline, others not funded, that explain various facets of my research project.

Introductory talk (video, slides).

Blog post, on John Baez's blog Azimuth, about my motivations for studying this subject. (Here's a .pdf version.)


Book.

Category Theory for Scientists. This book attempts to show that category theory can be applied throughout the sciences as a framework for modeling observed phenomena and for communicating results. In order to target a scientific audience, the book is example-based rather than proof-based. On the arXiv.


Papers.

Pixel Array Method. Here we give a fast new method for solving systems of equations. The method returns all solutions for a subset of variables in a given bounding box and can be more than 10x faster than other solvers (e.g. quasi-Newton methods).

Algebraic databases. Joint with Patrick Schultz, Christina Vasilakopoulou, Ryan Wisnesky. Here we understand attributes and null values in a database, using algebraic theories. The theoretical contributions include a notion of algebraic profunctor, a development of the collage construction, and a new proarrow equipment of "variable algebras" of a theory.

Functorial Data Migration. This was published in the journal Information and Computation in 2012. In this paper I show some of the power of the database=category idea: a morphism of schemas induces data migration functors that transmit states on one schema to states on the other. This creates such functionality as views, unions, joins, projects, selects, etc. An older version is here, and both are available on the arXiv. A 5-part video series (totalling ~1 hour) on roughly this subject can be found here.

On The Relational Foundations Of Functorial Data Migration. Joint with Ryan Wisnesky. A syntactic version of the above Functorial Data Migration paper, with more details about query composition, equivalences, and containment. On the arXiv.

Simplicial databases. A geometric model of databases. In this model, a "schema" for a database is a labeled simplicial set, whose n-simplicies correspond to tables with n+1 columns. We form a simplicial set by taking the union of such simplices, and doing so reflects the links between objects of differing types. The data itself is then a sheaf on the corresponding site. This category of databases is closed under limits and colimits and these functors correspond to common operations on databases, such as joins, selects, and unions. On the arXiv.
I gave a talk on this subject at the 2008 ATMCS meeting in Paris, and wrote a more brief survey of the ideas here. written out as a .pdf document.
A primitive version of this talk in which I only discuss the category of tables, was given as a colloquium in the University of Oregon computer science department. It can be found here.

Table manipulation in simplicial databases. This short note exemplifies the fact that the geometric structure of simplicial databases has concrete meaning. On the arXiv.

Type theoretical databases. Simplicial databases from the perspective of type theory. Joint with H. Forssell and H.R. Gylterud.

Higher-dimensional models of networks. In this paper, I make a case that many networks involve interactions between more than two entities, and as such should not be modeled by graphs but by simplicial sets. I give a variety of other models and suggest ways to determine which model is most applicable for a given situation, and how to transform models from one type into another. The paper is written mainly for a CS audience familiar with the basics of category theory. It can also be found on the arXiv. A more mathematical paper on networks and information processing is in the works.

Database queries and constraints via lifting problems. (Published by Mathematical structures in computer science.) In this paper I show that database queries and constraints, especially so-called "graph pattern queries", are nicely modeled in terms of the algebro-topological notion of lifting problems (which one encounters, for example, in Quillen's theory of model categories). The lifting problem is the query and the lifts are the solutions. Also available on the arXiv.

Kleisli database instances. Here we use monads to allow non-atomic values in database fields. For example depending on the choice of monad, a column may contain lists, probability distributions, or vectors of entries in some foreign table. As an interesting aside, we show that there is a schema Loop, whose Kleisli T-instances for different monads T completely capture classical structures of interest, such as directed graphs, discrete dynamical systems, Markov chains, multi-graphs, finite state automata, Turing Machines, and square matrices. Also available on the arXiv.

The operad of wiring diagrams. Here I discuss a symmetric colored operad S of wiring diagrams, which can be used as a graphical language for database queries. I also show that the same formalism fits the classical theory of digital circuits, as well as the notion of plug-and-play and recursion. Also available on the arXiv.

The operad of temporal wiring diagrams. Joint with Dylan Rupel. We discuss an operad of wiring diagrams, whose purpose is to formalize the hierarchical nature of processes. We prove that there is an algebra of stream processors. This may have applications to computer hardware design and neuroscience. Also available on the arXiv.

Algebras of open dynamical systems on the operad of wiring diagrams. Joint with Dmitry Vagner and Eugene Lerman. We show that open multivariable dynamical systems form an algebra on an operad of wiring diagrams. Also available on the arXiv.

Nesting of dynamic systems and mode-dependent networks. I show that you can build networks of dynamic systems, in which the dynamics can change the network architecture in real time. Also available on the arXiv.

Ologs: a category-theoretic foundation for knowledge representation. Joint with Robert E. Kent. In this paper we present an application of basic category theory to the field of knowledge representation. Ontology is the study of what something is, the intrinsic nature of a subject. Ologs, or ontology logs, are a framework for recording the results of such a study by breaking it down into types (objects), aspects (arrows), and facts (commutative diagrams). Functors between ologs can create translating dictionaries between various perspectives on a subject, and set-valued functors amount to instance data or "real world examples" for the system of concepts an olog. This work was published in the open-access journal PLoS ONE in January 2012. It is also available on the arXiv. An older version, with a simplified section on communication, may also be useful.

Category-theoretic analysis of hierarchical protein materials and social networks. Joint with Tristan Giesa, Elizabeth Wood, and Markus Buehler, published (2011) in PLoS ONE. In this paper we use an olog to rigorously compare and contrast the structure of two proteins and their resulting behaviors under strain. We then compare this whole setup with that of a social network, contrived to have the same olog-theoretic description. The examples reviewed here demonstrate that the intrinsic nature of these complex systems, which in particular includes a precise relationship between their structure and their function, can be effectively represented by an olog. This, in turn, allows for comparative studies between disparate materials or fields of application, and result in novel approaches to derive functionality in hierarchical systems. We discuss opportunities and challenges associated with the description of complex biological materials by using ologs as a powerful tool for analysis and design in the context of materiomics, and we present the potential impact of this approach for engineering, life sciences, and medicine. On the arXiv.

Reoccurring patterns in hierarchical protein materials and music: The power of analogies. (Joint with Tristan Giesa and Markus Buehler.) In this paper, we use ologs to display an analogy between hierarchical protein materials and music. Published (2011) in BioNanoScience.

Category theory based solution for the building block replacement problem in materials design. (Joint with Tristan Giesa and Markus Buehler.) In this paper we show that ologs provide a foundation from which large scale collaborative efforts in materials science can solve the building block replacement problem. Published (2012) in Advanced Engineering Materials.

Materials by design: merging proteins and music. (Joint with Joyce Wong, John McDonald, Micki Taylor-Pinney, David L. Kaplan, and Markus J. Buehler.) In this paper we discuss a way to translate musical insight into engineering insight for building materials. Published (2012) in Nano Today.

Experience Implementing a Performant Category-Theory Library in Coq. (Joint with Jason Gross and Adam Chlipala.) In this paper we discuss some difficulties and interesting approaches to implementing basic categorical constructions (e.g. Kan extensions) in the proof assistant Coq. On the arXiv.

Matriarch: A python library for materials architecture" Joint with Ravi Jagadessan, Tristan Giesa, and Markus Buehler. Published in ACS Biomaterials. The Matriarch software, as well as a user's guide, is available for free online at the Matriarch website.


Notes, talks, and student involvement.

Notes. Unfinished work I've begun. Some appears elsewhere, some is abandoned, some I hope to reinvigorate in the future.

Talks and posters. Some outlines of chalk-talks and slides of beamer-talks I have given on this subject.

Informatics seminar. This is the course webpage for my Informatics seminar, given at the University of Oregon in Winter 2010.

Agent-based complex systems. Here are a scheule and slides from a conference relating to these ideas at IPAM in October 2009.

UROP in Categorical Information Theory. A chance for students to become involved in research.




Creative Commons License
This work by David I. Spivak is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.