# Alan Edelman

• Professor of Applied Mathematics
• Computer Science and AI Laboratories
• Julia Lab Research Group Leader
• Fellow of SIAM, ACM, IEEE, AMS

## Scientific Computing

 Oceananigans.jl: Fast and friendly geophysical fluid dynamics on GPUs Ramadhan, A., et al. (2020, January). The Journal of Open Source Software. DOI: 10.21105/joss.02018. PDFAbstract Abstract: Oceananigans.jl is a fast and friendly software package for the numerical simulation of incompressible, stratified, rotating fluid flows on CPUs and GPUs. Universal Differential Equations for Scientific Machine Learning Rackauckas, C., Ma, Y., Martensen, J., Warner, C., Zubov, K., Supekar, R., ... & Ramadhan, A. (2020). arXiv preprint arXiv:2001.04385. PDFarXivAbstract Abstract: In the context of science, the well-known adage "a picture is worth a thousand words" might well be "a model is worth a thousand datasets." Scientific models, such as Newtonian physics or biological gene regulatory networks, are human-driven simplifications of complex phenomena that serve as surrogates for the countless experiments that validated the models. Recently, machine learning has been able to overcome the inaccuracies of approximate modeling by directly learning the entire set of nonlinear interactions from data. However, without any predetermined structure from the scientific basis behind the problem, machine learning approaches are flexible but data-expensive, requiring large databases of homogeneous labeled training data. A central challenge is reconciling data that is at odds with simplified models without requiring "big data".In this work we develop a new methodology, universal differential equations (UDEs), which augments scientific models with machine-learnable structures for scientifically-based learning. We show how UDEs can be utilized to discover previously unknown governing equations, accurately extrapolate beyond the original data, and accelerate model simulation, all in a time and data-efficient manner. This advance is coupled with open-source software that allows for training UDEs which incorporate physical constraints, delayed interactions, implicitly-defined events, and intrinsic stochasticity in the model. Our examples show how a diverse set of computationally-difficult modeling issues across scientific disciplines, from automatically discovering biological mechanisms to accelerating the training of physics-informed neural networks and large-eddy simulations, can all be transformed into UDE training problems that are efficiently solved by a single software methodology. Generalized Physics-Informed Learning Through Language-Wide Differentiable Programming Rackauckas C., Edelman A., Fischer K., Innes M., Saba E., Shah V.B., & Tebbutt W. (2020). In Proceedings of the AAAI 2020 Spring Symposium on Combining Artificial Intelligence and Machine Learning with Physical Sciences. PDFAbstract Abstract: Scientific computing is increasingly incorporating the advancements in machine learning to allow for data-driven physicsinformed modeling approaches. However, re-targeting existing scientific computing workloads to machine learning frameworks is both costly and limiting, as scientific simulations tend to use the full feature set of a general purpose programming language. In this manuscript we develop an infrastructure for incorporating deep learning into existing scientific computing code through Differentiable Programming (∂P). We describe a ∂P system that is able to take gradients of full Julia programs, making Automatic Differentiation a first class language feature and compatibility with deep learning pervasive. Our system utilizes the one-language nature of Julia package development to augment the existing package ecosystem with deep learning, supporting almost all language constructs (control flow, recursion, mutation, etc.) while generating highperformance code without requiring any user intervention or refactoring to stage computations. We showcase several examples of physics-informed learning which directly utilizes this extension to existing simulation code: neural surrogate models, machine learning on simulated quantum hardware, and data-driven stochastic dynamical model discovery with neural stochastic differential equations. Signal Enhancement for Magnetic Navigation Challenge Problem Gnadt, A. R., Belarge, J., Canciani, A., Conger, L., Curro, J., Edelman, A., ... & Rackauckas, C. (2020). arXiv preprint arXiv:2007.12158. PDFarXivAbstract Abstract: Harnessing the magnetic field of the earth for navigation has shown promise as a viable alternative to other navigation systems. A magnetic navigation system collects its own magnetic field data using a magnetometer and uses magnetic anomaly maps to determine the current location. The greatest challenge with magnetic navigation arises when the magnetic field data from the magnetometer on the navigation system encompass the magnetic field from not just the earth, but also from the vehicle on which it is mounted. It is difficult to separate the earth magnetic anomaly field magnitude, which is crucial for navigation, from the total magnetic field magnitude reading from the sensor. The purpose of this challenge problem is to decouple the earth and aircraft magnetic signals in order to derive a clean signal from which to perform magnetic navigation. Baseline testing on the dataset shows that the earth magnetic field can be extracted from the total magnetic field using machine learning (ML). The challenge is to remove the aircraft magnetic field from the total magnetic field using a trained neural network. These challenges offer an opportunity to construct an effective neural network for removing the aircraft magnetic field from the dataset, using an ML algorithm integrated with physics of magnetic navigation. Rapid software prototyping for heterogeneous and distributed platforms Besard, T., Churavy, V., Edelman, E., & De Sutter, B. (2019, June). Advances in Engineering Software, 132, 29-46 Abstract Abstract: The software needs of scientists and engineers are growing and their programs are becoming more compute-heavy and problem-specific. This has led to an influx of non-expert programmers, who need to use and program high-performance computing platforms.With the continued stagnation of single-threaded performance, using hardware accelerators such as GPUs or FPGAs is necessary. Adapting software to these compute platforms is a difficult task, especially for non-expert programmers, leading to applications being unable to take advantage of new hardware or requiring extensive rewrites.We propose a programming model that allows non-experts to benefit from high-performance computing, while enabling expert programmers to take full advantage of the underlying hardware. In this model, programs are generically typed, the location of the data is encoded in the type system, and multiple dispatch is used to select functionality based on the type of the data. This enables rapid prototyping, retargeting and reuse of existing software, while allowing for hardware specific optimization if required.Our approach allows development to happen in one source language enabling domain experts and performance engineers to jointly develop a program, without the overhead, friction, and challenges associated with developing in multiple programming languages for the same project.We demonstrate the viability and the core principles of this programming model in Julia using realistic examples, showing the potential of this approach for rapid prototyping, and its applicability for real-life engineering. We focus on usability for non-expert programmers and demonstrate that the potential of the underlying hardware can be fully exploited. A Differentiable Programming System to Bridge Machine Learning and Scientific Computing Innes, M., Edelman, A., Fischer, K., Rackauckas, C., Saba, E., Shah, V. B., & Tebbutt, W. (2019). CoRR, abs/1907.07587. PDFarXivAbstract Abstract: Scientific computing is increasingly incorporating the advancements in machine learning and the ability to work with large amounts of data. At the same time, machine learning models are becoming increasingly sophisticated and exhibit many features often seen in scientific computing, stressing the capabilities of machine learning frameworks. Just as the disciplines of scientific computing and machine learning have shared common underlying infrastructure in the form of numerical linear algebra, we now have the opportunity to further share new computational infrastructure, and thus ideas, in the form of Differentiable Programming. We describe Zygote, a Differentiable Programming system that is able to take gradients of general program structures. We implement this system in the Julia programming language. Our system supports almost all language constructs (control flow, recursion, mutation, etc.) and compiles high-performance code without requiring any user intervention or refactoring to stage computations. This enables an expressive programming model for deep learning, but more importantly, it enables us to incorporate a large ecosystem of libraries in our models in a straightforward way. We discuss our approach to automatic differentiation, including its support for advanced techniques such as mixed-mode, complex and checkpointed differentiation, and present several examples of differentiating programs. Fast computation of the principal components of genotype matrices in Julia J Chen, A Noack, A Edelman. (2018) PDFarXivAbstract Abstract: Finding the largest few principal components of a matrix of genetic data is a common task in genome-wide association studies (GWASs), both for dimensionality reduction and for identifying unwanted factors of variation. We describe a simple random matrix model for matrices that arise in GWASs, showing that the singular values have a bulk behavior that obeys a Marchenko-Pastur distributed with a handful of large outliers. We also implement Golub-Kahan-Lanczos (GKL) bidiagonalization in the Julia programming language, providing thick restarting and a choice between full and partial reorthogonalization strategies to control numerical roundoff. Our implementation of GKL bidiagonalization is up to 36 times faster than software tools used commonly in genomics data analysis for computing principal components, such as EIGENSOFT and FlashPCA, which use dense LAPACK routines and randomized subspace iteration respectively. On Machine Learning and Programming Languages Innes, M., Karpinski, S., Shah, V., Barber, D., Saito Stenetorp, P. L. E. P. S., Besard, T., ... & Malmaud, J. (2018, February). Association for Computing Machinery (ACM). PDFAbstract Abstract: The complexity of Machine Learning (ML) models and the frameworks people are using to build them has exploded along with ML itself. State-of-the-art models are increasingly programs, with support for programming constructs like loops and recursion, and this brings out many interesting issues in the tools we use to create them — that is, programming languages (PL). This paper1, discusses the necessity for a first class language for machine learning, and what such a language might look like. A more open efficient future for AI development and data science with an introduction to Julia. Edelman, A. (2017). Big Data (Big Data), 2017 IEEE International Conference on (pp. 2-2). IEEE. Julia: A fresh approach to numerical computing Bezanson, J., Edelman, A., Karpinski, S., & Shah, V. B. (2017). SIAM review, 59(1), 65-98. PDFarXivAbstract Abstract: Bridging cultures that have often been distant, Julia combines expertise from the diverse fields of computer science and computational science to create a new approach to numerical computing. Julia is designed to be easy and fast. Julia questions notions generally held as “laws of nature” by practitioners of numerical computing: High-level dynamic programs have to be slow, One must prototype in one language and then rewrite in another language for speed or deployment, and There are parts of a system for the programmer, and other parts best left untouched as they are built by the experts. We introduce the Julia programming language and its design — a dance between specialization and abstraction. Specialization allows for custom treatment. Multiple dispatch, a technique from computer science, picks the right algorithm for the right circumstance. Abstraction, what good computation is really about, recognizes what remains the same after differences are stripped away. Abstractions in mathematics are captured as code through another technique from computer science, generic programming. Julia shows that one can have machine performance without sacrificing human convenience. Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia Amini, A., Horn, B., & Edelman, A. (2016). arXiv preprint arXiv:1612.08825. PDFarXivAbstract Abstract: Convolutions have long been regarded as fundamental to applied mathematics, physics and engineering. Their mathematical elegance allows for common tasks such as numerical differentiation to be computed efficiently on large data sets. Efficient computation of convolutions is critical to artificial intelligence in real-time applications, like machine vision, where convolutions must be continuously and efficiently computed on tens to hundreds of kilobytes per second. In this paper, we explore how convolutions are used in fundamental machine vision applications. We present an accelerated n-dimensional convolution package in the high performance computing language, Julia, and demonstrate its efficacy in solving the time to contact problem for machine vision. Results are measured against synthetically generated videos and quantitatively assessed according to their mean squared error from the ground truth. We achieve over an order of magnitude decrease in compute time and allocated memory for comparable machine vision applications. All code is packaged and integrated into the official Julia Package Manager to be used in various other scenarios. Julia implementation of the dynamic distributed dimensional data model Chen, A., Edelman, A., Kepner, J., Gadepally, V., & Hutchison, D. (2016). High Performance Extreme Computing Conference (HPEC), (pp. 1-7). IEEE. PDFarXivAbstract Abstract: Julia is a new language for writing data analysis programs that are easy to implement and run at high performance. Similarly, the Dynamic Distributed Dimensional Data Model (D4M) aims to clarify data analysis operations while retaining strong performance. D4M accomplishes these goals through a composable, unified data model on associative arrays. In this work, we present an implementation of D4M in Julia and describe how it enables and facilitates data analysis. Several experiments showcase scalable performance in our new Julia version as compared to the original Matlab implementation. Circuitscape: modeling landscape connectivity to promote conservation and human health. McRae, B., Shah, V., & Edelman, A. (2016). The Nature Conservancy, 14. PDFAbstract Abstract: Circuitscape is an award-winning connectivity analysis software package designed to model species movement and gene flow across fragmented landscapes, and to identify areas important for connectivity conservation. It uses algorithms from electrical circuit theory to model connectivity, taking advantage of links between circuit and random walk theories. Its ability to incorporate all possible pathways across a landscape simultaneously has made Circuitscape particularly useful for connectivity analysis. Robust benchmarking in noisy environments. Chen, J., Edelman, A., & Revels, J. (2016). PDFarXivAbstract Abstract: We propose a benchmarking strategy that is robust in the presence of timer error, OS jitter and other environmental fluctuations, and is insensitive to the highly nonideal statistics produced by timing measurements. We construct a model that explains how these strongly nonideal statistics can arise from environmental fluctuations, and also justifies our proposed strategy. We implement this strategy in the BenchmarkTools Julia package, where it is used in production continuous integration (CI) pipelines for developing the Julia language and its ecosystem. Julia Introduction Edelman, A. (2015, May). 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW) (pp. 1271-1271). IEEE. Abstract Abstract: Scientific computing has traditionally required the highest performance, yet domain experts have largely moved to slower dynamic languages for daily work. The Julia developers believe there are many good reasons to prefer dynamic languages for these applications and do not expect their use to diminish. Fortunately, modern language design and compiler techniques make it possible to mostly eliminate the performance trade-off and provide a single environment productive enough for prototyping and efficient enough for deploying performance-intensive applications. The Julia programming language fills this role: it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages. Integral geometry for Markov chain Monte Carlo: overcoming the curse of search-subspace dimensionality O Mangoubi, A Edelman. (2015). PDFarXivAbstract Abstract: We introduce a method that uses the Cauchy-Crofton formula and a new curvature formula from integral geometry to reweight the sampling probabilities of Metropolis-within-Gibbs algorithms in order to increase their convergence speed. We consider algorithms that sample from a probability density conditioned on a manifold $\mathcal{M}$. Our method exploits the symmetries of the algorithms' isotropic random search-direction subspaces to analytically average out the variance in the intersection volume caused by the orientation of the search-subspace with respect to the manifold $\mathcal{M}$ it intersects. This variance can grow exponentially with the dimension of the search-subspace, greatly slowing down the algorithm. Eliminating this variance allows us to use search-subspaces of dimensions many times greater than would otherwise be possible, allowing us to sample very rare events that a lower-dimensional search-subspace would be unlikely to intersect. To extend this method to events that are rare for reasons other than their support $\mathcal{M}$ having a lower dimension, we formulate and prove a new theorem in integral geometry that makes use of the curvature form of the Chern-Gauss-Bonnet theorem to reweight sampling probabilities. On the side, we also apply our theorem to obtain new theoretical bounds for the volumes of real algebraic manifolds. Finally, we demonstrate the computational effectiveness and speedup of our method by numerically applying it to the conditional stochastic Airy operator sampling problem in random matrix theory. Julia Language Documentation Bezanson, J., Karpinski, S., Shah, V., & Edelman, A. (2014).The Julia Manual. http://docs. julialang. org/en/release-0.2/manual, 1-261. PDFAbstract Abstract: Scientific computing has traditionally required the highest performance, yet domain experts have largely moved to slower dynamic languages for daily work. We believe there are many good reasons to prefer dynamic languages for these applications, and we do not expect their use to diminish. Fortunately, modern language design and compiler techniques make it possible to mostly eliminate the performance trade-off and provide a single environment productive enough for prototyping and efficient enough for deploying performance-intensive applications. The Julia programming language fills this role: it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages. Because Julia’s compiler is different from the interpreters used for languages like Python or R, you may find that Julia’s performance is unintuitive at first. If you find that something is slow, we highly recommend reading through the Performance Tips (page 279) section before trying anything else. Once you understand how Julia works, it’s easy to write code that’s nearly as fast as C. An Efficient Partitioning Oracle for Bounded-Treewidth Graphs Edelman, A., Hassidim, A., Nguyen, H.N., & Onak, K. (2011). In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (pp. 530-541). Springer Berlin Heidelberg. PDFarXivAbstract Abstract: Partitioning oracles were introduced by Hassidim et al. (FOCS 2009) as a generic tool for constant-time algorithms. For any epsilon > 0, a partitioning oracle provides query access to a fixed partition of the input bounded-degree minor-free graph, in which every component has size poly(1/epsilon), and the number of edges removed is at most epsilon*n, where n is the number of vertices in the graph. However, the oracle of Hassidimet al. makes an exponential number of queries to the input graph to answer every query about the partition. In this paper, we construct an efficient partitioning oracle for graphs with constant treewidth. The oracle makes only O(poly(1/epsilon)) queries to the input graph to answer each query about the partition. Examples of bounded-treewidth graph classes include k-outerplanar graphs for fixed k, series-parallel graphs, cactus graphs, and pseudoforests. Our oracle yields poly(1/epsilon)-time property testing algorithms for membership in these classes of graphs. Another application of the oracle is a poly(1/epsilon)-time algorithm that approximates the maximum matching size, the minimum vertex cover size, and the minimum dominating set size up to an additive epsilon*n in graphs with bounded treewidth. Finally, the oracle can be used to test in poly(1/epsilon) time whether the input bounded-treewidth graph is k-colorable or perfect. Enhancing the acquisition efficiency of fast magnetic resonance imaging via broadband encoding of signal content Mitsouras, D., Zientara, G. P., Edelman, A., & Rybicki, F. J. (2006). Magnetic resonance imaging, 24(9), 1209-1227. PDFAbstract Abstract: Current efficient magnetic resonance imaging (MRI) methods such as parallel-imaging and k−t methods encode MR signals using a set of effective encoding functions other than the Fourier basis. This work revisits the proposition of directly manipulating the set of effective encoding functions at the radiofrequency excitation step in order to increase MRI efficiency. This approach, often termed “broadband encoding,” enables the application of algebraic matrix factorization technologies to extract efficiency by representing and encoding MR signal content in a compacted form. Broadband imaging equivalents of fast multiecho, parallel and k−t MRI are developed and analyzed. The potential of these techniques to increase the time efficiency of data acquisition is experimentally verified on a commercial MRI scanner using simple spin-echo imaging. A three-dimensional gradient-echo dynamic imaging application that demonstrates the potential benefits of this approach compared to the present state of the art for certain applications is also presented. Building Blocks and Excluded Sums Demaine, E. D., Demaine, M. L., Edelman, A., Leiserson, C. E., & Persson, P. O. (2005). SIAM News, 38(1), 1-5. PDF Fast Multipole: It's all about Adding Functions in Finite Precision Edelman, A., & Persson, P. 0. (2004, never published). PDF Non-Fourier Encoded Parallel MRI Using Multiple Receiver Coils Mitsouras, D., Hoge, W. S., Rybicki, F. J., Kyriakos, W. E., Edelman, A., & Zientara, G. P. (2004). Magnetic resonance in medicine, 52(2), 321-328. PDFAbstract Abstract: This paper describes a general theoretical framework that combines non-Fourier (NF) spatially-encoded MRI with multichannel acquisition parallel MRI. The two spatial-encoding mechanisms are physically and analytically separable, which allows NF encoding to be expressed as complementary to the inherent encoding imposed by RF receiver coil sensitivities. Consequently, the number of NF spatial-encoding steps necessary to fully encode an FOV is reduced. Furthermore, by casting the FOV reduction of parallel imaging techniques as a dimensionality reduction of the k-space that is NF-encoded, one can obtain a speed-up of each digital NF spatial excitation in addition to accelerated imaging. Images acquired at speed-up factors of 2× to 8× with a four-element RF receiver coil array demonstrate the utility of this framework and the efficiency afforded by it. A Physically Based Numerical Color Calibration Algorithm for Film Edelman, A., Wang, F., & Rao, A. (2003) PDF A Fast Projected Conjugate Gradient Algorithm for Training Support Vector Machines Wen, T., Edelman, A., & Gorsich, D. (2001, August). In Contemporary mathematics (pp. 245-263). American Mathematical Society. PDFAbstract Abstract: Support Vector Machines (SVMs) are of current interest in the solution of classification problems. However, serious challenges appear in the training problem when the training set is large. Training SVMs involves solving a linearly constrained quadratic programming problem. In this paper, we present a fast and easy-to-implement projected Conjugate Gradient algorithm for solving this quadratic programming problem.Compared with the exiting ones, this algorithm tries to be adaptive to each training problem and each computer's memory hierarchy. Although written in a high-level programming language, numerical experiments show that the performance of its MATLAB implementation is competitive with that of benchmark C/C++ codes such as SVMlight and SvmFu. The parallelism of this algorithm is also discussed in this paper. Nonlinear eigenvalue problems with orthogonality constraints Edelman, A., & Lippert, R. (2000). (section 8.3). Bai, Z., Demmel, J., Dongarra, J., Ruhe, A. and Vorst, H. van der, editors: Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Philidelphia: SIAM. PDF The geometry of algorithms with orthogonality constraints Edelman, A., Arias, T. A., & Smith, S. T. (1998). SIAM journal on Matrix Analysis and Applications, 20(2), 303-353. PDFarXivAbstract Abstract: In this paper we develop new Newton and conjugate gradient algorithms on the Grassmann and Stiefel manifolds. These manifolds represent the constraints that arise in such areas as the symmetric eigenvalue problem, nonlinear eigenvalue problems, electronic structures computations, and signal processing. In addition to the new algorithms, we show how the geometrical framework gives penetrating new insights allowing us to create, understand, and compare algorithms. The theory proposed here provides a taxonomy for numerical linear algebra algorithms that provide a top level mathematical view of previously unrelated algorithms. It is our hope that developers of new algorithms and perturbation theories will benefit from the theory, methods, and examples in this paper. Multiscale Computation with Interpolating Wavelets Lippert, R. A., Arias, T. A., & Edelman, A. (1998). Journal of Computational Physics, 140(2), 278-310. PDFarXivAbstract Abstract: Multiresolution analyses based upon interpolets, interpolating scaling functions introduced by Deslauriers and Dubuc, are particularly well-suited to physical applications because they allow exact recovery of the multiresolution representation of a function from its sample values on a finite set of points in space. We present a detailed study of the application of wavelet concepts to physical problems expressed in such bases. The manuscript describes algorithms for the associated transforms which, for properly constructed grids of variable resolution, compute correctly without having to introduce extra grid points. We demonstrate that for the application of local homogeneous operators in such bases, the non-standard multiply of Beylkin, Coifman and Rokhlin also proceeds exactly for inhomogeneous grids of appropriate form. To obtain less stringent conditions on the grids, we generalize the non-standard multiply so that communication may proceed between non-adjacent levels. The manuscript concludes with timing comparisons against naive algorithms and an illustration of the scale-independence of the convergence rate of the conjugate gradient solution of Poisson's equation using a simple preconditioning, suggesting that this approach leads to an O(n) solution of this equation. The Mathematics of the Pentium Division Bug Edelman, A. (1997). SIAM Review, 39(1), 54-67. PDF On Conjugate Gradient-Like Methods for Eigen-Like Problems Edelman, A., & Smith, S. T. (1996). BIT Numerical Mathematics, 36(3), 494-508. PDFAbstract Abstract: Numerical analysts, physicists, and signal processing engineers have proposed algorithms that might be called conjugate gradient for problems associated with the computation of eigenvalues. There are many variations, mostly one eigenvalue at a time though sometimes block algorithms are proposed. Is there a correct “conjugate gradient” algorithm for the eigenvalue problem? How are the algorithms related amongst themselves and with other related algorithms such as Lanczos, the Newton method, and the Rayleigh quotient? Reflections on the Large Scale Matrix Computations in Lattice QCD Conference Brower, R., Demmel, J., Edelman, A., & Hockney, G. (1995, never published). This is a little note whipped up after a very nice conference held in Lexington Kentucky PDF Curvature in Conjugate Gradient Eigenvalue Computation with Applications to Materials and Chemistry Calculations Edelman, A., Arias, T. A., & Smith, S. T. (1994). In In Proceedings of the 1994 SIAM Applied Linear Algebra Conference. PDFAbstract Abstract: We illustrate the importance of using curvature information when performing constrained conjugate gradient minimization on manifolds and identify certain common and useful submanifolds in numerical analysis. This minimization approach promises to be useful in materials and chemistry calculations. When is x*(1/x) not equal to 1? Edelman, A. (1994). Never published except for the web -- just for fun. PDF

## Random Eigenvalues etc.

 Random Hyperplanes, Generalized Singular Values & “What’s My β?” A Edelman, Y Wang. (2018) IEEE Statistical Signal Processing Workshop (SSP), 458-462. doi: 10.1109/SSP.2018.8450833 PDFAbstract Abstract: We streamline the treatment of the Jacobi ensemble from random matrix theory by providing a succinct geometric characterization which may be used directly to compute the Jacobi ensemble distribution without unnecessary matrix baggage traditionally seen in the MANOVA formulation. Algebraically the Jacobi ensemble naturally corresponds to the Generalized Singular Value Decomposition from the field of Numerical Linear Algebra. We further provide a clear geometric interpretation for the Selberg constant in front of the distribution which may sensibly be defined even beyond the reals, complexes, and quaternions. On the application side, we propose a new learning problem where one estimates a β that best fits the sample eigenvalues from the Jacobi ensemble. Eigenvalue approximation of sums of Hermitian matrices from eigenvector localization/delocalization Movassagh, R., & Edelman, A. (2017). arXiv preprint arXiv:1710.09400. PDFarXivAbstract Abstract: We propose a technique for calculating and understanding the eigenvalue distribution of sums of random matrices from the known distribution of the summands. The exact problem is formidably hard. One extreme approximation to the true density amounts to classical probability, in which the matrices are assumed to commute; the other extreme is related to free probability, in which the eigenvectors are assumed to be in generic positions and sufficiently large. In practice, free probability theory can give a good approximation of the density. We develop a technique based on eigenvector localization/delocalization that works very well for important problems of interest where free probability is not sufficient, but certain uniformity properties apply. The localization/delocalization property appears in a convex combination parameter that notably, is independent of any eigenvalue properties and yields accurate eigenvalue density approximations. We demonstrate this technique on a number of examples as well as discuss a more general technique when the uniformity properties fail to apply. On Computing Schur Functions and Series Thereof Chan, C., Drensky, V., Edelman, A., Kan, R., & Koev, P. (2017). Lawrence Berkeley National Laboratory, eScholarship.org. PDF Beyond Universality in Random Matrix Theory Edelman, A., Guionnet, A., & Péché, S. (2016). The Annals of Applied Probability, 26(3), 1659-1697. PDFarXivAbstract Abstract: In order to have a better understanding of finite random matrices with non-Gaussian entries, we study the 1/N expansion of local eigenvalue statistics in both the bulk and at the hard edge of the spectrum of random matrices. This gives valuable information about the smallest singular value not seen in universality laws. In particular, we show the dependence on the fourth moment (or the kurtosis) of the entries. This work makes use of the so-called complex deformed GUE and Laguerre ensembles. Beta Jacobi Ensembles and the GSVD Edelman, A. Banff International Research Station for Mathematical Innovation and Discovery (2016) Infinite random matrix theory, tridiagonal bordered Toeplitz matrices, and the moment problem Dubbs, A., & Edelman, A. (2015). Linear Algebra and its Applications, 467, 188-201. PDFAbstract Abstract: The four major asymptotic level density laws of random matrix theory may all be showcased through their Jacobi parameter representation as having a bordered Toeplitz form. We compare and contrast these laws, completing and exploring their representations in one place. Inspired by the bordered Toeplitz form, we propose an algorithm for the finite moment problem by proposing a solution whose density has a bordered Toeplitz form. The singular values of the GUE (less is more) Edelman, A., & La Croix, M. (2015, October). Random Matrices:Theory and Applications, 4(4), Issue 04. PDFarXivAbstract Abstract: Some properties that nominally involve the eigenvalues of Gaussian Unitary Ensemble (GUE) can instead be phrased in terms of singular values. By discarding the signs of the eigenvalues, we gain access to a surprising decomposition: the singular values of the GUE are distributed as the union of the singular values of two independent ensembles of Laguerre type. This independence is remarkable given the well known phenomenon of eigenvalue repulsion. The structure of this decomposition reveals that several existing observations about large n limits of the GUE are in fact manifestations of phenomena that are already present for finite random matrices. We relate the semicircle law to the quarter-circle law by connecting Hermite polynomials to generalized Laguerre polynomials with parameter ±1/2. Similarly, we write the absolute value of the determinant of the n×n GUE as a product n independent random variables to gain new insight into its asymptotic log-normality. The decomposition also provides a description of the distribution of the smallest singular value of the GUE, which in turn permits the study of the leading order behavior of the condition number of GUE matrices. The study is motivated by questions involving the enumeration of orientable maps, and is related to questions involving powers of complex Ginibre matrices. The inescapable conclusion of this work is that the singular values of the GUE play an unpredictably important role that had gone unnoticed for decades even though, in hindsight, so many clues had been around. Condition Numbers of Indefinite Rank 2 Ghost Wishart Matrices Edelman, A., & Movassagh, R. (2015, October). Linear Algebra and its Applications, 483, 342-351. PDFarXivAbstract Abstract: We define an indefinite Wishart matrix as a matrix of the form $A=W^{T}W\Sigma$, where $\Sigma$ is an indefinite diagonal matrix and $W$ is a matrix of independent standard normals. We focus on the case where $W$ is $L$ by 2 which has engineering applications. We obtain the distribution of the ratio of the eigenvalues of $A$. This distribution can be "folded" to give the distribution of the condition number. We calculate formulas for $W$ real $(\beta=1)$, complex $(\beta=2)$, quaternionic $(\beta=4)$ or any ghost 0<$\beta$<$\infty$. We then corroborate our work by comparing them against numerical experiments. Random Matrix Theory, Numerical Computation and Applications Edelman, A., Sutton, B. D., & Wang, Y. (2014). Modern Aspects of Random Matrix Theory, 72, 53. PDF Low-temperature random matrix theory at the soft edge Edelman, A., Persson, P. O., & Sutton, B. D. (2014). Journal of Mathematical Physics, 55(6), 063302. PDFAbstract Abstract: "Low temperature" random matrix theory is the study of random eigenvalues as energy is removed. In standard notation, β is identified with inverse temperature, and low temperatures are achieved through the limit β → ∞. In this paper, we derive statistics for low-temperature random matrices at the "soft edge," which describes the extreme eigenvalues for many random matrix distributions. Specifically, new asymptotics are found for the expected value and standard deviation of the general-β Tracy-Widom distribution. The new techniques utilize beta ensembles, stochastic differential operators, and Riccati diffusions. The asymptotics fit known high-temperature statistics curiously well and contribute to the larger program of general-β random matrix theory. Computing with Beta Ensembles and Hypergeometric Functions Drensky, V., Edelman, A., Genoar, K., Kan, R., & Koev, P. (2014). RMTA. PDF The Beta-MANOVA Ensemble with General Covariance Dubbs, A., & Edelman, A. (2014). Random Matrices: Theory and Applications, 3(01), 1450002. PDFarXivAbstract Abstract: We find the joint generalized singular value distribution and largest generalized singular value distributions of the $\beta$-MANOVA ensemble with positive diagonal covariance, which is general. This has been done for the continuous $\beta > 0$ case for identity covariance (in eigenvalue form), and by setting the covariance to $I$ in our model we get another version. For the diagonal covariance case, it has only been done for $\beta = 1,2,4$ cases (real, complex, and quaternion matrix entries). This is in a way the first second-order $\beta$-ensemble, since the sampler for the generalized singular values of the $\beta$-MANOVA with diagonal covariance calls the sampler for the eigenvalues of the $\beta$-Wishart with diagonal covariance of Forrester and Dubbs-Edelman-Koev-Venkataramana. We use a conjecture of MacDonald proven by Baker and Forrester concerning an integral of a hypergeometric function and a theorem of Kaneko concerning an integral of Jack polynomials to derive our generalized singular value distributions. In addition we use many identities from Forrester's {\it Log-Gases and Random Matrices}. We supply numerical evidence that our theorems are correct. Eigenvalue distributions of Beta-Wishart matrices Edelman, A., & Koev, P. (2014). Random Matrices: Theory and Applications, 3(02), 1450009. PDFAbstract Abstract: We derive explicit expressions for the distributions of the extreme eigenvalues of the beta-Wishart random matrices in terms of the hypergeometric function of a matrix argument. These results generalize the classical results for the real (β = 1), complex (β = 2), and quaternion (β = 4) Wishart matrices to any β > 0. Random Matrix Theory and its Innovative Applications Edelman, A., & Wang, Y. (2013). In Advances in Applied Mathematics, Modeling, and Computational Science (pp. 91-116). Springer US. PDFAbstract Abstract: Recently more and more disciplines of science and engineering have found Random Matrix Theory valuable. Some disciplines use the limiting densities to indicate the cutoff between "noise" and "signal." Other disciplines are finding eigenvalue repulsions a compelling model of reality. This survey introduces both the theory behind these applications and MATLAB experiments allowing a reader immediate access to the ideas The beta-Wishart ensemble Dubbs, A., Edelman, A., Koev, P., & Venkataramana, P. (2013). Journal of Mathematical Physics, 54(8), 083507. PDFarXivAbstract Abstract: This paper proves a matrix model for the Wishart Ensemble with general covariance and general dimension parameter beta. In so doing, we introduce a new and elegant definition of Jack polynomials. Error Analysis of Free Probability Approximations to the Density of States of Disordered Systems Chen, J., Hontz, E., Moix, J., Welborn, M., Van Voorhis, T., Suárez, A., ... & Edelman, A. (2012). Physical review letters, 109(3), 036403. PDFarXivAbstract Abstract: Theoretical studies of localization, anomalous diffusion and ergodicity breaking require solving the electronic structure of disordered systems. We use free probability to approximate the ensemble- averaged density of states without exact diagonalization. We present an error analysis that quantifies the accuracy using a generalized moment expansion, allowing us to distinguish between different approximations. We identify an approximation that is accurate to the eighth moment across all noise strengths, and contrast this with the perturbation theory and isotropic entanglement theory. Partial Freeness of Random Matrices Chen, J., Van Voorhis, T., & Edelman, A. (2012). arXiv preprint arXiv:1204.2257. PDFarXivAbstract Abstract: We investigate the implications of free probability for random matrices. From rules for calculating all possible joint moments of two free random matrices, we develop a notion of partial freeness which is quantified by the breakdown of these rules. We provide a combinatorial interpretation for partial freeness as the presence of closed paths in Hilbert space defined by particular joint moments. We also discuss how asymptotic moment expansions provide an error term on the density of states. We present MATLAB code for the calculation of moments and free cumulants of arbitrary random matrices. Isotropic Entanglement Edelman, A., & Movassagh, R. (2010, revised 2012). Isotropic entanglement. arXiv preprint arXiv:1012.5039. PDFarXivAbstract Abstract: The method of "Isotropic Entanglement" (IE), inspired by Free Probability Theory and Random Matrix Theory, predicts the eigenvalue distribution of quantum many-body (spin) systems with generic interactions. At the heart is a "Slider", which interpolates between two extrema by matching fourth moments.The first extreme treats the non-commuting terms classically and the second treats them isotropically. Isotropic means that the eigenvectors are in generic positions. We prove Matching Three Moments and Slider Theorems and further prove that the interpolation is universal, i.e., independent of the choice of local terms. Our examples show that IE provides an accurate picture well beyond what one expects from the first four moments alone. Random Triangle Theory with Geometry and Applications Edelman, A., & Strang, G. (2012). Foundations of Computational Mathematics, 15(3), 681-713.Accompanying Julia Notebook [html] [ipynb] PDFAbstract Abstract: What is the probability that a random triangle is acute? We explore this old question from a modern viewpoint, taking into account linear algebra, shape theory, numerical analysis, random matrix theory, the Hopf fibration, and much much more. One of the best distributions, all six vertex coordinates being independent standard Gaussians, can be reduced to four by translation of the center to (0, 0) or reformulation as a 2x2 random matrix problem.In this note, we develop shape theory in its historical context for a wide audience. We hope to encourage others to look again (and differently) at triangles.We provide a new constructive proof, using the geometry of parallelians, of a key result of shape theory: that triangle shapes naturally fall on a hemisphere. We give several proofs of uniformity when the normal distribution is transferred to the hemisphere, including a new proof related to the distribution of random condition numbers. Generalizing to higher dimensions, we obtain the “square root ellipticity statistic” of random matrix theory. Another proof connects the Hopf map to the SVD of 2 by 2 matrices. A new theorem decribes three similar triangles hidden in the hemisphere. Many triangle properties are reformulated as matrix theorems, providing insight to both. This paper argues for a shift of viewpoint to the modern approaches of random matrix theory.As one example, we propose that the smallest singular value is an effective test for uniformity. New software is developed and applications are proposed. Density of States of Quantum Spin Systems from Isotropic Entanglement Edelman, A., & Movassagh, R. (2011). Physical review letters, 107(9), 097205. PDFarXivAbstract Abstract: We propose a method which we call "Isotropic Entanglement" (IE), that predicts the eigenvalue distribution of quantum many body (spin) systems (QMBS) with generic interactions. We interpolate between two known approximations by matching fourth moments. Though, such problems can be QMA-complete, our examples show that IE provides an accurate picture of the spectra well beyond what one expects from the first four moments alone. We further show that the interpolation is universal, i.e., independent of the choice of local terms. The Random Matrix Technique of Ghosts and Shadows Edelman, A. (2010). Markov Processes and Related Fields, 16(4), 783-790. PDF Sturm Sequences and Random Eigenvalue distributions Albrecht, J. T., Chan, C. P., & Edelman, A. (2009). Foundations of Computational Mathematics, 9(4), 461-483. PDFAbstract Abstract: This paper proposes that the study of Sturm sequences is invaluable in the numerical computation and theoretical derivation of eigenvalue distributions of random matrix ensembles. We first explore the use of Sturm sequences to efficiently compute histograms of eigenvalues for symmetric tridiagonal matrices and apply these ideas to random matrix ensembles such as the β-Hermite ensemble. Using our techniques, we reduce the time to compute a histogram of the eigenvalues of such a matrix from O(n 2+m) to O(mn) time where n is the dimension of the matrix and m is the number of bins (with arbitrary bin centers and widths) desired in the histogram (m is usually much smaller than n). Second, we derive analytic formulas in terms of iterated multivariate integrals for the eigenvalue distribution and the largest eigenvalue distribution for arbitrary symmetric tridiagonal random matrix models. As an example of the utility of this approach, we give a derivation of both distributions for the β-Hermite random matrix ensemble (for general β). Third, we explore the relationship between the Sturm sequence of a random matrix and its shooting eigenvectors. We show using Sturm sequences that assuming the eigenvector contains no zeros, the number of sign changes in a shooting eigenvector of parameter λ is equal to the number of eigenvalues greater than λ. Finally, we use the techniques presented in the first section to experimentally demonstrate a O(log n) growth relationship between the variance of histogram bin values and the order of the β-Hermite matrix ensemble. Sample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples Nadakuditi, R. R., & Edelman, A. (2008). Signal Processing, IEEE Transactions on, 56(7), 2625-2638. PDFarXivAbstract Abstract: We present a mathematically justifiable, computationally simple, sample eigenvalue based procedure for estimating the number of high-dimensional signals in white noise using relatively few samples. The main motivation for considering a sample eigenvalue based scheme is the computational simplicity and the robustness to eigenvector modelling errors which are can adversely impact the performance of estimators that exploit information in the sample eigenvectors. There is, however, a price we pay by discarding the information in the sample eigenvectors; we highlight a fundamental asymptotic limit of sample eigenvalue based detection of weak/closely spaced high-dimensional signals from a limited sample size. This motivates our heuristic definition of the effective number of identifiable signals which is equal to the number of "signal" eigenvalues of the population covariance matrix which exceed the noise variance by a factor strictly greater than 1+sqrt(Dimensionality of the system/Sample size). The fundamental asymptotic limit brings into sharp focus why, when there are too few samples available so that the effective number of signals is less than the actual number of signals, underestimation of the model order is unavoidable (in an asymptotic sense) when using any sample eigenvalue based detection scheme, including the one proposed herein. The analysis reveals why adding more sensors can only exacerbate the situation. Numerical simulations are used to demonstrate that the proposed estimator consistently estimates the true number of signals in the dimension fixed, large sample size limit and the effective number of identifiable signals in the large dimension, large sample size limit. Statistical eigen-inference from large Wishart Matrices Rao, N. R., Mingo, J. A., Speicher, R., & Edelman, A. (2008). The Annals of Statistics, 2850-2885. PDFarXivAbstract Abstract: We consider settings where the observations are drawn from a zero-mean multivariate (real or complex) normal distribution with the population covariance matrix having eigenvalues of arbitrary multiplicity. We assume that the eigenvectors of the population covariance matrix are unknown and focus on inferential procedures that are based on the sample eigenvalues alone (i.e., "eigen-inference"). Results found in the literature establish the asymptotic normality of the fluctuation in the trace of powers of the sample covariance matrix. We develop concrete algorithms for analytically computing the limiting quantities and the covariance of the fluctuations. We exploit the asymptotic normality of the trace of powers of the sample covariance matrix to develop eigenvalue-based procedures for testing and estimation. Specifically, we formulate a simple test of hypotheses for the population eigenvalues and a technique for estimating the population eigenvalues in settings where the cumulative distribution function of the (nonrandom) population eigenvalues has a staircase structure. Monte Carlo simulations are used to demonstrate the superiority of the proposed methodologies over classical techniques and the robustness of the proposed techniques in high-dimensional, (relatively) small sample size settings. The improved performance results from the fact that the proposed inference procedures are "global" (in a sense that we describe) and exploit "global" information thereby overcoming the inherent biases that cripple classical inference procedures which are "local" and rely on "local" information. The Polynomial Method for Random Matrices Rao, N. R., & Edelman, A. (2008). Foundations of Computational Mathematics, 8(6), 649-702. PDFarXivAbstract Abstract: We define a class of "algebraic" random matrices. These are random matrices for which the Stieltjes transform of the limiting eigenvalue distribution function is algebraic, i.e., it satisfies a (bivariate) polynomial equation. The Wigner and Wishart matrices whose limiting eigenvalue distributions are given by the semi-circle law and the Marcenko-Pastur law are special cases. Algebraicity of a random matrix sequence is shown to act as a certificate of the computability of the limiting eigenvalue density function. The limiting moments of algebraic random matrix sequences, when they exist, are shown to satisfy a finite depth linear recursion so that they may often be efficiently enumerated in closed form. In this article, we develop the mathematics of the polynomial method which allows us to describe the class of algebraic matrices by its generators and map the constructive approach we employ when proving algebraicity into a software implementation that is available for download in the form of the RMTool random matrix "calculator" package. Our characterization of the closure of algebraic probability distributions under free additive and multiplicative convolution operations allows us to simultaneously establish a framework for computational (non-commutative) "free probability" theory. We hope that the tools developed allow researchers to finally harness the power of the infinite random matrix theory. The beta-Jacobi matrix model, the CS decomposition, and generalized singular value problems Edelman, A., & Sutton, B. D. (2008). Foundations of Computational Mathematics, 8(2), 259-285. PDF From Random Matrices to Stochastic Operators Edelman, A., & Sutton, B. D. (2007). Journal of Statistical Physics, 127(6), 1121-1165. PDFarXivAbstract Abstract: We propose that classical random matrix models are properly viewed as finite difference schemes for stochastic differential operators. Three particular stochastic operators commonly arise, each associated with a familiar class of local eigenvalue behavior. The stochastic Airy operator displays soft edge behavior, associated with the Airy kernel. The stochastic Bessel operator displays hard edge behavior, associated with the Bessel kernel. The article concludes with suggestions for a stochastic sine operator, which would display bulk behavior, associated with the sine kernel. MOPS: Multivariate Orthogonal Polynomials (symbolically) Dumitriu, I., Edelman, A., & Shuman, G. (2007). Journal of symbolic computation, 42(6), 587-620. PDFarXivAbstract Abstract: In this paper we present a Maple library (MOPs) for computing Jack, Hermite, Laguerre, and Jacobi multivariate polynomials, as well as eigenvalue statistics for the Hermite, Laguerre, and Jacobi ensembles of Random Matrix theory. We also compute multivariate hypergeometric functions, and offer both symbolic and numerical evaluations for all these quantities. We prove that all algorithms are well-defined, analyze their complexity, and illustrate their performance in practice. Finally, we also present a few of the possible applications of this library. Sample Size Cognizant Detection of Signals in White Noise Nadakuditi, R. R., & Edelman, A. (2007, June). In Signal Processing Advances in Wireless Communications, 2007. SPAWC 2007. IEEE 8th Workshop on (pp. 1-5). IEEE. PDFarXivAbstract Abstract: The detection and estimation of signals in noisy, limited data is a problem of interest to many scientific and engineering communities. We present a computationally simple, sample eigenvalue based procedure for estimating the number of high-dimensional signals in white noise when there are relatively few samples. We highlight a fundamental asymptotic limit of sample eigenvalue based detection of weak high-dimensional signals from a limited sample size and discuss its implication for the detection of two closely spaced signals. This motivates our heuristic definition of the 'effective number of identifiable signals.' Numerical simulations are used to demonstrate the consistency of the algorithm with respect to the effective number of signals and the superior performance of the algorithm with respect to Wax and Kailath's "asymptotically consistent" MDL based estimator. On the Largest Principal Angle between Random Subspaces Absil, P. A., Edelman, A., & Koev, P. (2006). Linear Algebra and its applications, 414(1), 288-294. PDFAbstract Abstract: Formulas are derived for the probability density function and the probability distribution function of the largest canonical angle between two p -dimensional subspaces of Rn chosen from the uniform distribution on the Grassmann manifold (which is the unique distribution invariant by orthogonal transformations of Rn). The formulas involve the gamma function and the hypergeometric function of a matrix argument. On the Efficient Evaluation of the Hypergeometric Function of a Matrix Argument Koev, P., & Edelman, A. (2006). Mathematics of Computation, 75(254), 833-846. PDFarXivAbstract Abstract: We present new algorithms that efficiently approximate the hypergeometric function of a matrix argument through its expansion as a series of Jack functions. Our algorithms exploit the combinatorial properties of the Jack function, and have complexity that is only linear in the size of the matrix. Global Spectrum Fluctuations for the beta-Hermite and beta-Laguerre ensembles via matrix models Dumitriu, I., & Edelman, A. (2006). Journal of Mathematical Physics, 47(6), 063302. PDFarXivAbstract Abstract: We study the global spectrum fluctuations for beta-Hermite and beta-Laguerre ensembles via the tridiagonal matrix models introduced in [11], and prove that the fluctuations describe a Gaussian process on monomials. We extend our results to slightly larger classes of random matrices. Eigenvalues of Hermite and Laguerre ensembles: Large Beta Asymptotics Dumitriu, I., & Edelman, A. (2005). In Annales de l'IHP Probabilités et statistiques, 41(6), 1083-1099. PDFarXivAbstract Abstract: In this paper we examine the zero and first order eigenvalue fluctuations for the $\beta$-Hermite and $\beta$-Laguerre ensembles, using the matrix models we described in [5], in the limit as $\beta \to \infty$. We find that the fluctuations are described by Gaussians of variance $O(1/\beta)$, centered at the roots of a corresponding Hermite (Laguerre) polynomial. We also show that the approximation is very good, even for small values of $\beta$, by plotting exact level densities versus sum of Gaussians approximations. Tails of Condition Number Distributions Edelman, A., & Sutton, B. D. (2005). SIAM journal on matrix analysis and applications, 27(2), 547-560. PDFAbstract Abstract: Let $\kappa$ be the condition number of an m-by-n matrix with independent standard Gaussian entries, either real ($\beta = 1$) or complex ($\beta = 2$). The major result is the existence of a constant C (depending on m, n, and $\beta$) such that $P[\kappa > x] < C \, x^{-\beta}$ for all x. As $x \rightarrow \infty$, the bound is asymptotically tight. An analytic expression is given for the constant C, and simple estimates are given, one involving a Tracy--Widom largest eigenvalue distribution. All of the results extend beyond real and complex entries to general $\beta$. On the Probability Distribution of the Outputs of the Diagonally Loaded Capon-MVDR Processor Nadakuditi, R. R., & Edelman, A. (2005, October). In Signals, Systems and Computers, 2005. Conference Record of the Thirty-Ninth Asilomar Conference on (pp. 1717-1723). IEEE. Numerical Methods for Eigenvalue Distributions of Random Matrices Edelman, A., & Persson, P. O. (2005). arXiv preprint math-ph/0501068. PDFarXivAbstract Abstract: We present efficient numerical techniques for calculation of eigenvalue distributions of random matrices in the beta-ensembles. We compute histograms using direct simulations on very large matrices, by using tridiagonal matrices with appropriate simplifications. The distributions are also obtained by numerical solution of the Painleve II and V equations with high accuracy. For the spacings we show a technique based on the Prolate matrix and Richardson extrapolation, and we compare the distributions with the zeros of the Riemann zeta function. Matrix Models for Beta Ensembles Dumitriu, I., & Edelman E. (2002). Journal of Mathematical Physics, 43, 5830--5847. PDFarXivAbstract Abstract: This paper constructs tridiagonal random matrix models for general ($\beta>0$) $\beta$-Hermite (Gaussian) and $\beta$-Laguerre (Wishart) ensembles. These generalize the well-known Gaussian and Wishart models for $\beta = 1,2,4$. Furthermore, in the cases of the $\beta$-Laguerre ensembles, we eliminate the exponent quantization present in the previously known models. We further discuss applications for the new matrix models, and present some open problems. On the determinant of a uniformly distributed complex matrix Edelman, A. (1995). Journal of Complexity, 11(3), 352-357. PDF How many zeros of a random polynomial are real? Edelman, A., & Kostlan, E. (1995). Bulletin of the American Mathematical Society, 32(1), 1-37. PDFarXivAbstract Abstract: We give an elementary derivation of the Kac integral formula for the expected number of real roots of a random polynomial with independent standard normally distributed coefficients. We show that the expected number of real roots is the length of the moment curve $(1,t,\ldots,t^n)$ projected onto the surface of the unit sphere, divided by $\pi$. The density of the root is proportional to how fast this curve is traced out. We generalize the Kac formula to polynomials with coefficients that have an arbitrary multivariate normal distribution. We show, for example, that for a particularly nice definition of random polynomial, the expected number of real roots is exactly the square root of the degree.When the random polynomials have an arbitrary density function, we express the expected number of roots as a weighted length of the curve. We also calculate the the distribution of the real zeros of random power series and Fourier series, random sums of orthogonal polynomials, and random Dirichlet series. How many eigenvalues of a random matrix are real? Edelman, A., Kostlan, E., & Shub, M. (1994). Journal of the American Mathematical Society, 7(1), 247-267. PDFAbstract Abstract: Let $A$ be an $n \times n$ matrix whose elements are independent random variables with standard normal distributions. As $n \to \infty$, the expected number of real eigenvalues is asymptotic to $\sqrt {2n/\pi }$. We obtain a closed form expression for the expected number of real eigenvalues for finite $n$, and a formula for the density of a real eigenvalue for finite $n$. Asymptotically, a real normalized eigenvalue $\lambda /\sqrt n$ of such a random matrix is uniformly distributed on the interval [-1, 1]. Analogous, but strikingly different, results are presented for the real generalized eigenvalues. We report on numerical experiments confirming these results and suggesting that the assumption of normality is not important for the asymptotic results. The road from Kac's matrix to Kac's random polynomials Edelman, A. & Kostlan, E. (1994). Proceedings of the 1994 SIAM Applied Linear Algebra Conference J.G. Lewis, ed., SIAM, Philadelphia, 503--507 PDF The Circular Law and the Probability that a Random Matrix Has $k$ Real Eigenvalues Edelman, A. (1993). preprint. PDFAbstract Abstract: Let $A$ be an $n$ by $n$ matrix whose elements are independent random variables with standard normal distributions. Girko's controversial circular law states that the distribution of appropriately normalized eigenvalues is asymptotically uniform in the unit disk in the complex plane. We derive the exact distribution of the complex eigenvalues for finite $n$, from which convergence in distribution to the circular law for normally distributed matrices may be derived. Similar methodology allows us to derive a joint distribution formula for the real Schur decomposition of $A$. Integration of this distribution yields the probability that $A$ has exactly $k$ real eigenvalues. For example, we show that the probability that $A$ has all real eigenvalues is exactly $2^{-n(n-1)/4}$. Eigenvalue Roulette and Random Test Matrices Edelman, A. (1993). In Linear Algebra for Large Scale and Real-Time Applications (pp. 365-368). Springer Netherlands. PDFAbstract Abstract: Random matrices may not be mentioned in research abstracts, but second only to Matlab, the most widely used tool of numerical linear algebraists is the “random test matrix.” Algorithmic developers in need of guinea pigs nearly always take random matrices with standard normal entries or perhaps close cousins, such as the uniform distribution on [-1,1]. The choice is highly reasonable: these matrices are generated effortlessly and might very well catch programming errors. What is a mistake is to psychologically link a random matrix with the intuitive notion of a "typical" matrix or the vague concept of "any old matrix." On the distribution of a scaled condition number Edelman, A. (1992). Mathematics of Computation, 58(197), 185-190. PDF The distribution and moments of the smallest eigenvalue of a random matrix of Wishart type Edelman, A. (1991). Linear algebra and its applications, 159, 55-80. Abstract Abstract: Given a random rectangular m × n matrix with elements from a normal distribution, what is the distribution of the smallest singular value? To pose an equivalent question in the language of multivariate statistics, what is the distribution of the smallest eigenvalue of a matrix from the central Wishart distribution in the null case? We give new results giving the distribution as a simple recursion. This includes the more difficult case when n – m is an even integer, without resorting to zonal polynomials and hypergeometric functions of matrix arguments. With the recursion, one can obtain exact expressions for the density and the moments of the distribution in terms of functions usually no more complicated than polynomials, exponentials, and at worst ordinary hypergeometric functions. We further elaborate on the special cases when n – m = 0, 1, 2, and 3 and give a numerical table of the expected values for 2 ⩽ m ⩽ 25 and 0 ⩽ n – m ⩽ 25. Eigenvalues and Condition Numbers of Random Matrices Edelman, A. (1989). MIT PhD Dissertation(4).(As of 2002, the figures are now included in this file. They were all recomputed!) PDF Eigenvalues and condition numbers of random matrices Edelman, A. (1988). SIAM Journal on Matrix Analysis and Applications, 9(4), 543-560. PDFAbstract Abstract: Given a random matrix, what condition number should be expected? This paper presents a proof that for real or complex $n \times n$ matrices with elements from a standard normal distribution, the expected value of the log of the 2-norm condition number is asymptotic to $\log n$ as $n \to \infty$. In fact, it is roughly $\log n + 1.537$ for real matrices and $\log n + 0.982$ for complex matrices as $n \to \infty$. The paper discusses how the distributions of the condition numbers behave for large n for real or complex and square or rectangular matrices. The exact distributions of the condition numbers of $2 \times n$ matrices are also given.Intimately related to this problem is the distribution of the eigenvalues of Wishart matrices. This paper studies in depth the largest and smallest eigenvalues, giving exact distributions in some cases. It also describes the behavior of all the eigenvalues, giving an exact formula for the expected characteristic polynomial. A linear-time algorithm for evaluating series of Schur functions Chan, C., Drensky, V., Edelman, A., & Koev, P. PDF

## Parallel Computing

 Pascal Matrices Edelman, A., & Strang, G. (2004). American Mathematical Monthly, 189-197. PDF Staircase failures explained by orthogonal versal forms Edelman, A., & Ma, Y. (2000). SIAM Journal on Matrix Analysis and Applications, 21(3), 1004-1025. PDFAbstract Abstract: Treating matrices as points in n2 -dimensional space, we apply geometry to study and explain algorithms for the numerical determination of the Jordan structure of a matrix. Traditional notions such as sensitivity of subspaces are replaced with angles between tangent spaces of manifolds in n2 -dimensional space. We show that the subspace sensitivity is associated with a small angle between complementary subspaces of a tangent space on a manifold in n2 -dimensional space. We further show that staircase algorithm failure is related to a small angle between what we call staircase invariant space and this tangent space. The matrix notions in n2 -dimensional space are generalized to pencils in 2mn-dimensional space. We apply our theory to special examples studied by Boley, Demmel, and Kågström. A Geometric Approach to Perturbation Theory of Matrices and Matrix Pencils. Part II: A Stratification Enhanced Staircase Algorithm Edelman, A., Elmroth, E., & Kågström, B. (1999). SIAM Journal on Matrix Analysis and Applications, 20(3), 667-699. PDFAbstract Abstract: Computing the Jordan form of a matrix or the Kronecker structure of a pencil is a well-known ill-posed problem. We propose that knowledge of the closure relations, i.e., the stratification, of the orbits and bundles of the various forms may be applied in the staircase algorithm. Here we discuss and complete the mathematical theory of these relationships and show how they may be applied to the staircase algorithm. This paper is a continuation of our Part I paper on versal deformations, but it may also be read independently. The Computation and Sensitivity of Double Eigenvalues Lippert, R. A., & Edelman, A. (1999). Advances in Computational Mathematics, Lecture Notes in Pure and Appl. Math, 202, 353-393. PDF A Counterexample to a Hadamard Matrix Pivot Conjecture Edelman, A., & Friendman, D. (1998). Linear and Multilinear Algebra, 44(1), 53-56. Abstract Abstract: In the study of the growth factor of completely pivoted Hadamard matrices, it becomes natural to study the possible pivots. Very little is known or provable about these pivots other than a few cases at the beginning and end. For example it is known that the first four pivots must be 1,2,2 and 4 and the last three pivots in backwards order must be n/2, and n/2. Based on computational experiments, it was conjectured by Day and Peterson, that the n—3rd pivot must always be n/4. This conjecture would have suggested a kind of symmetry with the first four pivots. In this note we demonstrate a matrix whose n–3rd pivot is n/2 showing that the conjecture is false. Non-generic eigenvalue perturbations of Jordan blocks Ma, Y., & Edelman, A. (1998). Linear algebra and its applications, 273(1), 45-63. PDFAbstract Abstract: We show that if an n × n Jordan block is perturbed by an O(ε) upper k-Hessenberg matrix (k subdiagonals including the main diagonal), then generically the eigenvalues split into p rings of size k and one of size r (if r ≠ 0), where n = pk + r. This generalizes the familiar result (k = n, p = 1, r = 0) that generically the eigenvalues split into a ring of size n. We compute the radii of the rings to first order and generalize the result in a number of directions involving multiple Jordan blocks of the same size. A Geometric Approach to Perturbation Theory of Matrices and Matrix Pencils. Part I: Versal Deformation Edelman, A., Elmroth, E., & Kågström, B. (1997). SIAM Journal on Matrix Analysis and Applications, 18(3), 653-692. Abstract Abstract: Computing the Jordan form of a matrix or the Kronecker structure of a pencil is a well-known ill-posed problem. We propose that knowledge of the closure relations, i.e., the stratification, of the orbits and bundles of the various forms may be applied in the staircase algorithm. Here we discuss and complete the mathematical theory of these relationships and show how they may be applied to the staircase algorithm. This paper is a continuation of our Part I paper on versal deformations, but may also be read independently. On the complete pivoting conjecture for a Hadamard matrix of order 12 Edelman, A., & Mascarenhas, W. (1995). Linear and Multilinear Algebra, 38(3), 181-187. PDFAbstract Abstract: This paper settles a conjecture by Day and Peterson that if Gaussian elimination with complete pivoting is performed on a 12 by 12 Hadamard matrix, then (1,2,2,4,3,10/3,18/5,4,3,6,6,12) must be the (absolute) pivots. Our proof uses the idea of symmetric block designs to reduce the complexity that would be found in enumerating cases.In contrast, at least 30 patterns for the absolute values of the pivots have been observed for 16 by 16 Hadamard matrices. This problem is non-trivial because row and column permutations do not preserve pivots. A naive computer search would require (12!)2 trials. A simple estimate of the condition number of a linear system Guggenheimer, H. W., Edelman, A. S., & Johnson, C. R. (1995). College Mathematics Journal, 2-5. Polynomial roots from companion matrix eigenvalues Edelman, A., & Murakami, H. (1995). Mathematics of Computation, 64(210), 763-776. PDFAbstract Abstract: In classical linear algebra, the eigenvalues of a matrix are sometimes defined as the roots of the characteristic polynomial. An algorithm to compute the roots of a polynomial by computing the eigenvalues of the corresponding companion matrix turns the tables on the usual definition. We derive a first-order error analysis of this algorithm that sheds light on both the underlying geometry of the problem as well as the numerical error of the algorithm. Our error analysis expands on work by Van Dooren and Dewilde in that it states that the algorithm is backward normwise stable in a sense that must be defined carefully. Regarding the stronger concept of a small componentwise backward error, our analysis predicts a small such error on a test suite of eight random polynomials suggested by Toh and Trefethen. However, we construct examples for which a small componentwise relative backward error is neither predicted nor obtained in practice. We extend our results to polynomial matrices, where the result is essentially the same, but the geometry becomes more complicated. On Parlett's matrix norm inequality for the Cholesky decomposition Edelman, A., & Mascarenhas, W. F. (1995). Numerical linear algebra with applications, 2(3), 243-250. PDFAbstract Abstract: We show that a certain matrix norm ratio studied by Parlett has a supermum that is O(equation image) when the chosen norm is the Frobenius norm, while it is O(log n) for the 2-norm. This ratio arises in Parlett's analysis of the Cholesky decomposition of an n by n matrix. The dimension of matrices (matrix pencils) with given Jordan (Kronecker) canonical forms Demmel, J. W., & Edelman, A. (1995). Linear algebra and its applications, 230, 61-87. PDFAbstract Abstract: The set of $n$ by $n$ matrices with a given Jordan canonical form defines a subset of matrices in complex $n^2$ dimensional space. We analyze one classical approach and one new approach to count the dimension of this set. The new approach is based upon and meant to give insight into the staircase algorithm for the computation of the Jordan Canonical Form as well as the occasional failures of this algorithm. We extend both techniques to count the dimension of the more complicated set defined by the Kronecker canonical form of an arbitrary rectangular matrix pencil $A-\lambda B$. Scaling for orthogonality Edelman, A., & Stewart, G. W. (1993). Signal Processing, IEEE Transactions on, 41(4), 1676-1677. PDFAbstract Abstract: In updating algorithms where orthogonal transformations are accumulated, it is important to preserve the orthogonality of the product in the presence of rounding error. Moonen et al. (ibid., vol.39, p.1911-13, 1991) have pointed out that simply normalizing the columns of the product tends to preserve orthogonality-though not, as DeGroat (ibid., vol.39, p.1913-14, 1991) points out, to working precision. The authors discuss the previous work and give an analysis of the phenomenon. The complete pivoting conjecture for Gaussian Elimination is false Edelman, A. (1992). Mathematica Journal, 2(2), 58-61. PDF Snapshots of Mobile Jacobi Edelman, A. (1991). In Numerical Linear Algebra, Digital Signal Processing and Parallel Algorithms (pp. 485-488). Springer Berlin Heidelberg.(What a shame that I do not have the pictures available. Even better would be to put the original video on line. Maybe some day!) PDF