- Scientific Computing
- Random Eigenvalues etc.
- Parallel Computing
- Numerical Linear Algebra
- Approximation Theory

## Scientific Computing

Oceananigans.jl: Fast and friendly geophysical fluid dynamics on GPUs Ramadhan, A., et al. (2020, January). The Journal of Open Source Software. DOI: 10.21105/joss.02018. | PDF Abstract | |

Abstract: Oceananigans.jl is a fast and friendly software package for the numerical simulation of incompressible, stratified, rotating fluid flows on CPUs and GPUs. | ||

Universal Differential Equations for Scientific Machine Learning Rackauckas, C., Ma, Y., Martensen, J., Warner, C., Zubov, K., Supekar, R., ... & Ramadhan, A. (2020). arXiv preprint arXiv:2001.04385. | PDF arXiv Abstract | |

Abstract: In the context of science, the well-known adage "a picture is worth a thousand words" might well be "a model is worth a thousand datasets." Scientific models, such as Newtonian physics or biological gene regulatory networks, are human-driven simplifications of complex phenomena that serve as surrogates for the countless experiments that validated the models. Recently, machine learning has been able to overcome the inaccuracies of approximate modeling by directly learning the entire set of nonlinear interactions from data. However, without any predetermined structure from the scientific basis behind the problem, machine learning approaches are flexible but data-expensive, requiring large databases of homogeneous labeled training data. A central challenge is reconciling data that is at odds with simplified models without requiring "big data".In this work we develop a new methodology, universal differential equations (UDEs), which augments scientific models with machine-learnable structures for scientifically-based learning. We show how UDEs can be utilized to discover previously unknown governing equations, accurately extrapolate beyond the original data, and accelerate model simulation, all in a time and data-efficient manner. This advance is coupled with open-source software that allows for training UDEs which incorporate physical constraints, delayed interactions, implicitly-defined events, and intrinsic stochasticity in the model. Our examples show how a diverse set of computationally-difficult modeling issues across scientific disciplines, from automatically discovering biological mechanisms to accelerating the training of physics-informed neural networks and large-eddy simulations, can all be transformed into UDE training problems that are efficiently solved by a single software methodology. | ||

Generalized Physics-Informed Learning Through Language-Wide Differentiable Programming Rackauckas C., Edelman A., Fischer K., Innes M., Saba E., Shah V.B., & Tebbutt W. (2020). In Proceedings of the AAAI 2020 Spring Symposium on Combining Artificial Intelligence and Machine Learning with Physical Sciences. | PDF Abstract | |

Abstract: Scientific computing is increasingly incorporating the advancements in machine learning to allow for data-driven physicsinformed modeling approaches. However, re-targeting existing scientific computing workloads to machine learning frameworks is both costly and limiting, as scientific simulations tend to use the full feature set of a general purpose programming language. In this manuscript we develop an infrastructure for incorporating deep learning into existing scientific computing code through Differentiable Programming (∂P). We describe a ∂P system that is able to take gradients of full Julia programs, making Automatic Differentiation a first class language feature and compatibility with deep learning pervasive. Our system utilizes the one-language nature of Julia package development to augment the existing package ecosystem with deep learning, supporting almost all language constructs (control flow, recursion, mutation, etc.) while generating highperformance code without requiring any user intervention or refactoring to stage computations. We showcase several examples of physics-informed learning which directly utilizes this extension to existing simulation code: neural surrogate models, machine learning on simulated quantum hardware, and data-driven stochastic dynamical model discovery with neural stochastic differential equations. | ||

Signal Enhancement for Magnetic Navigation Challenge Problem Gnadt, A. R., Belarge, J., Canciani, A., Conger, L., Curro, J., Edelman, A., ... & Rackauckas, C. (2020). arXiv preprint arXiv:2007.12158. | PDF arXiv Abstract | |

Abstract: Harnessing the magnetic field of the earth for navigation has shown promise as a viable alternative to other navigation systems. A magnetic navigation system collects its own magnetic field data using a magnetometer and uses magnetic anomaly maps to determine the current location. The greatest challenge with magnetic navigation arises when the magnetic field data from the magnetometer on the navigation system encompass the magnetic field from not just the earth, but also from the vehicle on which it is mounted. It is difficult to separate the earth magnetic anomaly field magnitude, which is crucial for navigation, from the total magnetic field magnitude reading from the sensor. The purpose of this challenge problem is to decouple the earth and aircraft magnetic signals in order to derive a clean signal from which to perform magnetic navigation. Baseline testing on the dataset shows that the earth magnetic field can be extracted from the total magnetic field using machine learning (ML). The challenge is to remove the aircraft magnetic field from the total magnetic field using a trained neural network. These challenges offer an opportunity to construct an effective neural network for removing the aircraft magnetic field from the dataset, using an ML algorithm integrated with physics of magnetic navigation. | ||

Rapid software prototyping for heterogeneous and distributed platforms Besard, T., Churavy, V., Edelman, E., & De Sutter, B. (2019, June). Advances in Engineering Software, 132, 29-46 | Abstract | |

Abstract: The software needs of scientists and engineers are growing and their programs are becoming more compute-heavy and problem-specific. This has led to an influx of non-expert programmers, who need to use and program high-performance computing platforms.With the continued stagnation of single-threaded performance, using hardware accelerators such as GPUs or FPGAs is necessary. Adapting software to these compute platforms is a difficult task, especially for non-expert programmers, leading to applications being unable to take advantage of new hardware or requiring extensive rewrites. We propose a programming model that allows non-experts to benefit from high-performance computing, while enabling expert programmers to take full advantage of the underlying hardware. In this model, programs are generically typed, the location of the data is encoded in the type system, and multiple dispatch is used to select functionality based on the type of the data. This enables rapid prototyping, retargeting and reuse of existing software, while allowing for hardware specific optimization if required. Our approach allows development to happen in one source language enabling domain experts and performance engineers to jointly develop a program, without the overhead, friction, and challenges associated with developing in multiple programming languages for the same project. We demonstrate the viability and the core principles of this programming model in Julia using realistic examples, showing the potential of this approach for rapid prototyping, and its applicability for real-life engineering. We focus on usability for non-expert programmers and demonstrate that the potential of the underlying hardware can be fully exploited. | ||

A Differentiable Programming System to Bridge Machine Learning and Scientific Computing Innes, M., Edelman, A., Fischer, K., Rackauckas, C., Saba, E., Shah, V. B., & Tebbutt, W. (2019). CoRR, abs/1907.07587. | PDF arXiv Abstract | |

Abstract: Scientific computing is increasingly incorporating the advancements in machine learning and the ability to work with large amounts of data. At the same time, machine learning models are becoming increasingly sophisticated and exhibit many features often seen in scientific computing, stressing the capabilities of machine learning frameworks. Just as the disciplines of scientific computing and machine learning have shared common underlying infrastructure in the form of numerical linear algebra, we now have the opportunity to further share new computational infrastructure, and thus ideas, in the form of Differentiable Programming. We describe Zygote, a Differentiable Programming system that is able to take gradients of general program structures. We implement this system in the Julia programming language. Our system supports almost all language constructs (control flow, recursion, mutation, etc.) and compiles high-performance code without requiring any user intervention or refactoring to stage computations. This enables an expressive programming model for deep learning, but more importantly, it enables us to incorporate a large ecosystem of libraries in our models in a straightforward way. We discuss our approach to automatic differentiation, including its support for advanced techniques such as mixed-mode, complex and checkpointed differentiation, and present several examples of differentiating programs. | ||

Fast computation of the principal components of genotype matrices in Julia J Chen, A Noack, A Edelman. (2018) | PDF arXiv Abstract | |

Abstract: Finding the largest few principal components of a matrix of genetic data is a common task in genome-wide association studies (GWASs), both for dimensionality reduction and for identifying unwanted factors of variation. We describe a simple random matrix model for matrices that arise in GWASs, showing that the singular values have a bulk behavior that obeys a Marchenko-Pastur distributed with a handful of large outliers. We also implement Golub-Kahan-Lanczos (GKL) bidiagonalization in the Julia programming language, providing thick restarting and a choice between full and partial reorthogonalization strategies to control numerical roundoff. Our implementation of GKL bidiagonalization is up to 36 times faster than software tools used commonly in genomics data analysis for computing principal components, such as EIGENSOFT and FlashPCA, which use dense LAPACK routines and randomized subspace iteration respectively. | ||

On Machine Learning and Programming Languages Innes, M., Karpinski, S., Shah, V., Barber, D., Saito Stenetorp, P. L. E. P. S., Besard, T., ... & Malmaud, J. (2018, February). Association for Computing Machinery (ACM). | PDF Abstract | |

Abstract: The complexity of Machine Learning (ML) models and the frameworks people are using to build them has exploded along with ML itself. State-of-the-art models are increasingly programs, with support for programming constructs like loops and recursion, and this brings out many interesting issues in the tools we use to create them — that is, programming languages (PL). This paper1, discusses the necessity for a first class language for machine learning, and what such a language might look like. | ||

A more open efficient future for AI development and data science with an introduction to Julia. Edelman, A. (2017). Big Data (Big Data), 2017 IEEE International Conference on (pp. 2-2). IEEE. | ||

Julia: A fresh approach to numerical computing Bezanson, J., Edelman, A., Karpinski, S., & Shah, V. B. (2017). SIAM review, 59(1), 65-98. | PDF arXiv Abstract | |

Abstract: Bridging cultures that have often been distant, Julia combines expertise from the diverse fields of computer science and computational science to create a new approach to numerical computing. Julia is designed to be easy and fast. Julia questions notions generally held as “laws of nature” by practitioners of numerical computing:
- High-level dynamic programs have to be slow,
- One must prototype in one language and then rewrite in another language for speed or deployment, and
- There are parts of a system for the programmer, and other parts best left untouched as they are built by the experts.
Julia shows that one can have machine performance without sacrificing human convenience. | ||

Accelerated Convolutions for Efficient Multi-Scale Time to Contact Computation in Julia Amini, A., Horn, B., & Edelman, A. (2016). arXiv preprint arXiv:1612.08825. | PDF arXiv Abstract | |

Abstract: Convolutions have long been regarded as fundamental to applied mathematics, physics and engineering. Their mathematical elegance allows for common tasks such as numerical differentiation to be computed efficiently on large data sets. Efficient computation of convolutions is critical to artificial intelligence in real-time applications, like machine vision, where convolutions must be continuously and efficiently computed on tens to hundreds of kilobytes per second. In this paper, we explore how convolutions are used in fundamental machine vision applications. We present an accelerated n-dimensional convolution package in the high performance computing language, Julia, and demonstrate its efficacy in solving the time to contact problem for machine vision. Results are measured against synthetically generated videos and quantitatively assessed according to their mean squared error from the ground truth. We achieve over an order of magnitude decrease in compute time and allocated memory for comparable machine vision applications. All code is packaged and integrated into the official Julia Package Manager to be used in various other scenarios. | ||

Julia implementation of the dynamic distributed dimensional data model Chen, A., Edelman, A., Kepner, J., Gadepally, V., & Hutchison, D. (2016). High Performance Extreme Computing Conference (HPEC), (pp. 1-7). IEEE. | PDF arXiv Abstract | |

Abstract: Julia is a new language for writing data analysis programs that are easy to implement and run at high performance. Similarly, the Dynamic Distributed Dimensional Data Model (D4M) aims to clarify data analysis operations while retaining strong performance. D4M accomplishes these goals through a composable, unified data model on associative arrays. In this work, we present an implementation of D4M in Julia and describe how it enables and facilitates data analysis. Several experiments showcase scalable performance in our new Julia version as compared to the original Matlab implementation. | ||

Circuitscape: modeling landscape connectivity to promote conservation and human health. McRae, B., Shah, V., & Edelman, A. (2016). The Nature Conservancy, 14. | PDF Abstract | |

Abstract: Circuitscape is an award-winning connectivity analysis software package designed to model species movement and gene flow across fragmented landscapes, and to identify areas important for connectivity conservation. It uses algorithms from electrical circuit theory to model connectivity, taking advantage of links between circuit and random walk theories. Its ability to incorporate all possible pathways across a landscape simultaneously has made Circuitscape particularly useful for connectivity analysis. | ||

Robust benchmarking in noisy environments. Chen, J., Edelman, A., & Revels, J. (2016). | PDF arXiv Abstract | |

Abstract: We propose a benchmarking strategy that is robust in the presence of timer error, OS jitter and other environmental fluctuations, and is insensitive to the highly nonideal statistics produced by timing measurements. We construct a model that explains how these strongly nonideal statistics can arise from environmental fluctuations, and also justifies our proposed strategy. We implement this strategy in the BenchmarkTools Julia package, where it is used in production continuous integration (CI) pipelines for developing the Julia language and its ecosystem. | ||

Julia Introduction Edelman, A. (2015, May). 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW) (pp. 1271-1271). IEEE. | Abstract | |

Abstract: Scientific computing has traditionally required the highest performance, yet domain experts have largely moved to slower dynamic languages for daily work. The Julia developers believe there are many good reasons to prefer dynamic languages for these applications and do not expect their use to diminish. Fortunately, modern language design and compiler techniques make it possible to mostly eliminate the performance trade-off and provide a single environment productive enough for prototyping and efficient enough for deploying performance-intensive applications. The Julia programming language fills this role: it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages. | ||

Integral geometry for Markov chain Monte Carlo: overcoming the curse of search-subspace dimensionality O Mangoubi, A Edelman. (2015). | PDF arXiv Abstract | |

Abstract: We introduce a method that uses the Cauchy-Crofton formula and a new curvature formula from integral geometry to reweight the sampling probabilities of Metropolis-within-Gibbs algorithms in order to increase their convergence speed. We consider algorithms that sample from a probability density conditioned on a manifold $\mathcal{M}$. Our method exploits the symmetries of the algorithms' isotropic random search-direction subspaces to analytically average out the variance in the intersection volume caused by the orientation of the search-subspace with respect to the manifold $\mathcal{M}$ it intersects. This variance can grow exponentially with the dimension of the search-subspace, greatly slowing down the algorithm. Eliminating this variance allows us to use search-subspaces of dimensions many times greater than would otherwise be possible, allowing us to sample very rare events that a lower-dimensional search-subspace would be unlikely to intersect.To extend this method to events that are rare for reasons other than their support $\mathcal{M}$ having a lower dimension, we formulate and prove a new theorem in integral geometry that makes use of the curvature form of the Chern-Gauss-Bonnet theorem to reweight sampling probabilities. On the side, we also apply our theorem to obtain new theoretical bounds for the volumes of real algebraic manifolds. Finally, we demonstrate the computational effectiveness and speedup of our method by numerically applying it to the conditional stochastic Airy operator sampling problem in random matrix theory. | ||

Julia Language Documentation Bezanson, J., Karpinski, S., Shah, V., & Edelman, A. (2014). The Julia Manual. http://docs. julialang. org/en/release-0.2/manual, 1-261. | PDF Abstract | |

Abstract: Scientific computing has traditionally required the highest performance, yet domain experts have largely moved to slower dynamic languages for daily work. We believe there are many good reasons to prefer dynamic languages for these applications, and we do not expect their use to diminish. Fortunately, modern language design and compiler techniques make it possible to mostly eliminate the performance trade-off and provide a single environment productive enough for prototyping and efficient enough for deploying performance-intensive applications. The Julia programming language fills this role: it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages.Because Julia’s compiler is different from the interpreters used for languages like Python or R, you may find that Julia’s performance is unintuitive at first. If you find that something is slow, we highly recommend reading through the Performance Tips (page 279) section before trying anything else. Once you understand how Julia works, it’s easy to write code that’s nearly as fast as C. | ||

An Efficient Partitioning Oracle for Bounded-Treewidth Graphs Edelman, A., Hassidim, A., Nguyen, H.N., & Onak, K. (2011). In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (pp. 530-541). Springer Berlin Heidelberg. | PDF arXiv Abstract | |

Abstract: Partitioning oracles were introduced by Hassidim et al. (FOCS 2009) as a generic tool for constant-time algorithms. For any epsilon > 0, a partitioning oracle provides query access to a fixed partition of the input bounded-degree minor-free graph, in which every component has size poly(1/epsilon), and the number of edges removed is at most epsilon*n, where n is the number of vertices in the graph.
However, the oracle of Hassidimet al. makes an exponential number of queries to the input graph to answer every query about the partition. In this paper, we construct an efficient partitioning oracle for graphs with constant treewidth. The oracle makes only O(poly(1/epsilon)) queries to the input graph to answer each query about the partition.
Examples of bounded-treewidth graph classes include k-outerplanar graphs for fixed k, series-parallel graphs, cactus graphs, and pseudoforests. Our oracle yields poly(1/epsilon)-time property testing algorithms for membership in these classes of graphs. Another application of the oracle is a poly(1/epsilon)-time algorithm that approximates the maximum matching size, the minimum vertex cover size, and the minimum dominating set size up to an additive epsilon*n in graphs with bounded treewidth. Finally, the oracle can be used to test in poly(1/epsilon) time whether the input bounded-treewidth graph is k-colorable or perfect. | ||

Enhancing the acquisition efficiency of fast magnetic resonance imaging via broadband encoding of signal content Mitsouras, D., Zientara, G. P., Edelman, A., & Rybicki, F. J. (2006). Magnetic resonance imaging, 24(9), 1209-1227. | PDF Abstract | |

Abstract: Current efficient magnetic resonance imaging (MRI) methods such as parallel-imaging and k−t methods encode MR signals using a set of effective encoding functions other than the Fourier basis. This work revisits the proposition of directly manipulating the set of effective encoding functions at the radiofrequency excitation step in order to increase MRI efficiency. This approach, often termed “broadband encoding,” enables the application of algebraic matrix factorization technologies to extract efficiency by representing and encoding MR signal content in a compacted form. Broadband imaging equivalents of fast multiecho, parallel and k−t MRI are developed and analyzed. The potential of these techniques to increase the time efficiency of data acquisition is experimentally verified on a commercial MRI scanner using simple spin-echo imaging. A three-dimensional gradient-echo dynamic imaging application that demonstrates the potential benefits of this approach compared to the present state of the art for certain applications is also presented. | ||

Building Blocks and Excluded Sums Demaine, E. D., Demaine, M. L., Edelman, A., Leiserson, C. E., & Persson, P. O. (2005). SIAM News, 38(1), 1-5. | PDF | |

Fast Multipole: It's all about Adding Functions in Finite Precision Edelman, A., & Persson, P. 0. (2004, never published). | PDF | |

Non-Fourier Encoded Parallel MRI Using Multiple Receiver Coils Mitsouras, D., Hoge, W. S., Rybicki, F. J., Kyriakos, W. E., Edelman, A., & Zientara, G. P. (2004). Magnetic resonance in medicine, 52(2), 321-328. | PDF Abstract | |

Abstract: This paper describes a general theoretical framework that combines non-Fourier (NF) spatially-encoded MRI with multichannel acquisition parallel MRI. The two spatial-encoding mechanisms are physically and analytically separable, which allows NF encoding to be expressed as complementary to the inherent encoding imposed by RF receiver coil sensitivities. Consequently, the number of NF spatial-encoding steps necessary to fully encode an FOV is reduced. Furthermore, by casting the FOV reduction of parallel imaging techniques as a dimensionality reduction of the k-space that is NF-encoded, one can obtain a speed-up of each digital NF spatial excitation in addition to accelerated imaging. Images acquired at speed-up factors of 2× to 8× with a four-element RF receiver coil array demonstrate the utility of this framework and the efficiency afforded by it. | ||

A Physically Based Numerical Color Calibration Algorithm for Film Edelman, A., Wang, F., & Rao, A. (2003) | PDF | |

A Fast Projected Conjugate Gradient Algorithm for Training Support Vector Machines Wen, T., Edelman, A., & Gorsich, D. (2001, August). In Contemporary mathematics (pp. 245-263). American Mathematical Society. | PDF Abstract | |

Abstract: Support Vector Machines (SVMs) are of current interest in the solution of classification problems. However, serious challenges appear in the training problem when the training set is large. Training SVMs involves solving a linearly constrained quadratic programming problem. In this paper, we present a fast and easy-to-implement projected Conjugate Gradient algorithm for solving this quadratic programming problem.Compared with the exiting ones, this algorithm tries to be adaptive to each training problem and each computer's memory hierarchy. Although written in a high-level programming language, numerical experiments show that the performance of its MATLAB implementation is competitive with that of benchmark C/C++ codes such as SVMlight and SvmFu. The parallelism of this algorithm is also discussed in this paper. | ||

Nonlinear eigenvalue problems with orthogonality constraints Edelman, A., & Lippert, R. (2000). (section 8.3). Bai, Z., Demmel, J., Dongarra, J., Ruhe, A. and Vorst, H. van der, editors: Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Philidelphia: SIAM. | PDF | |

The geometry of algorithms with orthogonality constraints Edelman, A., Arias, T. A., & Smith, S. T. (1998). SIAM journal on Matrix Analysis and Applications, 20(2), 303-353. | PDF arXiv Abstract | |

Abstract: In this paper we develop new Newton and conjugate gradient algorithms on the Grassmann and Stiefel manifolds. These manifolds represent the constraints that arise in such areas as the symmetric eigenvalue problem, nonlinear eigenvalue problems, electronic structures computations, and signal processing. In addition to the new algorithms, we show how the geometrical framework gives penetrating new insights allowing us to create, understand, and compare algorithms. The theory proposed here provides a taxonomy for numerical linear algebra algorithms that provide a top level mathematical view of previously unrelated algorithms. It is our hope that developers of new algorithms and perturbation theories will benefit from the theory, methods, and examples in this paper. | ||

Multiscale Computation with Interpolating Wavelets Lippert, R. A., Arias, T. A., & Edelman, A. (1998). Journal of Computational Physics, 140(2), 278-310. | PDF arXiv Abstract | |

Abstract: Multiresolution analyses based upon interpolets, interpolating scaling functions introduced by Deslauriers and Dubuc, are particularly well-suited to physical applications because they allow exact recovery of the multiresolution representation of a function from its sample values on a finite set of points in space. We present a detailed study of the application of wavelet concepts to physical problems expressed in such bases. The manuscript describes algorithms for the associated transforms which, for properly constructed grids of variable resolution, compute correctly without having to introduce extra grid points. We demonstrate that for the application of local homogeneous operators in such bases, the non-standard multiply of Beylkin, Coifman and Rokhlin also proceeds exactly for inhomogeneous grids of appropriate form. To obtain less stringent conditions on the grids, we generalize the non-standard multiply so that communication may proceed between non-adjacent levels. The manuscript concludes with timing comparisons against naive algorithms and an illustration of the scale-independence of the convergence rate of the conjugate gradient solution of Poisson's equation using a simple preconditioning, suggesting that this approach leads to an O(n) solution of this equation. | ||

The Mathematics of the Pentium Division Bug Edelman, A. (1997). SIAM Review, 39(1), 54-67. | PDF | |

On Conjugate Gradient-Like Methods for Eigen-Like Problems Edelman, A., & Smith, S. T. (1996). BIT Numerical Mathematics, 36(3), 494-508. | PDF Abstract | |

Abstract: Numerical analysts, physicists, and signal processing engineers have proposed algorithms that might be called conjugate gradient for problems associated with the computation of eigenvalues. There are many variations, mostly one eigenvalue at a time though sometimes block algorithms are proposed. Is there a correct “conjugate gradient” algorithm for the eigenvalue problem? How are the algorithms related amongst themselves and with other related algorithms such as Lanczos, the Newton method, and the Rayleigh quotient? | ||

Reflections on the Large Scale Matrix Computations in Lattice QCD Conference Brower, R., Demmel, J., Edelman, A., & Hockney, G. (1995, never published). This is a little note whipped up after a very nice conference held in Lexington Kentucky | PDF | |

Curvature in Conjugate Gradient Eigenvalue Computation with Applications to Materials and Chemistry Calculations Edelman, A., Arias, T. A., & Smith, S. T. (1994). In In Proceedings of the 1994 SIAM Applied Linear Algebra Conference. | PDF Abstract | |

Abstract: We illustrate the importance of using curvature information when performing constrained conjugate gradient minimization on manifolds and identify certain common and useful submanifolds in numerical analysis. This minimization approach promises to be useful in materials and chemistry calculations. | ||

When is x*(1/x) not equal to 1? Edelman, A. (1994). Never published except for the web -- just for fun. | PDF |

## Random Eigenvalues etc.

Random Hyperplanes, Generalized Singular Values & “What’s My β?” A Edelman, Y Wang. (2018) IEEE Statistical Signal Processing Workshop (SSP), 458-462. doi: 10.1109/SSP.2018.8450833 | PDF Abstract | |

Abstract: We streamline the treatment of the Jacobi ensemble from random matrix theory by providing a succinct geometric characterization which may be used directly to compute the Jacobi ensemble distribution without unnecessary matrix baggage traditionally seen in the MANOVA formulation. Algebraically the Jacobi ensemble naturally corresponds to the Generalized Singular Value Decomposition from the field of Numerical Linear Algebra. We further provide a clear geometric interpretation for the Selberg constant in front of the distribution which may sensibly be defined even beyond the reals, complexes, and quaternions. On the application side, we propose a new learning problem where one estimates a β that best fits the sample eigenvalues from the Jacobi ensemble. | ||

Eigenvalue approximation of sums of Hermitian matrices from eigenvector localization/delocalization Movassagh, R., & Edelman, A. (2017). arXiv preprint arXiv:1710.09400. | PDF arXiv Abstract | |

Abstract: We propose a technique for calculating and understanding the eigenvalue distribution of sums of random matrices from the known distribution of the summands. The exact problem is formidably hard. One extreme approximation to the true density amounts to classical probability, in which the matrices are assumed to commute; the other extreme is related to free probability, in which the eigenvectors are assumed to be in generic positions and sufficiently large. In practice, free probability theory can give a good approximation of the density.We develop a technique based on eigenvector localization/delocalization that works very well for important problems of interest where free probability is not sufficient, but certain uniformity properties apply. The localization/delocalization property appears in a convex combination parameter that notably, is independent of any eigenvalue properties and yields accurate eigenvalue density approximations. We demonstrate this technique on a number of examples as well as discuss a more general technique when the uniformity properties fail to apply. | ||

On Computing Schur Functions and Series Thereof Chan, C., Drensky, V., Edelman, A., Kan, R., & Koev, P. (2017). Lawrence Berkeley National Laboratory, eScholarship.org. | PDF | |

Beyond Universality in Random Matrix Theory Edelman, A., Guionnet, A., & Péché, S. (2016). The Annals of Applied Probability, 26(3), 1659-1697. | PDF arXiv Abstract | |

Abstract: In order to have a better understanding of finite random matrices with non-Gaussian entries, we study the 1/N expansion of local eigenvalue statistics in both the bulk and at the hard edge of the spectrum of random matrices. This gives valuable information about the smallest singular value not seen in universality laws. In particular, we show the dependence on the fourth moment (or the kurtosis) of the entries. This work makes use of the so-called complex deformed GUE and Laguerre ensembles. | ||

Beta Jacobi Ensembles and the GSVD Edelman, A. Banff International Research Station for Mathematical Innovation and Discovery (2016) | ||

Infinite random matrix theory, tridiagonal bordered Toeplitz matrices, and the moment problem Dubbs, A., & Edelman, A. (2015). Linear Algebra and its Applications, 467, 188-201. | PDF Abstract | |

Abstract: The four major asymptotic level density laws of random matrix theory may all be showcased through their Jacobi parameter representation as having a bordered Toeplitz form. We compare and contrast these laws, completing and exploring their representations in one place. Inspired by the bordered Toeplitz form, we propose an algorithm for the finite moment problem by proposing a solution whose density has a bordered Toeplitz form. | ||

The singular values of the GUE (less is more) Edelman, A., & La Croix, M. (2015, October). Random Matrices:Theory and Applications, 4(4), Issue 04. | PDF arXiv Abstract | |

Abstract: Some properties that nominally involve the eigenvalues of Gaussian Unitary Ensemble (GUE) can instead be phrased in terms of singular values. By discarding the signs of the eigenvalues, we gain access to a surprising decomposition: the singular values of the GUE are distributed as the union of the singular values of two independent ensembles of Laguerre type. This independence is remarkable given the well known phenomenon of eigenvalue repulsion. The structure of this decomposition reveals that several existing observations about large n limits of the GUE are in fact manifestations of phenomena that are already present for finite random matrices. We relate the semicircle law to the quarter-circle law by connecting Hermite polynomials to generalized Laguerre polynomials with parameter ±1/2. Similarly, we write the absolute value of the determinant of the n×n GUE as a product n independent random variables to gain new insight into its asymptotic log-normality. The decomposition also provides a description of the distribution of the smallest singular value of the GUE, which in turn permits the study of the leading order behavior of the condition number of GUE matrices. The study is motivated by questions involving the enumeration of orientable maps, and is related to questions involving powers of complex Ginibre matrices. The inescapable conclusion of this work is that the singular values of the GUE play an unpredictably important role that had gone unnoticed for decades even though, in hindsight, so many clues had been around. | ||

Condition Numbers of Indefinite Rank 2 Ghost Wishart Matrices Edelman, A., & Movassagh, R. (2015, October). Linear Algebra and its Applications, 483, 342-351. | PDF arXiv Abstract | |

Abstract: We define an indefinite Wishart matrix as a matrix of the form $A=W^{T}W\Sigma$, where $\Sigma$ is an indefinite diagonal matrix and $W$ is a matrix of independent standard normals. We focus on the case where $W$ is $L$ by 2 which has engineering applications. We obtain the distribution of the ratio of the eigenvalues of $A$. This distribution can be "folded" to give the distribution of the condition number. We calculate formulas for $W$ real $(\beta=1)$, complex $(\beta=2)$, quaternionic $(\beta=4)$ or any ghost 0<$\beta$<$\infty$. We then corroborate our work by comparing them against numerical experiments. | ||

Random Matrix Theory, Numerical Computation and Applications Edelman, A., Sutton, B. D., & Wang, Y. (2014). Modern Aspects of Random Matrix Theory, 72, 53. | PDF | |

Low-temperature random matrix theory at the soft edge Edelman, A., Persson, P. O., & Sutton, B. D. (2014). Journal of Mathematical Physics, 55(6), 063302. | PDF Abstract | |

Abstract: "Low temperature" random matrix theory is the study of random eigenvalues as energy is removed. In standard notation, β is identified with inverse temperature, and low temperatures are achieved through the limit β → ∞. In this paper, we derive statistics for low-temperature random matrices at the "soft edge," which describes the extreme eigenvalues for many random matrix distributions. Specifically, new asymptotics are found for the expected value and standard deviation of the general-β Tracy-Widom distribution. The new techniques utilize beta ensembles, stochastic differential operators, and Riccati diffusions. The asymptotics fit known high-temperature statistics curiously well and contribute to the larger program of general-β random matrix theory. | ||

Computing with Beta Ensembles and Hypergeometric Functions Drensky, V., Edelman, A., Genoar, K., Kan, R., & Koev, P. (2014). RMTA. | PDF | |

The Beta-MANOVA Ensemble with General Covariance Dubbs, A., & Edelman, A. (2014). Random Matrices: Theory and Applications, 3(01), 1450002. | PDF arXiv Abstract | |

Abstract: We find the joint generalized singular value distribution and largest generalized singular value distributions of the $\beta$-MANOVA ensemble with positive diagonal covariance, which is general. This has been done for the continuous $\beta > 0$ case for identity covariance (in eigenvalue form), and by setting the covariance to $I$ in our model we get another version. For the diagonal covariance case, it has only been done for $\beta = 1,2,4$ cases (real, complex, and quaternion matrix entries). This is in a way the first second-order $\beta$-ensemble, since the sampler for the generalized singular values of the $\beta$-MANOVA with diagonal covariance calls the sampler for the eigenvalues of the $\beta$-Wishart with diagonal covariance of Forrester and Dubbs-Edelman-Koev-Venkataramana. We use a conjecture of MacDonald proven by Baker and Forrester concerning an integral of a hypergeometric function and a theorem of Kaneko concerning an integral of Jack polynomials to derive our generalized singular value distributions. In addition we use many identities from Forrester's {\it Log-Gases and Random Matrices}. We supply numerical evidence that our theorems are correct. | ||

Eigenvalue distributions of Beta-Wishart matrices Edelman, A., & Koev, P. (2014). Random Matrices: Theory and Applications, 3(02), 1450009. | PDF Abstract | |

Abstract: We derive explicit expressions for the distributions of the extreme eigenvalues of the beta-Wishart random matrices in terms of the hypergeometric function of a matrix argument. These results generalize the classical results for the real (β = 1), complex (β = 2), and quaternion (β = 4) Wishart matrices to any β > 0. | ||

Random Matrix Theory and its Innovative Applications Edelman, A., & Wang, Y. (2013). In Advances in Applied Mathematics, Modeling, and Computational Science (pp. 91-116). Springer US. | PDF Abstract | |

Abstract: Recently more and more disciplines of science and engineering have found Random Matrix Theory valuable. Some disciplines use the limiting densities to indicate the cutoff between "noise" and "signal." Other disciplines are finding eigenvalue repulsions a compelling model of reality. This survey introduces both the theory behind these applications and MATLAB experiments allowing a reader immediate access to the ideas | ||

The beta-Wishart ensemble Dubbs, A., Edelman, A., Koev, P., & Venkataramana, P. (2013). Journal of Mathematical Physics, 54(8), 083507. | PDF arXiv Abstract | |

Abstract: This paper proves a matrix model for the Wishart Ensemble with general covariance and general dimension parameter beta. In so doing, we introduce a new and elegant definition of Jack polynomials. | ||

Error Analysis of Free Probability Approximations to the Density of States of Disordered Systems Chen, J., Hontz, E., Moix, J., Welborn, M., Van Voorhis, T., Suárez, A., ... & Edelman, A. (2012). Physical review letters, 109(3), 036403. | PDF arXiv Abstract | |

Abstract: Theoretical studies of localization, anomalous diffusion and ergodicity breaking require solving the electronic structure of disordered systems. We use free probability to approximate the ensemble- averaged density of states without exact diagonalization. We present an error analysis that quantifies the accuracy using a generalized moment expansion, allowing us to distinguish between different approximations. We identify an approximation that is accurate to the eighth moment across all noise strengths, and contrast this with the perturbation theory and isotropic entanglement theory. | ||

Partial Freeness of Random Matrices Chen, J., Van Voorhis, T., & Edelman, A. (2012). arXiv preprint arXiv:1204.2257. | PDF arXiv Abstract | |

Abstract: We investigate the implications of free probability for random matrices. From rules for calculating all possible joint moments of two free random matrices, we develop a notion of partial freeness which is quantified by the breakdown of these rules. We provide a combinatorial interpretation for partial freeness as the presence of closed paths in Hilbert space defined by particular joint moments. We also discuss how asymptotic moment expansions provide an error term on the density of states. We present MATLAB code for the calculation of moments and free cumulants of arbitrary random matrices. | ||

Isotropic Entanglement Edelman, A., & Movassagh, R. (2010, revised 2012). Isotropic entanglement. arXiv preprint arXiv:1012.5039. | PDF arXiv Abstract | |

Abstract: The method of "Isotropic Entanglement" (IE), inspired by Free Probability Theory and Random Matrix Theory, predicts the eigenvalue distribution of quantum many-body (spin) systems with generic interactions. At the heart is a "Slider", which interpolates between two extrema by matching fourth moments.The first extreme treats the non-commuting terms classically and the second treats them isotropically. Isotropic means that the eigenvectors are in generic positions. We prove Matching Three Moments and Slider Theorems and further prove that the interpolation is universal, i.e., independent of the choice of local terms. Our examples show that IE provides an accurate picture well beyond what one expects from the first four moments alone. | ||

Random Triangle Theory with Geometry and Applications Edelman, A., & Strang, G. (2012). Foundations of Computational Mathematics, 15(3), 681-713. | PDF Abstract | |

Abstract: What is the probability that a random triangle is acute? We explore this old question from a modern viewpoint, taking into account linear algebra, shape theory, numerical analysis, random matrix theory, the Hopf fibration, and much much more. One of the best distributions, all six vertex coordinates being independent standard Gaussians, can be reduced to four by translation of the center to (0, 0) or reformulation as a 2x2 random matrix problem.In this note, we develop shape theory in its historical context for a wide audience. We hope to encourage others to look again (and differently) at triangles. We provide a new constructive proof, using the geometry of parallelians, of a key result of shape theory: that triangle shapes naturally fall on a hemisphere. We give several proofs of uniformity when the normal distribution is transferred to the hemisphere, including a new proof related to the distribution of random condition numbers. Generalizing to higher dimensions, we obtain the “square root ellipticity statistic” of random matrix theory. Another proof connects the Hopf map to the SVD of 2 by 2 matrices. A new theorem decribes three similar triangles hidden in the hemisphere. Many triangle properties are reformulated as matrix theorems, providing insight to both. This paper argues for a shift of viewpoint to the modern approaches of random matrix theory. As one example, we propose that the smallest singular value is an effective test for uniformity. New software is developed and applications are proposed. | ||

Density of States of Quantum Spin Systems from Isotropic Entanglement Edelman, A., & Movassagh, R. (2011). Physical review letters, 107(9), 097205. | PDF arXiv Abstract | |

Abstract: We propose a method which we call "Isotropic Entanglement" (IE), that predicts the eigenvalue distribution of quantum many body (spin) systems (QMBS) with generic interactions. We interpolate between two known approximations by matching fourth moments. Though, such problems can be QMA-complete, our examples show that IE provides an accurate picture of the spectra well beyond what one expects from the first four moments alone. We further show that the interpolation is universal, i.e., independent of the choice of local terms. | ||

The Random Matrix Technique of Ghosts and Shadows Edelman, A. (2010). Markov Processes and Related Fields, 16(4), 783-790. | PDF | |

Sturm Sequences and Random Eigenvalue distributions Albrecht, J. T., Chan, C. P., & Edelman, A. (2009). Foundations of Computational Mathematics, 9(4), 461-483. | PDF Abstract | |

Abstract: This paper proposes that the study of Sturm sequences is invaluable in the numerical computation and theoretical derivation of eigenvalue distributions of random matrix ensembles. We first explore the use of Sturm sequences to efficiently compute histograms of eigenvalues for symmetric tridiagonal matrices and apply these ideas to random matrix ensembles such as the β-Hermite ensemble. Using our techniques, we reduce the time to compute a histogram of the eigenvalues of such a matrix from O(n 2+m) to O(mn) time where n is the dimension of the matrix and m is the number of bins (with arbitrary bin centers and widths) desired in the histogram (m is usually much smaller than n). Second, we derive analytic formulas in terms of iterated multivariate integrals for the eigenvalue distribution and the largest eigenvalue distribution for arbitrary symmetric tridiagonal random matrix models. As an example of the utility of this approach, we give a derivation of both distributions for the β-Hermite random matrix ensemble (for general β). Third, we explore the relationship between the Sturm sequence of a random matrix and its shooting eigenvectors. We show using Sturm sequences that assuming the eigenvector contains no zeros, the number of sign changes in a shooting eigenvector of parameter λ is equal to the number of eigenvalues greater than λ. Finally, we use the techniques presented in the first section to experimentally demonstrate a O(log n) growth relationship between the variance of histogram bin values and the order of the β-Hermite matrix ensemble. | ||

Sample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples Nadakuditi, R. R., & Edelman, A. (2008). Signal Processing, IEEE Transactions on, 56(7), 2625-2638. | PDF arXiv Abstract | |

Abstract: We present a mathematically justifiable, computationally simple, sample eigenvalue based procedure for estimating the number of high-dimensional signals in white noise using relatively few samples. The main motivation for considering a sample eigenvalue based scheme is the computational simplicity and the robustness to eigenvector modelling errors which are can adversely impact the performance of estimators that exploit information in the sample eigenvectors. There is, however, a price we pay by discarding the information in the sample eigenvectors; we highlight a fundamental asymptotic limit of sample eigenvalue based detection of weak/closely spaced high-dimensional signals from a limited sample size. This motivates our heuristic definition of the effective number of identifiable signals which is equal to the number of "signal" eigenvalues of the population covariance matrix which exceed the noise variance by a factor strictly greater than 1+sqrt(Dimensionality of the system/Sample size). The fundamental asymptotic limit brings into sharp focus why, when there are too few samples available so that the effective number of signals is less than the actual number of signals, underestimation of the model order is unavoidable (in an asymptotic sense) when using any sample eigenvalue based detection scheme, including the one proposed herein. The analysis reveals why adding more sensors can only exacerbate the situation. Numerical simulations are used to demonstrate that the proposed estimator consistently estimates the true number of signals in the dimension fixed, large sample size limit and the effective number of identifiable signals in the large dimension, large sample size limit. | ||

Statistical eigen-inference from large Wishart Matrices Rao, N. R., Mingo, J. A., Speicher, R., & Edelman, A. (2008). The Annals of Statistics, 2850-2885. | PDF arXiv Abstract | |

Abstract: We consider settings where the observations are drawn from a zero-mean multivariate (real or complex) normal distribution with the population covariance matrix having eigenvalues of arbitrary multiplicity. We assume that the eigenvectors of the population covariance matrix are unknown and focus on inferential procedures that are based on the sample eigenvalues alone (i.e., "eigen-inference"). Results found in the literature establish the asymptotic normality of the fluctuation in the trace of powers of the sample covariance matrix. We develop concrete algorithms for analytically computing the limiting quantities and the covariance of the fluctuations. We exploit the asymptotic normality of the trace of powers of the sample covariance matrix to develop eigenvalue-based procedures for testing and estimation. Specifically, we formulate a simple test of hypotheses for the population eigenvalues and a technique for estimating the population eigenvalues in settings where the cumulative distribution function of the (nonrandom) population eigenvalues has a staircase structure. Monte Carlo simulations are used to demonstrate the superiority of the proposed methodologies over classical techniques and the robustness of the proposed techniques in high-dimensional, (relatively) small sample size settings. The improved performance results from the fact that the proposed inference procedures are "global" (in a sense that we describe) and exploit "global" information thereby overcoming the inherent biases that cripple classical inference procedures which are "local" and rely on "local" information. | ||

The Polynomial Method for Random Matrices Rao, N. R., & Edelman, A. (2008). Foundations of Computational Mathematics, 8(6), 649-702. | PDF arXiv Abstract | |

Abstract: We define a class of "algebraic" random matrices. These are random matrices for which the Stieltjes transform of the limiting eigenvalue distribution function is algebraic, i.e., it satisfies a (bivariate) polynomial equation. The Wigner and Wishart matrices whose limiting eigenvalue distributions are given by the semi-circle law and the Marcenko-Pastur law are special cases. Algebraicity of a random matrix sequence is shown to act as a certificate of the computability of the limiting eigenvalue density function. The limiting moments of algebraic random matrix sequences, when they exist, are shown to satisfy a finite depth linear recursion so that they may often be efficiently enumerated in closed form. In this article, we develop the mathematics of the polynomial method which allows us to describe the class of algebraic matrices by its generators and map the constructive approach we employ when proving algebraicity into a software implementation that is available for download in the form of the RMTool random matrix "calculator" package. Our characterization of the closure of algebraic probability distributions under free additive and multiplicative convolution operations allows us to simultaneously establish a framework for computational (non-commutative) "free probability" theory. We hope that the tools developed allow researchers to finally harness the power of the infinite random matrix theory. | ||

The beta-Jacobi matrix model, the CS decomposition, and generalized singular value problems Edelman, A., & Sutton, B. D. (2008). Foundations of Computational Mathematics, 8(2), 259-285. | PDF | |

From Random Matrices to Stochastic Operators Edelman, A., & Sutton, B. D. (2007). Journal of Statistical Physics, 127(6), 1121-1165. | PDF arXiv Abstract | |

Abstract: We propose that classical random matrix models are properly viewed as finite difference schemes for stochastic differential operators. Three particular stochastic operators commonly arise, each associated with a familiar class of local eigenvalue behavior. The stochastic Airy operator displays soft edge behavior, associated with the Airy kernel. The stochastic Bessel operator displays hard edge behavior, associated with the Bessel kernel. The article concludes with suggestions for a stochastic sine operator, which would display bulk behavior, associated with the sine kernel. | ||

MOPS: Multivariate Orthogonal Polynomials (symbolically) Dumitriu, I., Edelman, A., & Shuman, G. (2007). Journal of symbolic computation, 42(6), 587-620. | PDF arXiv Abstract | |

Abstract: In this paper we present a Maple library (MOPs) for computing Jack, Hermite, Laguerre, and Jacobi multivariate polynomials, as well as eigenvalue statistics for the Hermite, Laguerre, and Jacobi ensembles of Random Matrix theory. We also compute multivariate hypergeometric functions, and offer both symbolic and numerical evaluations for all these quantities. We prove that all algorithms are well-defined, analyze their complexity, and illustrate their performance in practice. Finally, we also present a few of the possible applications of this library. | ||

Sample Size Cognizant Detection of Signals in White Noise Nadakuditi, R. R., & Edelman, A. (2007, June). In Signal Processing Advances in Wireless Communications, 2007. SPAWC 2007. IEEE 8th Workshop on (pp. 1-5). IEEE. | PDF arXiv Abstract | |

Abstract: The detection and estimation of signals in noisy, limited data is a problem of interest to many scientific and engineering communities. We present a computationally simple, sample eigenvalue based procedure for estimating the number of high-dimensional signals in white noise when there are relatively few samples. We highlight a fundamental asymptotic limit of sample eigenvalue based detection of weak high-dimensional signals from a limited sample size and discuss its implication for the detection of two closely spaced signals. This motivates our heuristic definition of the 'effective number of identifiable signals.' Numerical simulations are used to demonstrate the consistency of the algorithm with respect to the effective number of signals and the superior performance of the algorithm with respect to Wax and Kailath's "asymptotically consistent" MDL based estimator. | ||

On the Largest Principal Angle between Random Subspaces Absil, P. A., Edelman, A., & Koev, P. (2006). Linear Algebra and its applications, 414(1), 288-294. | PDF Abstract | |

Abstract: Formulas are derived for the probability density function and the probability distribution function of the largest canonical angle between two p -dimensional subspaces of Rn chosen from the uniform distribution on the Grassmann manifold (which is the unique distribution invariant by orthogonal transformations of Rn). The formulas involve the gamma function and the hypergeometric function of a matrix argument. | ||

On the Efficient Evaluation of the Hypergeometric Function of a Matrix Argument Koev, P., & Edelman, A. (2006). Mathematics of Computation, 75(254), 833-846. | PDF arXiv Abstract | |

Abstract: We present new algorithms that efficiently approximate the hypergeometric function of a matrix argument through its expansion as a series of Jack functions. Our algorithms exploit the combinatorial properties of the Jack function, and have complexity that is only linear in the size of the matrix. | ||

Global Spectrum Fluctuations for the beta-Hermite and beta-Laguerre ensembles via matrix models Dumitriu, I., & Edelman, A. (2006). Journal of Mathematical Physics, 47(6), 063302. | PDF arXiv Abstract | |

Abstract: We study the global spectrum fluctuations for beta-Hermite and beta-Laguerre ensembles via the tridiagonal matrix models introduced in [11], and prove that the fluctuations describe a Gaussian process on monomials. We extend our results to slightly larger classes of random matrices. | ||

Eigenvalues of Hermite and Laguerre ensembles: Large Beta Asymptotics Dumitriu, I., & Edelman, A. (2005). In Annales de l'IHP Probabilités et statistiques, 41(6), 1083-1099. | PDF arXiv Abstract | |

Abstract: In this paper we examine the zero and first order eigenvalue fluctuations for the $\beta$-Hermite and $\beta$-Laguerre ensembles, using the matrix models we described in [5], in the limit as $\beta \to \infty$. We find that the fluctuations are described by Gaussians of variance $O(1/\beta)$, centered at the roots of a corresponding Hermite (Laguerre) polynomial. We also show that the approximation is very good, even for small values of $\beta$, by plotting exact level densities versus sum of Gaussians approximations. | ||

Tails of Condition Number Distributions Edelman, A., & Sutton, B. D. (2005). SIAM journal on matrix analysis and applications, 27(2), 547-560. | PDF Abstract | |

Abstract: Let $\kappa$ be the condition number of an m-by-n matrix with independent standard Gaussian entries, either real ($\beta = 1$) or complex ($\beta = 2$). The major result is the existence of a constant C (depending on m, n, and $\beta$) such that $P[\kappa > x] < C \, x^{-\beta}$ for all x. As $x \rightarrow \infty$, the bound is asymptotically tight. An analytic expression is given for the constant C, and simple estimates are given, one involving a Tracy--Widom largest eigenvalue distribution. All of the results extend beyond real and complex entries to general $\beta$. | ||

On the Probability Distribution of the Outputs of the Diagonally Loaded Capon-MVDR Processor Nadakuditi, R. R., & Edelman, A. (2005, October). In Signals, Systems and Computers, 2005. Conference Record of the Thirty-Ninth Asilomar Conference on (pp. 1717-1723). IEEE. | ||

Numerical Methods for Eigenvalue Distributions of Random Matrices Edelman, A., & Persson, P. O. (2005). arXiv preprint math-ph/0501068. | PDF arXiv Abstract | |

Abstract: We present efficient numerical techniques for calculation of eigenvalue distributions of random matrices in the beta-ensembles. We compute histograms using direct simulations on very large matrices, by using tridiagonal matrices with appropriate simplifications. The distributions are also obtained by numerical solution of the Painleve II and V equations with high accuracy. For the spacings we show a technique based on the Prolate matrix and Richardson extrapolation, and we compare the distributions with the zeros of the Riemann zeta function. | ||

Matrix Models for Beta Ensembles Dumitriu, I., & Edelman E. (2002). Journal of Mathematical Physics, 43, 5830--5847. | PDF arXiv Abstract | |

Abstract: This paper constructs tridiagonal random matrix models for general ($\beta>0$) $\beta$-Hermite (Gaussian) and $\beta$-Laguerre (Wishart) ensembles. These generalize the well-known Gaussian and Wishart models for $\beta = 1,2,4$. Furthermore, in the cases of the $\beta$-Laguerre ensembles, we eliminate the exponent quantization present in the previously known models. We further discuss applications for the new matrix models, and present some open problems. | ||

On the determinant of a uniformly distributed complex matrix Edelman, A. (1995). Journal of Complexity, 11(3), 352-357. | PDF | |

How many zeros of a random polynomial are real? Edelman, A., & Kostlan, E. (1995). Bulletin of the American Mathematical Society, 32(1), 1-37. | PDF arXiv Abstract | |

Abstract: We give an elementary derivation of the Kac integral formula for the expected number of real roots of a random polynomial with independent standard normally distributed coefficients. We show that the expected number of real roots is the length of the moment curve $(1,t,\ldots,t^n)$ projected onto the surface of the unit sphere, divided by $\pi$. The density of the root is proportional to how fast this curve is traced out. We generalize the Kac formula to polynomials with coefficients that have an arbitrary multivariate normal distribution. We show, for example, that for a particularly nice definition of random polynomial, the expected number of real roots is exactly the square root of the degree.When the random polynomials have an arbitrary density function, we express the expected number of roots as a weighted length of the curve. We also calculate the the distribution of the real zeros of random power series and Fourier series, random sums of orthogonal polynomials, and random Dirichlet series. | ||

How many eigenvalues of a random matrix are real? Edelman, A., Kostlan, E., & Shub, M. (1994). Journal of the American Mathematical Society, 7(1), 247-267. | PDF Abstract | |

Abstract: Let $ A$ be an $ n \times n$ matrix whose elements are independent random variables with standard normal distributions. As $ n \to \infty $, the expected number of real eigenvalues is asymptotic to $ \sqrt {2n/\pi } $. We obtain a closed form expression for the expected number of real eigenvalues for finite $ n$, and a formula for the density of a real eigenvalue for finite $ n$. Asymptotically, a real normalized eigenvalue $ \lambda /\sqrt n $ of such a random matrix is uniformly distributed on the interval [-1, 1]. Analogous, but strikingly different, results are presented for the real generalized eigenvalues. We report on numerical experiments confirming these results and suggesting that the assumption of normality is not important for the asymptotic results. | ||

The road from Kac's matrix to Kac's random polynomials Edelman, A. & Kostlan, E. (1994). Proceedings of the 1994 SIAM Applied Linear Algebra Conference J.G. Lewis, ed., SIAM, Philadelphia, 503--507 | PDF | |

The Circular Law and the Probability that a Random Matrix Has $k$ Real Eigenvalues Edelman, A. (1993). preprint. | PDF Abstract | |

Abstract: Let $A$ be an $n$ by $n$ matrix whose elements are independent random variables with standard normal distributions. Girko's controversial circular law states that the distribution of appropriately normalized eigenvalues is asymptotically uniform in the unit disk in the complex plane. We derive the exact distribution of the complex eigenvalues for finite $n$, from which convergence in distribution to the circular law for normally distributed matrices may be derived.Similar methodology allows us to derive a joint distribution formula for the real Schur decomposition of $A$. Integration of this distribution yields the probability that $A$ has exactly $k$ real eigenvalues. For example, we show that the probability that $A$ has all real eigenvalues is exactly $2^{-n(n-1)/4}$. | ||

Eigenvalue Roulette and Random Test Matrices Edelman, A. (1993). In Linear Algebra for Large Scale and Real-Time Applications (pp. 365-368). Springer Netherlands. | PDF Abstract | |

Abstract: Random matrices may not be mentioned in research abstracts, but second only to Matlab, the most widely used tool of numerical linear algebraists is the “random test matrix.” Algorithmic developers in need of guinea pigs nearly always take random matrices with standard normal entries or perhaps close cousins, such as the uniform distribution on [-1,1]. The choice is highly reasonable: these matrices are generated effortlessly and might very well catch programming errors. What is a mistake is to psychologically link a random matrix with the intuitive notion of a "typical" matrix or the vague concept of "any old matrix." | ||

On the distribution of a scaled condition number Edelman, A. (1992). Mathematics of Computation, 58(197), 185-190. | PDF | |

The distribution and moments of the smallest eigenvalue of a random matrix of Wishart type Edelman, A. (1991). Linear algebra and its applications, 159, 55-80. | Abstract | |

Abstract: Given a random rectangular m × n matrix with elements from a normal distribution, what is the distribution of the smallest singular value? To pose an equivalent question in the language of multivariate statistics, what is the distribution of the smallest eigenvalue of a matrix from the central Wishart distribution in the null case? We give new results giving the distribution as a simple recursion. This includes the more difficult case when n – m is an even integer, without resorting to zonal polynomials and hypergeometric functions of matrix arguments. With the recursion, one can obtain exact expressions for the density and the moments of the distribution in terms of functions usually no more complicated than polynomials, exponentials, and at worst ordinary hypergeometric functions. We further elaborate on the special cases when n – m = 0, 1, 2, and 3 and give a numerical table of the expected values for 2 ⩽ m ⩽ 25 and 0 ⩽ n – m ⩽ 25. | ||

Eigenvalues and Condition Numbers of Random Matrices Edelman, A. (1989). MIT PhD Dissertation(4).(As of 2002, the figures are now included in this file. They were all recomputed!) | PDF | |

Eigenvalues and condition numbers of random matrices Edelman, A. (1988). SIAM Journal on Matrix Analysis and Applications, 9(4), 543-560. | PDF Abstract | |

Abstract: Given a random matrix, what condition number should be expected? This paper presents a proof that for real or complex $n \times n$ matrices with elements from a standard normal distribution, the expected value of the log of the 2-norm condition number is asymptotic to $\log n$ as $n \to \infty$. In fact, it is roughly $\log n + 1.537$ for real matrices and $\log n + 0.982$ for complex matrices as $n \to \infty$. The paper discusses how the distributions of the condition numbers behave for large n for real or complex and square or rectangular matrices. The exact distributions of the condition numbers of $2 \times n$ matrices are also given.Intimately related to this problem is the distribution of the eigenvalues of Wishart matrices. This paper studies in depth the largest and smallest eigenvalues, giving exact distributions in some cases. It also describes the behavior of all the eigenvalues, giving an exact formula for the expected characteristic polynomial. | ||

A linear-time algorithm for evaluating series of Schur functions Chan, C., Drensky, V., Edelman, A., & Koev, P. | PDF |

## Parallel Computing

Julia: A fresh approach to parallel programming Edelman, A. (2015, May). =Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International (pp. 517-517). IEEE. | Abstract | |

Abstract: Summary form only given. The Julia programming language is gaining enormous popularity. Julia was designed to be easy and fast. Most importantly, Julia shatters deeply established notions widely held in the applied community. Julia shows the fascinating dance between specialization and abstraction. Specialization allows for custom treatment. We can pick just the right algorithm for the right circumstance and this can happen at runtime based on argument types (code selection via multiple dispatch). Abstraction recognizes what remains the same after differences are stripped away and ignored as irrelevant. The recognition of abstraction allows for code reuse (generic programming). A simple idea that yields incredible power. Julia is many things to many people. In this talk we describe how Julia was built on the heels of our parallel computing experience with Star-P which began as an MIT research project and was a software product of Interactive Supercomputing. Our experience taught us that bolting parallelism onto an existing language that was not designed for performance or parallelism is difficult at best, and impossible at worst. One of our (not so secret) motivations to build Julia was to have the language we wanted for parallel numerical computing. | ||

Parallel Prefix Polymorphism Permits Parallelization, Presentation & Proof Chen, J., & Edelman, A. (2014, November). In Proceedings of the 1st First Workshop for High Performance Technical Computing in Dynamic Languages (pp. 47-56). IEEE Press. | PDF Abstract | |

Abstract: Polymorphism in programming languages enables code reuse. Here, we show that polymorphism has broad applicability far beyond computations for technical computing: parallelism in distributed computing, presentation of visualizations of runtime data flow, and proofs for formal verification of correctness. The ability to reuse a single codebase for all these purposes provides new ways to understand and verify parallel programs. | ||

Array Operators Using Multiple Dispatch: A design methodology for array implementations in dynamic languages Bezanson, J., Chen, J., Karpinski, S., Shah, V., & Edelman, A. (2014, June). In Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (p. 56). ACM. | PDF Abstract | |

Abstract: Arrays are such a rich and fundamental data type that they tend to be built into a language, either in the compiler or in a large lowlevel library. Defining this functionality at the user level instead provides greater flexibility for application domains not envisioned by the language designer. Only a few languages, such as C++ and Haskell, provide the necessary power to define n-dimensional arrays, but these systems rely on compile-time abstraction, sacrificing some flexibility. In contrast, dynamic languages make it straightforward for the user to define any behavior they might want, but at the possible expense of performance.As part of the Julia language project, we have developed an approach that yields a novel trade-off between flexibility and compiletime analysis. The core abstraction we use is multiple dispatch. We have come to believe that while multiple dispatch has not been especially popular in most kinds of programming, technical computing is its killer application. By expressing key functions such as array indexing using multi-method signatures, a surprising range of behaviors can be obtained, in a way that is both relatively easy to write and amenable to compiler analysis. The compact factoring of concerns provided by these methods makes it easier for userdefined types to behave consistently with types in the standard library. | ||

Julia: A Fast Dynamic Language for Technical Computing Bezanson, J., Karpinski, S., Shah, V. B., & Edelman, A. (2012). arXiv preprint arXiv:1209.5145. | PDF arXiv Abstract | |

Abstract: Dynamic languages have become popular for scientific computing. They are generally considered highly productive, but lacking in performance. This paper presents Julia, a new dynamic language for technical computing, designed for performance from the beginning by adapting and extending modern programming language techniques. A design based on generic functions and a rich type system simultaneously enables an expressive programming model and successful type inference, leading to good performance for a wide range of programs. This makes it possible for much of the Julia library to be written in Julia itself, while also incorporating best-of-breed C and Fortran libraries. | ||

Language and Compiler Support for Auto-Tuning Variable-Accuracy Algorithms Ansel, J., Wong, Y. L., Chan, C., Olszewski, M., Edelman, A., & Amarasinghe, S. (2011, April). In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (pp. 85-96). IEEE Computer Society. | PDF Abstract | |

Abstract: Approximating ideal program outputs is a common technique for solving computationally difficult problems, for adhering to processing or timing constraints, and for performance optimization in situations where perfect precision is not necessary. To this end, programmers often use approximation algorithms, iterative methods, data resampling, and other heuristics. However, programming such variable accuracy algorithms presents difficult challenges since the optimal algorithms and parameters may change with different accuracy requirements and usage environments. This problem is further compounded when multiple variable accuracy algorithms are nested together due to the complex way that accuracy requirements can propagate across algorithms and because of the size of the set of allowable compositions. As a result, programmers often deal with this issue in an ad-hoc manner that can sometimes violate sound programming practices such as maintaining library abstractions. In this paper, we propose language extensions that expose trade-offs between time and accuracy to the compiler. The compiler performs fully automatic compile-time and installtime autotuning and analyses in order to construct optimized algorithms to achieve any given target accuracy. We present novel compiler techniques and a structured genetic tuning algorithm to search the space of candidate algorithms and accuracies in the presence of recursion and sub-calls to other variable accuracy code. These techniques benefit both the library writer, by providing an easy way to describe and search the parameter and algorithmic choice space, and the library user, by allowing high level specification of accuracy requirements which are then met automatically without the need for the user to understand any algorithm-specific parameters. Additionally, we present a new suite of benchmarks, written in our language, to examine the efficacy of our techniques. Our experimental results show that by relaxing accuracy requirements, we can easily obtain performance improvements ranging from 1.1× to orders of magnitude of speedup. | ||

Petabricks: A Language and Compiler for Algorithmic Choice Ansel, J., Chan, C., Wong, Y. L., Olszewski, M., Zhao, Q., Edelman, A., & Amarasinghe, S. (2009). (Vol. 44, No. 6, pp. 38-49). ACM. | PDF Abstract | |

Abstract: It is often impossible to obtain a one-size-fits-all solution for high performance algorithms when considering different choices for data distributions, parallelism, transformations, and blocking. The best solution to these choices is often tightly coupled to different architectures, problem sizes, data, and available system resources. In some cases, completely different algorithms may provide the best performance. Current compiler and programming language techniques are able to change some of these parameters, but today there is no simple way for the programmer to express or the compiler to choose different algorithms to handle different parts of the data. Existing solutions normally can handle only coarse-grained, library level selections or hand coded cutoffs between base cases and recursive cases. We present PetaBricks, a new implicitly parallel language and compiler where having multiple implementations of multiple algorithms to solve a problem is the natural way of programming. We make algorithmic choice a first class construct of the language. Choices are provided in a way that also allows our compiler to tune at a finer granularity. The PetaBricks compiler autotunes programs by making both fine-grained as well as algorithmic choices. Choices also include different automatic parallelization techniques, data distributions, algorithmic parameters, transformations, and blocking. Additionally, we introduce novel techniques to autotune algorithms for different convergence criteria. When choosing between various direct and iterative methods, the PetaBricks compiler is able to tune a program in such a way that delivers near-optimal efficiency for any desired level of accuracy. The compiler has the flexibility of utilizing different convergence criteria for the various components within a single algorithm, providing the user with accuracy choice alongside algorithmic choice. | ||

Autotuning Multigrid with PetaBricks Chan, C., Ansel, J., Wong, Y. L., Amarasinghe, S., & Edelman, A. (2009, November). In High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on (pp. 1-12). IEEE. | PDF Abstract | |

Abstract: Algorithmic choice is essential in any problem domain to realizing optimal computational performance. Multigrid is a prime example: not only is it possible to make choices at the highest grid resolution, but a program can switch techniques as the problem is recursively attacked on coarser grid levels to take advantage of algorithms with different scaling behaviors. Additionally, users with different convergence criteria must experiment with parameters to yield a tuned algorithm that meets their accuracy requirements. Even after a tuned algorithm has been found, users often have to start all over when migrating from one machine to another. We present an algorithm and autotuning methodology that address these issues in a near-optimal and efficient manner. The freedom of independently tuning both the algorithm and the number of iterations at each recursion level results in an exponential search space of tuned algorithms that have different accuracies and performances. To search this space efficiently, our autotuner utilizes a novel dynamic programming method to build efficient tuned algorithms from the bottom up. The results are customized multigrid algorithms that invest targeted computational power to yield the accuracy required by the user. The techniques we describe allow the user to automatically generate tuned multigrid cycles of different shapes targeted to the user's specific combination of problem, hardware, and accuracy requirements. These cycle shapes dictate the order in which grid coarsening and grid refinement are interleaved with both iterative methods, such as Jacobi or Successive Over-Relaxation, as well as direct methods, which tend to have superior performance for small problem sizes. The need to make choices between all of these methods brings the issue of variable accuracy to the forefront. Not only must the autotuning framework compare different possible multigrid cycle shapes against each other, but it also needs the ability to compare tuned cy- les against both direct and (non-multigrid) iterative methods. We address this problem by using an accuracy metric for measuring the effectiveness of tuned cycle shapes and making comparisons over all algorithmic types based on this common yardstick. In our results, we find that the flexibility to trade performance versus accuracy at all levels of recursive computation enables us to achieve excellent performance on a variety of platforms compared to algorithmically static implementations of multigrid. Our implementation uses PetaBricks, an implicitly parallel programming language where algorithmic choices are exposed in the language. The PetaBricks compiler uses these choices to analyze, autotune, and verify the PetaBricks program. These language features, most notably the autotuner, were key in enabling our implementation to be clear, correct, and fast. | ||

Interactive Supercomputing’s Star-P Platform: Parallel MATLAB and MPI Homework Classroom Study on High Level Language Productivity Edelman, A., Husbands, P., & Leibman, S. (2006). Massachusetts Institute of Technology, Cambridge, MA. (No. DOE/FE/25631-1). | PDF | |

Star-P User Guide (2006). Interactive Supercomputing. | PDF | |

Getting Started with Star-P: Taking Your First Test-Drive (2006). Interactive Supercomputing | PDF | |

Parallel MATLAB: doing it right Choy, R., & Edelman, A. (2005). Proceedings of the IEEE, 93(2), 331-341. | PDF Abstract | |

Abstract: MATLAB [20] is one of the most widely used mathematical computing environments in technical computing. It is an interactive environment that provides high performance computational routines and an easy-to-use, C-like scripting language. It has started out as an interactive interface to EISPACK [31] and LINPACK [13], and has remained a serial program. In 1995, Cleve Moler of Mathworks argued that there was no market at the time for a parallel MATLAB [26]. But times have changed and we are seeing increasing interest in developing a parallel MATLAB, from both academic and commercial sectors. In a recent survey, [10] 27 parallel MATLAB projects have been identiﬁed. In this paper we will expand upon that survey and discuss the approaches the projects have taken to parallelize MATLAB. Also we will describe innovative features in some of the parallel MATLAB projects. Then we will conclude with an idea of a ‘right’ parallel MATLAB.Finally we will give an example of what we think is a ‘right’ parallel MATLAB: MATLAB*P [11] | ||

Star-P: High productivity parallel computing Choy, R., Edleman, A., Gilbert, J. R., Shah, V., & Cheng, D. (2004). Massachusetts Institute of Technology, Cambridge, MA. | PDF Abstract | |

Abstract: Star-P is an interactive parallel scientific computing environment. It aims to make parallel programming more accessible. Star-P borrows ideas from Matlab*P [3], but is a new development. Currently only a Matlab interface for Star-P is available, but it is not limited to being a parallel Matlab. It combines all four parallel Matlab approaches in one environment, as described in the parallel Matlab survey [2]: embarrassingly parallel, message passing, backend support and compilation. It also integrates several parallel numerical libraries into one single easy-to-use piece of software. The focus of Star-P is to improve user productivity in parallel programming. We believe that Star-P can dramatically reduce the difficulty of programming parallel computers by reducing the time needed for development and debugging. To achieve productivity, it is important that the user interface is intuitive to the user. For example, a large class of scientific users are already familiar with the Matlab language. So it is beneficial to use Matlab as a parallel programming language. Additions to the language are minimal in keeping with the philosophy to avoid re-learning. Also, as a design goal, our system does not distinguish between serial data and parallel data. C = op(A,B) print(C) C will be the same whether A and B are distributed or not. This will allow the same piece of code to run sequentially (when all the arguments are serial) or in parallel (when at least one of the arguments is distributed). | ||

Interactive Supercomputing with MITMatlab Husbands, P., Isbell Jr, C. L., & Edelman, A. (1998). | PDF | |

The Future Fast Fourier Transform? Edelman, A., McCorquodale, P., & Toledo, S. (1998). SIAM Journal on Scientific Computing, 20(3), 1094-1114. | PDF Abstract | |

Abstract: It seems likely that improvements in arithmetic speed will continue to outpace advances in communication bandwidth. Furthermore, as more and more problems are working on huge datasets, it is becoming increasingly likely that data will be distributed across many processors because one processor does not have suﬃcient storage capacity. For these reasons, we propose that an inexact DFT such as an approximate matrix-vector approach based on singular values or a variation of the Dutt–Rokhlin fast-multipole-based algorithm may outperform any exact parallel FFT. The speedup may be as large as a factor of three in situations where FFT run time is dominated by communication. For the multipole idea we further propose that a method of “virtual charges” may improve accuracy, and we provide an analysis of the singular values that are needed for the approximate matrix-vector approaches. | ||

Index Transformation Algorithms in a Linear Algebra Framework Edelman, A., Heller, S., & Johnsson, S. L. (1994). Parallel and Distributed Systems, IEEE Transactions on, 5(12), 1302-1309. | PDF Abstract | |

Abstract: We present a linear algebraic formulation for a class of index transformations such as Gray code encoding and decoding, matrix transpose, bit reversal, vector reversal, shues, and other index or dimension permutations. This formulation unies, simplies, and can be used to derive algorithms for hypercube multiprocessors. We show how all the widely known properties of Gray codes and some not so well-known properties as well, can be derived using this framework. Using this framework, we relate hypercube communications algorithms to Gauss-Jordan elimination on a matrix of 0's and 1's. | ||

Large Numerical Linear Algebra in 1994: The Continuing Influence of Parallel Computing Edelman, A. (1994, May). In Scalable High-Performance Computing Conference, 1994., Proceedings of the (pp. 781-787). IEEE. | PDF | |

Hypercube algorithms for direct N-body solvers for different granularities Brunet, J. P., Mesirov, J. P., & Edelman, A. (1993). SIAM Journal on Scientific Computing, 14(5), 1143-1158. | Abstract | |

Abstract: Algorithms for the N-body problem are compared and contrasted, particularly those where N is in the range for which direct methods outperform approximation methods. With fewer bodies than processors, the so-called “replicated orrery” on a three-dimensional grid has been used successfully on the Connection Machine CM-2 architecture. With more bodies, the "rotated and translated Gray codes" is an ideal direct algorithm for machines such as the CM-2 in that it takes optimal advantage of the communications bandwidth of the machine. A classical Latin square can be used to abstractly denote any direct N-body calculation. Computational windows superimposed on this square illustrate the granularity of the computation. This point of view naturally illustrates a sequence of algorithms ranging along a granularity scale in the following order: "massively parallel," "replicated orrery," "orrery," "rotated/translated Gray codes," and "serial." | ||

Large Dense Numerical Linear Algebra in 1993: The Parallel Computing Influence Edelman, A. (1993). International Journal of High Performance Computing Applications, 7(2), 113-128. | PDF Abstract | |

Abstract: This article surveys the current state of applications of large dense numerical linear algebra and the influence of parallel computing. Furthermore, it attempts to crys tallize many important ideas that are sometimes mis understood in the rush to write fast programs. | ||

The first annual large dense linear system survey Edelman, A. (1991). ACM SIGNUM Newsletter, 26(4), 6-12. | PDF | |

Optimal matrix transposition and bit reversal on hypercubes: All--to--all personalized communication Edelman, A. (1991). Journal of Parallel and Distributed Computing<, 11(4), 328-331. | PDF Abstract | |

Abstract: In a hypercube multiprocessor with distributed memory, messages have a street address and an apartment number, i.e., a hypercube node address and a local memory address. Here we describe an optimal algorithm for performing the communication described by exchanging the bits of the node address with that of the local address. These exchanges occur typically in both matrix transposition and bit reversal for the fast Fourier transform. | ||

Matrix multiplication on hypercubes using full bandwidth and constant storage Ho, C. T., & Johnsson, L. (1991). In Sixth Distributed Memory Computing Conference (pp. 447-451). | PDF | |

Guess what? We have a hypercube Edelman, A. (1989, not published ) Thinking Machines Corporation Semi-Internal Report. | PDF Abstract | |

Abstract: Here I try my best to explain how to use the hypercube wires on the Connection Machine. At one time, direct use of the wires was considered beyond the reach of reasonable programming, but this is no longer true. We now have the primitives and expertise that make self routing reasonable and actually quite fun. No previous experience is assumed, all you have to do is be willing to learn about the capabilities now available in CMIS. In this document I emphasize less what algorithms can be implemented using the hypercube wires, and more the mechanics of how to use them. It might be nice to have another document that shows off how the hypercube can be used in non-trivial ways. The CM-2’s power is being terribly under-utilized. It is my hope that several people will use this document as a departure point for implementing fast communications routines. There is no doubt that fixed communications patterns can be much faster. This document is not at the level of a textbook yet, but I have tried my best to explain as much as possible. I strongly encourage people using the wires to send questions to the “wires” mailing list and work with me (while I am still around) , Mark Bromley, or Steve Heller to take advantage of the expertise that we have gained. Just lately I have been hearing talk about implementing stencils as a program product. Another product that would run beautifully fast on the machine is an n-body package. Other primitives that can and should be written are fast power of 2 news, spread row and column simultaneously (perfect for LU), spread from diagonal accross row and column simultaneously (perfect for Jacobi eigenvalues and SVD), scans etc. There is plenty of work to go around, all of which we have the know-how to solve. There have been many changes in the use of the wires in the past months. It went from impossible (the days of Steve Vavasis) to requiring microcode trickery and speed hacking (this is where I started working) to being more or less straightforward and convenient (Mark Bromley deserves tons of credit for this). Benchmarks based on news being fast and everything else being slow should be thrown away. AE’s hearing about code being communication bound should be asking probing questions about why. |

## Numerical Linear Algebra

Pascal Matrices Edelman, A., & Strang, G. (2004). American Mathematical Monthly, 189-197. | PDF | |

Staircase failures explained by orthogonal versal forms Edelman, A., & Ma, Y. (2000). SIAM Journal on Matrix Analysis and Applications, 21(3), 1004-1025. | PDF Abstract | |

Abstract: Treating matrices as points in n2 -dimensional space, we apply geometry to study and explain algorithms for the numerical determination of the Jordan structure of a matrix. Traditional notions such as sensitivity of subspaces are replaced with angles between tangent spaces of manifolds in n2 -dimensional space. We show that the subspace sensitivity is associated with a small angle between complementary subspaces of a tangent space on a manifold in n2 -dimensional space. We further show that staircase algorithm failure is related to a small angle between what we call staircase invariant space and this tangent space. The matrix notions in n2 -dimensional space are generalized to pencils in 2mn-dimensional space. We apply our theory to special examples studied by Boley, Demmel, and Kågström. | ||

A Geometric Approach to Perturbation Theory of Matrices and Matrix Pencils. Part II: A Stratification Enhanced Staircase Algorithm Edelman, A., Elmroth, E., & Kågström, B. (1999). SIAM Journal on Matrix Analysis and Applications, 20(3), 667-699. | PDF Abstract | |

Abstract: Computing the Jordan form of a matrix or the Kronecker structure of a pencil is a well-known ill-posed problem. We propose that knowledge of the closure relations, i.e., the stratification, of the orbits and bundles of the various forms may be applied in the staircase algorithm. Here we discuss and complete the mathematical theory of these relationships and show how they may be applied to the staircase algorithm. This paper is a continuation of our Part I paper on versal deformations, but it may also be read independently. | ||

The Computation and Sensitivity of Double Eigenvalues Lippert, R. A., & Edelman, A. (1999). Advances in Computational Mathematics, Lecture Notes in Pure and Appl. Math, 202, 353-393. | PDF | |

A Counterexample to a Hadamard Matrix Pivot Conjecture Edelman, A., & Friendman, D. (1998). Linear and Multilinear Algebra, 44(1), 53-56. | Abstract | |

Abstract: In the study of the growth factor of completely pivoted Hadamard matrices, it becomes natural to study the possible pivots. Very little is known or provable about these pivots other than a few cases at the beginning and end. For example it is known that the first four pivots must be 1,2,2 and 4 and the last three pivots in backwards order must be n/2, and n/2. Based on computational experiments, it was conjectured by Day and Peterson, that the n—3rd pivot must always be n/4. This conjecture would have suggested a kind of symmetry with the first four pivots. In this note we demonstrate a matrix whose n–3rd pivot is n/2 showing that the conjecture is false. | ||

Non-generic eigenvalue perturbations of Jordan blocks Ma, Y., & Edelman, A. (1998). Linear algebra and its applications, 273(1), 45-63. | PDF Abstract | |

Abstract: We show that if an n × n Jordan block is perturbed by an O(ε) upper k-Hessenberg matrix (k subdiagonals including the main diagonal), then generically the eigenvalues split into p rings of size k and one of size r (if r ≠ 0), where n = pk + r. This generalizes the familiar result (k = n, p = 1, r = 0) that generically the eigenvalues split into a ring of size n. We compute the radii of the rings to first order and generalize the result in a number of directions involving multiple Jordan blocks of the same size. | ||

A Geometric Approach to Perturbation Theory of Matrices and Matrix Pencils. Part I: Versal Deformation Edelman, A., Elmroth, E., & Kågström, B. (1997). SIAM Journal on Matrix Analysis and Applications, 18(3), 653-692. | PDF Abstract | |

Abstract: Computing the Jordan form of a matrix or the Kronecker structure of a pencil is a well-known ill-posed problem. We propose that knowledge of the closure relations, i.e., the stratification, of the orbits and bundles of the various forms may be applied in the staircase algorithm. Here we discuss and complete the mathematical theory of these relationships and show how they may be applied to the staircase algorithm. This paper is a continuation of our Part I paper on versal deformations, but may also be read independently. | ||

On the complete pivoting conjecture for a Hadamard matrix of order 12 Edelman, A., & Mascarenhas, W. (1995). Linear and Multilinear Algebra, 38(3), 181-187. | PDF Abstract | |

Abstract: This paper settles a conjecture by Day and Peterson that if Gaussian elimination with complete pivoting is performed on a 12 by 12 Hadamard matrix, then (1,2,2,4,3,10/3,18/5,4,3,6,6,12) must be the (absolute) pivots. Our proof uses the idea of symmetric block designs to reduce the complexity that would be found in enumerating cases.In contrast, at least 30 patterns for the absolute values of the pivots have been observed for 16 by 16 Hadamard matrices. This problem is non-trivial because row and column permutations do not preserve pivots. A naive computer search would require (12!)2 trials. | ||

A simple estimate of the condition number of a linear system Guggenheimer, H. W., Edelman, A. S., & Johnson, C. R. (1995). College Mathematics Journal, 2-5. | ||

Polynomial roots from companion matrix eigenvalues Edelman, A., & Murakami, H. (1995). Mathematics of Computation, 64(210), 763-776. | PDF Abstract | |

Abstract: In classical linear algebra, the eigenvalues of a matrix are sometimes defined as the roots of the characteristic polynomial. An algorithm to compute the roots of a polynomial by computing the eigenvalues of the corresponding companion matrix turns the tables on the usual definition. We derive a first-order error analysis of this algorithm that sheds light on both the underlying geometry of the problem as well as the numerical error of the algorithm. Our error analysis expands on work by Van Dooren and Dewilde in that it states that the algorithm is backward normwise stable in a sense that must be defined carefully. Regarding the stronger concept of a small componentwise backward error, our analysis predicts a small such error on a test suite of eight random polynomials suggested by Toh and Trefethen. However, we construct examples for which a small componentwise relative backward error is neither predicted nor obtained in practice. We extend our results to polynomial matrices, where the result is essentially the same, but the geometry becomes more complicated. | ||

On Parlett's matrix norm inequality for the Cholesky decomposition Edelman, A., & Mascarenhas, W. F. (1995). Numerical linear algebra with applications, 2(3), 243-250. | PDF Abstract | |

Abstract: We show that a certain matrix norm ratio studied by Parlett has a supermum that is O(equation image) when the chosen norm is the Frobenius norm, while it is O(log n) for the 2-norm. This ratio arises in Parlett's analysis of the Cholesky decomposition of an n by n matrix. | ||

The dimension of matrices (matrix pencils) with given Jordan (Kronecker) canonical forms Demmel, J. W., & Edelman, A. (1995). Linear algebra and its applications, 230, 61-87. | PDF Abstract | |

Abstract: The set of $n$ by $n$ matrices with a given Jordan canonical form defines a subset of matrices in complex $n^2$ dimensional space. We analyze one classical approach and one new approach to count the dimension of this set. The new approach is based upon and meant to give insight into the staircase algorithm for the computation of the Jordan Canonical Form as well as the occasional failures of this algorithm. We extend both techniques to count the dimension of the more complicated set defined by the Kronecker canonical form of an arbitrary rectangular matrix pencil $A-\lambda B$. | ||

Scaling for orthogonality Edelman, A., & Stewart, G. W. (1993). Signal Processing, IEEE Transactions on, 41(4), 1676-1677. | PDF Abstract | |

Abstract: In updating algorithms where orthogonal transformations are accumulated, it is important to preserve the orthogonality of the product in the presence of rounding error. Moonen et al. (ibid., vol.39, p.1911-13, 1991) have pointed out that simply normalizing the columns of the product tends to preserve orthogonality-though not, as DeGroat (ibid., vol.39, p.1913-14, 1991) points out, to working precision. The authors discuss the previous work and give an analysis of the phenomenon. | ||

The complete pivoting conjecture for Gaussian Elimination is false Edelman, A. (1992). Mathematica Journal, 2(2), 58-61. | PDF | |

Snapshots of Mobile Jacobi Edelman, A. (1991). In Numerical Linear Algebra, Digital Signal Processing and Parallel Algorithms (pp. 485-488). Springer Berlin Heidelberg.(What a shame that I do not have the pictures available. Even better would be to put the original video on line. Maybe some day!) | PDF |

## Approximation Theory

Nonnegativity-, monotonicity-, or convexity-preserving cubic and quintic Hermite interpolation Dougherty, R. L., Edelman, A. S., & Hyman, J. M. (1989). Mathematics of Computation, 52(186), 471-494. | Abstract | |

Abstract: The Hermite polynomials are simple, effective interpolants of discrete data. These interpolants can preserve local positivity, monotonicity, and convexity of the data if we restrict their derivatives to satisfy constraints at the data points. This paper describes the conditions that must be satisfied for cubic and quintic Hermite interpolants to preserve these properties when they exist in the discrete data. We construct algorithms to ensure that these constraints are satisfied and give numerical examples to illustrate the effectiveness of the algorithms on locally smooth and rough data. | ||

On locally supported basis functions for the representation of geometrically continuous curves Dyn, N., Edelman, A., & Micchelli, C. A. (1987). Analysis, 7(3-4), 313-342. | ||

Admissible slopes for monotone and convex interpolation Edelman, A., & Micchelli, C. A. (1987). Numerische Mathematik, 51(4), 441-458. | PDF Abstract | |

Abstract: In many applications, interpolation of experimental data exhibiting some geometric property such as nonnegativity, monotonicity or convexity is unacceptable unless the interpolant reflects these characteristics. This paper identifies admissible slopes at data points of variousC 1 interpolants which ensure a desirable shape. We discuss this question, in turn for the following function classes commonly used for shape preserving interpolations: monotone polynomials,C 1 monotone piecewise polynomials, convex polynomials, parametric cubic curves and rational functions. |

TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines Kepner, J., Brightwell, R., Edelman, A., Gadepally, V., Jananthan, H., Jones, M., ... & Reuther, A. (2018, September). IEEE High Performance extreme Computing Conference (HPEC) (pp. 1-8). IEEE. | PDF arXiv Abstract | |

Abstract: The rise in computing hardware choices is driving a reevaluation of operating systems. The traditional role of an operating system controlling the execution of its own hardware is evolving toward a model whereby the controlling processor is distinct from the compute engines that are performing most of the computations. In this context, an operating system can be viewed as software that brokers and tracks the resources of the compute engines and is akin to a database management system. To explore the idea of using a database in an operating system role, this work defines key operating system functions in terms of rigorous mathematical semantics (associative array algebra) that are directly translatable into database operations. These operations possess a number of mathematical properties that are ideal for parallel operating systems by guaranteeing correctness over a wide range of parallel operations. The resulting operating system equations provide a mathematical specification for a Tabular Operating System Architecture (TabulaROSA) that can be implemented on any platform. Simulations of forking in TabularROSA are performed using an associative array implementation and compared to Linux on a 32,000+ core supercomputer. Using over 262,000 forkers managing over 68,000,000,000 processes, the simulations show that TabulaROSA has the potential to perform operating system functions on a massively parallel scale. The TabulaROSA simulations show 20x higher performance as compared to Linux while managing 2000x more processes in fully searchable tables. | ||

Novel Algebras for Advanced Analytics in Julia Shah, V. B., Edelman, A., Karpinski, S., Bezanson, J., & Kepner, J. (2013). | PDF Abstract | |

Abstract: A linear algebraic approach to graph algorithms that exploits the sparse adjacency matrix representation of graphs can provide a variety of benefits. These benefits include syntactic simplicity, easier implementation, and higher performance. One way to employ linear algebra techniques for graph algorithms is to use a broader definition of matrix and vector multiplication. We demonstrate through the use of the Julia language system how easy it is to explore semirings using linear algebraic methodologies. | ||

Visualizing Large Kronecker Graphs Nguyen, H., Kepner, J., & Edelman, A. (2011). Graph Algorithms in the Language of Linear Algebra (pp. 241-250). Society for Industrial and Applied Mathematics. | PDF Abstract | |

Abstract: Kronecker graphs have been shown to be one of the most promising models for real-world networks. Visualization of Kronecker graphs is an important challenge. This chapter describes an interactive framework to assist scientists and engineers in generating, analyzing, and visualizing Kronecker graphs with as little effort as possible. | ||

Performance of sample covariance based capon bearing only tracker Richmond, C. D., Geddes, R. L., Movassagh, R., & Edelman, A. (2011, November). Signals, Systems and Computers (ASILOMAR), 2011 Conference Record of the Forty Fifth Asilomar Conference (pp. 2036-2039). IEEE. | Abstract | |

Abstract: Bearing estimates input to a tracking algorithm require a concomitant measurement error to convey confidence. When Capon algorithm based bearing estimates are derived from low signal-to-noise ratio (SNR) data, the method of interval errors (MIE) provides a representation of measurement error improved over high SNR metrics like the Cramér-Rao bound or Taylor series. A corresponding improvement in overall tracker performance is had. These results have been demonstrated [4] assuming MIE has perfect knowledge of the true data covariance. Herein this assumption is weakened to explore the potential performance of a practical implementation that must address the challenges of non-stationarity and finite sample effects. Comparisons with known non-linear smoothing techniques designed to reject outlier measurements is also explored. | ||

Leveraging high performance computation for statistical wind prediction Chan, C. P., Stalker, J., Edelman, A., & Connors, S. (2010). | PDF Abstract | |

Abstract: This paper presents a new application of a particular machine learning technique for improving wind forecasting. The technique, known as kernel regression, is somewhat similar to fuzzy logic in that both make predictions based on the similarity of the current state to historical training states. Unlike fuzzy logic systems, kernel regression relaxes the requirement for explicit event classifications and instead leverages the training set to form an implicit multidimensional joint density and compute a conditional expectation given any available data.The need for faster, highly accurate, and cost-effective predictive techniques for wind power forecasting is becoming imperative as wind energy becomes a larger contributor to the energy mix in places throughout the world. Several approaches that depend on large computing resources are already in use today; however, high performance computing can help us not only solve existing computational problems faster or with larger inputs, but also create and implement new real-time forecasting mechanisms. In wind power forecasting, like in many other scientific domains, it is often important to be able to tune the tradeoff between accuracy and computational efficiency. The work presented here represents the first steps toward building a portable, parallel, auto-tunable forecasting program where the user can select a desired level of accuracy, and the program will respond with the fastest machine-specific parallel algorithm that achieves the accuracy target. | ||

A request: the Painlevé project Bornemann, F., Clarkson, P., Deift, P., Edelman, A., Its, A., & Lozier, D. (2010). AMSTAT News, 57, 1389. | PDF Abstract | |

Abstract: In recent years the Painlevé equations, particularly the six Painlevé transcendents PI,…, PVI, have emerged as the core of modern special function theory. In the eighteenth and nineteenth centuries, the classical special functions such as the Bessel functions, the Airy function, the Legendre functions, the hypergeometric functions, and so on, were recognized and developed in response to the problems of the day in electromagnetism, acoustics, hydrodynamics, elasticity, and many other areas. In the same way, around the middle of the twentieth century, as science and engineering continued to expand in new directions, a new class of functions, the Painlevé functions, started to appear in applications. The list of problems now known to be described by the Painlevé equations is large, varied, and expanding rapidly. | ||

A Scalable Parallel Sorting Algorithm for Contemporary Architectures Cheng, D. R., Shah, V. B., Gilbert, J. R., & Edelman, A. (2009). | PDF Abstract | |

Abstract: Modern scientic codes often combine numerical methods with combinatorial methods. Sorting, a widely studied problem in computer science, is an important primitive for combinatorial scientic computing. Increasingly, scientic codes target parallel computers. Scientic programming environments such as Matlab, Python, R, and Star-P provide sorting as a built-in function. We present a parallel sorting algorithm that is close to optimal in computation and communication, along with scaling results for shared and distributed memory architectures. One of the core motivations for developing this code was the lack of a quality scalable parallel sorting library. Our sorting library is available as open source and is written to be compatible with the C++ Standard Template Library. The code is available for download at http://gauss. cs. ucsb. edu/code/index. shtml | ||

The Star-P High Performance Computing Platform Edelman, Alan. Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. Vol. 4. IEEE, 2007 | Abstract | |

Abstract: The high performance MATLAB® user now has more choices than ever. Interactive Supercomputing's Star-P embraces this new world where, as an example, a MATLAB user who never wants to leave MATLAB might sit next to a C++ programmer at the office and both surf the Internet for the latest high speed FFT written in yet another language. The MATLAB of the past now becomes one browser into a bigger computational world. HPC users need this bigger world. Other "browsers" can be imagined. The open Star-P platform gives users options never before available to programmers who have traditionally enjoyed living exclusively inside a MATLAB environment. | ||

Semi-automated parallelization using Star-P Schaa, D., Kaeli, D., & Edelman, A. (2007). | PDF Abstract | |

Abstract: The purpose of this research is to determine the effectiveness of Star-P as a tool for exploiting parallelism in high-performance scientific computing, and eventually contributing to its functionality—perhaps by porting it to run on other high performance platforms such as graphics processors. | ||

So What's innovative and exotic about star-P for MATLAB and other clients? Edelman, A. (2006, November). Proceedings of the 2006 ACM/IEEE conference on Supercomputing (p. 278). ACM. | Abstract | |

Abstract: Star-P is a unique technology offered by Interactive Supercomputing after nurturing at MIT. Star-P through its clever abstractions is solving the ease of use problem that has plagued supercomputing. Given that there have been around 30 parallel MATLABs including three other major offerings, Star-P must be way ahead to compete in the marketplace.Some of the innovative features of Star-P are the ability to program in MATLAB, hook in task parallel codes written using a processor free abstraction, hook in existing parallel codes, and obtain the performance that represents the HPC promise. All this is through a client/server interface. Other clients such as Python or R could be possible. The MATLAB, Python, or R becomes the "browser." If we make it look easy, it is because decades of parallel computing experience has taught us that it is not.This talk demonstrates the abstractions and innovations that make this possible. | ||

Free Probability, Sample Covariance Matrices, and Signal Processing Rao, N. R., & Edelman, A. (2006, May). Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on (Vol. 5, pp. V-V). IEEE. | Abstract | |

Abstract: Free probability provides tools and techniques for analyzing the eigen-spectra of large Hermitian random matrices. These stochastic eigen-analysis techniques have been invaluable in providing insight into the structure of sample covariance matrices. We briefly outline how these techniques can be used to analytically predict the spectrum of large sample covariance matrices. An eigen-inference application is briefly discussed | ||

A Novel Parallel Sorting Algorithm for Contemporary Architectures Cheng, D. R., Shah, V. B., Gilbert, J. R., & Edelman, A. (2006). Submitted to ALENEX. | PDF Abstract | |

Abstract: Traditionally, the field of scientific computing has been dominated by numerical methods. However, modern scientific codes often combine numerical methods with combinatorial methods. Sorting, a widely studied problem in computer science, is an important primitive for combinatorial scientific computing. As high performance computers become more affordable due to multi-core CPUs and commodity clustering, more and more scientific codes are written for parallel computers. Scientific programming environments such as Matlab and Star-P provide sorting as a built-in function. Parallel sorting can also form a basic building block to implement higher level combinatorial algorithms and computations with irregular communication patterns and workloads-such as parallel sparse matrix computations [15]. We describe the design and implementation of an algorithm for parallel sorting on contemporary architectures. Distributed memory architectures are widely in use today. The cost of communication is an order of magnitude larger than the cost of computation on such architectures. Often, it is not enough to tune existing algorithms. Newer architectures demand a fresh look at the problems being solved and new algorithms to yield good performance. We propose a parallel sorting algorithm which moves a minimal amount of data over the network. Our algorithm is close to optimal in both the computation and communication required. It moves lesser data than widely used sample sorting algorithms, and is computationally a lot more efficient on distributed and shared memory architectures. | ||

An Interactive Approach to Parallel Combinatorial Algorithms with Star-P Gilbert, J., Shah, V., Letsche, T., Reinhardt, S., & Edelman, A. (2005). Proc. 9th HPEC. | PDF Abstract | |

Abstract: In the context of combinatorial and graph algorithms, we wish to offer the programmer such a rich set of primitives and tools. The Star-P system, now a commercial product of Interactive Supercomputing, extends MATLAB style interactive programming to a high performance machine through a backend server [1, 2, 3, 4, 5, 6]. Our conclusion is that this system can offer highly tuned kernels for combinatorial and graph algorithms with the above core beliefs in mind. This is far more preferable and practical than asking users to learn to obtain speedups from scratch, reinventing the wheel each time. | ||

Fast Sorting on a Distributed-Memory Architecture Cheng, D. R., Shah, V., Gilbert, J. R., & Edelman, A. (2005). | PDF Abstract | |

Abstract: We consider the often-studied problem of sorting, for a parallel computer. Given an input array distributed evenly over p processors, the task is to compute the sorted output array, also distributed over the p processors. Many existing algorithms take the approach of approximately load-balancing the output, leaving each processor with Θ(n/p) elements. However, in many cases, approximate load-balancing leads to inefficiencies in both the sorting itself and in further uses of the data after sorting. We provide a deterministic parallel sorting algorithm that uses parallel selection to produce any output distribution exactly, particularly one that is perfectly load-balanced. Furthermore, when using a comparison sort, this algorithm is 1-optimal in both computation and communication. We provide an empirical study that illustrates the efficiency of exact data splitting, and shows an improvement over two sample sort algorithms. | ||

The bias of the MVDR beamformer outputs under diagonal loading Edelman, A., Nadakuditi, R. (2005) Acoustics, Speech, and Signal Processing, 2005. Proceedings.(ICASSP'05). IEEE International Conference | Abstract | |

Abstract: The MVDR beamformer is the most extensively used array processing algorithm and involves inverting the sample covariance matrix. In the snapshot deficient scenario, when the number of sensors is greater than or approximately equal to the number of snapshots, the eigenvalues of the resulting sample covariance matrix are poorly conditioned. Diagonal loading is then applied to the sample covariance matrix. Expressions for the bias of the resulting MVDR beamformer outputs in the sidelobe region are presented that are exact for asymptotically large arrays. Numerical simulations confirm the accuracy of these asymptotic expressions when predicting the bias of the outputs of moderately large arrays. | ||

Random matrix theory Edelman, A., & Rao, N. R. (2005). Acta Numerica, 14, 233-297. | PDF Abstract | |

Abstract: Random matrix theory is now a big subject with applications in many disciplines of science, engineering and finance. This article is a survey specifically oriented towards the needs and interests of a numerical analyst. This survey includes some original material not found anywhere else. We include the important mathematics which is a very modern development, as well as the computational software that is transforming the theory into useful practice. | ||

Advances in random matrix theory Edelman, A. (2005) IEEE 6th Workshop on Signal Processing Advances in Wireless Communications | Abstract | |

Abstract: Random matrix theory is now a big subject with applications in many fields of science, finance and engineering. This talk surveys some of the important mathematics that is a very modern development as well as the computational software that is transforming theory into useul practice. | ||

Modeling and rendering of weathered stone Dorsey, J., Edelman, A., Jensen, H. W., Legakis, J., & Pedersen, H. K. (2005, July). ACM SIGGRAPH 2005 Courses (p. 4). ACM. | PDF Abstract | |

Abstract: Stone is widespread in its use as a building material and artistic medium. One of its most remarkable qualities is that it changes appearance as it interacts with the environment. These changes are mainly confined to the surface but involve complex volumetric effects such as erosion and mineral dissolution. This paper presents an approach for the modeling and rendering of changes in the shape and appearance of stone.To represent stone, we introduce a slab data structure, which is a surface-aligned volume confined to a narrow region around the boundary of the stone. Our weathering model employs a simulation of the flow of moisture and the transport, dissolution, and recrystallization of minerals within the porous stone volume. In addition, this model governs the erosion of material from the surface. To render the optical effects of translucency and coloration due to the composition of minerals near the surface, we simulate the scattering of light inside the stone using a general subsurface Monte Carlo ray tracer. These techniques can capture many aspects of the time-dependent appearance of stone. We demonstrate the approach with models of granite and marble statues, as well as a sandstone column. | ||

Asymptotic mean squared error performance of diagonally loaded Capon-MVDR processor Richmond, C. D., Nadakuditi, R. R., & Edelman, A. (2005, November). Conference Record of the Thirty-Ninth Asilomar Conference on Signals, Systems and Computers (Vol. 2005, pp. 1711-1716). | Abstract | |

Abstract: The asymptotic local mean squared error (MSE) performance of the Capon algorithm, aka the minimum variance distortionless response (MVDR) spectral estimator, has been studied extensively by several authors. Stoica et al.[21], Vaidyanathan and Buckley [23], and Hawkes and Nehorai [11] have exploited Taylor’s theorem and complex gradient methods to provide accurate prediction of the Capon algorithm signal parameter estimate MSE performance. These predictions are valid (i) above the estimation threshold signal-to-noise ratio (SNR) and (ii) provided a sufficient number of training samples is available for covariance estimation. The goal of this present analysis is to extend these results to the case in which the sample covariance matrix is diagonally loaded, as is often done in practice for regularization, stabilizing matrix inversion, and white noise gain control [7]. | ||

On the largest principal angle between random subspaces P-A Absil, A Edelman, P Koev (2004) | Abstract | |

Abstract: Formulas are derived for the probability density function and the probability distribution of the largest canonical angle between two p-dimensional subspaces of R chosen from the uniform distribution on the Grassmann manifold endowed with its canonical metric (which is the essentially unique isotropic metric, ie, invariant under rotations). | ||

Solving Multiple Classes of Problems in Parallel with MATLAB* P Choy, R., & Edelman, A. (2004). | PDF Abstract | |

Abstract: MATLAB [7] is one of the most widely used mathematical computing environments in technical computing. It is an interactive environment that provides high performance computational routines and an easy-to-use, C-like scripting language. Mathworks, the company that develops MATLAB, currently does not provide a version of MATLAB that can utilize parallel computing [9]. This has led to academic and commercial efforts outside Mathworks to build a parallel MATLAB, using a variety of approaches. MATLAB*P is a parallel MATLAB that focus on enhancing productivity by providing an easy to use parallel computing tool. Using syntaxes identical to regular MATLAB, it can be used to solve large scale algebraic problems as well as multiple small problems in parallel. This paper describes how the innovative combination of ’*p mode’ and ’MultiMATLAB/MultiOctave mode’ in MATLAB*P can be used to solve a large range of real world problems. | ||

An application of NNMF in a color calibration algorithm of film F Wang, A Edelman, A Rao. (2003) Technical report, MIT Laboratory for Computer Science | Abstract | |

Abstract: Introduction: Faithful reproduction of color is of great interest to animation film makers and manufacturers of color producing devices. Improvements to the conventional color calibration practice for color systems can have a wide range impact in the color industry. With the aim of achieving better color accuracy in the making and showing of animation movies, we propose a physically based numerical color calibration algorithm for motion picture film. | ||

Support Vector Machine Lagrange Multipliers and Simplex Volume Decompositions T Wen, A Edelman. (2000)submitted to SIAM journal on Matrix Analysis and Applications | PDF Abstract | |

Abstract: The Support Vector Machine (SVM) idea has attracted recent attention in solving classification and regression problems. As an example based method, SVMs distinguish two point classes by finding a separating boundary layer, which is determined by points that become known as Support Vectors (SVs). While the computation of the separating boundary layer is formulated as a linearly constrained Quadratic Programming (QP) problem, in practice the corresponding dual problem is computed. This paper investigates how the solution to the dual problem depends on the geometry. When examples are separable, we will show that the Lagrange multipliers (the unknowns of the dual problem) associated with SVs can be interpreted geometrically as a normalized ratio of simplex volumes, and at the same time a simplex volume decomposition relation must be satisfied. Examples for the two and three dimensional cases are given during the discussion. Besides showing geometric properties of SVMs, we also suggest a way to investigate the distribution of the Lagrange multipliers based on a random matrix model. We finish this paper with a further analysis of how the Lagrange multipliers depend on three critical angles using the Singular Value and CS decompositions. | ||

Multiscale computations with interpolating scaling R Lippert, T Arias, A Edelman. (1999) | Abstract | |

Abstract: Wavelets o er a means of approximating functions that allows selective re nement. If regions of an image or a signal have exceptionally large variations, one need only store a set of coe cients, determined by function values in neighborhoods of those regions, in order to reproduce these variations accurately. In this way, one can have approximations of functions in terms of a basis that has spatially varying resolution. This approach reduces the memory storage required to represent functions and may be used for data compression. Physical applications often involve multiple physical elds which interact in space with non-linear couplings. E cient implementations must minimize not only storage to represent the these elds but also the processing required to describe their interactions. | ||

Akamai Technologies: A Mathematical Success Story Edelman, A. (1999) SIAM News, 32-1. (pp. 12-13). | Abstract | |

Abstract: What a success story for mathematics! No, it’s not a proof of the Riemann hypothesis, but a multibillion-dollar mathematical event. It’s the story of a research problem that blossomed into one of the hottest recent IPOs on Wall Street. And it’s all due to the tremendous impact of the Internet.The mathematicians are Tom Leighton (of the applied mathematics group and the Laboratory for Computer Science at MIT) and Danny Lewin (often referred to as “Tom’s graduate student” but a visionary in his own right). Together, they created Akamai Technologies. Akamai is now a large company, with hundreds of employees (many of whom are SIAM members; Leighton himself is a SICOMP editor). | ||

The probability that a random real gaussian matrix haskreal eigenvalues, related distributions, and the circular law Edelman, A. (1997). Journal of Multivariate Analysis, 60-2. (pp. 203-232). | PDF Abstract | |

Abstract: This paper investigates the eigenvalues of a real random n by n matrix of standard normals. Our primary question is, "What is the probability pn, k that exactly k eigenvalues are real?" A simpler question that has been studied in much stronger form in the literature is, "Why do the complex eigenvalues when properly normalized roughly fall uniformly in the unit disk as in Fig. 1?"Both questions can be answered by first factoring the matrix into some form of the real Schur decomposition, then interpreting this decomposition as a change of variables, and finally performing a wedge product derivation of the Jacobian of this change of variables. We demonstrate the power of these techniques by obtaining exact answers to these questions. | ||

Scalable parallel numerical methods and software tools for material design Bylaskay, E. J., Kohnz, S. R., Badenz, S. B., Edelmanx, A., Kawai, R., Ongk, M. E. G., & Weare, J. H. (1995). SIAM, 75. (pp.219). | PDF Abstract | |

Abstract: A new method of solution to the local spin density approximation to the electronic Schrodinger equation is presented. The method is based on an efficient, parallel, adaptive multigrid eigenvalue solver. It is shown that adaptivity is both necessary and sufficient to accurately solve the eigenvalue problem near the singularities at the atomic centers. While preliminary, these results suggest that direct real space methods may provide a much needed method for efficiently computing the forces in complex materials. | ||

An optional hypercube direct N-body solver on the connection machine JP Brunet, A Edelman, JP Mesirov. (1990). Proceedings of the 1990 ACM/IEEE conference on Supercomputing. IEEE Computer Society Press. (pp. 748-752). | PDF Abstract | |

Abstract: The direct method for solving N-body problems maps perfectly onto hypercube architectures. Unlike other hypercube implementations, we have implemented a direct N-body solver on the Connection Machine CM-2 which makes optimum use of the full bandwidth of the hypercube. When N ≫ P, where P is the number of floating-point processors, the communication time of the algorithm is negligible, and the execution time is that of the arithmetic time giving a P-fold speed-up for real problems. To obtain this performance, we use “rotated and translated Gray codes” which result in time-wise edge disjoint Hamiltonian paths on the hypercube. We further propose that this communication pattern has unexplored potential for other types of algorithms. Timings are presented for a collection of interacting point vortices in two dimensions. | ||

The Toeplitz-circulant eigenvalue problem Ax= λCx G Strang, A Edelmank. (1987). Oakland Conference on PDE’s, Longmans, London |