Optimal Outlier RemovalJohn DunaganMIT
May 9,

ABSTRACT


Given a set of points in $n$dimensional Euclidean space, we define a point $x$ to be a $\gamma$outlier if there exists some direction $w$ in which it is more than $\gamma$ standard deviations from the mean along $w$. Outlier removal is the problem of removing an $\epsilon$ fraction of points so that the remaining subset has no $\gamma$outliers. We characterize the optimal tradeoff between $\epsilon$ and $\gamma$ using a simple algorithm for outlier removal. The algorithm suggests a natural linear ordering of the point set with interesting properties. It is our hope that this work will be of interest to experimental scientists as well as the mathematics community. For a demonstration, please visit http://theory.lcs.mit.edu/~jdunagan This is joint work with Santosh Vempala. 
Combinatorics Seminar, Mathematics Department, MIT, sara(atsign)math.mit.edu 

