We present geometric tools for algorithms. These tools illustrate that
in algorithm discovery a geometric perspective can be an insightful one.
(i) Outlier Removal: A set of points in n-dimensional space (data with
multiple attributes), may contain "outliers". They could be due to error
in collecting the data (experimental error) or could correspond to an
interesting pattern. In either case, it would be useful to find (and
separate) outliers.
What precisely is an outlier? We present a robust notion of outliers and
a polynomial-time algorithm to remove a small fraction of any set of points
(or more generally, any distribution on rationals) so that the remaining set
has no outliers. As an application, we give a polynomial-time guarantee for
the classical perceptron algorithm (for learning a half-space in n-dimensional
space).
(ii) Random Projection: Our second tool is a simple method for reducing the
dimensionality of a set of points. Choose at random a low-dimensional subspace
(a hyperplane through the origin), and project the set of points to this
subspace.
Does this way of reducing dimensionality retain relevant properties of the
point set? We show that in several different contexts, random projection is
the key to finding efficient algorithms. The first is an old problem from
learning theory. Given a set of points labelled "red" or "blue", find a set
of k half-spaces such that the "red" points all lie in one intersection of
the half-spaces, and the "blue" points all lie outside it. Two other problems
that we will mention include text retrieval and minimizing graph bandwidth.