Estimating Manifolds: Methods and Surrogates

Christopher R. Genovese, a professor of statistics at Carnegie Mellon University, will be speak at the Applied and Interdisciplinary Mathematics Seminar on Monday, February 4 at 4:30 p.m. in 509 Lake Hall at Northeastern University.

ABSTRACT: Spatial data and high-dimensional data, such as collections of images, often contain high-density regions that concentrate around some lower dimensional structure. In many cases, these structures are well-modeled by smooth manifolds, or collections of such manifolds.  For example, the distribution of matter in the universe at large scales forms a web of intersecting clusters (0-dimensional manifolds), filaments (1 dimensional manifolds), and walls (2-dimensional manifolds), and the shape and distribution of these structures have cosmological implications.

I will discuss a new theory and methods for the problem of estimating manifolds (and collections of manifolds) from noisy data in the embedding space. The noise distribution has a dramatic effect on the performance (e.g., minimax rates) of estimators that is related to but distinct from what happens in measurement-error problems.  Some variants of the problem are “hard” in the sense that no estimator can achieve a practically useful level of performance.  I will show that in the “hard” case, it is possible to achieve accurate estimators for a suitable surrogate of the unknown manifold that captures many of the key features of the object. And I will describe efficient methods for estimating surrogates and characterizing “hyper-ridges” in many dimensions.

Posted in Mathematics