2015 QPRC

An Experimental Design Algorithm for Large-Scale Statistical Computation

Peter Qian
Department of Statistics
University of Wisconsin-Madison, Madison, WI
peterq@stat.wisc.edu

We propose an experimental design algorithm that is highly efficient in large-scale statistical computation settings with observational data.  Our generalized OEM (GOLEM) algorithm, a modification of Orthogonalizing EM (OEM), enjoys fast convergence for ordinary least squares (OLS) problems and is also easily extended to regularized least squares problems including penalties such as the lasso, ridge, and SCAD. In our reformulation, we provide a different geometric interpretation of the OEM algorithm and a computationally efficient solver based on the new geometric perspective. An optimal convergence rate is also proved based on this perspective. In addition, we extend the OEM technique to more general settings such generalized linear models, resulting in an algorithm that is state-of-the-art for logistic regression and regularized logistic regression in settings where n >> p. In general, our alternate formulation of OEM for OLS has computational complexity O(np). We prove that GOLEM converges for convex losses and converges to a local solution when the loss function is nonconvex.  We demonstrate the state-of-the-art performance of GOLEM on several large-scale datasets, both real and synthetic.