Sufficient dimension folding theory and methods

by Yuan Xue

Institution: University of Georgia
Department: Statistics
Degree: PhD
Year: 2012
Keywords: Central Folding Subspace, Central Mean Folding Subspace, Central Quantile Folding Subspace, Folded Minimum Average Variance Estimation, Folded MAVE ensemble, Modi
Record ID: 1986185
Full text PDF: http://purl.galileo.usg.edu/uga_etd/xue_yuan_201212_phd


This dissertation undertakes the theory and methods of sufficient dimension folding for matrix-/array-valued objects. Traditionally, researchers reduced the dimensions of matrix-/array-valued data by vectorizing the data into vectors. Nonetheless, analysis based on the vectorized data lost the crucial structural information carried by the data. Keeping the structure is critical in many fields. Dimension folding is a cutting-edge technology for capturing the critical essence of those structured data, reducing their dimensions as much as possible and preserving their intrinsic structure. We first consider the sufficient dimension folding for the regression mean function when predictors are matrix- or array-valued. A new concept named central mean folding subspace and its two local estimation methods: folded outer product of gradients estimation (folded-OPG) and folded minimum average variance estimation (folded-MAVE) are proposed. The asymptotic property for folded-MAVE is established. A modified BIC criterion is used to determine the dimensions of the central mean folding subspace. Performances of the two local estimation methods are evaluated by simulated examples and the efficacy is demonstrated in finite samples. The folded-MAVE method is adopted to analyze a primary biliary cirrhosis data. Second, we focus on the sufficient dimension folding for the regression on robustness for matrix- or array-valued objects. The central functional dimension folding subspace and a class of estimation methods on robust estimators are introduced. Special attention is paid to the central quantile dimension folding subspace, a widely interesting case of the central functional folding subspace. The performances of the proposed estimation methods on estimating the central quantile folding dimension subspace are evaluated by simulated models. We also apply our method to the primary biliary cirrhosis data for quantile regression. Third, we introduce our future work. A class of dimension folding estimators based on an ensemble of folded-MAVE is introduced to characterize the central folding subspace (CFS). The ensemble estimators can exhaustively estimate the central folding subspace without imposing restrictive conditions on the predictors. A cross validation criterion is proposed to determine the dimensions of CFS. Theoretical properties and numerical performance of the proposed method will be studied in the future.