Regression, especially Nonparametric Regression
07 Jan 2010 08:46
"Regression", in statistical jargon, is the problem of guessing the average level of some quantitative response variable from various predictor variables.
Linear regression is perhaps the single most common quantitative tool in economics, sociology, and many other fields; it's certainly the most common use of statistics. (Analysis of variance, arguably more common in psychology and biology, is a disguised form of regression.) While linear regression deserves a place in statistics, that place should be nowhere near as large and prominent as it currently is. There are very few situations where we actually have scientific support for linear models. Fortunately, very flexible nonlinear regression methods now exist, and from the user's point of view are just as easy as linear regression, and at least as insightful. (Regression trees and additive models, in particular, are just as interpretable.) At the very least, if you do have a particular functional form in mind for the regression, linear or otherwise, you should use a non-parametric regression to test the adequacy of that form.
From a technical point of view, the main drawback of modern regression methods is that their extra flexibility comes at the price of less "efficiency" — estimates converge more slowly, so you have less precision for the same amount of data. There are some situations where you'd prefer to have more precise estimates from a bad model than less precise estimates from a model which doesn't make systematic errors, but I don't think that's what most users of linear regression are chosing to do; they're just taught to type lm rather than gam. In this day and age, though, I don't understand why not.
(Of course, for the statistician, a lot of the more flexible regression methods look more or less like linear regression in some disguised form, because fundamentally all it does is projection. So it's not crazy to make it a foundational topic for statisticians. We should not, however, give the rest of the world the impression that the hat matrix is the source of all knowledge.)
The use of regression, linear or otherwise, for causal inference, rather than prediction, is a different, and far more sordid, story.
See also: Computational Statistics; Data Mining; Learning Theory; Model Selection; Neural Nets; Social Science Methodology; What Is the Right Null Model for Linear Regression?
- Recommended, more general:
- Richard A. Berk
- Regression Analysis: A Constructive Critique [Mini-review]
- Statistical Learning from a Regression Perspective
- Julian J. Faraway, Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models
- Andrew Gelman and Iain Pardoe, "Average predictive comparisons for models with nonlinearity, interactions, and variance components", Sociological Methodology forthcoming (2007) [PDF preprint, Gelman's comments]
- Jeffrey D. Hart, Nonparametric Smoothing and Lack-of-Fit Tests [Mini-review]
- Trevor Hastie and Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction [This is a corner-stone book, but is about much, much more than just regression.]
- Jeffrey S. Racine, "Nonparametric Econometrics: A Primer", Foundations and Trends in Econometrics 3 (2008): 1--88 [Good primer of nonparametric techniques for regression, density estimation and hypothesis testing; next to no economic content (except for examples). PDF reprint]
- Grace Wahba, Spline Models for Observational Data
- Larry Wasserman
- All of Statistics
- All of Nonparametric Statistics
- Notes for 36-707, Regression Analysis
- Weisberg, Applied Linear Regression
- Recommended, more specialized:
- Norman H. Anderson and James Shanteau, "Weak inference with linear models", Psychological Bulletin 84 (1977): 1155--1170 [A demonstration of why you should not rely on R2 to back up your claims]
- Raymond J. Carroll, Aurore Delaigle, and Peter Hall, "Nonparametric Prediction in Measurement Error Models", Journal of the American Statistical Association 104 (2009): 993--1003
- Kevin A. Clarke, "The Phantom Menace: Omitted Variables Bias in Econometric Research" [PDF. Or: Kitchen-sink regressions considered harmful. Including extra variables in your linear regression may or may not reduce the bias in your estimate of any particular coefficients of interest, depending on the correlations between the added variables, the predictors of interest, the response, and omitted relevant variables. Adding more variables always increases the variance of your estimates.]
- Berthold R. Haag, "Non-parametric Regression Tests Using Dimension Reduction Techniques", Scandinavian Journal of Statistics 35 (2008): 719--738
- Jon Lafferty and Larry Wasserman, "Rodeo: Sparse Nonparametric Regression in High Dimensions", math.ST/0506342 ["We present a method for simultaneously performing bandwidth selection and variable selection in nonparametric regression."]
- Lukas Meier, Sara van de Geer and Peter Bühlmann, "High-Dimensional Additive Modeling", arxiv:0806.4115 = Annals of Statistics 37 (2009): 3779--3821
- Pradeep Ravikumar, John Lafferty, Han Liu, Larry Wasserman, "Sparse Additive Models", arxiv:0711.4555
- Sara van de Geer, Empirical Process Theory in M-Estimation
- Modesty forbids me to recommend:
- My Lecture notes on data mining
- To read:
- Sylvain Arlot and Pascal Massart, "Data-driven Calibration of Penalties for Least-Squares Regression", Journal of Machine Learning Research 10 (2009): 245--279
- Gilles Blanchard, Nicole Kraemer, "Kernel Conjugate Gradient is Universally Consistent", arxiv:0902.4380 ["approximate solutions are constructed by projections onto a nested set of data-dependent subspaces"]
- Borowiak, Model Discrimination for Nonlinear Regression Models
- Adrian W. Bowman and Adelchi Azzalini, Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations
- Thomas Brambor, William Roberts Clark and Matt Golder, "Understanding Interaction Models: Improving Empirical Analyses", Political Analysis 14 (2006): 63--82
- Lawrence D. Brown and Mark G. Low, "Asymptotic Equivalence of Nonparametric Regression and White Noise", Annals of Statistics 24 (1996): 2384--2398 [JSTOR]
- T. Tony Cai, Harrison H. Zhou, "Asymptotic equivalence and adaptive estimation for robust nonparametric regression", Annals of Statistics 37 (2009): 3204--3235 = arxiv:0909.0343
- Arnak Dalalyan and Alexandre B. Tsybakov, "Sparse Regression Learning by Aggregation and Langevin Monte-Carlo", arxiv:0903.1223
- Sam Efromovich, Nonparametric Curve Estimation
- Andrew Gelman and Jennifer Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models
- Christopher R. Genovese and Larry Wasserman
- "Confidence sets for nonparametric wavelet regression", math.ST/0505632 = Annals of Statistics 33 (2005): 698--729
- "Adaptive Confidence Bands", math.ST/0701513
- Jose M. Gonzalez-Barrios and Silvia Ruiz-Velasco, "Regression analysis and dependence", Metrica 61 (2005): 73--87
- Emmanuel Guerre and Pascal Lavergne, "Data-driven rate-optimal specification testing in regression models", math.ST/0505640 = Annals of Statistics 33 (2005): 840--870
- Laszlo Gyorfi et al., A Distribution-Free Theory of Nonparametric Regression
- Peter Hall, "On Bootstrap Confidence Intervals in Nonparametric Regression", Annals of Statistics 20 (1992): 695--711
- Bruce E. Hansen
- "Uniform Convergence Rates for Kernel Estimation with Dependent Data", Econometric Theory 24 (2008): 726--748 [abstract with link to free PDF]
- Econometrics [Textbook draft. Free online.]
- Wolfgang Härdle, Applied Nonparametric Regression [blurb; online]
- Wolfgang Härdle, Marlene Müller, Stefan Sperlich and Axel Werwatz, Nonparametric and Semiparametric Models: An Introduction [Full text online]
- Salvatore Ingrassia, Simona C. Minotti, Giorgio Vittadini, "Local statistical modeling by cluster-weighted" [sic], arxiv:0911.2634 [Revisiting Gershenfeld et al.'s "cluster-weighted modeling" from a more properly statistical perspective]
- Sameer M. Jalnapurkar, "Learning a regression function via Tikhonov regularization", math.ST/0509420
- Bo Kai, Runze Li and Hui Zou, "Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression", Journal of the Royal Statistical Society B 72 (2010): 49--69
- Estate V. Khmaladze, Hira L. Koul, "Goodness-of-fit problem for errors in nonparametric regression: Distribution free approach", Annals of Statistics 37 (2009): 3165--3185 = arxiv:0909.0170
- Michael R. Kosorok, Introduction to Empirical Processes and Semiparametric Inference [partial PDF preprint]
- Nicole Kraemer, Anne-Laure Boulesteix, Gerhard Tutz, "Penalized Partial Least Squares Based on B-Splines Transformations", math.ST/0608576
- Qi Li and Jeffrey Scott Racine, Nonparametric Econometrics: Theory and Practice
- Oliver Linton and Zhijie Xiao, "A Nonparametric Regression Estimator That Adapts To Error Distribution of Unknown Form", Econometric Theory 23 (2007): 371--413
- Abdelkader Mokkadem, Mariane Pelletier, Yousri Slaoui, "Revisiting Révész's stochastic approximation method for the estimation of a regression function", arxiv:0812.3973
- Philippe Rigollet, "Maximum likelihood aggregation and misspecified generalized linear models", arxiv:0911.2919
- Cynthia Rudin, "Stability Analysis for Regularized Least Squares Regression", cs.LG/0502016
- George A. F. Seber and C. J. Wild, Nonlinear Regression
- David Shilane, Richard H. Liang and Sandrine Dudoit, "Loss-Based Estimation with Evolutionary Algorithms and Cross-Validation", UC Berkeley Biostatistics Working Paper 227 [Abstract, PDF]
- Jeffrey S. Simonoff,
Smoothing Methods in Statistics - Aris Spanos, "Revisiting the Omitted Variables Argument: Substantive vs. Statistical Adequacy" [PDF preprint]
- Liangjun Su and Aman Ullah, "Local polynomial estimation of nonparametric simultaneous equations models", Journal of Econometrics 144 (2008): 193--218
- Gerhard Tutz and Jan Ulbricht, "Penalized regression with correlation-based penalty", Statistics and Computing 19 (2008): 239--253
- Daniela M. Witten and Robert Tibshirani, "Covariance-regularized regression and classification for high dimensional problems", Journal of the Royal Statistical Society B 71 (2009): 615--636
- Peng Zhau and Bin Yu, "On Model Selection Consistency of Lasso", Journal of Machine Learning Research 7 (2006): 2541--2563
