<?xml version="1.0"?>
<!-- name="generator" content="blosxom/2.0" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
  <channel>
    <title>Notebooks   </title>
    <link>http://bactra.org/notebooks</link>
    <description>Cosma's Notebooks</description>
    <language>en</language>

  <item>
    <title>Bootstrapping, and Other Resampling Methods</title>
    <link>http://bactra.org/notebooks/2012/04/18#bootstrap</link>
    <description>
&lt;P&gt;Bootstrapping is a way of figuring out the properties of statistical
estimators (and other procedures, like hypothesis tests) by simulation.  What
we would really like to know his how different our answers could have been, if
we re-ran our experiment.  We can't actually do this, but we can fit a model to
our data and simulate from it, and see what answer we'd get from the
simulations.  We can even do this from exceedingly general non-parametric
estimates, like re-sampling the original data.  This is a brilliant idea, and
my default way of handling the uncertainty of estimation in complex models or
with complex systems.  But &lt;a href=&quot;../weblog/652.html&quot;&gt;having just written
3500 words on this for a magazine&lt;/a&gt;, I feel absolutely no inclination to
explain myself further.

&lt;P&gt;I most interested in resampling techniques for dependent data, and would be
ecstatic if I could figure out a non-parametric bootstrap for
&lt;a href=&quot;network-data-analysis.html&quot;&gt;networks&lt;/a&gt;.  &amp;mdash; Presumably
&lt;a href=&quot;universal-prediction.html&quot;&gt;universal prediction algorithms&lt;/a&gt;
could be used for this purpose?

&lt;P&gt;See also:
	&lt;a href=&quot;cross-validation.html&quot;&gt;Cross-Validation&lt;/a&gt;;
	&lt;a href=&quot;statistics.html&quot;&gt;Statistics&lt;/a&gt;


&lt;ul&gt;Recommended, big picture:
	&lt;li&gt;A. C. Davison and D. V. Hinkley, &lt;cite&gt;Bootstrap Methods and
their Applications&lt;/cite&gt;
	&lt;li&gt;Bradley Efron
		&lt;uL&gt;
		&lt;li&gt;&quot;Bootstrap Methods: Another Look at the Jackknife&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.aos/1176344552&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;7&lt;/strong&gt; (1979): 1--26&lt;/a&gt; [The original paper;
staggeringly understandable]
		&lt;li&gt;&lt;cite&gt;The Bootstrap, the Jackknife, and Other
Resampling Plans&lt;/cite&gt; [1982 notes volume]
		&lt;/ul&gt;
	&lt;/ul&gt;

&lt;ul&gt;Recommended, close-ups:
	&lt;li&gt;Peter B&amp;uuml;hlmann, &quot;Sieve Bootstrap with Variable Length Markov Chains for Stationary Categorical Time Series&quot;, &lt;cite&gt;Journal of the American Statistical Association&lt;/cite&gt; &lt;strong&gt;97&lt;/strong&gt; (2002): 443--456
[&lt;a href=&quot;http://e-collection.ethbib.ethz.ch/view/eth:24072&quot;&gt;PDF preprint&lt;/a&gt;]
	&lt;li&gt;Paul Doukhan, Silika Prohl, and Christian Y. Robert, &quot;Subsampling
weakly dependent time series and application to extremes&quot;, &lt;a href=&quot;http://arxiv.org/abs/1009.0805&quot;&gt;arxiv:1009.0805&lt;/a&gt; [Thanks to Dr. Prohl
for a pre-pre-print]
	&lt;li&gt;A. C. Field and A. H. Welsh, &quot;Bootstrapping clustered data&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1111/j.1467-9868.2007.00593.x&quot;&gt;&lt;cite&gt;Journal of
the Royal Statistical Society&lt;/cite&gt; B &lt;strong&gt;69&lt;/strong&gt; (2007): 369--390&lt;/a&gt;
	&lt;li&gt;Silvia Goncalves and Halbert White, &quot;Maximum likelihood and the
bootstrap for nonlinear dynamic models&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1016/S0304-4076(03)00204-5&quot;&gt;&lt;cite&gt;Journal of
Econometrics&lt;/cite&gt; &lt;strong&gt;119&lt;/strong&gt; (2004): 199--219&lt;/a&gt;
	&lt;li&gt;Hans R. K&amp;uuml;nsch, &quot;The Jackknife and the Bootstrap for General
Stationary
Observations&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1176347265&quot;&gt;&lt;cite&gt;Annals
of Statistics&lt;/cite&gt; &lt;strong&gt;17&lt;/strong&gt; (1989): 1217--1241&lt;/a&gt;
	&lt;li&gt;S. N. Lahiri, &lt;cite&gt;Resampling Methods for Dependent Data&lt;/cite&gt;
[&lt;a href=&quot;../weblog/algae-2010-05.html#lahiri&quot;&gt;Mini-review&lt;/a&gt;]
	&lt;li&gt;Elizaveta Levina and Peter J. Bickel, &quot;Texture synthesis and
nonparametric resampling of random
fields&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1162567632&quot;&gt;&lt;cite&gt;Annals
of Statistics&lt;/cite&gt;
&lt;strong&gt;34&lt;/strong&gt; (2006): 1751--1773&lt;/a&gt;
	&lt;li&gt;Peter Mccullagh, &quot;Resampling and exchangeable arrays&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.bj/1081788029&quot;&gt;&lt;cite&gt;Bernoulli&lt;/cite&gt; &lt;strong&gt;6&lt;/strong&gt; (2000): 285--301&lt;/a&gt;
[Well, half-recommended.  Everything he says here is right, but I think one
could construct an exactly parallel argument to show that resampling could not
get correct standard errors for the mean of a stationary sequence; and
of course it can't if you insist on resampling dependent data as though
it were independent.  See the papers by Owen and by Owen and Eckles.]
	&lt;li&gt;Art B. Owen, &quot;The pigeonhole bootstrap&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aoas/1196438024&quot;&gt;&lt;cite&gt;Annals
of Applied Statistics&lt;/cite&gt; &lt;strong&gt;1&lt;/strong&gt; (2007): 386--411&lt;/a&gt;
	&lt;li&gt;Art B. Owen and Dean G. Eckles, &quot;Bootstrapping data arrays of arbitrary order&quot;, &lt;a href=&quot;http://arxiv.org/abs/1106.2125&quot;&gt;arxiv:1106.2125&lt;/a&gt;
	&lt;/ul&gt;

&lt;ul&gt;Modesty forbids me to recommend:
	&lt;li&gt;CRS, &quot;The
Bootstrap&quot;, &lt;a href=&quot;http://www.americanscientist.org/issues/pub/2010/3/the-bootstrap/1&quot;&gt;&lt;cite&gt;American
Scientist&lt;/cite&gt; &lt;strong&gt;98&lt;/strong&gt; (2010): 186--190&lt;/a&gt;
[&lt;a href=&quot;../weblog/652.html&quot;&gt;Self-commentary&lt;/a&gt;]
	&lt;/ul&gt;

&lt;ul&gt;To read:
	&lt;li&gt;Sylvain Arlot, Gilles Blanchard, and Etienne Roquain, &quot;Some nonasymptotic results on resampling in high dimension, I: Confidence regions&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.aos/1262271609&quot;&gt;&lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;38&lt;/strong&gt;
(2010): 51--82&lt;/a&gt;
	&lt;li&gt;Peter B&amp;uuml;hlmann, &quot;Bootstraps for Time Series&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.ss/1023798998&quot;&gt;&lt;cite&gt;Statistical Science&lt;/cite&gt; &lt;strong&gt;17&lt;/strong&gt; (2002): 52--72&lt;/a&gt;
	&lt;li&gt;Snigdhansu Chatterjee and Arup Bose, &quot;Generalized bootstrap for
estimating equations&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0504515&quot;&gt;math.ST/0504515&lt;/a&gt; = &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053604000000904&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;33&lt;/strong&gt; (2005): 414--436&lt;/a&gt;
	&lt;li&gt;Guang Cheng and Jianhua Z. Huang, &quot;Bootstrap consistency for general semiparametric M-estimation&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1279638543&quot;&gt;&lt;cite&gt;Annals of Statistics&lt;/cite&gt;
&lt;strong&gt;38&lt;/strong&gt; (2010): 2884--2915&lt;/a&gt;
	&lt;li&gt;Michael R. Chernick, &lt;cite&gt;Bootstrap Methods: A Practitioner's
Guide&lt;/cite&gt;
	&lt;li&gt;Andreas Christmann, Matias Salibian-Barrera, Stefan Van Aelst, &quot;On the stability of bootstrap estimators&quot;, &lt;a href=&quot;http://arxiv.org/abs/1111.1876&quot;&gt;arxiv:1111.1876&lt;/a&gt;
	&lt;li&gt;Herold Dehling, Martin Wendler, &quot;Central Limit Theorem and the Bootstrap for U-Statistics of Strongly Mixing Data&quot;, &lt;a href=&quot;http://arxiv.org/abs/0811.1888&quot;&gt;arxiv:0811.1888&lt;/a&gt;
	&lt;li&gt;Mathias Drton and Benjamin Williams, &quot;Quantifying the failure of bootstrap likelihood ratio tests&quot;, &lt;a href=&quot;http://dx.doi.org/10.1093/biomet/asr033
&quot;&gt;&lt;cite&gt;Biometrika&lt;/cite&gt;
&lt;strong&gt;98&lt;/strong&gt; (2011): 919--934&lt;/a&gt;
	&lt;li&gt;Efron and Tibshirani, &lt;cite&gt;An Introduction to the Bootstrap&lt;/cite&gt;
	&lt;li&gt;Yanqin Fan, Qi Li and Insik Min, &quot;A Nonparametric Bootstrap Test of
Conditional Distributions&quot;, &lt;a
href=&quot;http://dx.doi.org/10.1017/S0266466606060294&quot;&gt;&lt;cite&gt;Econometric
Theoy&lt;/cite&gt; &lt;strong&gt;22&lt;/strong&gt; (2006): 587--613&lt;/a&gt;
	&lt;li&gt;Cheng-Der Fuh and Inchi Hu, &quot;Estimation in hidden Markov models via efficient importance sampling&quot;, &lt;cite&gt;Bernoulli&lt;/cite&gt;
&lt;strong&gt;13&lt;/strong&gt; (2007): 492--513, &lt;a href=&quot;http://arxiv.org/abs/0708.4152&quot;&gt;arxiv:0708.4152&lt;/a&gt;
	&lt;li&gt;Jurgen Franke, Jens-Peter Kreiss and Enno Mammen,
&quot;Bootstrap of Kernel Smoothing in Nonlinear Time Series&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.bj/1078951087&quot;&gt;&lt;cite&gt;Bernoulli&lt;/cite&gt; &lt;strong&gt;8&lt;/strong&gt; (2002): 1--37&lt;/a&gt;
	&lt;li&gt;Axel Gandy, Patrick Rubin-Delanchy, &quot;An algorithm to compute the power of Monte Carlo tests with guaranteed precision&quot;, &lt;a href=&quot;http://arxiv.org/abs/1110.1248&quot;&gt;arxiv:1110.1248&lt;/a&gt;
	&lt;li&gt;Philip Good
		&lt;ul&gt;
		&lt;li&gt;&lt;cite&gt;Permutation, Parametric, and Bootstrap Tests of
Hypotheses&lt;/cite&gt;
		&lt;li&gt;&lt;citE&gt;Resampling Methods: A Practical Guide to
Data Analysis&lt;/cite&gt;
		&lt;/ul&gt;
	&lt;li&gt;Peter G. Hall, &lt;cite&gt;The Bootstrap and Edgeworth Expansion&lt;/cite&gt;,
	&lt;li&gt;Peter Hall and Hugh Miller, &quot;Bootstrap confidence intervals and hypothesis tests for extrema of parameters&quot;, &lt;a href=&quot;http://dx.doi.org/10.1093/biomet/asq045&quot;&gt;&lt;cite&gt;Biometrika&lt;/cite&gt; &lt;Strong&gt;97&lt;/strong&gt; (2010): 881--892&lt;/a&gt; [E.g., looking at the largest (or smallest) of a bunch of regression coefficients]
	&lt;li&gt;Eunju Hwang, Dong Wan Shin, &quot;Stationary bootstrapping for
non-parametric estimator of nonlinear autoregressive
model&quot;, &lt;a href=&quot;http://dx.doi.org/10.1111/j.1467-9892.2010.00699.x&quot;&gt;&lt;cite&gt;Journal
of Time Series Analysis&lt;/cite&gt; forthcoming (2011)&lt;/a&gt;
	&lt;li&gt;Arnold Janssen and Thorsten Pauls, &quot;How Do Bootstrap and
Permutation Tests Work?&quot;, &lt;cite&gt;The Annals of Statistics&lt;/cite&gt; &lt;strong&gt;31&lt;/strong&gt; (2003): 768--806
	&lt;li&gt;Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar, Michael I. Jordan, &quot;A Scalable Bootstrap for Massive Data&quot;, &lt;a href=&quot;http://arxiv.org/abs/1112.5016&quot;&gt;arxiv:1112.5016&lt;/a&gt;
	&lt;li&gt;Jens-Peter Kreiss, Efstathios Paparoditis, Dimitris N. Politis, &quot;On the range of validity of the autoregressive sieve bootstrap&quot;, &lt;cite&gt;Annals
of Statistics&lt;/cite&gt; &lt;strong&gt;39&lt;/strong&gt; (2011): 2103--2130, &lt;a href=&quot;http://arxiv.org/abs/12016211&quot;&gt;arxiv:12016211&lt;/a&gt;
	&lt;li&gt;S. N. Lahiri, &quot;Edgeworth expansions for studentized statistics under weak dependence&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1262271619&quot;&gt;&lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;38&lt;/strong&gt; (2010): 388--434&lt;/a&gt;
	&lt;li&gt;Stephen M. S. Lee and P. Y. Lai, &quot;Improving coverage accuracy
of block bootstrap confidence intervals&quot;, &lt;a href=&quot;http://arxiv.org/abs/0804.4361&quot;&gt;arxiv:0804.4361&lt;/a&gt;
	&lt;li&gt;Anne Leucht, &quot;Degenerate U- and V-statistics under weak dependence: Asymptotic theory and bootstrap consistency&quot;, &lt;a href=&quot;http://dx.doi.org/10.3150/11-BEJ354&quot;&gt;&lt;cite&gt;Bernoulli&lt;/cite&gt; &lt;strong&gt;18&lt;/strong&gt; (2012): 552--585&lt;/a&gt;
	&lt;li&gt;W. S. Lok and Stephen M. S. Lee, &quot;Robustness Diagnosis for Bootstrap Inference&quot;, &lt;a href=&quot;http://dx.doi.org/&quot;&gt;&lt;cite&gt;Journal of Computational and Graphical Statistics&lt;/cite&gt; &lt;strong&gt;20&lt;/strong&gt; (2011): 448--460&lt;/a&gt;
	&lt;li&gt;Michael H. Neumann, Efstathios Paparoditis, &quot;Goodness-of-fit tests for Markovian time series models: Central limit theory and bootstrap approximations&quot;, &lt;cite&gt;Bernoulli&lt;/citE&gt; &lt;strong&gt;14&lt;/strong&gt; (2008): 14--46,
&lt;a href=&quot;http://arxiv.org/abs/0803.0835&quot;&gt;arxiv:0803.0835&lt;/a&gt;
	&lt;li&gt;Daniel J. Nordman, &quot;A note on the stationary bootstrap's
variance&quot;, &lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;37&lt;/strong&gt; (2009):
359--370, &lt;a href=&quot;http://arxiv.org/abs/0903.0474&quot;&gt;arxiv:0903.0474&lt;/a&gt;
	&lt;li&gt;Dimitris N. Politis, &quot;The Impact of Bootstrap Methods on
Time Series Analysis&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ss/1063994977&quot;&gt;&lt;cite&gt;Statistical Science&lt;/cite&gt;
&lt;Strong&gt;18&lt;/strong&gt; (2003): 219--230&lt;/a&gt;
	&lt;li&gt;Dimitris N. Politis, Joseph P. Romano and Michael Wolf, &lt;cite&gt;Subsampling&lt;/cite&gt;
	&lt;li&gt;Zacharias Psaradakis
		&lt;ul&gt;
		&lt;li&gt;&quot;A sieve bootstrap test for stationarity,&quot;
&lt;a href=&quot;http://dx.doi.org/10.1016/S0167-7152(03)00012-9&quot;&gt;&lt;cite&gt;Statistics and
Probability Letters&lt;/cite&gt; &lt;strong&gt;62&lt;/strong&gt; (2003): 263--274&lt;/a&gt;
		&lt;li&gt;&quot;Blockwise bootstrap testing for stationarity&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1016/j.spl.2005.09.001&quot;&gt;&lt;cite&gt;Statistics and
Probability Letters&lt;/cite&gt; &lt;strong&gt;76&lt;/strong&gt; (2006): 562--570&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Joseph P. Romano, Azeem M. Shaikh, &quot;On the Uniform Asymptotic Validity of Subsampling and the Bootstrap&quot;, &lt;a href=&quot;http://arxiv.org/abs/1204.2762&quot;&gt;arxiv:1204.2762&lt;/a&gt;
	&lt;li&gt;Matias Salibian-Barrera, Stefan van Aelst and Gert Willems,
&quot;Fast and robust bootstrap&quot;, &lt;a href=&quot;http://dx.doi.org/10.1007/s10260-007-0048-6&quot;&gt;&lt;cite&gt;Statistical Methods and Applications&lt;/cite&gt; &lt;strong&gt;17&lt;/strong&gt;
(2009): 41--71&lt;/a&gt;
	&lt;li&gt;Jun Shao and Dongsheng Tu, &lt;citE&gt;The Jackknife and the Bootstrap&lt;/cite&gt;
	&lt;li&gt;Xiaofeng Shao, &quot;The Dependent Wild
Bootstrap&quot;, &lt;a href=&quot;http://dx.doi.org/10.1198/jasa.2009.tm08744&quot;&gt;&lt;cite&gt;Journal
of the American Statistical Association&lt;/cite&gt; &lt;strong&gt;105&lt;/strong&gt; (2010):
218--235&lt;/a&gt;
	&lt;li&gt;Olimjon Sh. Sharipov and Martin Wendler, &quot;Bootstrap
for the Sample Mean and for U-Statistics of Stationary Processes&quot;,
&lt;a href=&quot;http://arxiv.org/abs/0911.3083&quot;&gt;arxiv:0911.3083&lt;/a&gt;
	&lt;li&gt;Jan Sprenger, &quot;Science without (parametric) models: the case of bootstrap resampling&quot;, &lt;a href=&quot;http://dx.doi.org/10.1007/s11229-009-9567-z&quot;&gt;&lt;cite&gt;Synthese&lt;/cite&gt; &lt;strong&gt;180&lt;/strong&gt; (2011): 65--76&lt;/a&gt;
	&lt;li&gt;Jos&amp;eacute; Trashorras, Olivier Wintenberger, &quot;Large deviations for bootstrapped empirical measures&quot;, &lt;a href=&quot;http://arxiv.org/abs/1110.4620&quot;&gt;arxiv:1110.4620&lt;/a&gt;
	&lt;li&gt;Lionel Truquet, &quot;On a nonparametric resampling scheme for Markov random fields&quot;, &lt;a href=&quot;http://dx.doi.org/10.1214/11-EJS644&quot;&gt;&lt;cite&gt;Electronic Journal of Statistics&lt;/cite&gt; &lt;strong&gt;5&lt;/strong&gt; (2011): 1503--1536&lt;/a&gt;
	&lt;li&gt;Herwig Wendt, Patrice Abry and Stephane Jaffard, &quot;Bootstrap for
Empirical Multifractal Analysis&quot;, &lt;citE&gt;IEEE Signal Processing Magazine&lt;/cite&gt;
July 2007, pp. 38--48 [+ technical papers by these authors]
	&lt;li&gt;Abdelhak M. Zoubir and D. Robert Iskander, &lt;citE&gt;Bootstrap
Techniques for Signal Processing&lt;/cite&gt;
	&lt;/ul&gt;
</description>
  </item>
  </channel>
</rss>
