<?xml version="1.0"?>
<!-- name="generator" content="blosxom/2.0" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
  <channel>
    <title>Notebooks   </title>
    <link>http://bactra.org/notebooks</link>
    <description>Cosma's Notebooks</description>
    <language>en</language>

  <item>
    <title>Statistics</title>
    <link>http://bactra.org/notebooks/2012/04/15#statistics</link>
    <description>


&lt;P&gt;An application of &lt;a href=&quot;probability.html&quot;&gt;probability&lt;/a&gt;, with intimate
ties to &lt;a href=&quot;learning-inference-induction.html&quot;&gt;machine learning,
non-demonstrative inference and induction&lt;/a&gt;.

&lt;P&gt;Since June 2005, I have been a (very, very junior)
&lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/&quot;&gt;professor of statistics&lt;/a&gt;.  This
made me interested in &lt;a href=&quot;teaching-statistics.html&quot;&gt;how to teach it&lt;/a&gt;.

&lt;P&gt;&lt;em&gt;See
also&lt;/em&gt;: &lt;a href=&quot;properties-vs-principles-for-statistics.html&quot;&gt;Properties
vs. principles in defining &quot;good statistics&quot;&lt;/a&gt;


&lt;dl&gt;&lt;em&gt;Things I need to learn more about&lt;/em&gt;:
	&lt;dt&gt;Dependent data&lt;/dt&gt;
	&lt;dd&gt;&lt;a href=&quot;time-series.html&quot;&gt;Statistical inference for stochastic
processes, a.k.a. time-series analysis&lt;/a&gt;.  &lt;a href=&quot;filtering.html&quot;&gt;Signal
processing and filtering&lt;/a&gt;.
&lt;a href=&quot;spatial-statistics.html&quot;&gt;Spatial statistics&lt;/a&gt;.&lt;/dd&gt;
	&lt;dt&gt;&lt;a href=&quot;model-selection.html&quot;&gt;Model selection&lt;/a&gt;&lt;/dt&gt;
	&lt;dd&gt;Especially: adapting to unknown characteristics of the data, like
unknown noise distributions, or unknown smoothness of
the &lt;a href=&quot;regression.html&quot;&gt;regression function&lt;/a&gt;.&lt;/dd&gt;
	&lt;dt&gt;Model discrimination&lt;/dt&gt;
	&lt;dd&gt;That is, designing experiments so as to discriminate between
competing classes of model.  Adaptation to data issues here.&lt;/dd&gt;
	&lt;dt&gt;Rates of convergence of estimators to true values&lt;/dt&gt;
	&lt;dd&gt;&lt;a href=&quot;empirical-process-theory.html&quot;&gt;Empirical process
theory&lt;/a&gt;.  (Cf. some questions in
&lt;a href=&quot;ergodic-theory.html&quot;&gt;ergodic theory&lt;/a&gt;).&lt;/dd&gt;
	&lt;dt&gt;Estimating distribution functions&lt;/dt&gt;
	&lt;dd&gt;And estimating &lt;a href=&quot;information-theory.html&quot;&gt;entropies&lt;/a&gt;, or
other functionals of distributions.&lt;/dd&gt;
	&lt;dt&gt;Non-parametric methods&lt;/dt&gt;
	&lt;dd&gt;Both those that are genuinely distribution-free, and
those that would more accurately be mega-parametric (even
infinitely-parametric) methods, such as &lt;a href=&quot;neural-nets.html&quot;&gt;neural
networks&lt;/a&gt;&lt;/dd&gt;
	&lt;dt&gt;&lt;a href=&quot;regression.html&quot;&gt;Regression&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;a href=&quot;bootstrap.html&quot;&gt;Bootstrapping and other resampling methods&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;A href=&quot;cross-validation.html&quot;&gt;Cross-validation&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;a href=&quot;sufficient-statistics.html&quot;&gt;Sufficient statistics&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;a href=&quot;exponential-families.html&quot;&gt;Exponential families&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;a href=&quot;info-geo.html&quot;&gt;Information Geometry&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;a href=&quot;partial-identification.htm&quot;&gt;Partial identification
of parametric statistical models&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;a href=&quot;causality.html&quot;&gt;Causal Inference&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;Decision theory&lt;/dt&gt;
	&lt;dd&gt;&lt;a href=&quot;decision-theory.html&quot;&gt;Conventional&lt;/a&gt;, and the sorts with
some connection to &lt;a href=&quot;judgment.html&quot;&gt;how real decisions are
made&lt;/a&gt;.&lt;/dd&gt;
	&lt;dt&gt;&lt;a href=&quot;graphical-models.html&quot;&gt;Graphical models&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;a href=&quot;monte-carlo.html&quot;&gt;Monte Carlo and other simulation
methods&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&quot;De-Bayesing&quot;&lt;/dt&gt;
	&lt;dd&gt;Ways of taking Bayesian procedures and eliminating dependence on
priors, either by replacing them by initial point-estimates, or by showing the
prior doesn't matter, asymptotically or hopefully sooner.
See: &lt;a href=&quot;bayesian-consistency.html&quot;&gt;Frequentist consistency of Bayesian
procedures&lt;/a&gt;.&lt;/dd&gt;
	&lt;dT&gt;&lt;a href=&quot;computational-statistics.html&quot;&gt;Computational Statistics&lt;/a&gt;&lt;/dt&gt;
	&lt;dt&gt;&lt;a href=&quot;structured-data.html&quot;&gt;Statistics of structured data&lt;/a&gt;
	&lt;dt&gt;&lt;a href=&quot;grammatical-inference.html&quot;&gt;Grammatical Inference&lt;/a&gt;
	&lt;dt&gt;Factor analysis&lt;/dt&gt;
	&lt;dt&gt;Multiple testing&lt;/dt&gt;
	&lt;dt&gt;Predictive distributions&lt;/dt&gt;
	&lt;dd&gt;... especially if they have confidence/coverage properties&lt;/dd&gt;
	&lt;dt&gt;&lt;a href=&quot;density-estimation.html&quot;&gt;Density estimation&lt;/a&gt;&lt;/dt&gt;
	&lt;dd&gt;especially &lt;em&gt;conditional&lt;/em&gt; density estimation; and
&lt;a href=&quot;density-estimation-on-graphs.html&quot;&gt;density estimation on
graphical models&lt;/a&gt;&lt;/dd&gt;
	&lt;dt&gt;&lt;a href=&quot;indirect-inference.html&quot;&gt;Indirect inference&lt;/a&gt;
	&lt;/dl&gt;

&lt;ul&gt;Recommended, non-technical:
	&lt;li&gt;Francis Galton, &quot;Statistical Inquiries into the Efficacy of
Prayer,&quot; &lt;cite&gt;Fortnightly Review&lt;/cite&gt; &lt;strong&gt;12&lt;/strong&gt; (1872): 125--135
[&lt;a href=&quot;http://www.mugu.com/galton/essays/efficacy-prayer.html&quot;&gt;online&lt;/a&gt;]
	&lt;li&gt;Larry Gonick and Woollcott Smith, &lt;cite&gt;The Cartoon Guide to
Statistics&lt;/cite&gt;
	&lt;li&gt;Ian Hacking, &lt;cite&gt;The Taming of Chance&lt;/cite&gt; [Putting chance to
work in the 19th century]
	&lt;li&gt;D. Huff, &lt;cite&gt;How to Lie with Statistics&lt;/cite&gt;
	&lt;li&gt;Theodore Porter, &lt;cite&gt;The Rise of Statistical Thinking,
1820--1900&lt;/cite&gt;
	&lt;li&gt;Constance Reid, &lt;cite&gt;Neyman from Life&lt;/cite&gt; [Biography of Jerzy
Neyman, one of &lt;em&gt;the&lt;/em&gt; makers of modern statistical theory, and, I am
happy to say, among the brighter lights of my alma mater.  Reid does an
excellent job of &lt;em&gt;explaining&lt;/em&gt; Neyman's work in terms accessible to the
general reader.  There is a new edition, titled simply &lt;cite&gt;Neyman,&lt;/cite&gt; but
otherwise unchanged.  &lt;a href=&quot;http://stevereads.com/weblog/category/books/reid-constance/neyman-from-life/&quot;&gt;Review by Steve Laniel&lt;/a&gt;]
	&lt;li&gt;Edward R. Tufte
		&lt;ul&gt;
		&lt;li&gt;&lt;cite&gt;The Visual Display of Quantitative Information&lt;/cite&gt;
		&lt;li&gt;&lt;cite&gt;Visual Explanations&lt;/cite&gt;
		&lt;/ul&gt;
	&lt;/ul&gt;

&lt;ul&gt;Recommended, technical, big pictures:
	&lt;li&gt;Ole E. Barndorff-Nielsen and David R. Cox, &lt;cite&gt;Inference
and Asymptotics&lt;/cite&gt;
	&lt;li&gt;M. S. Bartlett
		&lt;ul&gt;
		&lt;li&gt;&quot;Inference and Stochastic Processes&quot;,
&lt;cite&gt;Journal of the Royal Statistical Society&lt;/cite&gt; A &lt;strong&gt;130&lt;/strong&gt;
(1967): 457--478 [&lt;a href=&quot;http://www.jstor.org/stable/2982519&quot;&gt;JSTOR&lt;/a&gt;]
		&lt;li&gt;&quot;Chance or Chaos?&quot;, &lt;cite&gt;Journal of the Royal
Statistical Society&lt;/cite&gt; A &lt;strong&gt;153&lt;/strong&gt; (1990): 321--347 [&lt;a href=&quot;http://www.jstor.org/pss/2982976&quot;&gt;JSTOR&lt;/a&gt;]
		&lt;/ul&gt;
	&lt;li&gt;Richard A. Berk, &lt;cite&gt;Regression Analysis: A Constructive
Critique&lt;/cite&gt; [&lt;a href=&quot;../weblog/algae-2007-11.html&quot;&gt;My comments&lt;/a&gt;]
	&lt;li&gt;Leo Breiman, &quot;Statistical Modeling: The Two Cultures&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.ss/1009213726&quot;&gt;&lt;cite&gt;Statistical
Science&lt;/cite&gt; &lt;strong&gt;16&lt;/strong&gt; (2001): 199--231&lt;/a&gt; [very much including
the discussion by others and the reply by Breiman.  Thanks to Chris Wiggins for
alerting me to this.]
	&lt;li&gt;D. R. Cox and Christl A. Donnelly, &lt;cite&gt;Principles of Applied
Statistics&lt;/cite&gt; [Review forthcoming]
	&lt;li&gt;Harald Cram&amp;eacute;r, &lt;cite&gt;Mathematical Methods of
Statistics&lt;/cite&gt; [&lt;a href=&quot;../reviews/cramer-on-math-stat/&quot;&gt;Review&lt;/a&gt;]
	&lt;li&gt;C. David Garson, &lt;cite&gt;&lt;a href=&quot;http://www2.chass.ncsu.edu/garson/pa765/statnote.htm&quot;&gt;Statnotes: An Online Textbook&lt;/a&gt;&lt;/cite&gt;
	&lt;li&gt;Peter Guttorp, &lt;cite&gt;Stochastic Modeling of Scientific Data&lt;/cite&gt;
[Good introduction to using dependent data]
	&lt;li&gt;Trever Hastie, Robert Tibshirani and Jerome Friedman, &lt;cite&gt;The
Elements of Statistical Learning: Data Mining, Inference, and Prediction&lt;/cite&gt;
[&lt;a href=&quot;http://www-stat.stanford.edu/~tibs/ElemStatLearn/&quot;&gt;Website&lt;/a&gt;, with full text free in PDF]
	&lt;li&gt;Robert E. Kass, &quot;Statistical Inference; The Big Picture&quot;,
&lt;cite&gt;Statistical Science&lt;/cite&gt; &lt;strong&gt;26&lt;/strong&gt; (2011): 1--19,
&lt;a href=&quot;http://arxiv.org/abs/1106.2895&quot;&gt;arxiv:1106.2895&lt;/a&gt;
	&lt;li&gt;Tony Lin [Prof. Dr. Lin was working on his doctorate when I was an
undergrad at Berkeley; we became friends at the &lt;a
href=&quot;http://www.ihouse.org/&quot;&gt;I-House&lt;/a&gt;, if that is the word I want for
someone who offered to keep my brain alive in a jigger-glass and subject it to
random electrical shocks (&quot;Jzzt!  Jzzt!&quot;).  But despite his questionable
tastes in acquaintances, he's a damn good statistician and a model teacher.]
		&lt;ul&gt;
		&lt;li&gt;&lt;a href=&quot;http://www.math.ucla.edu/~tlin/50.html&quot;&gt;Virtual
Statistics 50&lt;/a&gt; [Intro. statistics]
		&lt;li&gt;&lt;a href=&quot;http://www.math.ucla.edu/~tlin/154a.html&quot;&gt;Virtual
Statistics 154A&lt;/a&gt; [Intro. statistics with algebra and calculus]
		&lt;/ul&gt;
	&lt;li&gt;Deborah Mayo, &lt;cite&gt;Error and the Growth of Experimental
Knowledge&lt;/cite&gt; [Review: &lt;a href=&quot;../reviews/error/&quot;&gt;We Have Ways of Making
You Talk, or, Long Live Peircism-Popperism-Neyman-Pearson Thought!&lt;/a&gt;]
	&lt;li&gt;NIST, &lt;cite&gt;Electronic Handbook of Statistical Methods&lt;/cite&gt;
[&lt;a href=&quot;http://www.itl.nist.gov/div898/handbook/&quot;&gt;Full text
free online&lt;/a&gt;]
	&lt;li&gt;E. J. G. Pitman, &lt;cite&gt;Some Basic Theory for Statistical
Inference&lt;/cite&gt; [Review: &lt;a href=&quot;../reviews/pitman-basic-theory/&quot;&gt;Intermediate Statistics from an Advanced Point of View&lt;/a&gt;]
	&lt;li&gt;Jorma Rissanen, &lt;cite&gt;Stochastic Complexity in Statistical
Inquiry&lt;/cite&gt; [Review: &lt;a
href=&quot;../reviews/stochastic-complexity-in-statistical-inquiry/&quot;&gt;Less Is
More, or, &lt;em&gt;Ecce data!&lt;/em&gt;&lt;/a&gt;]
	&lt;li&gt;Mark Schervish, &lt;citE&gt;Theory of Statistics&lt;/cite&gt;
	&lt;li&gt;John R. Taylor, &lt;cite&gt;An Introduction to Error Analysis: The Study
of Uncertainties in Physical Measurements&lt;/cite&gt; [a.k.a. &quot;the book with the
train-wreck on the cover&quot;]
	&lt;li&gt;&lt;a href=&quot;http://www.stat.cmu.edu/~larry/&quot;&gt;Larry Wasserman&lt;/a&gt;
		&lt;ul&gt;
		&lt;li&gt;&lt;cite&gt;All of Statistics&lt;/cite&gt;
		&lt;li&gt;&lt;cite&gt;All of Nonparametric Statistics&lt;/cite&gt;
		&lt;/ul&gt;
	&lt;/ul&gt;

&lt;ul&gt;Recommended, technical, close-ups:
	&lt;li&gt;A. C. Atkinson and A. N. Donev, &lt;cite&gt;Optimum Experimental
Design&lt;/cite&gt; [&lt;a href=&quot;../reviews/atkinson-donev/&quot;&gt;Review&lt;/a&gt;]
	&lt;li&gt;F. Bacchus, H. E. Kyburg and M. Thalos, &quot;Against
Conditionalization,&quot; &lt;cite&gt;Synthese&lt;/cite&gt; &lt;strong&gt;85&lt;/strong&gt; (1990):
475--506 [Why &quot;Dutch book&quot; arguments do not, in fact, mean that rational
agents must be Bayesian reasoners]
	&lt;li&gt;Andrew Barron and Nicolas Hengartner, &quot;Information theory and superefficiency&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1024691358&quot;&gt;&lt;citE&gt;Annals of Statistics&lt;/citE&gt; &lt;strong&gt;26&lt;/strong&gt; (1998):
1800--1825&lt;/a&gt;
	&lt;li&gt;M. S. Bartlett, &quot;The Statistical Significance of Odd Bits of
Information&quot;, &lt;cite&gt;Biometrika&lt;/cite&gt; &lt;strong&gt;39&lt;/strong&gt; (1952): 228--237 [A
goodness-of-fit test based on fluctuations of the
entropy.  &lt;a href=&quot;http://www.jstor.org/2334019&quot;&gt;JSTOR&lt;/a&gt;]
	&lt;li&gt;M. J. Bayarri and James O. Berger, &quot;&lt;em&gt;P&lt;/em&gt; Values for Composite
Null Models&quot;, &lt;cite&gt;Journal of the American Statistical
Association&lt;/cite&gt; &lt;strong&gt;95&lt;/strong&gt; (2000) 127--1142 [To be read in
conjunction with Robins, van der Vaart and Ventura,
below.  &lt;a href=&quot;http://www.jstor.org/stable/2669749&quot;&gt;JSTOR&lt;/a&gt;]
	&lt;li&gt;Anil K. Bera and Aurobindo Ghosh, &quot;Neyman's Smooth Test and Its
Applications in Econometrics&quot;, pp. 177--230 in Aman Ullah, Alan T. K. Wan and
Anoop Chaturvedi (eds.), &lt;cite&gt;Handbook of Applied Econometrics and Statistical
Inference&lt;/cite&gt;, &lt;a href=&quot;http://ssrn.com/abstract=272888&quot;&gt;SSRN/272888&lt;/a&gt;
	&lt;li&gt;Julian Besag, &quot;A Candidate's Formula: A Curious Result in
Bayesian Prediction&quot;, &lt;cite&gt;Biometrika&lt;/cite&gt; &lt;strong&gt;76&lt;/strong&gt; (1989): 183
[A wonderful and bizarre expression for the Bayesian predictive density, in
terms of how adding a new data point would change the posterior.  &lt;a href=&quot;http://www.jstor.org/pss/2336383&quot;&gt;JSTOR&lt;/a&gt;]
	&lt;li&gt;David Blackwell and M. A. Girshick, &lt;cite&gt;Theory of Games and
Statistical Decisions&lt;/cite&gt;
	&lt;li&gt;Leo Breiman, &quot;No Bayesians in Foxholes&quot;, &lt;cite&gt;IEEE Expert:
Intelligent Systems and Their Applications&lt;/cite&gt; &lt;strong&gt;12&lt;/strong&gt; (1997):
21--24
[&lt;a href=&quot;http://www.stat.columbia.edu/~gelman/stuff_for_blog/breiman.pdf&quot;&gt;PDF
reprint&lt;/a&gt;; &lt;a
href=&quot;http://www.stat.columbia.edu/~cook/movabletype/archives/2006/12/no_bayesians_in.html&quot;&gt;comments&lt;/a&gt;
by Andy Gelman]
	&lt;li&gt;Jochen Brocker, &quot;A Lower Bound on Arbitrary &lt;em&gt;f&lt;/em&gt;-Divergences in
Terms of the Total Variation&quot; &lt;a href=&quot;http://arxiv.org/abs/0903.1765&quot;&gt;arxiv:0903.1765&lt;/a&gt;
	&lt;li&gt;Peter B&amp;uuml;hlmann and Sara van de Geer, &lt;citE&gt;Statistics for High-Dimensional Data: Methods, Theory and Applications&lt;/citE&gt; [&lt;a href=&quot;../weblog/algae-2011-12.html#buhlmann-van-de-geer&quot;&gt;Mini-review&lt;/a&gt;]
	&lt;li&gt;Ronald W. Butler, &quot;Predictive Likelihood Inference with
Applications&quot;, &lt;cite&gt;Journal of the Royal Statistical Society&lt;/cite&gt;
B &lt;strong&gt;48&lt;/strong&gt; (1986): 1--38 [&quot;in the predictive setting, all parameters
are nuisance
parameters&quot;.  &lt;a href=&quot;http://www.jstor.org/stable/2345635&quot;&gt;JSTOR&lt;/a&gt;]
	&lt;li&gt;Hwan-sik Choi and Nicholas M. Kiefer, &quot;Differential Geometry and
Bias Correction in Nonnested Hypothesis Testing&quot;
[&lt;a href=&quot;http://www.arts.cornell.edu/econ/kiefer/GeometryMS6.pdf&quot;&gt;PDF preprint
via Kiefer&lt;/a&gt;]
	&lt;li&gt;A. C. Davison and D. V. Hinkley, &lt;cite&gt;Bootstrap Methods and
their Applications&lt;/cite&gt;
	&lt;li&gt;J. Bradford DeLong and Kevin Lang, &quot;Are All Economic Hypotheses
False?&quot;, &lt;cite&gt;Journal of Political Economy&lt;/cite&gt; &lt;strong&gt;100&lt;/strong&gt; (1992):
1257--1272 [&lt;a href=&quot;http://www.j-bradford-delong.net/pdf_files/False.pdf&quot;&gt;PDF
preprint&lt;/a&gt;.  The point is about abuses of hypothesis testing, not economic
hypotheses as such.]
	&lt;li&gt;Earman, &lt;cite&gt;Bayes or Bust? A Critical Account of Bayesian
Confirmation Theory&lt;/cite&gt;
	&lt;li&gt;Bradley Efron
		&lt;ul&gt;
		&lt;li&gt;&quot;Bootstrap Methods: Another Look at the Jackknife&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.aos/1176344552&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;7&lt;/strong&gt; (1979): 1--26&lt;/a&gt; [The original paper;
staggeringly understandable]
		&lt;li&gt;&lt;cite&gt;The Jackknife, the Bootstrap, and Other Resampling Plans&lt;/cite&gt;
		&lt;li&gt;&quot;Maximum Likelihood and Decision Theory&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.aos/1176345778&quot;&gt;&lt;citE&gt;The Annals of Statistics&lt;/cite&gt; &lt;strong&gt;10&lt;/strong&gt; (1982): 340--356&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Steve Fienberg, &lt;cite&gt;The Analysis of Cross-Classified Categorical
Data&lt;/cite&gt;
	&lt;li&gt;Don Fraser, &quot;Is Bayes posterior just quick and dirty confidence&quot;,
&lt;cite&gt;Statistical Science&lt;/cite&gt; &lt;strong&gt;26&lt;/strong&gt; (2011): 299--316,
&lt;a href=&quot;http://arxiv.org/abs/1112.5582&quot;&gt;arxiv:1112.5582&lt;/a&gt; [See also the
discussions by others, and Fraser's reply.  My answer to the question posed in
Fraser's title is &quot;yes&quot;, or rather &quot;YES!&quot;]
	&lt;li&gt;Andrew Gelman, Jennifer Hill and Masanao Yajima, &quot;Why we (usually)
don't have to worry about multiple comparisons&quot;
[&lt;a
href=&quot;http://www.stat.columbia.edu/~gelman/research/unpublished/multiple2.pdf&quot;&gt;PDF
preprint&lt;/a&gt;]
	&lt;li&gt;Andrew Gelman and Iain Pardoe, &quot;Average predictive comparisons
for models with nonlinearity, interactions, and variance components&quot;,
&lt;cite&gt;Sociological Methodology&lt;/cite&gt; forthcoming (2007)
[&lt;a
href=&quot;http://www.stat.columbia.edu/~gelman/research/published/ape17.pdf&quot;&gt;PDF
preprint&lt;/a&gt;,
Gelman's &lt;a
href=&quot;http://www.stat.columbia.edu/~cook/movabletype/archives/2007/08/average_predict.html&quot;&gt;comments&lt;/a&gt;]
	&lt;li&gt;Christopher Genovese, Peter Freeman, Larry Wasserman, Robert C. Nichol and Christopher Miller, &quot;Inference for the Dark Energy Equation of State
Using Type IA Supernova Data&quot;, &lt;cite&gt;Annals of Applied Statistics&lt;/cite&gt;
&lt;strong&gt;3&lt;/strong&gt; (2009): 144--178, &lt;a href=&quot;http://arxiv.org/abs/0805.4136&quot;&gt;arxiv:0805.4136&lt;/a&gt; [I am biased, because Genovese and Wasserman are friends, but this seems to me a model of a modern applied statistics paper: use interesting
statistical ideas to say something helpful about an important scientific
problem &lt;em&gt;on its own terms&lt;/em&gt;, rather than distorting the problem
until it &quot;looks like a nail&quot;.]
	&lt;li&gt;Charles J. Geyer, &quot;Le Cam Made Simple: Asymptotics of Maximum
Likelihood without the LLN or CLT or Sample Size Going to Infinity&quot;
[&lt;a href=&quot;http://www.stat.umn.edu/geyer/lecam/&quot;&gt;PDF preprint&lt;/a&gt;, via
Prof. Geyer.  &amp;mdash; There are two separable points here.  One is that much of
the usual asymptotic theory of maximum likelihood follows from the quadratic
form of the likelihood &lt;em&gt;alone&lt;/em&gt;; whenever and however that is reached,
those consequences follow.  Approximately quadratic likelihoods imply
approximations to the usual asymptotics.  This is unquestionably correct.  The
other is some bashing of results like the law of large numbers and central
limit theorem, which seems misguided to me.]
	&lt;li&gt;Tilmann Gneiting, &quot;Making and Evaluating Point Forecasts&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1198/jasa.2011.r10138&quot;&gt;&lt;cite&gt;Journal of the American Statistical Association&lt;/cite&gt; &lt;strong&gt;106&lt;/strong&gt; (2011): 746--762&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/0912.0902&quot;&gt;arxiv:0912.0902&lt;/a&gt;
	&lt;li&gt;Trygve Haavelmo, &quot;The Probability Approach in Econometrics&quot;,
&lt;citE&gt;Econometrica&lt;/cite&gt; &lt;strong&gt;12&lt;/strong&gt; (1944, supplement): iii--115
[&lt;a href=&quot;http://www.jstor.org/stable/info/1906935&quot;&gt;JSTOR&lt;/a&gt;]
	&lt;li&gt;Mark S. Handcock and Martina Morris, &lt;cite&gt;Relative Distribution
Methods in the Social Sciences&lt;/cite&gt;
[Review: &lt;a href=&quot;../reviews/relative-distribution/&quot;&gt;Beyond Mean and
Deviance&lt;/a&gt;]
	&lt;li&gt;Bruce E. Hansen
		&lt;ul&gt;
		&lt;li&gt;&quot;The Likelihood Ratio Test Under Nonstandard
Conditions: Testing the Markov Switching Model of GNP&quot;, &lt;cite&gt;Journal of
Applied Econometrics&lt;/cite&gt;
&lt;strong&gt;7&lt;/strong&gt; (1992): S61--S82 [I very much like the approach of treating
the likelihood ratio as an &lt;a href=&quot;empirical-process-theory.html&quot;&gt;empirical
process&lt;/a&gt;; why haven't I seen it before?  (Also, the state-of-the-art in
simulating Gaussian processes must be much better now than what Hansen had in
'92, which would make this even more
practical. &lt;a href=&quot;http://www.ssc.wisc.edu/~bhansen/papers/jae_92.pdf&quot;&gt;PDF
reprint&lt;/a&gt;.]
		&lt;li&gt;&quot;Inference when a nuisance parameter is not identified under the null hypothesis&quot;, &lt;a href=&quot;http://www.ssc.wisc.edu/~bhansen/papers/ecnmt_96.html&quot;&gt;&lt;cite&gt;Econometrica&lt;/cite&gt; &lt;strong&gt;64&lt;/strong&gt; (1996): 413--430&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Jeffrey D. Hart, &lt;cite&gt;Nonparametric Smoothing and Lack-of-Fit
Tests&lt;/cite&gt; [&lt;a href=&quot;../weblog/algae-2009-10.html#hart-on-smoothing&quot;&gt;comments&lt;/a&gt;]
	&lt;li&gt;Nils Lid Hjort and David Pollard, &quot;Asymptotics for minimisers of convex processes&quot;, &lt;a href=&quot;http://arxiv.org/abs/1107.3806&quot;&gt;arxiv:1107.3806&lt;/a&gt;
[Very elegant]
	&lt;li&gt;Peter J. Huber, &quot;On the Non-Optimality of Optimal Procedures&quot;
[&lt;a href=&quot;http://projecteuclid.org/euclid.lnms/1249305323&quot;&gt;Full text&lt;/a&gt;]
	&lt;li&gt;Wilbert C. M. Kallenberg and Teresa Ledwina, &quot;Data-driven
smooth tests when the hypothesis is composite&quot;, &lt;cite&gt;Journal of the
American Statistical Association&lt;/cite&gt; &lt;strong&gt;92&lt;/strong&gt; (1997): 1094--1104
[&lt;a href=&quot;http://purl.utwente.nl/publications/62408&quot;&gt;Abstract, PDF reprint&lt;/a&gt;; &lt;a href=&quot;http://www.jstor.org/stable/2965574&quot;&gt;JSTOR&lt;/a&gt;]
	&lt;li&gt;Gary King, &lt;cite&gt;A Solution to the Ecological Inference Problem:
Reconstructing Individual Behavior from Aggregate Data&lt;/cite&gt; [&lt;a
href=&quot;../reviews/king-on-ecological-inference/&quot;&gt;Review&lt;/a&gt;]
	&lt;li&gt;Solomon W. Kullback, &lt;cite&gt;Information Theory and Statistics&lt;/cite&gt;
	&lt;li&gt;Michael Lavine and Mark J. Schervish, &quot;Bayes Factors: What They
Are and What They Are Not&quot; [&lt;a href=&quot;http://www.stat.cmu.edu/tr/tr652/tr652.html&quot;&gt;PS preprint&lt;/a&gt;]
	&lt;li&gt;Steffen Lauritzen, &lt;cite&gt;Extremal Families and Systems of
Sufficient Statistics&lt;/cite&gt; [See comments
under &lt;a href=&quot;sufficient-statistics.html&quot;&gt;sufficient statistics&lt;/a&gt;]
	&lt;li&gt;J. F. Lawless and Marc Fredette, &quot;Frequentist prediction intervals
and predictive distributions&quot;, &lt;a
href=&quot;http://dx.doi.org/10.1093/biomet/92.3.529&quot;&gt;&lt;cite&gt;Biometrika&lt;/cite&gt;
&lt;strong&gt;92&lt;/strong&gt; (2005): 529--542&lt;/a&gt; [&quot;Frequentist predictive distributions
are defined as confidence distributions .... A simple pivotal-based approach
that produces prediction intervals and predictive distributions with
well-calibrated frequentist probability interpretations is introduced, and
efficient simulation methods for producing predictive distributions are
considered. Properties related to an average Kullback-Leibler measure of
goodness for predictive or estimated distributions are given.&quot;]
	&lt;li&gt;Lucien Le Cam
		&lt;ul&gt;
		&lt;li&gt;&quot;Neyman and Stochastic Models&quot;
[&lt;a
href=&quot;http://stat-www.berkeley.edu/users/rice/LeCam/papers/94ms_neyman.pdf&quot;&gt;PDF&lt;/a&gt;.
Some vignettes of Neyman putting together models, and his model-building
process.]
		&lt;li&gt;&quot;Maximum Likelihood; An Introduction&quot;
[&lt;a
href=&quot;http://stat-www.berkeley.edu/users/rice/LeCam/papers/tech168.pdf&quot;&gt;PDF&lt;/a&gt;.
Not really an introduction, but rather a collection of examples of where it
just &lt;em&gt;does not work&lt;/em&gt;, or at least doesn't work well.]
		&lt;/ul&gt;
	&lt;li&gt;Erich L. Lehmann, &quot;On likelihood ratio tests&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0610835&quot;&gt;math.ST/0610835&lt;/a&gt;
	&lt;li&gt;Jing Lei, James Robins, Larry Wasserman, &quot;Efficient Nonparametric Conformal Prediction Regions&quot;, &lt;a href=&quot;http://arxiv.org/abs/1111.1418&quot;&gt;arxiv:1111.1418&lt;/a&gt;
	&lt;li&gt;&lt;a href=&quot;http://www.stat.psu.edu/~bing/&quot;&gt;Bing Li&lt;/a&gt;, &quot;A minimax
approach to consistency and efficiency for estimating equations,&quot; &lt;cite&gt;Annals
of Statistics&lt;/cite&gt; &lt;strong&gt;24&lt;/strong&gt; (1996): 1283--1297
[&lt;a href=&quot;http://www.stat.psu.edu/~bing/papers/aos96.ps&quot;&gt;online version&lt;/a&gt;]
	&lt;li&gt;Bruce Lindsay and Liawei Liu, &quot;Model Assessment Tools for a Model
False
World&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ss/1270041257&quot;&gt;&lt;cite&gt;Statistical
Science&lt;/cite&gt;
&lt;strong&gt;24&lt;/strong&gt; (2009):
303--318&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/1010.0304&quot;&gt;arxiv:1010.0304&lt;/a&gt;
[Their model-adequacy index is, essentially, the number of samples needed to
detect the falsity of the model with some reasonable, pre-set level of power,
with fixed size/significance level.  This is a very natural quantity.  In fact,
by results which go back to Kullback's book, the power grows exponentially,
with a rate equal to the Kullback-Leibler divergence rate.  (More exactly, one
minus the power goes to zero exponentially at that rate, but you know what I
meant.)  &lt;a href=&quot;large-deviations.html&quot;&gt;Large deviations theory&lt;/a&gt; includes
generalizations of this result.  Many statisticians, I'd guess, would prefer
the Lindsay-Liu index because will feel it more natural to them to gauge error
in terms of a sample size rather than bits, but to each their own.]
	&lt;li&gt;Brad Luen and Philip B. Stark, &quot;Testing earthquake predictions&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.imsc/1207580090&quot;&gt;pp. 302--315 in
Deborah Nolan and Terry Speed (eds.), &lt;cite&gt;Probability and Statistics; Essays
in Honor of David A. Freedman&lt;/cite&gt;&lt;/a&gt; [The issues arise however not just for earthquakes, but for all sorts of clustered events]
	&lt;li&gt;Charles Manski, &lt;cite&gt;Identification for Prediction
and Decision&lt;/cite&gt; [&lt;a href=&quot;../weblog/algae-2009-08.html#manski&quot;&gt;Mini-review&lt;/a&gt;]
	&lt;li&gt;Deborah G. Mayo and D. R. Cox, &quot;Frequentist statistics as a theory
of inductive
inference&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.ST/0610846&quot;&gt;math.ST/0610846&lt;/a&gt;
	&lt;li&gt;M. B. Nevel'son and R. Z. Has'minskii, &lt;cite&gt;Stochastic
Approximation and Recursive Estimation&lt;/cite&gt;
	&lt;li&gt;Andrey Novikov, &quot;Optimal sequential multiple
hypothesis tests&quot;, &lt;a href=&quot;http://arxiv.org/abs/0811.1297&quot;&gt;arxiv:0811.1297&lt;/a&gt;
	&lt;li&gt;David Pollard
		&lt;ul&gt;
		&lt;li&gt;&quot;Asymptotics via Empirical Processes&quot;,
&lt;cite&gt;Statistical Science&lt;/cite&gt; &lt;strong&gt;4&lt;/strong&gt; (1989): 341--354
		&lt;li&gt;&lt;cite&gt;Empirical Processes: Theory and Applications&lt;/cite&gt;
		&lt;/ul&gt;
	&lt;li&gt;Jeffrey S. Racine, &quot;Nonparametric Econometrics: A Primer&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1561/0800000009&quot;&gt;&lt;cite&gt;Foundations and Trends in Econometrics&lt;/cite&gt;
&lt;strong&gt;3&lt;/strong&gt; (2008): 1--88&lt;/a&gt; [Good primer of nonparametric techniques
for regression, density estimation and hypothesis testing; next to no economic
content (except for examples).  Presumes reasonable familiarity with parametric
statistics.  &lt;a href=&quot;http://socserv.mcmaster.ca/racine/ECO0301.pdf&quot;&gt;PDF
reprint&lt;/a&gt;]
	&lt;li&gt;James M. Robins and Ya'acov Ritov, &quot;Toward a curse of Dimensionality
Appropriate (CODA) Asymptotic Theory for Semi-Parametric Models&quot;,
&lt;cite&gt;Statistics in Medicine&lt;/citE&gt; &lt;strong&gt;16&lt;/strong&gt; (1997): 285--319
[&lt;a href=&quot;http://www.biostat.harvard.edu/~robins/coda.pdf&quot;&gt;PDF reprint&lt;/a&gt; via Prof. Robins]
	&lt;li&gt;James M. Robins, Aad van de Vaart and Val&amp;eacute;rie Ventura,
&quot;Asymptotic Distribution of &lt;em&gt;P&lt;/em&gt; Values in Composite Null Models&quot;, &lt;cite&gt;Journal
of the American Statistical Association&lt;/cite&gt; &lt;strong&gt;95&lt;/strong&gt; (2000):
1143--1156 [&lt;a href=&quot;http://www.jstor.org/stable/2669750&quot;&gt;JSTOR&lt;/a&gt;.  Paired
article with Bayarri and Berger, above.  The discussions and rejoinders
(pp. 1157--1172) are valuable.]
	&lt;li&gt;George G. Roussas, &lt;cite&gt;Contiguity of Probability Measures: Some Applications in Statistics&lt;/cite&gt; [&lt;a href=&quot;../weblog/algae-2010-10.html#roussas&quot;&gt;Mini-review&lt;/a&gt;]
	&lt;li&gt;C. Scott and R. Nowak, &quot;A Neyman-Pearson Approach to Statistical
Learning&quot;, &lt;a href=&quot;http://dx.doi.org/10.1109/TIT.2005.856955&quot;&gt;&lt;cite&gt;IEEE
Transactions on Information Theory&lt;/cite&gt; &lt;strong&gt;51&lt;/strong&gt; (2005):
3806--3819&lt;/a&gt; [Comments: &lt;a href=&quot;../weblog/645.html&quot;&gt;Learning Your Way to
Maximum Power&lt;/a&gt;]
	&lt;li&gt;Jeffrey S. Simonoff, &lt;cite&gt;Smoothing Methods in Statistics&lt;/cite&gt;
	&lt;li&gt;Spyros Skouras, &quot;Decisionmetrics: Towards a Decision-Based
Approach to Econometrics,&quot; &lt;a
href=&quot;http://www.santafe.edu/sfi/publications/Abstracts/01-11-064abs.html&quot;&gt;SFI
Working Paper 2001-11-064&lt;/a&gt; [Applies far outside econometrics.  If what you
really want to do is to minimize a &lt;em&gt;known&lt;/em&gt; loss function, optimizing a
conventional accuracy measure, e.g. least squares, can be highly
counterproductive.]
	&lt;li&gt;Aris Spanos
		&lt;ul&gt;
		&lt;li&gt;&quot;The Curve-Fitting Problem, Akaike-type Model
Selection, and the Error Statistical Approach&quot; [Or: could &lt;em&gt;your&lt;/em&gt; model
selection tell you that Kepler is better than Ptolemy?  Technical report,
economics dept., Virginia Tech,
2006.  &lt;a
href=&quot;http://www.econ.vt.edu/Seminars/Seminarpapers/Spanos-curvefitting.pdf&quot;&gt;PDF&lt;/a&gt;]
		&lt;li&gt;&quot;Where do statistical models come from? Revisiting the
problem of specification&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0610849&quot;&gt;math.ST/0610849&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Sara van de Geer, &lt;cite&gt;Empirical Process Theory in
M-Estimation&lt;/cite&gt; [Finding non-asymptotic rates of convergence for common
estimators]
	&lt;li&gt;Quang H. Vuong, &quot;Likelihood Ratio Tests for Model Selection and
Non-Nested Hypotheses&quot;, &lt;cite&gt;Econometrica&lt;/cite&gt; &lt;strong&gt;57&lt;/strong&gt; (1989):
307--333
	&lt;li&gt;&lt;a href=&quot;http://www.stat.wisc.edu/~wahba/&quot;&gt;Grace Wahba&lt;/a&gt;,
&lt;cite&gt;Spline Models for Observational Data&lt;/cite&gt;
	&lt;li&gt;Michael E. Wall, Andreas Rechtsteiner and Luis M. Rocha, &quot;Singular
Value Decomposition and Principal Component Analysis,&quot;
&lt;a href=&quot;http://arxiv.org/abs/physics/0208101&quot;&gt;physics/0208101&lt;/a&gt;
	&lt;li&gt;Larry Wasserman, &quot;Low Assumptions, High Dimensions&quot;, &lt;a href=&quot;http://www.rmm-journal.de/downloads/Article_Wasserman.pdf&quot;&gt;&lt;cite&gt;RMM&lt;/cite&gt; &lt;strong&gt;2&lt;/strong&gt; (2011): 201--209&lt;/a&gt;
	&lt;li&gt;Halbert White, &lt;citE&gt;Estimation, Inference and Specification
Analysis&lt;/cite&gt; [&lt;a href=&quot;../weblog/algae-2009-09.html#white&quot;&gt;mini-review&lt;/a&gt;]
	&lt;li&gt;Achilleas Zapranis and Apostolos-Paul Refenes, &lt;cite&gt;Principles of
Neural Model Identification, Selection and Adequacy, with Applications to
Financial Econometrics&lt;/cite&gt;
	&lt;li&gt;Sven Zenker, Jonathan Rubin, Gilles Clermont, &quot;From Inverse Problems in Mathematical Physiology to Quantitative Differential Diagnoses&quot;, &lt;a href=&quot;http://dx.doi.org/10.1371/journal.pcbi.0030204&quot;&gt;&lt;cite&gt;PLoS Computational Biology&lt;/cite&gt; &lt;strong&gt;3&lt;/strong&gt; (2007): e205&lt;/a&gt;
	&lt;/ul&gt;

&lt;ul&gt;Recommended, historical:
	&lt;li&gt;Nicola Giocoli, &quot;From Wald to Savage: homo economicus becomes a Bayesian statistician&quot; [&lt;a href=&quot;http://mpra.ub.uni-muenchen.de/34117/&quot;&gt;preprint&lt;/a&gt;]
	&lt;li&gt;Erich L. Lehmann, &quot;On the history and use of some standard statistical models&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.imsc/1207580081&quot;&gt;pp. 114--126 in Deborah Nolan and Terry Speed (eds.), &lt;cite&gt;Probability and Statistics: Essays in Honor of David A. Freedman&lt;/cite&gt;&lt;/a&gt;
	&lt;li&gt;Henry Scheffe, &quot;Statistical Inference in the Non-Parametric Case&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aoms/1177731355&quot;&gt;&lt;cite&gt;Annals of Mathematical Statistics&lt;/cite&gt; &lt;strong&gt;14&lt;/strong&gt;
(1943): 305--332&lt;/a&gt; [Recommended not as a historical study, but a historical
document]
	&lt;/ul&gt;

&lt;ul&gt;Modesty forbids me to recommend:
	&lt;li&gt;Andrew Gelman and CRS, &quot;Philosophy and the practice of Bayesian
statistics&quot;, submitted to the &lt;cite&gt;Journal of the American Statistical
Association&lt;/cite&gt;, &lt;a href=&quot;http://arxiv.org/abs/1006.3868&quot;&gt;arxiv:1006.3868&lt;/a&gt;
	&lt;li&gt;CRS, &quot;The Bootstrap&quot;, &lt;a href=&quot;http://www.americanscientist.org/issues/pub/2010/3/the-bootstrap/1&quot;&gt;&lt;cite&gt;American Scientist&lt;/cite&gt; &lt;strong&gt;98&lt;/strong&gt; (2010): 186--190&lt;/a&gt; [&lt;a href=&quot;http://bactra.org/weblog/652.html&quot;&gt;Further&lt;/a&gt;]
	&lt;/ul&gt;

&lt;ul&gt;To read, textbooks, reviews, etc.:
	&lt;li&gt;R. Harald Baayen, &lt;cite&gt;Analyzing Linguistic Data: A Practical Introduction to Statistics Using R&lt;/cite&gt; [&lt;a href=&quot;http://cambridge.org/9780521709187&quot;&gt;blurb&lt;/a&gt;]
	&lt;li&gt;Vic Barnett, &lt;cite&gt;Comparative Statistical Inference&lt;/cite&gt;
	&lt;li&gt;Bucklew, &lt;cite&gt;Large Deviation Techniques in Decision, Simulation,
and Estimation&lt;/cite&gt;
	&lt;li&gt;A. C. Davison, &lt;cite&gt;Statistical Models&lt;/cite&gt;
	&lt;li&gt;Alan Julian Izenman, &lt;citE&gt;Modern Multivariate Statistical
Techniques: Regression, Classification, and Manifold Learning&lt;/cite&gt;
	&lt;li&gt;Erich L. Lehmann,  &lt;cite&gt;Elements of Large-Sample Theory&lt;/cite&gt;
	&lt;li&gt;National Institute of Standards and Technology, &lt;cite&gt;&lt;a
href=&quot;http://www.itl.nist.gov/div898/handbook/index.htm&quot;&gt;Engineering Statistics
Handbook&lt;/a&gt;&lt;/cite&gt; [All the sections I've looked at have been quite good.]
	&lt;li&gt;Yudi Pawitan, &lt;cite&gt;In All Likelihood: Statistical Modeling
and Inference Using Likelihood&lt;/cite&gt;
	&lt;li&gt;Aris Spanos, &lt;cite&gt;Probability Theory and Statistical Interference:
Econometric Modeling with Observational Data&lt;/cite&gt;
	&lt;/ul&gt;

&lt;ul&gt;To read, history and philosophy:
	&lt;li&gt;Odd O. Aalen, Per Kragh Andersen, \Ornulf Borgan, Richard D. Gill, Niels Keiding, &quot;History of applications of martingales in survival analysis&quot;,
&lt;cite&gt;Electronic Journal for History of Probability and Statistics&lt;/cite&gt;
&lt;strong&gt;5&lt;/strong&gt; (2009), &lt;a href=&quot;http://arxiv.org/abs/1003.0188&quot;&gt;arxiv:1003.0188&lt;/a&gt;
	&lt;li&gt;Carolina Armenteros, &quot;From Human Nature to Normal Humanity: Joseph de Maistre, Rousseau, and the Origins of Moral Statistics&quot;, &lt;cite&gt;Journal of the
History of Ideas&lt;/cite&gt; &lt;strong&gt;68&lt;/strong&gt; (2007): 107--130 [&lt;a href=&quot;http://dx.doi.org/10.1353/jhi.2007.0000&quot;&gt;Abstract, text links&lt;/a&gt;]
	&lt;li&gt;Ian Hacking, &quot;The Theory of Probable Inference: Neyman, Peirce and
Braithwaite,&quot; in &lt;cite&gt;Science, Belief and Behavior: Essays in Honor of
R. B. Braithwaite&lt;/cite&gt; ed. D. H. Mellor
	&lt;li&gt;Anders Hald, &lt;cite&gt;A History of Parametric Statistical Inference from Bernoulli to Fisher, 1713--1935&lt;/cite&gt; [&lt;a href=&quot;www.springer.com/978-0-387-46408-4&quot;&gt;Blurb&lt;/a&gt;]
	&lt;li&gt;Kendall and Plackett (eds.), &lt;cite&gt;Studies in the History of
Statistics and Probability&lt;/cite&gt;
	&lt;li&gt;Kyburg, &lt;cite&gt;Uncertain Inference&lt;/cite&gt;
	&lt;li&gt;Erich Lehmann and Juliet Shadder, &lt;cite&gt;Fisher, Neyman, and the Creation of Classical Statistics&lt;/cite&gt; [&lt;a href=&quot;http://www.springer.com/book/978-1-4419-9499-8&quot;&gt;Blurb&lt;/a&gt;]
	&lt;li&gt;Mayo and Hollander (eds.), &lt;cite&gt;Acceptable Evidence: Science and
Values in Risk Management&lt;/cite&gt;
	&lt;li&gt;Leland Gerson Neuberg, &lt;cite&gt;Conceptual Anomalies in Economics and
Statistics: Lessons from the Social Experiment&lt;/cite&gt;
[&lt;a href=&quot;http://cambridge.org/052130444X&quot;&gt;blurb&lt;/a&gt;]
	&lt;li&gt;Theodore Porter, &lt;cite&gt;Trust in Numbers&lt;/cite&gt;
	&lt;li&gt;Stephen M. Stigler
		&lt;ul&gt;
		&lt;li&gt;&lt;cite&gt;The History of Statistics: The Measure of Uncertainty
before 1900&lt;/cite&gt;
		&lt;li&gt;&lt;cite&gt;Statistics on the Table: The History of Statistical
Concepts and Methods&lt;/cite&gt;
		&lt;/ul&gt;
	&lt;li&gt;&lt;a href=&quot;http://engl.iastate.edu/directory/gdwilson&quot;&gt;Gregory
D. Wilson&lt;/a&gt;, &lt;cite&gt;Articulation Theory and Disciplinary Change: Unpacking the
Bayesian-Frequentist Paradigm Conflict in Statistical Science&lt;/cite&gt;
[Ph.D. thesis, New Mexico State University, 2001]
	&lt;li&gt;S. L. Zabell, &lt;cite&gt;Symmetry and Its Discontents: Essays on the
History of Inductive Probability&lt;/cite&gt;
[&lt;a href=&quot;http://cambridge.org/052144912X&quot;&gt;blurb&lt;/a&gt;]
	&lt;/ul&gt;

&lt;ul&gt;To read, research literature:
	&lt;li&gt;Felix Abramovich, Yoav Benjamini, David L. Donoho and Iain
M. Johnstone, &quot;Adapting to Unknown Sparsity by controlling the False Discovery
Rate&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.ST/0505374&quot;&gt;math.ST/0505374&lt;/a&gt; [I
don't really care about sparsity, but they promise novel relations between the
FDR control and asymptotic minimaxity and complexity-penalized model
selection.]
	&lt;li&gt;Sophie Achard, &quot;A quadratic measure of
dependence&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.ST/0609259&quot;&gt;math.ST/0609259&lt;/a&gt;
[&quot;Asymptotic properties of a dimension-robust dependence measure are
investigated. It is related to those used in independence tests, but is
derivable, thus suitable for independent component analysis. An adjustable
kernel allows to accelerate the convergence of the estimator without affecting
the bias.&quot;]
	&lt;li&gt;Gianfranco Adimari and Annamaria Guolo, &quot;A note on the asymptotic behaviour of empirical likelihood statistics&quot;, &lt;a href=&quot;http://dx.doi.org/10.1007/s10260-010-0137-9&quot;&gt;&lt;cite&gt;Statistical Methods and Applications&lt;/cite&gt; &lt;strong&gt;19&lt;/strong&gt; (2010): 463--476&lt;/a&gt;
	&lt;li&gt;Shotaro Akaho, &quot;A kernel method for canonical correlation
analysis&quot;, &lt;a href=&quot;http://arxiv.org/abs/cs.LG/0609071&quot;&gt;cs.LG/0609071&lt;/a&gt;
	&lt;li&gt;Elizabeth S. Allman, Catherine Matias, John A. Rhodes, &quot;Identifiability of parameters in latent structure models with many observed variables&quot;,
&lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;37&lt;/strong&gt; (2009): 3099--3132,
&lt;a href=&quot;http://arxiv.org/abs/0809.5032&quot;&gt;arxiv:0809.5032&lt;/a&gt;
	&lt;li&gt;Miguel A. Arcones, &quot;Bahadur Efficiency of the Likelihood Ratio
Test&quot; [&lt;a href=&quot;http://www.math.binghamton.edu/arcones/prep/pv.pdf&quot;&gt;PDF preprint&lt;/a&gt; from 2005, presumably since published...]
	&lt;li&gt;Barry C. Arnold et al., &lt;cite&gt;Conditional Specification of Statistical Models&lt;/cite&gt;
	&lt;li&gt;R. A. Bailey, &lt;cite&gt;Design of Comparative Experiments&lt;/citE&gt;
[&lt;a href=&quot;http://cambridge.org/9780521683579&quot;&gt;Blurb&lt;/a&gt;]
	&lt;li&gt;Roger Barlow, &quot;Asymmetric Errors&quot;, &lt;a
href=&quot;http://arxiv.org/abs/physics/0401042&quot;&gt;physics/0401042&lt;/a&gt;
	&lt;li&gt;Ole E. Barndorff-Nielsen and David R. Cox, &quot;Prediction and
Asymptotics&quot;, &lt;cite&gt;Bernoulli&lt;/cite&gt;
&lt;strong&gt;2&lt;/strong&gt; (1996): 319--340
	&lt;li&gt;Ole E. Barndorff-Nielsen, David R. Cox and Claudia Kl&amp;uuml;ppelberg
(eds.), &lt;cite&gt;Complex Stochastic Systems&lt;/cite&gt;
	&lt;li&gt;Zvika Ben-Haim and Yonina C. Eldar, &quot;The Cramer-Rao Bound for
Sparse Estimation&quot;, &lt;a href=&quot;http://arxiv.org/abs/0905.4378&quot;&gt;arxiv:0905.4378&lt;/a&gt;
	&lt;li&gt;Yoav Benjamini, Marina Bogomolov, &quot;Adjusting for selection bias in testing multiple families of hypotheses&quot;, &lt;a href=&quot;http://arxiv.org/abs/1106.3670&quot;&gt;arxiv:1106.3670&lt;/a&gt;
	&lt;li&gt;Patrice Bertail, Paul Doukhan and Philippe Soulier
(eds.), &lt;cite&gt;Dependence in Probability and Statistics&lt;/cite&gt; [&quot;recent
developments in ... probability and statistics for dependent data... from
Markov chain theory and weak dependence with an emphasis on ...  dynamical
systems, to strong dependence in times series and random fields. ... section on
statistical estimation problems and specific
applications&quot;. &lt;a href=&quot;http://www.springer.com/0-387-31741-4&quot;&gt;Full blurb,
contents&lt;/a&gt;]
	&lt;li&gt;Rabi Bhattacharya and Vic Patrangenaru, &quot;Large sample theory of
intrinsic and extrinsic sample means on manifolds--II&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0507423&quot;&gt;math.ST/0507423&lt;/a&gt; = &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053605000000093&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;33&lt;/strong&gt; (2005): 1225--1259&lt;/a&gt; [I need a notebook
on statistics on manifolds, as opposed to &lt;a href=&quot;info-geo.html&quot;&gt;statistical
manifolds&lt;/a&gt;]
	&lt;li&gt;G. Biau and L. Gyorfi, &quot;On the Asymptotic Properties of a
Nonparametric $-Test Statistic of
Homogeneity&quot;, &lt;a href=&quot;http://dx.doi.org/10.1109/TIT.2005.856979&quot;&gt;&lt;cite&gt;IEEE
Transactions on Information Theory&lt;/cite&gt; &lt;strong&gt;51&lt;/strong&gt; (2005):
3965--3973&lt;/a&gt;
	&lt;li&gt;David R. Bickel
		&lt;ul&gt;
		&lt;li&gt;&quot;The Strength of Statistical Evidence for Composite Hypotheses: Inference to the Best Explanation&quot; [&lt;a href=&quot;http://biostats.bepress.com/cobra/ps/art71/&quot;&gt;Preprint&lt;/a&gt;]
		&lt;li&gt;&quot;Resolving conflicts between statistical methods by probability combination: Application to empirical Bayes analyses of genomic data&quot;, &lt;a href=&quot;http://arxiv.org/abs/1111.6174&quot;&gt;arxiv:1111.6174&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Peter J. Bickel, C. A. J. Klaassen, Y. Ritov and J. A. Wellner,
&lt;cite&gt;Efficient and Adaptive Estimation for Semiparametric Models&lt;/cite&gt;
	&lt;li&gt;Peter J. Bickel and Bo Li, &quot;Regularization in Statistics&quot;,
&lt;cite&gt;Test&lt;/cite&gt; &lt;strong&gt;15&lt;/strong&gt; (2006): 271--344
[&lt;a href=&quot;http://www.stat.berkeley.edu/~bickel/Test_BickelLi.pdf&quot;&gt;PDF
reprint&lt;/a&gt;]
	&lt;li&gt;Peter J. Bickel and Y. Ritov, &quot;Non-Parametric Estimators Which Can
Be `Plugged-In' &quot; UCB Stat. Tech. Rep. 602 [&lt;a
href=&quot;http://www.stat.berkeley.edu/tech-reports/602.abstract&quot;&gt;abstract&lt;/a&gt;, &lt;a
href=&quot;http://www.stat.berkeley.edu/~bickel/602.pdf&quot;&gt;pdf&lt;/a&gt;]
	&lt;li&gt;L. Birge, &quot;A New Lower Bound for Multiple Hypothesis Testing&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1109/TIT.2005.844101&quot;&gt;&lt;cite&gt;IEEE Transactions on
Information Theory&lt;/cite&gt; &lt;strong&gt;51&lt;/strong&gt; (2005): 1611--1615&lt;/a&gt;
	&lt;li&gt;Gilles Blanchard, Sylvain Delattre, Etienne Roquain , &quot;Testing over a continuum of null hypotheses&quot;, &lt;a href=&quot;http://arxiv.org/abs/1110.3599&quot;&gt;arxiv:1110.3599&lt;/a&gt;
	&lt;li&gt;Michael Blum, &quot;Approximate Bayesian Computation: a non-parametric
perspective&quot;, &lt;a href=&quot;http://arxiv.org/abs/0904.0635&quot;&gt;arxiv:0904.0635&lt;/a&gt;
	&lt;li&gt;Ingwer Borg and Patrick J. F. Groenen, &lt;cite&gt;Modern Multidimensional Scaling: Theory and Application&lt;/cite&gt;
	&lt;li&gt;A. R. Brazzale and A. C. Davison, &quot;Accurate Parametric Inference for Small Samples&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ss/1242049390&quot;&gt;&lt;cite&gt;Statistical Science&lt;/cite&gt; &lt;strong&gt;23&lt;/strong&gt;
(2008): 465--484&lt;/a&gt; [Apparently, a preview for the book.]
	&lt;li&gt;A. R. Brazzale, A. C. Davison and N. Reid, &lt;cite&gt;Applied
Asymptotics: Case Studies in Small-Sample Statistics&lt;/cite&gt;
[&lt;a href=&quot;http://cambridge.org/9780521847032&quot;&gt;blurb&lt;/a&gt;]
	&lt;li&gt;Trevor S. Breusch, &quot;Hypothesis Testing in Unidentified Models&quot;,
&lt;cite&gt;Review of Economic Studies&lt;/cite&gt; &lt;strong&gt;53&lt;/strong&gt; (1986): 635--651
[&lt;a href=&quot;http://www.jstor.org/pss/2297609&quot;&gt;JSTOR&lt;/a&gt;]
	&lt;li&gt;Adam D. Bull, &quot;Honest adaptive confidence bands and self-similar functions&quot;, &lt;a href=&quot;http://arxiv.org/abs/1110.4985&quot;&gt;arxiv:1110.4985&lt;/a&gt;
	&lt;li&gt;Florentina Bunea, Alexandre B. Tsybakov, Marten H. Wegkamp, Adrian Barbu, &quot;Spades and Mixture Models&quot;, &lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;38&lt;/strong&gt; (2010): 2525--2558, &lt;a href=&quot;http://arxiv.org/abs/0901.2044&quot;&gt;arxiv:0901.2044&lt;/a&gt;
	&lt;li&gt;Dizza Bursztyn and david M. Steinberg, &quot;Comparison of designs for
computer experiments&quot;, &lt;a
href=&quot;http://dx.doi.org/10.1016/j.jspi.2004.08.007&quot;&gt;&lt;cite&gt;Journal
of Statistical Planning and Inference&lt;/cite&gt; &lt;strong&gt;136&lt;/strong&gt; (2006):
1103--1119&lt;/a&gt;
	&lt;li&gt;T. Tony Cai, &quot;Minimax and Adaptive Inference in Nonparametric Function Estimation&quot;, &lt;a href=&quot;http://dx.doi.org/10.1214/11-STS355&quot;&gt;&lt;cite&gt;Statistical Science&lt;/cite&gt; &lt;Strong&gt;27&lt;/strong&gt; (2012): 31--50&lt;/a&gt;
	&lt;li&gt;T. Tony Cai and Mark G. Low, &quot;An adaptation theory for
nonparametric confidence intervals&quot;, &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053604000000049&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;32&lt;/strong&gt; (2004): 1805--1840&lt;/a&gt; = &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0503662&quot;&gt;math.ST/0503662&lt;/a&gt;
	&lt;li&gt;Emmanuel Candes and Terence Tao, &quot;Near Optimal Signal Recovery from
Random Projections and Universal Encoding Strategies&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.CA/0410542&quot;&gt;math.CA/0410542&lt;/a&gt;
	&lt;li&gt;Herve Cardot, Andre Mas and Pascal Sarda, &quot;CLT in Functional Linear
Regression Models&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0508073&quot;&gt;math.ST/0508073&lt;/a&gt;
	&lt;li&gt;Djalil Chafai and Didier Concordet, &quot;On the strong consistency of
approximated M-estimators&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0507102&quot;&gt;math.ST/0507102&lt;/a&gt; [Sounds
cool...]
	&lt;li&gt;In Hong Chang and Rahul Mukerjee, &quot;Asymptotic results on the
frequentist mean squared error of generalized Bayes point predictors&quot;, &lt;a
href=&quot;http://dx.doi.org/10.1016/j.spl.2003.11.015&quot;&gt;&lt;cite&gt;Statistics and
Probability Letters&lt;/cite&gt; &lt;strong&gt;67&lt;/strong&gt; (2004): 65--71&lt;/a&gt; [Note to
self: file this one under &quot;de-Bayesing&quot;.]
	&lt;li&gt;Sandra Chapman, George Rowlands and Nicholas Watkins
		&lt;ul&gt;
		&lt;li&gt;&quot;Extremum statistics: A framework for data analysis,&quot; &lt;a
href=&quot;http://arxiv.org/abs/cond-mat/0106015&quot;&gt;cond-mat/0106015&lt;/a&gt;
		&lt;li&gt;&quot;Extremum Statistics and Signatures of Long Range
Correlations,&quot; &lt;a
href=&quot;http://arxiv.org/abs/cond-mat/0106015&quot;&gt;cond-mat/0106015&lt;/a&gt;
		&lt;li&gt;&quot;The relationship between extremum statistics and
universal fluctuations,&quot; &lt;a
href=&quot;http://arxiv.org/abs/cond-mat/0007275&quot;&gt;cond-mat/0007275&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Jiahua Chen and Xianming Tan, &quot;Inference for Multivariate Normal
Mixtures&quot;, &lt;a href=&quot;http://arxiv.org/abs/0805.3906&quot;&gt;arxiv:0805.3906&lt;/a&gt; [Little
of the machinery looks like it depends on using mixtures of &lt;em&gt;normals&lt;/em&gt;]
	&lt;li&gt;Xiaohong Chen, Markus Reiss, &quot;On rate optimality for ill-posed
inverse problems in
econometrics&quot;, &lt;a href=&quot;http://arxiv.org/abs/0709.2003&quot;&gt;arxiv:0709.2003&lt;/a&gt;
[Non-parametric instrumental variables?]
	&lt;li&gt;Xinjia Chen, &quot;Sequential Tests of Statistical Hypotheses with Confidence Limits&quot;, &lt;a href=&quot;http://arxiv.org/abs/1007.4278&quot;&gt;arxiv:1007.4278&lt;/a&gt;
	&lt;li&gt;N. N. Chentsov, &lt;citE&gt;Statistical Decision Rules and Optimal
Inference&lt;/cite&gt;
	&lt;li&gt;Zhiyi Chi, &quot;Effects of statistical dependence on multiple testing under a hidden Markov model&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1297779853&quot;&gt;&lt;cite&gt;Annals of Statistics&lt;/cite&gt;
&lt;strong&gt;39&lt;/strong&gt; (2011): 439--473&lt;/a&gt;
	&lt;li&gt;Bertrand Clarke, &quot;Desiderata for a Predictive Theory of Statistics&quot;,
&lt;a href=&quot;http://ba.stat.cmu.edu/abstracts/Clarke.php&quot;&gt;&lt;cite&gt;Bayesian Analysis&lt;/cite&gt; &lt;strong&gt;5&lt;/strong&gt; (2010): 1--36&lt;/a&gt;
	&lt;li&gt;Sandy Clarke, Peter Hall, &quot;Robustness of multiple testing procedures against dependence&quot;, &lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;37&lt;/strong&gt;
(2009): 332--358, &lt;a href=&quot;http://arxiv.org/abs/0903.0464&quot;&gt;arxiv:0903.0464&lt;/a&gt;
	&lt;li&gt;Arthur Cohen and Harold B. Sackrowitz, &quot;Decision theory results for
one-sided multiple comparison procedures&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0504505&quot;&gt;math.ST/0504505&lt;/a&gt; = &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053604000000968&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;33&lt;/strong&gt; (2005): 126--144&lt;/a&gt;
	&lt;li&gt;Arthur Cohen, Harold B. Sackrowitz, Minya Xu, &quot;A new multiple
testing method in the dependent
case&quot;, &lt;a href=&quot;http://arxiv.org/abs/0906.3082&quot;&gt;arxiv:0906.3082&lt;/a&gt;
= &lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;37&lt;/strong&gt; (2009) 1518--1544
	&lt;li&gt;Daniel Commenges, &quot;Statistical models: Conventional, penalized and hierarchical likelihood&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ssu/1239113309&quot;&gt;&lt;cite&gt;Statistics Surveys&lt;/cite&gt; &lt;strong&gt;3&lt;/strong&gt;
(2009): 1--17&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/0808.4042&quot;&gt;arxiv:0808.4042&lt;/a&gt;
	&lt;li&gt;Daniel Commenges, Helene Jacqmin-Gadda, Cecile Proust, and Jeremie
Guedj, &quot;A Newton-Like Algorithm for Likelihood Maximization: The
Robust-Variance Scoring
Algorithm&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.ST/0610402&quot;&gt;math.ST/0610402&lt;/a&gt;
	&lt;li&gt;Cox and Wermuth, &lt;cite&gt;Multivariate Dependencies: Models, Analysis and Interpretation&lt;/cite&gt;
	&lt;li&gt;Anirban DasGupta, &lt;cite&gt;Asymptotic Theory of Statistics and
Probability&lt;/cite&gt; [&lt;a href=&quot;http://www.springer.com/978-0-387-75970-8&quot;&gt;Blurb&lt;/a&gt;]
	&lt;li&gt;Alexandre d'Aspremont, Onureena Banerjee, Laurent El Ghaoui,
&quot;First-order methods for sparse covariance
selection&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.OC/0609812&quot;&gt;math.OC/0609812&lt;/a&gt;
[&quot;Given a sample covariance matrix, we solve a maximum likelihood problem
penalized by the number of nonzero coefficients in the inverse covariance
matrix. Our objective is to find a sparse representation of the sample data and
to highlight conditional independence relationships between the sample
variables.&quot;]
	&lt;li&gt;I. Dattner, A. Goldenshluger, A. Juditsky, &quot;On deconvolution of distribution functions&quot;, &lt;a href=&quot;http://arxiv.org/abs/1006.3918&quot;&gt;arxiv:1006.3918&lt;/a&gt; [&quot;nonparametric estimation of a continuous distribution function from observations with measurement errors...  rate optimal estimators based on direct inversion of empirical characteristic function&quot;]
	&lt;li&gt;P. L. Davies
		&lt;ul&gt;
		&lt;li&gt;&quot;Data Features&quot;, &lt;cite&gt;Statistica Neerlandica&lt;/cite&gt; &lt;strong&gt;49&lt;/strong&gt; (1995): 185--245
		&lt;li&gt;&quot;Approximating Data&quot;, &lt;cite&gt;Journal of the Korean Statistical
Society&lt;/cite&gt; &lt;strong&gt;37&lt;/strong&gt; (2008): 191--211 [With discussion
and rejoinder.  &lt;a href=&quot;http://ocean.kisti.re.kr/IS_mvpopo212L.do?method=list&amp;poid=kss&amp;kojic=GCGHB5&amp;sVnc=v37n3&amp;sFree=&quot;&gt;Open access?&lt;/a&gt;]
		&lt;/ul&gt;
	&lt;li&gt;P. L. Davies, A. Kovac and M. Meise, &quot;Nonparametric Regression, Confidence Regions and Regularization&quot;, &lt;a href=&quot;http://arxiv.org/abs/0711.0690&quot;&gt;arxiv:0711.0690&lt;/a&gt;
	&lt;li&gt;A. Philip Dawid, Steven de Rooij, Glenn Shafer, Alexander Shen,
Nikolai Vereshchagin, Vladimir Vovk, &quot;Martingales and p-values as measures of
evidence&quot;, &lt;a href=&quot;http://arxiv.org/abs/0912.4269&quot;&gt;arxiv:0912.4269&lt;/a&gt;
	&lt;li&gt;Aurore Delaigle, Peter Hall and Jiashun Jin, &quot;Robustness and accuracy of methods for high dimensional data analysis based on Student's t-statistic&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1111/j.1467-9868.2010.00761.x&quot;&gt;&lt;cite&gt;Journal of the Royal Statistical Society&lt;/cite&gt; B &lt;strong&gt;forthcoming&lt;/strong&gt; (2011)&lt;/a&gt;
	&lt;li&gt;Joshua V Dillon, Guy Lebanon, &quot;Stochastic Composite Likelihood&quot;, &lt;a href=&quot;http://jmlr.csail.mit.edu/papers/v11/dillon10a.html&quot;&gt;&lt;cite&gt;Journal
of Machine Learning Research&lt;/cite&gt; &lt;strong&gt;11&lt;/strong&gt; (2010): 2597--2633&lt;/a&gt;, apparently the final version of &lt;a href=&quot;http://arxiv.org/abs/1003.0691&quot;&gt;arxiv:1003.0691&lt;/a&gt;
	&lt;li&gt;David L. Donoho, &quot;Estimation by epsilon-nets&quot; (Le Cam
Lecture, 2003; find citation)
	&lt;li&gt;David L. Donoho and Richard C. Liu, &quot;The ``Automatic'' Robustness
of Minimum Distance
Functionals&quot;, &lt;a href=&quot;http://dx.doi.org/10.1214/aos/1176350820&quot;&gt;&lt;cite&gt;Annals
of Statistics&lt;/cite&gt; &lt;strong&gt;16&lt;/strong&gt; (1988): 552--586&lt;/a&gt;
	&lt;li&gt;David L. Donoho and Jared Tanner, &quot;Observed Universality of Phase
Transitions in High-Dimensional Geometry, with Implications for Modern Data
Analysis and Signal
Processing&quot;, &lt;a href=&quot;http://arxiv.org/abs/0906.2530&quot;&gt;arxiv:0906.2530&lt;/a&gt;
	&lt;li&gt;Mathias Drton, &quot;Likelihood ratio tests and singularities&quot;,
&lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;37&lt;/strong&gt; (2009): 979--1012,
&lt;a href=&quot;http://arxiv.org/abs/math.ST/0703360&quot;&gt;arxiv:math.ST/0703360&lt;/a&gt;
	&lt;li&gt;Mathias Drton and Seth Sullivant, &quot;Algebraic statistical
models&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.ST/0703609&quot;&gt;math.ST/0703609&lt;/a&gt;
	&lt;li&gt;Jin-Chuan Duan and Andras Fulop, &quot;A stable estimator of the
information matrix under EM for dependent data&quot;, &lt;a href=&quot;http://dx.doi.org/10.1007/s11222-009-9149-4&quot;&gt;&lt;cite&gt;Statistics and Computing&lt;/cite&gt; &lt;strong&gt;21&lt;/strong&gt; (2011): 83--91&lt;/a&gt;
	&lt;li&gt;Morris L. Eaton, &lt;cite&gt;Multivariate Statistics: A Vector Space
Approach&lt;/cite&gt; [&quot;The purpose of this book is to present a version of
multivariate statistical theory in which vector space and invariance methods
replace, to a large extent, more traditional multivariate methods. The book is
a text. Over the past ten years, various versions have been used for graduate
multivariate courses at the University of Chicago, the University of
Copenhagen, and the University of Minnesota. Designed for a one year lecture
course or for independent study, the book contains a full complement of
problems and problem solutions.&quot;
&lt;a href=&quot;http://projecteuclid.org/euclid.lnms/1196285102&quot;&gt;Full text free
online&lt;/a&gt;.]
	&lt;li&gt;Sam Efromovich
		&lt;ul&gt;
		&lt;li&gt;&quot;Distribution estimation for biased data&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1016/S0378-3758(03)00202-7&quot;&gt;&lt;cite&gt;Journal of
Statistical Planning and Inference&lt;/cite&gt; &lt;strong&gt;124&lt;/strong&gt; (2004):
1--43&lt;/a&gt;
		&lt;li&gt;&quot;Dimension Reduction and Adaptation in Conditional Density
Estimation&quot;, &lt;a href=&quot;http://dx.doi.org/10.1198/jasa.2010.tm09426&quot;&gt;&lt;cite&gt;Journal
of the American Statistical Association&lt;/cite&gt; &lt;strong&gt;105&lt;/strong&gt; (2010):
761--774&lt;/a&gt;
		&lt;li&gt;&lt;cite&gt;Nonparametric Curve Estimation&lt;/cite&gt;
		&lt;/ul&gt;
	&lt;li&gt;Bradley Efron, &quot;Size, power and false discovery rates&quot;,
&lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;35&lt;/strong&gt; (2007): 1351--1377,
&lt;a href=&quot;http://arxiv.org/abs/0710.2245&quot;&gt;arxiv:0710.2245&lt;/a&gt;
	&lt;li&gt;Werner Ehm, J&amp;uuml;rgen Kornmeier, and Sven P. Heinrich, &quot;Multiple testing along a tree&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ejs/1274794929&quot;&gt;&lt;cite&gt;Electronic Journal of Statistics&lt;/cite&gt;
&lt;strong&gt;4&lt;/strong&gt; (2010): 461--471&lt;/a&gt; =? &lt;a href=&quot;http://arxiv.org/abs/0902.2296&quot;&gt;arxiv:0902.2296&lt;/a&gt;
	&lt;li&gt;Michael Evans and Gun Ho Jang, &quot;Invariant P-values for model checking&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1262271622&quot;&gt;&lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;38&lt;/strong&gt;
(2010): 512--525&lt;/a&gt;
	&lt;li&gt;S. N. Evans and P. B. Stark, &quot;Inverse Problems as Statistics&quot;
[&lt;a href=&quot;http://www.stat.berkeley.edu/tech-reports/609.abstract&quot;&gt;Abstract&lt;/a&gt;, &lt;a href=&quot;http://www.stat.berkeley.edu/~stark/Preprints/609.pdf&quot;&gt;PDF&lt;/a&gt;]
	&lt;li&gt;Jianqing Fan and Jian Zhang, &quot;Sieve empirical likelihood ratio
tests for nonparametric functions&quot;, &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053604000000210&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt;
&lt;strong&gt;32&lt;/strong&gt; (2004): 1858--1907&lt;/a&gt; = &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0503667&quot;&gt;math.ST/0503667&lt;/a&gt;
	&lt;li&gt;Thomas S. Ferguson, &lt;cite&gt;A Course in Large Sample Theory&lt;/cite&gt;
	&lt;li&gt;Jean-David Fermanian and Bernard Salani&amp;eacute; &quot;A Nonparametric
Simulated Maximum Likelihood Estimation Method&quot;, &lt;a
href=&quot;http://dx.doi.org/10.1017/S0266466604204054&quot;&gt;&lt;cite&gt;Econometric
Theory&lt;/citE&gt; &lt;strong&gt;20&lt;/strong&gt; (2004): 701--734&lt;/a&gt;
	&lt;li&gt;Ana K. Fermin and Carenne Ludena, &quot;A Statistical view of Iterative
Methods for Linear Inverse Problems&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0504064&quot;&gt;math.ST/0504064&lt;/a&gt;
	&lt;li&gt;Luisa Turrin Fernholz, &lt;citE&gt;von Mises Calculus for Statistical Functionals&lt;/cite&gt;
	&lt;li&gt;S. E. Fienberg, P. Hersh, A. Rinaldo and Y. Zhou, &quot;Maximum Likelihood Estimation in Latent Class Models For Contingency Table Data&quot;, &lt;a href=&quot;http://arxiv.org/abs/0709.3535&quot;&gt;arxiv:0709.3535&lt;/a&gt;
	&lt;li&gt;D. A. S. Fraser, N. Reid, E. Marras and G. Y. Yi, &quot;Default priors for Bayesian and frequentist inference&quot;, &lt;a href=&quot;http://dx.doi.org/10.1111/j.1467-9868.2010.00750.x&quot;&gt;&lt;cite&gt;Journal of the
Royal Statistical Society B&lt;/cite&gt; &lt;strong&gt;72&lt;/strong&gt; (2010): 631--654&lt;/a&gt;
	&lt;li&gt;A. Fraysse, &quot;Why minimax is not that pessimistic&quot;, &lt;a href=&quot;http://arxiv.org/abs/0902.3311&quot;&gt;arxiv:0902.3311&lt;/a&gt; [Because, apparently, learning
a generic function is just as hard as minimax leads you to think.  Bummer if
true.]
	&lt;li&gt;Magalie Fromont and B&amp;eacute;atrice Laurent, &quot;Adaptive
goodness-of-fit tests in a density model&quot;, &lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;34&lt;/strong&gt; (2006): 680--720
= &lt;a href=&quot;http://arxiv.org/abs/math.ST/0607013&quot;&gt;math.ST/0607013&lt;/a&gt;
	&lt;li&gt;Kenji Fukumizu, Le Song, Arthur Gretton, &quot;Kernel Bayes' rule&quot;,
&lt;a href=&quot;http://arxiv.org/abs/1009.5736&quot;&gt;arxiv:1009.5736&lt;/a&gt;
[&quot;A kernel method is proposed for realizing Bayes' rule, based on representations of probability distributions in reproducing kernel Hilbert spaces (RKHS). The empirical RKHS embeddings of the conditional probabilities and prior are expressed as feature mappings of samples, and an RKHS embedding of the posterior distribution is computed, again based on a feature mapping of a sample.&quot;]
	&lt;li&gt;Surya Ganguli and Haim Sompolinsky, &quot;Statistical Mechanics of
Compressed
Sensing&quot;, &lt;a href=&quot;http://dx.doi.org/10.1103/PhysRevLett.104.188701&quot;&gt;&lt;cite&gt;Physical
Review Letters&lt;/cite&gt; &lt;strong&gt;104&lt;/strong&gt; (2010): 188701&lt;/a&gt;
	&lt;li&gt;Seymour Geisser, &lt;cite&gt;Predictive Inference&lt;/cite&gt;
	&lt;li&gt;Christopher R. Genovese and Larry Wasserman
		&lt;ul&gt;
		&lt;li&gt;&quot;Confidence sets for
nonparametric wavelet regression&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0505632&quot;&gt;math.ST/0505632&lt;/a&gt; = &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053605000000011&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;33&lt;/strong&gt; (2005): 698--729&lt;/a&gt;
		&lt;li&gt;&quot;Adaptive Confidence
Bands&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.ST/0701513&quot;&gt;math.ST/0701513&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Christopher Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman, &quot;Minimax Manifold Estimation&quot;, &lt;a href=&quot;http://arxiv.org/abs/1007.0549&quot;&gt;arxiv:1007.0549&lt;/a&gt;
	&lt;li&gt;Josep Ginebra, &quot;On the Measure of the Information in a Statistical
Experiment&quot;, &lt;a
href=&quot;http://ba.stat.cmu.edu/journal/2007/vol02/issue01/ginebra.pdf&quot;&gt;&lt;cite&gt;Bayesian
Analysis&lt;/cite&gt; &lt;strong2&lt;/strong&gt; (2007): 167--212&lt;/a&gt;
	&lt;li&gt;Tilmann Gneiting, Fadoua Balabdaoui and Adrian E. Raftery,
&quot;Probabilistic forecasts, calibration and sharpness&quot;, &lt;a href=&quot;10.1111/j.1467-9868.2007.00587.x&quot;&gt;&lt;cite&gt;Journal
of the Royal Statistical Society&lt;/cite&gt; &lt;strong&gt;69&lt;/strong&gt; (2007): 243--268&lt;/a&gt;
	&lt;li&gt;Yuri Golubev, Vladimir Spokoiny, &quot;Exponential bounds for minimum contrast estimators&quot;, &lt;a href=&quot;http://arxiv.org/abs/0901.0655&quot;&gt;arxiv:0901.0655&lt;/a&gt;
	&lt;li&gt;Grassberger and Nadal (eds.), &lt;cite&gt;From Statistical Physics to
Statistical Inference and Back&lt;/cite&gt;
	&lt;li&gt;Ulf Grenander, &lt;cite&gt;Abstract Inference&lt;/cite&gt;
	&lt;li&gt;Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, Alexander Smola, &quot;A Kernel Two-Sample Test&quot;, &lt;a href=&quot;http://jmlr.csail.mit.edu/papers/v13/gretton12a.html&quot;&gt;&lt;cite&gt;Journal of Machine Learning Research&lt;/cite&gt; &lt;strong&gt;13&lt;/strong&gt; (2012): 723--773&lt;/a&gt;
	&lt;li&gt;Arthur Gretton, L&amp;aacute;szl&amp;oacute; Gy&amp;ouml;rfi, &quot;Consistent
Nonparametric Tests of
Independence&quot;, &lt;a href=&quot;http://jmlr.csail.mit.edu/papers/v11/gretton10a.html&quot;&gt;&lt;cite&gt;Journal
of Machine Learning Research&lt;/cite&gt; &lt;strong&gt;11&lt;/strong&gt; (2010): 1391--1423&lt;/a&gt;
	&lt;li&gt;Robert Hable, &quot;Asymptotic Normality of Support Vector Machines for Classification and Regression&quot;, &lt;a href=&quot;http://arxiv.org/abs/1010.0535&quot;&gt;arxiv:1010.0535&lt;/a&gt;
	&lt;li&gt;Peter Hall, Hans-Georg M&amp;uuml;ller, Fang Yao, &quot;Estimation of functional derivatives&quot;, &lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;37&lt;/strong&gt;
(2009): 3307--3329, &lt;a href=&quot;http://arxiv.org/abs/0909.1157&quot;&gt;arxiv:0909.1157&lt;/a&gt;
	&lt;li&gt;Marc Hallin, Davy Paindaveine, and Miroslav Siman, &quot;Multivariate
quantiles and multiple-output regression quantiles: From L1 optimization to
halfspace
depth&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1266586607&quot;&gt;&lt;cite&gt;Annals
of Statistics&lt;/citE&gt; &lt;strong&gt;38&lt;/strong&gt; (2010): 635--669&lt;/a&gt; [with discussion]
	&lt;li&gt;Wolfgang H&amp;auml;rdle, Marlene M&amp;uuml;ller, Stefan Sperlich and
Axel Werwatz, &lt;cite&gt;Nonparametric and Semiparametric Models: An
Introduction&lt;/cite&gt; [&lt;a href=&quot;http://fedc.wiwi.hu-berlin.de/xplore/ebooks/html/spm/&quot;&gt;Full text online&lt;/a&gt;]
	&lt;li&gt;Matthew T. Harrison, &quot;Valid p-Values using Importance Sampling&quot;, &lt;a href=&quot;http://arxiv.org/abs/104.2910&quot;&gt;arxiv:104.2910&lt;/a&gt;
	&lt;li&gt;Heng Lian, &quot;Empirical Likelihood Confidence Intervals for Nonparametric Functional Data Analysis&quot;, &lt;a href=&quot;http://arxiv.org/abs/0904.0843&quot;&gt;arxiv:0904.0843&lt;/a&gt;
	&lt;li&gt;David A. Hensher et al. &lt;cite&gt;Applied Choice Analysis: A
Primer&lt;/cite&gt; [&quot;Application of quantitative statistical methods to study
choices made by individuals. This primer provides an introduction to the main
techniques of choice analysis and also includes details on data collection and
preparation, model estimation and interpretation and the design of choice
experiments&quot;. &lt;a href=&quot;http://cambridge.org/0521605776&quot;&gt;Full blurb&lt;/a&gt;]
	&lt;li&gt;Tim Hesterberg, Nam Hee Choi, Lukas Meier, Chris Fraley, &quot;Least angle and $\ell_1$ penalized regression: A review&quot;, &lt;cite&gt;Statistics Surveys&lt;/cite&gt;
&lt;strong&gt;2&lt;/strong&gt; (2008): 61--93, &lt;a href=&quot;http://arxiv.org/abs/0802.0964&quot;&gt;arxiv:0802.0964&lt;/a&gt;
	&lt;li&gt;Nils Lid Hjort, Ian W. McKeague, Ingrid Van Keilegom, &quot;Extending
the scope of empirical
likelihood&quot;, &lt;a href=&quot;http://arxiv.org/abs/0904.2949&quot;&gt;arxiv:0904.2949&lt;/a&gt;
= &lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;37&lt;/strong&gt; (2009): 1079--1111
	&lt;li&gt;Peter D. Hoff, &quot;A hierarchical eigenmodel for pooled covariance estimation&quot;, &lt;a href=&quot;http://dx.doi.org/10.1111/j.1467-9868.2009.00716.x&quot;&gt;&lt;citE&gt;Journal of the Royal
Statistical Society&lt;/cite&gt; B &lt;strong&gt;71&lt;/strong&gt; (2009): 971--992&lt;/a&gt;
	&lt;li&gt;Marc Hoffmann and Richard Nickl, &quot;On adaptive inference and confidence bands&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1322663462&quot;&gt;&lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;39&lt;/strong&gt; (2011): 2383--2409&lt;/a&gt;
	&lt;li&gt;Joel L. Horowitz, &lt;citE&gt;Semiparametric and Nonparametric Methods in
Econometrics&lt;/cite&gt;
[&lt;a href=&quot;http://www.springer.com/978-0-387-92869-2&quot;&gt;blurb&lt;/a&gt;]
	&lt;li&gt;Serkan Hosten, Amit Khetan and Bernd Sturmfels, &quot;Solving the
Likelihood Equations&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0408270&quot;&gt;math.ST/0408270&lt;/a&gt;
	&lt;li&gt;Tzee-Ming Huang, &quot;Testing conditional independence using
maximal nonlinear conditional correlation&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1278861242&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;38&lt;/strong&gt; (2010): 2047--2091&lt;/a&gt;
	&lt;li&gt;Peter J. Huber, &quot;The behavior of maximum likelihood estimates under nonstandard conditions&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.bsmsp/1200512988&quot;&gt;&lt;cite&gt;Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability&lt;/cite&gt;, Vol. 1 (Univ. of Calif. Press, 1967), pp. 221-233&lt;/a&gt;
	&lt;li&gt;Mia Hubert, Peter J. Rousseeuw, Stefan Van Aelst, &quot;High-Breakdown Robust Multivariate Methods&quot;, &lt;cite&gt;Statistical Science&lt;/cite&gt; &lt;strong&gt;23&lt;/strong&gt; (2008): 92--119, &lt;a href=&quot;http://arxiv.org/abs/0808.0657&quot;&gt;arxiv:0808.0657&lt;/a&gt;
	&lt;li&gt;Alexander Ilin, Tapani Raiko, &quot;Practical Approaches to Principal Component Analysis in the Presence of Missing Values&quot;, &lt;a href=&quot;http://jmlr.csail.mit.edu/papers/v11/ilin10a.html&quot;&gt;&lt;cite&gt;Journal
of Machine Learning Research&lt;/cite&gt; &lt;strong&gt;11&lt;/strong&gt; (2010): 1957--2000&lt;/a&gt;
	&lt;li&gt;Stefano M. Iacus and Davide La Torre
	     &lt;ul&gt;
	     &lt;li&gt;&quot;Approximating Distribution Functions by Iterated Function
Systems,&quot; &lt;a href=&quot;http://arxiv.org/abs/math.PR/0111152&quot;&gt;math.PR/0111152&lt;/a&gt;
	     &lt;li&gt;&quot;Nonparametric estimation of distribution and density
functions in presence of missing data: an IFS approach,&quot;
&lt;a href=&quot;http://arxiv.org/abs/math.PR/0302016&quot;&gt;math.PR/0302016&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Ian H. Jermyn, &quot;Invariant Bayesian estimation on manifolds&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0506296&quot;&gt;math.ST/0506296&lt;/a&gt; = &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053604000001273&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;33&lt;/strong&gt; (2005): 583--605&lt;/a&gt;
	&lt;li&gt;Adam M. Johansen, Arnaud Doucet and Manuel Davy, &quot;Particle methods for maximum likelihood estimation in latent variable models&quot;, &lt;a href=&quot;http://dx.doi.org/10.1007/s11222-007-9037-8&quot;&gt;&lt;cite&gt;Statistics and Computing&lt;/cite&gt; &lt;strong&gt;18&lt;/strong&gt;
(2008) : 47--57&lt;/a&gt;
	&lt;li&gt;Ana Justel, Daniel Pena, Ruben Zamar, &quot;A multivariate
Kolmogorov-Smirnov test of goodness of fit&quot;, &lt;a href=&quot;http://dx.doi.org/10.1016/S0167-7152(97)00020-5&quot;&gt;&lt;CitE&gt;Statistics and Probability
Letters&lt;/citE&gt; &lt;strong&gt;35&lt;/strong&gt; (1997): 251--259&lt;/a&gt; [&lt;a href=&quot;http://halweb.uc3m.es/esp/Personal/personas/dpena/articles/kolpl97.pdf&quot;&gt;PDF reprint&lt;/a&gt;
via Prof. Pena]
	&lt;li&gt;Paul Kabaila and Kreshna Syuhada, &quot;The Asymptotic Efficiency of
Improved Prediction Intervals&quot;, &lt;a href=&quot;http://arxiv.org/abs/0901.1911&quot;&gt;arxiv:0901.1911&lt;/a&gt;
	&lt;li&gt;Ata Kaban, &quot;Non-parametric detection of meaningless distances
in high dimensional data&quot;, &lt;a href=&quot;http://dx.doi.org/10.1007/s11222-011-9229-0&quot;&gt;&lt;cite&gt;Statistics and Computing&lt;/cite&gt; &lt;strong&gt;22&lt;/strong&gt; (2011): 375--385&lt;/a&gt;
	&lt;li&gt;Sham Kakade, Ohad Shamir, Karthik Sindharan, Ambuj Tewari, &quot;Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity&quot;, &lt;a href=&quot;http://jmlr.csail.mit.edu/proceedings/papers/v9/kakade10a.html&quot;&gt;&lt;cite&gt;Journal of Machine Learning Research&lt;/cite&gt; Proceedings &lt;strong&gt;9&lt;/strong&gt; (2010): 381--388&lt;/a&gt;
	&lt;li&gt;Oscar Kempthorne, &quot;The classical problem of inference--goodness of fit&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.bsmsp/1200512989&quot;&gt;Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability&lt;/cite&gt;, Vol. 1, 235--249&lt;/a&gt;
	&lt;li&gt;D. F. Kerridge, &quot;Inaccuracy and Inference&quot;, &lt;cite&gt;Journal of
the Royal Statistical Society B&lt;/cite&gt; &lt;strong&gt;23&lt;/strong&gt; (1961): 184--194
	&lt;li&gt;Yuichi Kitamura, &quot;Empirical likelihood methods in econometrics: Theory and Practice&quot;, &lt;a href=&quot;http://cowles.econ.yale.edu/P/cd/d15b/d1569.pdf&quot;&gt;Cowles Foundation Discussion Paper No. 1569&lt;/a&gt; (2006)
	&lt;li&gt;Ioannis Kontoyiannis and S. P. Meyn, &quot;Computable exponential bounds for screened estimation and simulation&quot;, &lt;cite&gt;Annals of Applied Probability&lt;/cite&gt; &lt;strong&gt;18&lt;/strong&gt; (2008): 1491--1518, &lt;a href=&quot;http://arxiv.org/abs/math/0612040&quot;&gt;arxiv:math/0612040&lt;/a&gt;
	&lt;li&gt;Mikhail Kovtun, Igor Akushevich, Kenneth G. Manton and H. Dennis
Tolley
		&lt;ul&gt;
		&lt;li&gt;&quot;Linear Latent Structure Analysis: Mixture Distribution
Models with Linear Constraints&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.PR/0507025&quot;&gt;math.PR/0507025&lt;/a&gt;
		&lt;li&gt;&quot;A New Efficient Algorithm for Construction of LLS
Models&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.PR/0507021&quot;&gt;math.PR/0507021&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Jim Kuelbs and Anand N. Vidyashankar, &quot;Asymptotic inference for high-dimensional data&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1266586616&quot;&gt;&lt;cite&gt;Annals of Statistics&lt;/cite&gt;
&lt;strong&gt;38&lt;/strong&gt; (2010): 836--869&lt;/a&gt;
	&lt;li&gt;Solomon Kullback, &quot;Probability densities with given marginals,&quot;
&lt;cite&gt;Annals of Mathematical Statistics&lt;/cite&gt; &lt;strong&gt;39&lt;/strong&gt; (1968):
1236--1243
	&lt;li&gt;Masayuki Kumon and Akimichi Takemura, &quot;On a simple strategy weakly
forcing the strong law of large numbers in the bounded forecasting
game&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.PR/0508190&quot;&gt;math.PR/0508190&lt;/a&gt; [&quot;In
the framework of the game-theoretic probability of Shafer and Vovk (2001) ...
construct an explicit strategy weakly forcing the strong law of large numbers
(SLLN) in the bounded forecasting game. ...  simple finite-memory strategy
based on the past average of Reality's moves, which weakly forces the strong
law of large numbers with the convergence rate of $O(\sqrt{\log n/n})$.... We
show that if Reality violates SLLN, then the exponential growth rate of
Skeptic's capital process is explicitly described in terms of the Kullback
divergence between the average of Reality's moves when she violates SLLN and
the average when she observes SLLN.&quot;]
	&lt;li&gt;Tze Leung Lai, Shulamith T. Gross, and David Bo Shen, &quot;Evaluating probability forecasts&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1322663461&quot;&gt;&lt;citE&gt;Annals of Statistics&lt;/cite&gt;
&lt;strong&gt;39&lt;/strong&gt; (2011): 2356--2382&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/1202.5140&quot;&gt;arxiv:1202.5140&lt;/a&gt;
	&lt;li&gt;Mikhail Langovoy, &quot;Data-driven goodness-of-fit tests&quot;,
&lt;a href=&quot;http://arxiv.org/abs/0708.0169&quot;&gt;arxiv:0708.0169&lt;/a&gt;
	&lt;li&gt;Youngjo Lee and John A. Nelder, &quot;Likelihood Inference for Models with Unobservables: Another View&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ss/1270041252&quot;&gt;&lt;cite&gt;Statistical Science&lt;/citE&gt;
&lt;strong&gt;24&lt;/strong&gt; (2009): 255--269&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/1010.0303&quot;&gt;arxiv:1010.0303&lt;/a&gt; [with discussion and replies following]
	&lt;li&gt;E. L. Lehmann and Joseph P. Romano, &quot;Generalizations of the
Familywise Error Rate&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0507420&quot;&gt;math.ST/0507420&lt;/a&gt; = &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053605000000084&quot;&gt;&lt;citE&gt;Annals of
Statistics&lt;/citE&gt; &lt;strong&gt;33&lt;/strong&gt; (2005): 1138--1154&lt;/a&gt;
	&lt;li&gt;Matthieu Lerasle, &quot;Adaptive non-asymptotic confidence balls in density estimation&quot;, &lt;a href=&quot;http://arxiv.org/abs/1007.4528&quot;&gt;arxiv:1007.4528&lt;/a&gt;
	&lt;li&gt;M. Lerasle, R. I. Oliveira, &quot;Robust Empirical Mean Estimators&quot;,
&lt;a href=&quot;http://arxiv.org/abs/1112.3914&quot;&gt;arxiv:1112.3914&lt;/a&gt;
	&lt;li&gt;Feng Liang, Sayan Mukherjee, Mike West, &quot;The Use of Unlabeled Data
in Predictive
Modeling&quot;, &lt;a href=&quot;http://arxiv.org/abs/0710.4618&quot;&gt;arxiv:0710.4618&lt;/a&gt;
= &lt;cite&gt;Statistical Science&lt;/cite&gt; &lt;strong&gt;22&lt;/strong&gt; (2007): 189--205
	&lt;li&gt;Perry Liang, Francis Bach, Guillaume Bouchard and Michael
I. Jordan, &quot;Asymptotically Optimal Regularization in Smooth Parametric
Models&quot; [&lt;a href=&quot;http://www.cs.berkeley.edu/~jordan/papers/liang-etal-nips10.pdf&quot;&gt;PDF preprint&lt;/a&gt; via Prof. Jordan]
	&lt;li&gt;Victor S. L'vov, Anna Pomyalov and Itamar Procaccia, &quot;Outliers,
Extreme Events and Multiscaling,&quot; &lt;a
href=&quot;http://arxiv.org/abs/nlin.CD/0009049&quot;&gt;nlin.CD/0009049&lt;/a&gt;
	&lt;li&gt;Christian K. Machens, &quot;Adaptive sampling by information
maximization,&quot; &lt;a
href=&quot;http://arxiv.org/abs/physics/0112070&quot;&gt;physics/0112070&lt;/a&gt;
	&lt;li&gt;Robert Mariano, Til Schuermann and Melyvn J. Weeks
(eds.), &lt;cite&gt;Simulation-Based Inference in Econometrics: Methods and
Applications&lt;/cite&gt;
	&lt;li&gt;McCabe and Tremayne, &lt;cite&gt;Modern Asymptotic Theory&lt;/cite&gt;
	&lt;li&gt;P. McCullagh
		&lt;ul&gt;
		&lt;li&gt;&lt;cite&gt;Tensor Methods in Statistics&lt;/cite&gt;
		&lt;li&gt;&quot;What is a statistical
model?&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1035844977&quot;&gt;&lt;cite&gt;Annals
of Statistics&lt;/cite&gt; &lt;strong&gt;30&lt;/strong&gt; (2002): 1225--1310&lt;/a&gt; [With
commentary and rejoinder.  Not sure, from a superficial glance, if there's
really anything to this or it's just what one of my mentors calls &quot;algebraic
noodling&quot;.]
		&lt;/ul&gt;
	&lt;li&gt;Nicolai Meinshausen and John Rice, &quot;Estimating the proportion of
false null hypotheses among a large number of independently tested
hypotheses&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.ST/0501289&quot;&gt;math.ST/0501289&lt;/a&gt;
	&lt;li&gt;Alexander Meister, &lt;cite&gt;Deconvolution Problems in Nonparametric
Statistics&lt;/cite&gt; [&quot;e.g., density estimation based on contaminated data,
errors-in-variables regression, and image reconstruction&quot;.  &lt;a href=&quot;http://www.springer.com/978-3-540-87556-7&quot;&gt;Blurb&lt;/a&gt;]
	&lt;li&gt;Vladimir N. Minin, John D. O'Brien, Arseni Seregin, &quot;Empirically
corrected estimation of complete-data population summaries under model
misspecification&quot;, &lt;a href=&quot;http://arxiv.org/abs/0911.0930&quot;&gt;arxiv:0911.0930&lt;/a&gt;
	&lt;li&gt;David Mumford  and Agnes Desolneux, &lt;citE&gt;Pattern Theory:
The Stochastic Analysis of Real-World Signals&lt;/cite&gt;
	&lt;li&gt;Richard Nickl, &quot;Donsker-type theorems for nonparametric maximum
likelihood estimators&quot;, &lt;a
href=&quot;http://dx.doi.org/10.1007/s00440-006-0031-4&quot;&gt;&lt;cite&gt;Probability Theory and
Related Fields&lt;/cite&gt; &lt;strong&gt;138&lt;/strong&gt; (2007): 411--449&lt;/a&gt;
	&lt;li&gt;Andrey Novikov, &quot;Sequential multiple hypothesis testing in presence of control variables&quot;, &lt;cite&gt;Kybernetika&lt;/cite&gt; &lt;strong&gt;45&lt;/strong&gt;
(2009): 507--528, &lt;a href=&quot;http://arxiv.org/abs/0812.2712&quot;&gt;arxiv:0812.2712&lt;/a&gt;
	&lt;li&gt;Wojciech Olszewski, Alvaro Sandroni, &quot;A nonmanipulable test&quot;,
&lt;a href=&quot;http://arxiv.org/abs/0904.0338&quot;&gt;arxiv:0904.0338&lt;/a&gt; =
&lt;cite&gt;Annals of Statistics&lt;/citE&gt; &lt;strong&gt;37&lt;/strong&gt; (2009): 1013--1039
	&lt;li&gt;Giulio Palombo, &quot;Multivariate Goodness of Fit Procedures for Unbinned Data: An Annotated Bibliography&quot;, &lt;a href=&quot;http://arxiv.org/abs/1102.2407&quot;&gt;arxiv:1102.2407&lt;/a&gt;
	&lt;li&gt;Leandro Pardo, &lt;cite&gt;Statistical Inference Based on
Divergence Measures&lt;/cite&gt;
	&lt;li&gt;Donald A. Pierce and Dawn Peters, &quot;Improving on exact tests by
approximate conditioning&quot;, &lt;citE&gt;Biometrika&lt;/citE&gt; &lt;strong&gt;86&lt;/strong&gt; (1999):
265--277&lt;/a&gt;
[Via &lt;a
href=&quot;http://www.stat.columbia.edu/~gelman/stuff_for_blog/pierce.pdf&quot;&gt;Andrew
Gelman's blog&lt;/a&gt;; &lt;a
href=&quot;http://www.stat.columbia.edu/~gelman/stuff_for_blog/pierce.pdf&quot;&gt;PDF
reprint&lt;/a&gt;]
	&lt;li&gt;Tomaz Podobnik and Tomi Zivko, &quot;On Consistent and Calibrated
Inference about the Parameters of Sampling Distributions&quot;, &lt;a
href=&quot;http://arxiv.org/abs/physics/0508017&quot;&gt;physics/0508017&lt;/a&gt;
	&lt;li&gt;Thorsten Poeschel, Werner Ebeling, and Helge Rose, &quot;Guessing
probability distributions from small samples,&quot; &lt;a
href=&quot;http://arxiv.org/abs/cond-mat/0203467&quot;&gt;cond-mat/0203467&lt;/a&gt; =
&lt;cite&gt;Journal of Statistical Physics&lt;/cite&gt; &lt;strong&gt;80&lt;/strong&gt; (1995): 1443
	&lt;li&gt;David Pollard, &quot;Some thoughts on Le Cam's statistical decision theory&quot;, &lt;a href=&quot;http://arxiv.org/abs/1107.3811&quot;&gt;arxiv:1107.3811&lt;/a&gt;
	&lt;li&gt;Benedikt M. P&amp;ouml;tscher, &quot;Confidence Sets Based on
Sparse Estimators are Necessarily Large&quot;, &lt;a href=&quot;http://arxiv.org/abs/0711.1036&quot;&gt;arxiv:0711.1036&lt;/a&gt;
	&lt;li&gt;Joel Predd, Robert Seiringer, Elliott H. Lieb, Daniel Osherson, Vincent Poor, Sanjeev Kulkarni, &quot;Probabilistic coherence and proper scoring rules&quot;,
&lt;cite&gt;IEEE Transactions on Information Theory&lt;/cite&gt; &lt;strong&gt;55&lt;/strong&gt; (2009): 4786, &lt;a href=&quot;http://arxiv.org/abs/0710.3183&quot;&gt;arxiv:0710.3183&lt;/a&gt;
	&lt;li&gt;Ramsay and Silverman, &lt;cite&gt;Functional Data Analysis&lt;/cite&gt;
	&lt;li&gt;C. Radhakrishna Rao, &quot;Diversity: Its measurement, decomposition,
apportionment and analysis&quot;, &lt;cite&gt;Sankhya: The Indian Journal of
Statistics&lt;/cite&gt; &lt;strong&gt;44(A)&lt;/strong&gt; (1982): 1--22 [&lt;cite&gt;Sankhya&lt;/cite&gt; is
not in JSTOR!  Why is &lt;cite&gt;Sankhya&lt;/cite&gt; not in JSTOR?!?!]
	&lt;li&gt;J. C. W. Rayner and D. J. Best, &lt;cite&gt;Smooth Tests of Goodness of
Fit&lt;/cite&gt;
	&lt;li&gt;R.-D. Reiss and M. Thomas, &lt;cite&gt;Statistical Analysis of Extreme
Values: With Applications to Insurance, Finance, Hydrology and Other
Fields&lt;/cite&gt;
	&lt;li&gt;Irina Rish, &lt;cite&gt;Sparse Modeling; Theory, Algorithms, and
Applications&lt;/cite&gt;
	&lt;li&gt;James Robins and Aad van der Vaart, &quot;Adaptive
nonparametric confidence sets&quot;, &lt;cite&gt;Annals of Statistics&lt;/cite&gt;
&lt;strong&gt;34&lt;/strong&gt; (2006):
229--253, &lt;a href=&quot;http://arxiv.org/abs/math/0605473&quot;&gt;arxiv:math/0605473&lt;/a&gt;
[&quot;We construct honest confidence regions for a Hilbert space-valued parameter
in various statistical models. The confidence sets can be centered at arbitrary
adaptive estimators, and have diameter which adapts optimally to a given
selection of models.&quot;]
	&lt;li&gt;Sylvain Rubenthaler, Tobias Ryden and Magnus Wiktorsson, &quot;Fast
simulated annealing in $\R^d$ and an application to maximum likelihood
estimation&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.PR/0609353&quot;&gt;math.PR/0609353&lt;/a&gt;
	&lt;li&gt;Birgit Rudloff, Ioannis Karatzas, &quot;Testing composite hypotheses via convex duality&quot;, &lt;cite&gt;Bernoulli&lt;/cite&gt; &lt;strong&gt;16&lt;/strong&gt; (2010): 1224--1239,
&lt;a href=&quot;http://arxiv.org/abs/0809.4297&quot;&gt;arxiv:0809.4297&lt;/a&gt;
	&lt;li&gt;Emilio Seijo and Bodhisattva Sen, &quot;A continuous mapping theorem for the smallest argmax functional&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ejs/1305034909&quot;&gt;&lt;cite&gt;Electronic Journal of Statistics&lt;/cite&gt;
&lt;strong&gt;5&lt;/strong&gt; (2011): 421--439&lt;/a&gt;
	&lt;li&gt;Galit Shmueli, &quot;To Explain or to Predict?&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ss/1294167961&quot;&gt;Statistical
Science&lt;/cite&gt; &lt;strong&gt;25&lt;/strong&gt; (2010): 289--310&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/1101.0891&quot;&gt;arxiv:1101.0891&lt;/a&gt;
	&lt;li&gt;Ricardo Silva, Robert B. Gramacy, &quot;Gaussian Process Structural Equation Models with Latent Variables&quot;, &lt;a href=&quot;http://arxiv.org/abs/1002.4802&quot;&gt;arxiv:1002.4802&lt;/a&gt; [Heard the talk at UAI 2010, but I want the details.]
	&lt;li&gt;Kesar Singh, Minge Xie, William E. Strawderman, &quot;Confidence
distribution (CD) -- distribution estimator of a
parameter&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.lnms/1196794948&quot;&gt;pp. 132--150
in Regina Liu, William Strawderman and Cun-Hui Zhang (eds.), &lt;cite&gt;Complex
Datasets and Inverse Problems: Tomography, Networks and Beyond&lt;/cite&gt;&lt;/a&gt;
	&lt;li&gt;Jascha Sohl-Dickstein, Peter Battaglino, Michael R. DeWeese, &quot;Minimum Probability Flow Learning&quot;, &lt;a href=&quot;http://arxiv.org/abs/0906.4779&quot;&gt;arxiv:0906.4779&lt;/a&gt;
	&lt;li&gt;Christopher G. Small, &lt;cite&gt;The Statistical Theory of Shape&lt;/cite&gt;
 	&lt;li&gt;Aris Spanos
		&lt;ul&gt;
		&lt;li&gt;&quot;Revisiting the Omitted Variables Argument: 
Substantive vs. Statistical Adequacy&quot; [&lt;a href=&quot;http://www.error06.econ.vt.edu/Spanosb.pdf&quot;&gt;PDF preprint&lt;/a&gt;]
		&lt;li&gt;&quot;Is Frequentist Testing Vulnerable to the Base-Rate Fallacy?&quot;, &lt;a href=&quot;http://dx.doi.org/10.1086/656009&quot;&gt;&lt;cite&gt;Philosophy of Science&lt;/cite&gt; &lt;strong&gt;77&lt;/strong&gt;
(2010): 565--583&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Vladimir Spokoiny
		&lt;ul&gt;
		&lt;li&gt;&quot;A penalized exponential risk bound in parametric estimation&quot;, &lt;a href=&quot;http://arxiv.org/abs/0903.1721&quot;&gt;arxiv:0903.1721&lt;/a&gt;
		&lt;li&gt;&quot;Parametric estimation. Finite sample theory&quot;, &lt;a href=&quot;http://arxiv.org/abs/1111.3029&quot;&gt;arxiv:1111.3029&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Pablo Sprechmann, Ignacio Ram&amp;iacute;rez, Guillermo Sapiro, Yonina Eldar, &quot;C-HiLasso: A Collaborative Hierarchical Sparse Modeling Framework&quot;, &lt;a href=&quot;http://arxiv.org/abs/1006.1346&quot;&gt;arxiv:1006.1346&lt;/a&gt;
	&lt;li&gt;Johan A.K. Suykens, Carlos Alzate, and Kristiaan Pelckmans, &quot;Primal and dual model representations in kernel-based learning&quot;, 
&lt;a href=&quot;http://projecteuclid.org/euclid.ssu/1282746475&quot;&gt;&lt;cite&gt;Statistics Surveys&lt;/cite&gt; &lt;strong&gt;4&lt;/strong&gt; (2010):
148--183&lt;/a&gt;
	&lt;li&gt;G&amp;aacute;bor J. Sz&amp;eacute;kely and Maria L. Rizzo, &quot;Brownian
Distance
Covariance&quot;, &lt;a href=&quot;http://dx.doi.org/10.1214/09-AOAS312&quot;&gt;&lt;cite&gt;Annals of
Applied Statistics&lt;/cite&gt; &lt;strong&gt;3&lt;/strong&gt; (2009): 1236--1265&lt;/a&gt;,
&lt;a href=&quot;http://arxiv.org/abs/1010.0297&quot;&gt;arxiv:1010.0297&lt;/a&gt;
	&lt;li&gt;Olivier Thas, &lt;cite&gt;Comparing Distributions&lt;/cite&gt; [mostly
about goodness-of-fit tests]
	&lt;li&gt;F. V. Tkachov, &quot;Quasi-optimal observables: Attaining the quality
of maximal likelihood in parameter estimation when only a MC event generator is
available,&quot; &lt;a href=&quot;http://arxiv.org/abs/physics/0108030&quot;&gt;physics/0108030&lt;/a&gt;
	&lt;li&gt;Mark J. van der Laan and Sherri Rose, &lt;cite&gt;Targeted Learning:
Causal Inference for Observational and Experimental Data&lt;/cite&gt; [&lt;a href=&quot;http://www.springer.com/book/978-1-4419-9781-4&quot;&gt;Blurb&lt;/a&gt;]
	&lt;li&gt;Guenther Walther, &quot;The Average Likelihood Ratio for Large-scale Multiple Testing and Detecting Sparse Mixtures&quot;, &lt;a href=&quot;http://arxiv.org/abs/1111.0328&quot;&gt;arxiv:1111.0328&lt;/a&gt;
	&lt;li&gt;Xiaogang Wang and James V. Zidek, &quot;Selecting likelihood weights by
cross-validation&quot;, &lt;a
href=&quot;http://arxiv.org/abs/math.ST/0505599&quot;&gt;math.ST/0505599&lt;/a&gt; = &lt;a
href=&quot;http://dx.doi.org/10%2E1214/009053604000001309&quot;&gt;&lt;cite&gt;Annals of
Statistics&lt;/cite&gt; &lt;strong&gt;33&lt;/strong&gt; (2005): 463--500&lt;/a&gt; [&quot;The (relevance)
weighted likelihood was introduced to formally embrace a variety of statistical
procedures that trade bias for precision. Unlike its classical counterpart, the
weighted likelihood combines all relevant information while inheriting many of
its desirable features including good asymptotic properties. However, in order
to be effective, the weights involved in its construction need to be
judiciously chosen. Choosing those weights is the subject of this article in
which we demonstrate the use of cross-validation. We prove the resulting
weighted likelihood estimator (WLE) to be weakly consistent and asymptotically
normal. An application to disease mapping data is demonstrated.&quot; Sounds
interesting...]
	&lt;li&gt;Fabian L. Wauthier, Michael I. Jordan, &quot;Heavy-Tailed Processes for Selective Shrinkage&quot;, &lt;a href=&quot;http://arxiv.org/abs/1006.3901&quot;&gt;arxiv:1006.3901&lt;/a&gt;
	&lt;li&gt;Holger Wendland, &lt;cite&gt;Scattered Data Approximation&lt;/cite&gt;
	&lt;li&gt;Halbert White, &lt;cite&gt;Asymptotic Theory for Econometricians&lt;/cite&gt; [Useful
source, it seems, for non-IID central limit theorems]
	&lt;li&gt;Christopher K. I. Williams, &quot;How to Pretend That Correlated
Variables Are Independent by Using Difference
Observations&quot;, &lt;a
href=&quot;http://neco.mitpress.org/cgi/content/abstract/17/1/1&quot;&gt;&lt;cite&gt;Neural
Computation&lt;/cite&gt; &lt;strong&gt;17&lt;/strong&gt; (2005): 1--6&lt;/a&gt; [&quot;In many areas of data
modeling, observations at different locations (e.g., time frames or pixel
locations) are augmented by differences of nearby observations.... These
augmented observations are then often modeled as being independent. How can
this make sense? We provide two interpretations, showing (1) that the
likelihood of data generated from an autoregressive process can be computed in
terms of 'independent' augmented observations and (2) that the augmented
observations can be given a coherent treatment in terms of the products of
experts model...&quot;]
	&lt;li&gt;Wei Biao Wu, &quot;On false discovery control under dependence&quot;, &lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;36&lt;/strong&gt; (2008): 364--380, &lt;a href=&quot;http://arxiv.org/abs/0903.1971&quot;&gt;arxiv:0903.1971&lt;/a&gt;
	&lt;/ul&gt;
</description>
  </item>
  </channel>
</rss>
