<?xml version="1.0"?>
<!-- name="generator" content="blosxom/2.0" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
  <channel>
    <title>Notebooks   </title>
    <link>http://bactra.org/notebooks</link>
    <description>Cosma's Notebooks</description>
    <language>en</language>

  <item>
    <title>Concentration of Measure</title>
    <link>http://bactra.org/notebooks/2011/12/26#concentration-of-measure</link>
    <description>
&lt;P&gt;Roughly speaking, one says that a probability measure is &quot;concentrated&quot; if
any set of positive probability can be expanded very slightly to
contain &lt;em&gt;most&lt;/em&gt; of the probability.  A classic example (which I learned
in &lt;a href=&quot;stat-mech.html&quot;&gt;statistical mechanics&lt;/a&gt;) is the uniform
distribution on the interior of unit spheres in high dimensions.  A little bit
of calculus shows that, as the number of dimensions grows, the ratio of the
surface area to the volume does too, so a shell of thin, constant thickness on
the interior of the sphere ends up containing almost all of its volume, and
overlaps heavily with any set of positive probability.  Equivalently, all
well-behaved functional of a concentrated measure are very close to their
expectation values with very high probability.  (This is equivalent because one
can specify a probability measure either through giving the probabilities of
sets in the sigma-field, or by giving the expectation values of a sufficiently
rich set of test functions, e.g., all bounded and continuous functions.  [At
least, bounded and continuous functions are enough for the weak topology on
measures, but if you understand that qualification then you know what I'm
saying anyway.])

&lt;P&gt;In applications to probability, one is typically concerned with
&quot;concentration bounds&quot;, which are statements of the order of &quot;for samples of
size &lt;i&gt;n&lt;/i&gt;, the probability that any function &lt;i&gt;f&lt;/i&gt; in a well-behaved
class &lt;i&gt;F&lt;/i&gt; differs from its expectation value by &lt;i&gt;h&lt;/i&gt; or more is at
most &lt;i&gt;Ce&lt;/i&gt;&lt;sup&gt;-&lt;i&gt;nr&lt;/i&gt;(&lt;i&gt;h&lt;/i&gt;)&lt;/sup&gt;&quot;, with explicit values of the
prefactor &lt;i&gt;C&lt;/i&gt; and the rate function &lt;i&gt;r&lt;/i&gt;; these might depend on the
true distribution and on the function class, but not on the specific
function &lt;i&gt;f&lt;/i&gt;.

&lt;P&gt;These finite-sample upper bounds are to be distinguished from asymptotic
results giving matching upper and lower bounds
(&lt;a href=&quot;large-deviations.html&quot;&gt;large deviations theory&lt;/a&gt; proper).  They are
also distinct from &lt;a href=&quot;deviation-inequalities.html&quot;&gt;deviation
inequalities&lt;/a&gt; which may depend on the specific function &lt;i&gt;f&lt;/i&gt;, though
obviously finding such deviation inequalities is often a first step to proving
concentration-of-measure results.

&lt;P&gt;(Most of what I know about this subject I've learned from my former
student &lt;a href=&quot;http://www.cs.bgu.ac.il/~karyeh/&quot;&gt;Aryeh Kontorovich&lt;/a&gt;, but
don't blame him for my mis-apprehensions.)

&lt;P&gt;See also:
	&lt;a href=&quot;empirical-process-theory.html&quot;&gt;Empirical Process Theory&lt;/a&gt;;
	&lt;a href=&quot;learning-theory.html&quot;&gt;Learning Theory&lt;/a&gt;;
	&lt;a href=&quot;model-selection.html&quot;&gt;Model Selection&lt;/a&gt;;
	&lt;a href=&quot;stat-mech.html&quot;&gt;Statistical Mechanics&lt;/a&gt;

&lt;ul&gt;Recommended:
	&lt;li&gt;Sergey Bobkov, Mokshay Madiman, &quot;Concentration of the information in data with log-concave distributions&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aop/1312555807&quot;&gt;&lt;cite&gt;Annals of Probability&lt;/cite&gt; &lt;strong&gt;39&lt;/strong&gt; (2011): 1528--1543&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/1012.5457&quot;&gt;arxiv:1012.5457&lt;/a&gt;
	&lt;li&gt;Leonid (Aryeh) Kontorovich
[&lt;a href=&quot;../weblog/464.html&quot;&gt;Comments&lt;/a&gt;]
		&lt;ul&gt;
		&lt;li&gt;&quot;Obtaining Measure Concentration from Markov Contraction&quot;, &lt;a href=&quot;http://arxiv.org/abs/0711.0987&quot;&gt;arxiv:0711.0987&lt;/a&gt; [I really don't know why Leo bothered to take my &lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/754/&quot;&gt;class&lt;/a&gt;]
		&lt;li&gt;&quot;Measure Concentration of Hidden Markov
Processes&quot;, &lt;a href=&quot;http://arxiv.org/abs/math.PR/0608064&quot;&gt;math.PR/0608064&lt;/a&gt;
		&lt;li&gt;&quot;Measure concentration of Markov Tree Processes&quot;,
&lt;a href=&quot;http://arxiv.org/abs/math.PR/0608511&quot;&gt;math.PR/0608511&lt;/a&gt;
		&lt;/ul&gt;
	&lt;li&gt;Leonid Kontorovich and Kavita Ramanan, &quot;Concentration Inequalities for Dependent Random Variables via the Martingale Method&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aop/1229696598&quot;&gt;&lt;cite&gt;Annals of Probability&lt;/cite&gt; &lt;strong&gt;36&lt;/strong&gt; (2008): 2126--2158&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs//math/0609835&quot;&gt;arxiv:math/0609835&lt;/a&gt;
	&lt;li&gt;&lt;a href=&quot;http://web.me.com/pascal.massart/Site/Home.html&quot;&gt;Pascal Massart&lt;/a&gt;, &lt;cite&gt;Concentration Inequalities and Model
Selection&lt;/cite&gt; [Using empirical process theory to get finite-sample, i.e.,
non-asymptotic, risk bounds for various forms
of &lt;a href=&quot;model-selection.html&quot;&gt;model selection&lt;/a&gt;.  Available for free as
a &lt;a href=&quot;http://eprints.pascal-network.org/archive/00002827/&quot;&gt;large PDF
preprint&lt;/a&gt;.]
	&lt;li&gt;Andreas Maurer, &quot;Thermodynamics and Concentration&quot;,
&lt;citE&gt;Bernoulli&lt;/cite&gt; submitted (2011) [Deriving concentration bounds
from &lt;a href=&quot;stat-mech.html&quot;&gt;statistical-mechanical arguments&lt;/a&gt;; very nice.
&lt;a href=&quot;http://www.andreas-maurer.eu/TermoConc.pdf&quot;&gt;PDF preprint&lt;/a&gt; via
Dr. Maurer]
	&lt;/ul&gt;

&lt;ul&gt;To read:
	&lt;li&gt;Sylvain Arlot, Gilles Blanchard, and Etienne Roquain, &quot;Some nonasymptotic results on resampling in high dimension, I: Confidence regions&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aos/1262271609&quot;&gt;&lt;cite&gt;Annals of Statistics&lt;/cite&gt; &lt;strong&gt;38&lt;/strong&gt; (2010): 51--82&lt;/a&gt;
	&lt;li&gt;Alexander Barovnik, &lt;a href=&quot;http://www.math.lsa.umich.edu/~barvinok/total710.pdf&quot;&gt;lecture notes on Measure Concentration&lt;/a&gt;
	&lt;li&gt;S.G. Bobkov and F. G&amp;ouml;tze, &quot;Concentration of empirical distribution functions with applications to non-i.i.d. models&quot;,
&lt;a href=&quot;http://projecteuclid.org/euclid.bj/1290092911&quot;&gt;&lt;cite&gt;Bernoulli&lt;/cite&gt; &lt;strong&gt;16&lt;/strong&gt; (2010): 1385--1414&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/1011.6165&quot;&gt;arxiv:1011.6165&lt;/a&gt;
	&lt;li&gt;St&amp;eacute;phane Boucheron, G&amp;aacute;bor Lugosi, and Pascal Massart, &quot;Concentration inequalities using the entropy method&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.aop/1055425791&quot;&gt;&lt;cite&gt;Annals
of Probability&lt;/cite&gt; &lt;strong&gt;31&lt;/strong&gt; (2003): 1583--1614&lt;/a&gt;
	&lt;li&gt;Sourav Chatterjee and Partha S. Dey, &quot;Applications of Stein's method for concentration inequalities&quot;, &lt;cite&gt;Annals of Probability&lt;/cite&gt;
&lt;Strong&gt;38&lt;/strong&gt; (2010): 2443--2485, &lt;a href=&quot;http://arxiv.org/abs/0906.1034&quot;&gt;arxiv:0906.1034&lt;/a&gt;
	&lt;li&gt;J.-R. Chazottes, P. Collet, C. Kuelske, and F. Redig,
&quot;Concentration inequalities for random fields via coupling&quot;,
&lt;a href=&quot;http://dx.doi.org/10.1007/s00440-006-0026-1&quot;&gt;&lt;cite&gt;Probability Theory and Related Fields&lt;/cite&gt; &lt;strong&gt;137&lt;/strong&gt; (2007):
201--225&lt;/a&gt;,
&lt;a href=&quot;http://arxiv.org/abs/math.PR/0503483&quot;&gt;math.PR/0503483&lt;/a&gt;
	&lt;li&gt;Nathael Gozlan, Christian L&amp;eacute;onard, &quot;Transport Inequalities. A Survey&quot;, &lt;a href=&quot;http://arxiv.org/abs/1003.3852&quot;&gt;arxiv:1003.3852&lt;/a&gt;
	&lt;li&gt;Russell Impagliazzo, Valentine Kabanets, &quot;Constructive Proofs of Concentration Bounds&quot;, &lt;a href=&quot;http://eccc.hpi-web.de/report/2010/072/&quot;&gt;&lt;cite&gt;Electronic Colloquium on Computational
Complexity&lt;/cite&gt; TR10-072&lt;/a&gt; [From my superficial and biased inspection, the
most impenetrably non-probabilistic proof of a probabilistic result I have ever seen.  But potentially interesting.]
	&lt;li&gt;Ledoux, &lt;cite&gt;The Concentration of Measure Phenomenon&lt;/cite&gt;
	&lt;li&gt;Johannes C. Lederer, Sara A. van de Geer, &quot;New Concentration Inequalities for Suprema of Empirical Processes&quot;, &lt;a href=&quot;http://arxiv.org/abs/1111.3486&quot;&gt;arxiv:1111.3486&lt;/a&gt;
	&lt;li&gt;K. Marton, &quot;Bounding \bar{d}-distance by informational
divergence: a method to prove measure
concentration&quot;, &lt;a href=&quot;http://dx.doi.org/10.1214/aop/1039639365&quot;&gt;&lt;cite&gt;Annals
of Probability&lt;/cite&gt; &lt;strong&gt;24&lt;/strong&gt; (1996): 857--866&lt;/a&gt; [Thanks to Leo
Kontorovich for the pointer]
	&lt;li&gt;Timothy John Sullivan, Houman Owhadi, &quot;Equivalence of concentration inequalities for linear and non-linear functions&quot;, &lt;a href=&quot;http://arxiv.org/abs/1009.4913&quot;&gt;arxiv:1009.4913&lt;/a&gt;
	&lt;li&gt;Talagrand, &lt;cite&gt;The Generic Chaining&lt;/cite&gt;
	&lt;/ul&gt;
</description>
  </item>
  </channel>
</rss>
