<?xml version="1.0"?>
<!-- name="generator" content="blosxom/2.0" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
  <channel>
    <title>Notebooks   </title>
    <link>http://bactra.org/notebooks</link>
    <description>Cosma's Notebooks</description>
    <language>en</language>

  <item>
    <title>Partial Identification of Parametric Statistical Models</title>
    <link>http://bactra.org/notebooks/2009/10/24#partial-identification</link>
    <description>
&lt;P&gt;A parametric statistical model is said to be &quot;identifiable&quot; if no two
parameter settings give rise to the same distribution of observations.  This
means that there is always &lt;em&gt;some&lt;/em&gt; way to test whether the parameters
take one value rather than another.  If this is not the case, then the model is
said to be &quot;unidentifiable&quot;.  Sometimes models are unidentifiable because they
are bad models, specified in a stupid way which leads to redundancies.
Sometimes, however, models are unidentifiable because the data are bad --- if
you could measure certain variables, or measure them more precisely, the model
would be identifiable, but in fact you have to put up with noisy, missing,
aggregated, etc. data.  (Technically: the information we get from observations
is represented by a sigma algebra, or, over time, a filtration.  If two
distributions differ on the full filtration, their restrictions to some smaller
filtration might coincide.)  Presumably then you could still &lt;em&gt;partially&lt;/em&gt;
identify the model, up to, say, some notion of observational
equivalence.  &lt;em&gt;Query&lt;/em&gt;: how to make this precise?

&lt;P&gt;If the distribution predicted by the model depends, in a reasonably smooth
way, on the parameters, then we can form the Fisher information matrix, which
is basically the matrix of expected second derivatives of the likelihood with
respect to all the parameters.  (I realize that's not a very helpful statement
if you haven't at least forgotten the real definition of the Fisher
information.)  Suppose one or more of the eigenvalues of the Fisher information
matrix is zero.  Any vector orthogonal to the span of the eigenvectors
corresponding to the non-zero eigenvalues then gives a linear combination of
the original parameters which is unidentifiable, at least in the vicinity of
the point at which you're taking derivatives.  This suggests at least two
avenues of approach here.
	&lt;ol&gt;
	&lt;li&gt; Re-parameterization.  Perform some kind of rotation of the
coordinate system in parameter space so that one has a clear distinction
between identifiable and non-identifiable parameters into two orthogonal
groupings, and inference for the former can proceed in total ignorance of the
values of the latter.
	&lt;li&gt; Equivalent models.  Say (as I implied above) that two parameter
settings at observationally equivalent if they yield the same distribution over
observations.  The set of all parameter values observationally equivalent to a
given value should, plausibly, form a sub-manifold of parameter space.  The
zero eigenvectors of the Fisher information matrix give the directions in which
we could move in parameter space and (locally) stay on this sub-manifold.  Is
this enough to actually &lt;em&gt;define&lt;/em&gt; that sub-manifold?  It sounds
plausible.  (I am thinking of how, in dynamical systems, we can go from knowing
the stable/unstable/neutral directions in the neighborhood of a fixed point to
the stable/unstable/neutral manifolds, extending, potentially, arbitrarily far
away.)  Could this actually be used, in an exercise in computational
differential geometry, to &lt;em&gt;calculate&lt;/em&gt; the sub-manifold?
	&lt;/ol&gt;
(2) is actually more ambitious, because implicitly it would let us accomplish
(1), re-parameterization, not just locally, in the vicinity of our favored
parameter value, but globally --- moves along the sub-manifold, by
construction, do not affect the likelihood, while moves from one sub-manifold
to another must.

&lt;P&gt;&amp;mdash; Since writing the first version of this, I've run across the work of
Charles Manski, which is centrally concerned with &quot;partial identification&quot;, but
not quite in the sense I had in mind.  Rather than reparameterizing to get some
totally identifiable parameters and ignore the rest, he wants to take the
parameters as given, and put &lt;em&gt;bounds&lt;/em&gt; on the ones which can't be totally
identified.  This is only natural for the kinds of parameters he has in mind,
like the effects of policy interventions.

&lt;P&gt;(Thank to Gustavo Lacerda for corrections.)


&lt;P&gt;See also:
	&lt;a href=&quot;info-geo.html&quot;&gt;Information Geometry&lt;/a&gt;;
	&lt;a href=&quot;statistics.html&quot;&gt;Statistics&lt;/a&gt;

&lt;ul&gt;Recommended:
	&lt;li&gt;Charles Manski, &lt;cite&gt;Identification for Prediction
and Decision&lt;/cite&gt; [&lt;a href=&quot;../reviews/manski-on-identification/&quot;&gt;Review:
Better Roughly Right Than Exactly Wrong&lt;/a&gt;]
	&lt;/ul&gt;

&lt;ul&gt;To read:
	&lt;li&gt;Paul Gustafson, &quot;On Model Expansion, Model Contraction, Identifiability and Prior Information: Two Illustrative Scenarios Involving Mismeasured Variables&quot;, &lt;a href=&quot;http://projecteuclid.org/euclid.ss/1121347636&quot;&gt;&lt;cite&gt;Staistical
Science&lt;/cite&gt; &lt;strong&gt;20&lt;/strong&gt; (2005): 111--140&lt;/a&gt; [Thanks to Gustavo Lacerda for the pointer]
	&lt;li&gt;Charles Manski, &lt;cite&gt;Partial Identification of Probability
Distributions&lt;/cite&gt;
	&lt;/ul&gt;
</description>
  </item>
  </channel>
</rss>