<?xml version="1.0"?>
<!-- name="generator" content="blosxom/2.0.2" -->
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">

<rss version="0.91">
  <channel>
    <title>Three-Toed Sloth   </title>
    <link>http://bactra.org/weblog/</link>
    <description>Slow Takes from the Canopy (My Very Own Internet Tradition)</description>
    <language>en</language>

  <item>
    <title>Cognitive Democracy</title>
    <link>http://bactra.org/weblog/917.html</link>
    <description>
&lt;blockquote&gt;&lt;em&gt;Attention conservation notice&lt;/em&gt;: 8000+ words of political
theory by geeks.
&lt;/blockquote&gt;

&lt;P&gt;For quite a while now, &lt;a href=&quot;http://www.henryfarrell.net/&quot;&gt;Henry
Farrell&lt;/a&gt; and I have been worrying at a skein of ideas running between
&lt;a href=&quot;../notebooks/institutions.html&quot;&gt;institutions&lt;/a&gt;, &lt;a href=&quot;cat_networks.html&quot;&gt;networks&lt;/a&gt;,
evolutionary
games, &lt;a href=&quot;../notebooks/democracy.html&quot;&gt;democracy&lt;/a&gt;, &lt;a href=&quot;../notebooks/collective-cognition.html&quot;&gt;collective
cognition&lt;/a&gt;, the Internet and inequality.  Turning these threads into a
single seamless garment is beyond our hopes (and not our style anyway), but we
think we've been getting somewhere in at least usefully disentangling them.  An
upcoming workshop gave us an excuse to set out part (but not all) of the
pattern, and so here's a draft.  To the extent you like it, thank Henry; the
remaining snarls are my fault.

&lt;P&gt;This &lt;em&gt;is&lt;/em&gt; a draft, and anyway it's part of the spirit of the piece
that feedback would be appropriate.  I do not have comments, for
my &lt;a href=&quot;http://dashes.com/anil/2011/07/if-your-websites-full-of-assholes-its-your-fault.html&quot;&gt;usual
reasons&lt;/a&gt;, but
it's &lt;a href=&quot;http://crookedtimber.org/2012/05/23/cognitive-democracy/&quot;&gt;cross-posted
at Crooked Timber&lt;/a&gt;, and you can always send an e-mail.

&lt;P&gt;(&lt;em&gt;Very&lt;/em&gt; Constant Readers will now see the context for a lot of my
non-statistical writing over the last few years, including the previous post.)


&lt;h3 id=&quot;cognitive-democracy&quot;&gt;Cognitive Democracy&lt;/h3&gt;
&lt;h4 id=&quot;authors&quot;&gt;Henry Farrell (George Washington University) and Cosma Rohilla Shalizi (Carnegie Mellon University/The Santa Fe Institute)&lt;/h4&gt;

&lt;p&gt;In this essay, we outline a cognitive approach to democracy. Specifically, we argue that democracy has unique benefits as a form of collective problem solving in that it potentially allows people with highly diverse perspectives to come together in order collectively to solve problems. Democracy can do this better than either markets and hierarchies, because it brings these diverse perceptions into direct contact with each other, allowing forms of learning that are unlikely either through the price mechanism of markets or the hierarchical arrangements of bureaucracy. Furthermore, democracy can, by experimenting, take advantage of novel forms of collective cognition that are facilitated by new media.&lt;/p&gt;

&lt;p&gt;Much of what we say is synthetic --- our normative arguments build on both
the academic literature (Joshua Cohen's
and &lt;a href=&quot;../reviews/ober-democracy-and-knowledge.html&quot;&gt;Joshua Ober's&lt;/a&gt;
arguments about epistemic
democracy; &lt;a href=&quot;../reviews/knight-johnson.html&quot;&gt;Jack Knight and James
Johnson's pragmatist account of the benefits of a radically egalitarian
democracy&lt;/a&gt; and Elster
and &lt;a href=&quot;http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1845709&quot;&gt;Landemore's&lt;/a&gt;
forthcoming collection on &lt;em&gt;Collective Wisdom&lt;/em&gt;), and on arguments by
public intellectuals such
as &lt;a href=&quot;../reviews/where-good-ideas-come-from/&quot;&gt;Steven Berlin Johnson&lt;/a&gt;,
Clay Shirky, Tom Slee and Chris Hayes. We also seek to contribute to new
debates on the sources of collective wisdom. Throughout, we emphasize
the &lt;em&gt;cognitive&lt;/em&gt; benefits of democracy, building on important results
from cognitive science, from sociology, from machine learning and from network
theory.&lt;/p&gt;

&lt;p&gt;We start by explaining social institutions &lt;em&gt;should&lt;/em&gt; do. Next, we examine sophisticated arguments that have been made in defense of markets (Hayek's theories about catallaxy) and hierarchy (Richard Thaler and Cass Sunstein's &quot;libertarian paternalism&quot;) and discuss their inadequacies. The subsequent section lays out our arguments in favor of democracy, illustrating how democratic procedures have cognitive benefits that other social forms do not. The penultimate section discusses how democracy can learn from new forms of collective consensus formation on the Internet, treating these forms not as ideals to be approximated, but as imperfect experiments, whose successes &lt;em&gt;and failures&lt;/em&gt; can teach us about the conditions for better decision making; this is part of a broader agenda for cross-disciplinary research involving computer scientists and democratic theorists.&lt;/p&gt;

&lt;h4 id=&quot;justifying-social-institutions&quot;&gt;Justifying Social Institutions&lt;/h4&gt;

&lt;p&gt;What are broad macro-institutions such as politics, markets and hierarchies
good for? Different theorists have given very different answers to this
question. The dominant tradition in political theory tends to evaluate them in
terms of &lt;em&gt;justice&lt;/em&gt; --- whether institutions use procedures, or give
results, that can be seen as just according to some reasonable normative
criterion. Others, perhaps more cynically, have focused on their potential
contribution to &lt;em&gt;stability&lt;/em&gt; --- whether they produce an acceptable level
of social order, which minimizes violence and provides some modicum of
predictability. In this essay, we analyze these institutions according to a
different criterion. We start with a pragmatist question - whether these
institutions are &lt;em&gt;useful&lt;/em&gt; in helping us to &lt;em&gt;solve difficult social
problems.&lt;/em&gt;&lt;sup&gt;&lt;a href=&quot;#fn1&quot; class=&quot;footnoteRef&quot;
id=&quot;fnref1&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Some of the problems that we face in politics are simple ones (not in the
sense that solutions are easy, but in the sense that they are simple to
analyze). However, the most vexing problems are usually ones without any very
obvious solutions. How do we change legal rules and social norms in order to
mitigate the problems of global warming? How do we regulate financial markets
so as to minimize the risk of new crises emerging, and limit the harm of those
that happen? How do we best encourage the spread of human rights
internationally?&lt;/p&gt;

&lt;p&gt;These problems are pressing --- yet they are difficult to think about
systematically, let alone solve. They all share two important features. First,
they are all &lt;em&gt;social&lt;/em&gt; problems. That is, they are problems which involve
the interaction of large numbers of human beings, with different interests,
desires, needs and perspectives. Second, as a result, they are &lt;em&gt;complex&lt;/em&gt;
problems, in the sense that scholars of complexity understand the term. To
borrow Scott Page's (2011, p. 25) definition, they involve &quot;&lt;em&gt;diverse&lt;/em&gt;
entities that interact in a &lt;em&gt;network&lt;/em&gt; or &lt;em&gt;contact
structure.&lt;/em&gt;&quot;&lt;sup&gt;&lt;a href=&quot;#fn2&quot; class=&quot;footnoteRef&quot; id=&quot;fnref2&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;
They are a result of behavior that is difficult to predict, so that
consequences to changing behavior are extremely hard to map out in
advance. Finding solutions is difficult, and even when we find one, it is hard
to know whether it is good in comparison to other possible solutions, let alone
the best.&lt;/p&gt;

&lt;p&gt;We argue that macro-institutions will best be able to tackle these problems
if they have two features. First, they should foster &lt;em&gt;a high degree of
direct communication&lt;/em&gt; between individuals with diverse viewpoints. This
kind of intellectual diversity is crucial to identifying good solutions to
complex problems. Second, we argue that they should provide &lt;em&gt;relative
equality&lt;/em&gt; among affected actors in decision-making processes, so as to
prevent socially or politically powerful groups from blocking socially
beneficial changes to the detriment of their own particular interests.&lt;/p&gt;

&lt;p&gt;We base these contentions on two sets of arguments, one from work on
collective problem solving, the other from theories of political power. Both
are clarified if we think of the possible solutions to a difficult problem as
points on a landscape, where we seek the highest point. Difficult problems
present many peaks, solutions that are better than the points close to
them. Such landscapes are &lt;em&gt;rugged&lt;/em&gt; --- they have some degree of
organization, but are not so structured that simple algorithms can quickly find
the best solution. There is no guarantee that any particular peak
is &lt;em&gt;globally optimal&lt;/em&gt; (i.e. the best solution across the entire
landscape) rather than &lt;em&gt;locally optimal&lt;/em&gt; (the best solution within a
smaller subset of the landscape).&lt;/p&gt;

&lt;p&gt;Solving a complex problem involves a search across this landscape for the
best visible solutions. Individual agents have limited cognitive abilities, and
(usually) limited knowledge of the landscape. Both of these make them likely to
get stuck at local optima, which may be much worse than even other local peaks,
let alone the global optimum. Less abstractly, people may settle for bad
solutions, because they do not know better (they cannot perceive other, better
solutions), or because they have difficulty in reaching these solutions
(e.g. because of coordination problems, or because of the ability of powerful
actors to veto possible changes).&lt;/p&gt;

&lt;p&gt;Lu Hong and Scott Page (2004) &lt;a href=&quot;362.html&quot;&gt;use mathematical models to
argue that &lt;em&gt;diversity of viewpoints&lt;/em&gt; helps groups find better
solutions&lt;/a&gt; (higher peaks on the landscape). The intuition is that different
individuals, when confronting a problem, &quot;see&quot; different landscapes --- they
organize the set of possible solutions in different ways, some of which are
useful in identifying good peaks, some of which less so. Very smart individuals
(those with many mental tools) have better organized landscapes than less smart
individuals, and so are less likely to get trapped at inferior local
optima. However, at the group level, diversity of viewpoints matters a
lot. Page and Hong find that &quot;diversity trumps ability&quot;. Groups with high
diversity of internal viewpoints are better able to identify optima than groups
composed of much smarter individuals with more homogenous viewpoints. By
putting their diverse views together, the former are able to map out more of
the landscape and identify possible solutions that would be invisible to groups
of individuals with more similar perspectives.&lt;/p&gt;

&lt;p&gt;Page and Hong do not model the social processes through which individuals
can bring their diverse points of view together into a common
framework. However, their arguments surely suggest that actors' different
points of view need to be &lt;em&gt;exposed directly&lt;/em&gt; to each other, in order to
identify the benefits and drawbacks of different points of view, the ways in
which viewpoints can be combined to better advantage, and so on. These
arguments are supported by a plethora of work in sociology and elsewhere (Burt,
Rossman etc). As we explain at length below, some degree of clumping is also
beneficial, so so that individuals with divergent viewpoints do not converge
too quickly.&lt;/p&gt;

&lt;p&gt;The second issue for collective problem solving is more obvious. Even when
groups are able to identify good solutions (relatively high peaks in the
solution landscape), they may not be able to reach them. In particular, actors
who benefit from the status quo (or who would prefer less generally-beneficial
solutions) may be able to use political and social power to block movement
towards such peaks, and instead compel movement towards solutions that have
lower social and greater individual benefits. Research on problem solving
typically does not talk about differences in actors' interests, or in actors'
ability successfully to pursue their interests. While different individuals
initially perceive different aspects of the landscape, researchers assume that
once they are able to communicate with each other, they will all agree on how
to rank visible solutions from best to worst. But actors may have diverse
interests as well as diverse understandings of the world (and the two may
indeed be systematically linked). They may even be working in such different
landscapes, in terms of personal advantage, that one actor's peak is another's
valley, and vice versa. Moreover, actors may differ in their ability to ensure
that their interests are prosecuted. Recent work in political theory (Knight
1992, Johnson and Knight 2011), economics (Bowles and Naidu, 2008), political
science (Hacker and Pierson 2010) and sociology details how powerful actors may
be able to compel weaker ones to accept solutions that are to the advantage of
the former, but that have lower overall social benefits.&lt;/p&gt;

&lt;p&gt;Here, relative equality of power can have important
consequences. Individuals in settings with relatively equal power relations,
are, &lt;em&gt;ceteris paribus&lt;/em&gt; more likely to converge on solutions with broad
social benefits, and less likely to converge on solutions that benefit smaller
groups of individuals at the expense of the majority. Furthermore, equal power
relations may not only make it easier to converge on &quot;good&quot; solutions when they
have been identified, but may stimulate the process of search for such
solutions. Participating in the search for solutions and in decision-making
demands resources (at a minimum, time), and if those resources are concentrated
in a small set of actors, with similar interests and perspectives, the
solutions they will find will be fewer and worse than if a wide variety of
actors can also search.&lt;/p&gt;

&lt;p&gt;With this in mind, we ask whether different macro-institutions are better,
or worse at solving the complex problems that confront modern economies and
societies. Institutions will tend to do better to the extent that they both (i)
bring together people with different perspectives, and (ii) share
decision-making power relatively equally. Our arguments are, obviously, quite
broad. We do not speak much to the specifics of how macro-institutions work,
instead focusing on the broad logics of these different
macro-institutions. Furthermore, we do not look at the ways in which our
desiderata interact with other reasonable desiderata (such as social stability,
justice and so on). Even so, we think that it is worth clarifying the ways in
which different institutions can, or cannot, solve complex problems. In recent
decades, for example, many scholars and policy makers have devoted time and
energy to advocating markets as &lt;em&gt;the&lt;/em&gt; way to address social problems
that are too complex to be solved by top-down authority. As we show below,
markets, to the extent that they imply substantial power inequalities, and
increasingly homogenize human relations, are unlikely to possess the virtues
attributed to them, though they can have more particular benefits under
specific circumstances. Similarly, hierarchy suffers from dramatic
informational flaws. This prompts us to reconsider democracy, not for the sake
of justice or stability, but as a tool for solving the complex problems faced
by modern societies.&lt;/p&gt;

&lt;h4 id=&quot;markets-and-hierarchies-as-ways-to-solve-complex-problems&quot;&gt;Markets and Hierarchies as Ways to Solve Complex Problems&lt;/h4&gt;

&lt;p&gt;Many scholars and public intellectuals believe that markets or hierarchies
provide better ways to solve complex problems than democracy. Advocates of
markets usually build on the groundbreaking work of F. A. von Hayek, to argue
that market based forms of organization do a better job of eliciting
information and putting it to good work than does collective
organization. Advocates of hierarchy do not write from any such unified
tradition. However, Richard Thaler and Cass Sunstein have recently made a
sophisticated case for the benefits of hierarchy. They advocate a combination
of top-down mechanism design and institutions designed to guide choices rather
than to constrain them - what they call libertarian paternalism - as a way to
solve difficult social problems. Hayek's arguments are not the only case for
markets, and Thaler and Sunstein's are not the only justification for
hierarchy. They are, however, among the &lt;em&gt;best&lt;/em&gt; such arguments, and hence
provide a good initial way to test the respective benefits of markets,
hierarchies and democracies in solving complex problems. If there are better
arguments, which do not fall victim to the kinds of problems we point to, we
are not aware of them (but would be very happy to be told of them).&lt;/p&gt;

&lt;p&gt;Hayek's account of the informational benefits of markets is
groundbreaking. Although it builds on the insights of others (particularly
Michael Polanyi), it is arguably the first real effort to analyze how social
institutions work as information-processors. Hayek reasons as follows. Much of
human knowledge (as Polanyi argues) is practical, and cannot be fully
articulated (&quot;tacit&quot;). This knowledge is nonetheless crucial to economic
life. Hence, if we are to allocate resources well, we must somehow gather this
dispersed, fragmentary, informal knowledge, and make it useful.&lt;/p&gt;

&lt;p&gt;Hayek is explicit that no one person can know all that is required to
allocate resources properly, so there must be a &lt;em&gt;social&lt;/em&gt; mechanism for
such information processing. Hayek identifies three possible mechanisms:
central planning, planning by monopolistic industries, and decentralized
planning by individuals. He argues that the first and second of these break
down when we take account of the vast amount of tacit knowledge, which cannot
be conveyed to any centralized authority. Centralized or semi-centralized
planning are especially poor at dealing with the constant flows of major and
minor changes through which an economy (or, as Hayek would prefer, a catallaxy)
approaches balance. To deal with such changes, we need people to make the
necessary decisions on the spot --- but we also need some way to convey the
appropriate information about changes in the larger economic system to him or
her. The virtue of the price system, for Hayek, is to compress diffuse, even
tacit, knowledge about specific changes in specific circumstances into a single
index, which can guide individuals as to how they ought respond to changes
elsewhere. I do not need to grasp the intimate local knowledge of the farmer
who sells me tomatoes in order to decide whether to buy their products. The
farmer needs to know the price of fertilizer, not how it is made, or what it
could be used for other than tomatoes, or the other uses of the fertilizers'
ingredients. (I do not even need to know the price of fertilizer.) The
information that we need, to decide whether to buy tomatoes or to buy
fertilizer, is conveyed through prices, which may go up or down, depending on
the aggregate action of many buyers or suppliers, each working with her own
tacit understandings.&lt;/p&gt;

&lt;p&gt;This insight is both crucial and beautiful&lt;sup&gt;&lt;a href=&quot;#fn3&quot;
class=&quot;footnoteRef&quot; id=&quot;fnref3&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, yet it has stark limits. It
suggests that markets will be best at conveying a particular kind of
information about a particular kind of underlying facts, i.e., the relative
scarcity of different goods. As Stiglitz (2000) argues, market signals about
relative scarcity are always distorted, because prices embed information about
many other economically important factors. More importantly, although
information about relative scarcity surely helps markets approach some kind of
balance, it is little help in solving more complicated social problems, which
may depend not on allocating existing stocks of goods in a useful way, given
people's dispersed local knowledge, so much as discovering new goods or new
forms of allocation. More generally, Hayek's well-known detestation for
projects with collective goals lead him systematically to discount the ways in
which aggregate knowledge might work to solve collective rather than individual
problems.&lt;/p&gt;

&lt;p&gt;This is unfortunate. To the extent that markets fulfil Hayek's criteria, and
mediate all relevant interactions through the price mechanism, they foreclose
other forms of exchange that are more intellectually fruitful. In particular,
Hayek's reliance on arguments about inarticulable tacit knowledge mean that he
leaves no place for reasoned discourse or the useful exchange of views. In
Hayek's markets, people communicate only through prices. The advantage of
prices, for Hayek, is that they inform individuals about what others want (or
don't want), without requiring anyone to know anything about anyone else's
plans or understandings. But there are many useful forms of knowledge that
cannot readily be conveyed in this way.&lt;/p&gt;

&lt;p&gt;Individuals may learn something about those understandings as
a &lt;em&gt;by-product&lt;/em&gt; of market interactions. In John Stuart Mill's
description:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;But the economical advantages of commerce are surpassed in importance by
those of its effects which are intellectual and moral. It is hardly possible to
overrate the value, in the present low state of human improvement, of placing
human beings in contact with persons dissimilar to themselves, and with modes
of thought and action unlike those with which they are familiar. Commerce is
now what war once was, the principal source of this contact.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;However, such contact is largely incidental --- people engage in market
activities to buy or to sell to best advantage, not to learn. As markets become
purer, in both the Hayekian and neo-classical senses, they produce ever less of
the contact between different modes of life that Mill regards as salutary. The
resurgence of globalization; the creation of an Internet where people who will
only ever know each other by their account names buy and sell from each other;
the replacement of local understandings with global standards; all these
provide enormous efficiency gains and allow information about supply and demand
to flow more smoothly. Yet each of them undermines the Millian benefits of
commerce, by making it less likely that individuals with different points of
view will have those perspectives directly exposed to each other. More
tentatively, markets may themselves have a homogenizing impact on differences
between individuals and across societies, again reducing diversity. As Albert
Hirschman shows, there is a rich, if not unambiguous, literature on the global
consequences of market society. Sociologists such as John Meyer and his
colleagues find evidence of increased cultural and social convergence across
different national contexts, as a result of exposure to common market and
political forces.&lt;/p&gt;

&lt;p&gt;In addition, it is unclear whether markets in general reduce power
inequalities or reinforce them in modern democracies. It is almost certainly
true that the spread of markets helped undermine some historical forms of
hierarchy, such as feudalism (Marx). It is &lt;em&gt;not&lt;/em&gt; clear that they
continue to do so in modern democracies. On the one hand, free market
participation provides individuals with some ability (presuming equal market
access, etc.) to break away from abusive relationships. On the other, markets
provide greater voice and choice to those with more money; if money talks in
politics, it shouts across the agora. Nor are these effects limited to the
marketplace. The market facilitates and fosters asymmetries of wealth which in
turn may be directly or indirectly translated into asymmetries of political
influence (Lindblom). Untrammeled markets are associated with gross income
inequalities, which in turn infects politics with a variety of
pathologies. This suggests that markets fail in the broader task of exposing
individuals' differing perspectives to each to each other. Furthermore, markets
are at best indifferent levelers of unequal power relations.&lt;/p&gt;

&lt;p&gt;Does hierarchy do better? In an influential recent book, Richard Thaler and
Cass Sunstein suggest that it does. They argue that &quot;choice architects&quot;, people
who have &quot;responsibility for organizing the context in which people make
decisions,&quot; can design institutions so as to spur people to take better choices
rather than worse ones. Thaler and Sunstein are self-consciously paternalist,
claiming that flawed behavior and thinking consistently stop people from making
the choices that are in their best interests. However, they also find direct
control of people's choices morally opprobrious. Libertarian paternalism seeks
to guide but not eliminate choice, so that the easiest option is the &quot;best&quot;
choice that individuals &lt;em&gt;would&lt;/em&gt; make, if they only had sufficient
attention and discipline. It provides paternalistic guidance through
libertarian means, shaping choice contexts to make it more likely that
individuals will make the right choices rather than the wrong ones.&lt;/p&gt;

&lt;p&gt;This is, in Thaler and Sunstein's words, a politics of &quot;nudging&quot; choices
rather than dictating them. Although Thaler and Sunstein do not put it this
way, it is also a plea for the benefits of hierarchy in organizations and, in
particular, in government. Thaler and Sunstein's &quot;choice architects&quot; are
hierarchical superiors, specifically empowered to create broad schemes that
will shape the choices of many other individuals. Their power to do this does
not flow from, e.g., accountability to those whose choices get shaped. Instead,
it flows from positions of authority within firm or government, which allow
them to craft pension contribution schemes within firms, environmental policy
within the government, and so on.&lt;/p&gt;

&lt;p&gt;Thaler and Sunstein's recommendations have outraged libertarians, who
believe that a nudge is merely a well-aimed shove --- that individuals' freedom
will be reduced nearly as much by Thaler and Sunstein's choice architecture, as
it would be by direct coercion. &lt;a href=&quot;838.html&quot;&gt;We are also unenthusiastic
about libertarian paternalism, but for different reasons&lt;/a&gt;. While we do not
talk, here, about coercion, we have no particular normative objection to it,
provided that it is proportionate, directed towards legitimate ends, and
constrained by well-functioning democratic controls. Instead, we worry that the
kinds of hierarchy that Thaler and Sunstein presume actively inhibit the
unconstrained exchange of views that we see as essential to solving complex
problems.&lt;/p&gt;

&lt;p&gt;Bureaucratic hierarchy is an extraordinary political achievement. States
with clear, accountable hierarchies can achieve vast and intricate projects,
and businesses use hierarchies to coordinate highly complex chains of
production and distribution.&lt;sup&gt;&lt;a href=&quot;#fn4&quot; class=&quot;footnoteRef&quot;
id=&quot;fnref4&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; Even so, there are reasons why bureaucracies have few
modern defenders. Hierarchies rely on power asymmetries to work. Inferiors take
orders from superiors, in a chain of command leading up to the chief executive
officer (in firms) or some appointed or non-appointed political actor (in
government). This is good for pushing orders &lt;em&gt;down&lt;/em&gt; the chain, but
notoriously poor at transmitting useful information &lt;em&gt;up&lt;/em&gt;, especially
kinds of information superiors did not anticipate wanting. As scholars from Max
Weber on have emphasized, bureaucracies systematically encourage a culture of
conformity in order to increase predictability and static efficiency.&lt;/p&gt;

&lt;p&gt;Thaler and Sunstein presume a hierarchy in which orders are followed and
policies are implemented, but ignore what this implies about feedback. They
imagine hierarchically-empowered architects shaping the choices of a less
well-informed and less rational general population. They discuss ordinary
people's bad choices at length. However, they have remarkably little to say
about how it is that the architects housed atop the hierarchy can figure out
better choices on these individuals' behalf, or how the architectures can
actually design choice systems that will encourage these choices. Sometimes,
Thaler and Sunstein suggest that choice architects can rely on introspection:
&quot;Libertarian paternalists would like to set the default by asking what
reflective employees in Janet's position would actually want.&quot; At other times,
they imply that choice architects can use experimental techniques. The book's
opening analogy proposes a set of experiments, in which the director of food
services for a system &quot;with hundreds of schools&quot; (p. 1), &quot;who likes to think
about things in non-traditional ways,&quot; experiments with different arrangements
of food in order to discover which displays encourage kids to pick the
healthier options. Finally, Thaler and Sunstein sometimes argue that choice
architects can use results from the social sciences to find optima.&lt;/p&gt;

&lt;p&gt;One mechanism of information gathering that they systematically ignore is
active feedback from citizens. Although they argue in passing that feedback
from choice architects can help guide &lt;em&gt;consumers&lt;/em&gt;, e.g., giving
information about the content of food, or by shaping online interactions to
ensure that people are exposed to others' points of view, they have no place
for feedback from the individuals whose choices are being manipulated to help
guide the choice architects, let alone to constrain them. As Suzanne Mettler
(2011) has pointed out, Thaler and Sunstein depict citizens as passive
consumers, who need to be guided to the desired outcomes, rather than active
participants in democratic decision making.&lt;/p&gt;

&lt;p&gt;This also means that Thaler and Sunstein's proposals don't take advantage of
diversity. Choice architects, located within hierarchies which tend generically
to promote conformity, are likely to have a much more limited range of ways of
understanding problems than the population whose choices they are seeking to
structure. In Scott Page's terms, these actors are may very &quot;able&quot; --- they
will have sophisticated and complex heuristics, so that each individual choice
architect is better able than each individual member of the population to see a
large portion of the landscape of possible choices and outcomes. However, the
architects will be very similar to each other in background and training, so
that &lt;em&gt;as a group&lt;/em&gt; they will see a far more limited set of possibilities
than a group of randomly selected members of the population (who are likely to
have less sophisticated but far more diverse heuristics). Cultural homogeneity
among hierarchical elites helps create policy disasters (the &quot;best and
brightest&quot; problem). Direct involvement of a wider selection of actors with
more diverse heuristics would alleviate this problem.&lt;/p&gt;

&lt;p&gt;However, precisely because choice architects rely on hierarchical power to
create their architectures, they will have difficulty in eliciting feedback,
even if they want to. Inequalities of power notoriously dampen real exchanges
of viewpoints. Hierarchical inferiors within organizations worry about
contradicting their bosses. Ordinary members of the public are uncomfortable
when asked to contradict experts or officials. Work on group decision making
(including, e.g., Sunstein 2003) is full of examples of how perceived power
inequalities lead less powerful actors either to remain silent, or merely to
affirm the views of more powerful actors, even when they have independently
valuable perspectives or knowledge.&lt;/p&gt;

&lt;p&gt;In short, libertarian paternalism is flawed, not because it restricts
peoples' choices, but because it makes heroic assumptions about choice
architects' ability to figure out what the actual default choices should be,
and blocks their channels for learning better. Choice architects will be likely
to share a narrow range of sophisticated heuristics, and to have difficulty in
soliciting feedback from others with more diverse heuristics, because of their
hierarchical superiority and the unequal power relations that this
entails. Libertarian paternalism may still have value in situations of
individual choice, where people likely do &quot;want&quot; e.g. to save more or take more
exercise, but face commitment problems, or when other actors have an incentive
to misinform these people or to structure their choices in perverse ways in the
absence of a 'good' default choice. However, it will be far less useful, or
even actively pernicious, in complex situations, where many actors with
different interests make interdependent choices. Indeed, Thaler and Sunstein
are far more convincing when they discuss how to encourage people to choose
appropriate pension schemes than when they suggest that environmental problems
are the &quot;outcome of a global choice architecture system&quot; that could be usefully
rejiggered via a variety of voluntaristic mechanisms.&lt;/p&gt;

&lt;h4 id=&quot;democracy-as-a-way-to-solve-complex-problems&quot;&gt;Democracy as a way to solve complex problems&lt;/h4&gt;

&lt;p&gt;Is democracy better at identifying solutions to complex problems? Many ---
even on the left --- doubt that it is. They point to problems of finding common
ground and of partisanship, and despair of finding answers to hard
questions. The dominant tradition of American liberalism actually has
considerable distaste for the less genteel aspects of democracy. The early 20th
century Progressives and their modern heirs deplore partisanship and political
rivalry, instead preferring technocracy, moderation and deliberation (Rosenblum
2008). Some liberals (e.g., Thaler and Sunstein) are attracted to Hayekian
arguments for markets and libertarian paternalist arguments for hierarchy
exactly because they seem better than the partisan rancor of democratic
competition.&lt;/p&gt;

&lt;p&gt;We believe that they are wrong, and democracy offers a better way of solving
complex problems. Since, as we've argued, power asymmetries inhibit
problem-solving, democracy has a large advantage over both markets and
technocratic hierarchy. The fundamental democratic commitment is to equality of
power over political decision making. Real democracies do not deliver on this
commitment any more than real markets deliver perfect competition, or real
hierarchies deliver an abstractly benevolent interpretation of rules. But a
commitment to &lt;em&gt;democratic&lt;/em&gt; improvements is a commitment to making power
relations more equal, just as a commitment to markets is to improving
competition, and a commitment to hierarchy (in its positive aspects) is a
commitment to greater disinterestedness. This implies that a genuine commitment
to democracy is a commitment to political radicalism. We embrace this.&lt;/p&gt;

&lt;p&gt;Democracy, then, is committed to equality of power; it is also well-suited
to exposing points of view to each other in a way that leads to identifying
better solutions. This is because democracy also involves &lt;em&gt;debate&lt;/em&gt;. In
competitive elections and in more intimate discussions, democratic actors argue
over which proposals are better or worse, exposing their different perspectives
to each other.&lt;/p&gt;

&lt;p&gt;Yet at first glance, this interchange of perspectives looks ugly: it is partisan, rancorous and vexatious, and people seem to never change their minds. This leads some on the left to argue that we need to replace traditional democratic forms with ones that involve genuine deliberation, where people will strive to be open-minded, and to transcend their interests. These aspirations are hopelessly utopian. Such impartiality can only be achieved fleetingly at best, and clashes of interest and perception are intrinsic to democratic politics.&lt;/p&gt;

&lt;p&gt;Here, we concur with Jack Knight and Jim Johnson's important recent book
(2011), which argues that politics is a response to the problem of
diversity. Actors with differing --- indeed conflicting --- interests and
perceptions find that their fates are bound together, and that they must make
the best of this. Yet, Knight and Johnson argue, politics is also a matter of
seeking to &lt;em&gt;harness&lt;/em&gt; diversity so as to generate useful knowledge. They
specifically do not argue that democracy requires impartial
deliberation. Instead, they claim that partial and self-interested debate can
have epistemological benefits. As they describe it, &quot;democratic decision
processes make better use of the distributed knowledge that exists in a society
than do their rivals&quot; such as market coordination or judicial decision making
(p. 151). Knight and Johnson suggest that approaches based on diversity, such
as those of Scott Page and Elizabeth Anderson, provide a better foundation for
thinking about the epistemic benefits of democracy than the arguments of
Condorcet and his intellectual heirs.&lt;/p&gt;

&lt;p&gt;We agree. Unlike Hayek's account of markets, and Thaler and Sunstein's
account of hierarchy, this argument suggests that democracy can both foster
communication among individuals with highly diverse viewpoints. This is an
argument for &lt;em&gt;cognitive democracy&lt;/em&gt;, for democratic arrangements that
take best advantage of the cognitive diversity of their population. Like us,
Knight and Johnson stress the pragmatic benefits of equality. Harnessing the
benefits of diversity means ensuring that actors with a very wide range of
viewpoints have the opportunity to express their views and to influence
collective choice. Unequal societies will select only over a much smaller range
of viewpoints --- those of powerful people. Yet Knight and Johnson do not
really talk about the mechanisms through which clashes between different actors
with different viewpoints result in better decision making. Without such a
theory, it could be that conflict between perspectives results in worse rather
than better problem solving. To make a good case for democracy, we not only
need to bring diverse points of view to the table, but show that
the &lt;em&gt;specific ways&lt;/em&gt; in which they are exposed to each other have
beneficial consequences for problem solving.&lt;/p&gt;

&lt;p&gt;There is micro-level work which speaks to this issue. Hugo Mercier and Dan
Sperber (2011) advance a purely 'argumentative' account of reasoning, on which
reasoning is not intended to reach right answers, but rather to evaluate the
weaknesses of others' arguments and come up with good arguments to support
one's own position. This explains both why confirmation bias and motivated
reasoning are rife, and why the quality of argument is significantly better
when actors engage in real debates. Experimentally, individual performance when
reasoning in non-argumentative settings is 'abysmal,' but is 'good' in
argumentative settings. This, in turn, means that groups are typically better
in solving problems than is the best individual within the group . Indeed,
where there is diversity of opinion, confirmation bias can have positive
consequences in pushing people to evaluate and improve their arguments in a
competitive setting.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When one is alone or with people who hold similar views, one's arguments will not be critically evaluated. This is when the confirmation bias is most likely to lead to poor outcomes. However, when reasoning is used in a more felicitous context &amp;mdash; that is, in arguments among people who disagree but have a common interest in the truth &amp;mdash; the confirmation bias contributes to an efficient form of &lt;em&gt;division of cognitive labor.&lt;/em&gt; When a group has to solve a problem, it is much more efficient if each individual looks mostly for arguments supporting a given solution. They can then present these arguments to the group, to be tested by the other members. This method will work as long as people can be swayed by good arguments, and the results reviewed ... show that this is generally the case. This joint dialogic approach is much more efficient than one where each individual on his or her own has to examine all possible solutions carefully (p. 65).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A separate line of research in experimental social psychology (Nemeth et al. (2004), Nemeth and Ormiston (2007), and Nemeth (2012)) indicates that problem-solving groups produce more solutions, which outsiders assess as better and more innovative, when they contain persistent dissenting minorities, and are encouraged to engage in, rather than refrain from, mutual criticism. (Such effects can even be seen in school-children: see Mercer, 2000.) This, of course, makes a great deal of sense from Mercier and Sperber's perspective.&lt;/p&gt;

&lt;p&gt;This provides micro-level evidence that political argument will improve
problem solving, even if we are skeptical about human beings' ability to
abstract away from their specific circumstances and interests. Neither a
commitment to deliberation, nor even standard rationality is required for
argument to help solve problems. This has clear implications for democracy,
which forces actors with very different perspectives to engage with each
others' viewpoints. Even the most homogenous-seeming societies contain great
diversity of opinion and of interest (the two are typically related) within
them. In a democracy, no single set of interests or perspectives is likely to
prevail on its own. Sometimes, political actors have to build coalitions with
others holding dissimilar views, a process which requires engagement between
these views. Sometimes, they have to publicly contend with others holding
opposed perspectives in order to persuade uncommitted others to favor their
interpretation, rather than another. Sometimes, as new issues arise, they have
to persuade even their old allies of how their shared perspectives should be
reinterpreted anew.&lt;/p&gt;

&lt;p&gt;More generally, many of the features of democracy that skeptical liberals
deplore are actually of considerable benefit. Mercier and Sperber's work
provides microfoundations for arguments about the benefits of political
contention, such as John Stuart Mill's, and of arguments for the benefits of
partisanship, such as Nancy Rosenblum's (2008) sympathetic critique and
reconstruction of Mill. Their findings suggest that the confirmation bias that
political advocates have are subject to can have crucial benefits, so long as
it is tempered by the ability to evaluate good arguments in context.&lt;/p&gt;

&lt;p&gt;Other work suggests that the macro-structures of democracies too can have
benefits. Lazer and Friedman (2007) find on the basis of simulations that
problem solvers connected via linear networks (in which there are few links)
will find better solutions over the long run than problem solvers connected via
totally connected networks (in which there all nodes are linked to each
other). In a totally connected network, actors copy the best immediately
visible solution quickly, driving out diversity from the system, while in a
linear network, different groups explore the space around different solutions
for a much longer period, making it more likely that they will identify better
solutions that were not immediately apparent. Here, the macro-level structure
of the network does the same kind of work that confirmation bias does in
Mercier and Sperber's work --- it preserves diversity and encourages actors to
keep exploring solutions that may not have immediate
payoffs.&lt;sup&gt;&lt;a href=&quot;#fn5&quot; class=&quot;footnoteRef&quot; id=&quot;fnref5&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;This work offers a cognitive justification for the macro-level organization
of democratic life around political parties. Party politics tends to organize
debate into intense clusters of argument among people (partisans for the one or
the other party) who agree in broad outline about how to solve problems, but
who disagree vigorously about the specifics. Links between these clusters are
much rarer than links within them, and are usually mediated by
competition. Under a cognitive account, one might see each of these different
clusters as engaged in exploring the space of possibilities around a particular
solution, maintaining some limited awareness of other searches being performed
within other clusters, and sometimes discreetly borrowing from them in order to
improve competitiveness, but nonetheless preserving an essential level of
diversity (cf. Huckfeldt et al., 2004). Such very general considerations do not
justify any &lt;em&gt;specific&lt;/em&gt; partisan arrangement, as there may be better (or
worse) arrangements available. What it does is highlight how party organization
and party competition can have benefits that are hard or impossible to match in
a less clustered and more homogenous social setting. Specifically, it shows how
partisan arrangements can be better at solving complex problems than
non-partisan institutions, because they better preserve and better harness
diversity.&lt;/p&gt;

&lt;p&gt;This leads us to argue that democracy will be better able to solve complex
problems than either markets or hierarchy, for two reasons. First, democracy
embodies a commitment to political equality that the other two
macro-institutions do not. Clearly, actual democracies achieve political
equality more or less imperfectly. Yet if we are right, the better a democracy
is at achieving political equality, the better it will be, &lt;em&gt;ceteris
paribus&lt;/em&gt;, at solving complex problems. Second,
democratic &lt;em&gt;argument&lt;/em&gt;, which people use either to ally with or to attack
those with other points of view, is better suited to exposing different
perspectives to each other, and hence capturing the benefits of diversity, than
either markets or hierarchies. Notably, we do not make heroic claims about
people's ability to deliberate in some context that is free from faction and
self-interest. Instead, even under realistic accounts of how people argue,
democratic argument will have cognitive benefits, and indeed can transform
private vices (confirmation bias) into public virtues (the preservation of
cognitive diversity)&lt;sup&gt;&lt;a href=&quot;#fn6&quot; class=&quot;footnoteRef&quot;
id=&quot;fnref6&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;. Democratic structures --- such as political parties
--- that are often deplored turn out to have important cognitive
advantages.&lt;/p&gt;

&lt;h4 id=&quot;democratic-experimentalism-and-the-internet&quot;&gt;Democratic experimentalism and the Internet&lt;/h4&gt;

&lt;p&gt;As we have emphasized several times, we have no reason to think that
actually-existing democratic structures are as good as they could be, or even
close. If nothing else, designing institutions is, itself, a highly complex
problem, where even the most able decision-makers have little ability to
foresee the consequences of their actions. Even when an institution works well
at one time, the array of other institutions, social and physical conditions in
which it must function is constantly changing. Institutional design and reform,
then, is unavoidably a matter of more or less ambitious &quot;piecemeal social
experiments&quot;, to use the phrase of Popper (1957). As emphasized by Popper, and
by independently by Knight and Johnson, one of the strengths of democracy is
its ability to make, monitor, and learn from such
experiments.&lt;sup&gt;&lt;a href=&quot;file://localhost/Users/henryfarrell/temp/cognitive_democracy.html#fn7&quot;
class=&quot;footnoteRef&quot; id=&quot;fnref7&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; (Knight and Johnson particularly
emphasize the difficulty markets have in this task.) Democracies can, in fact,
experiment with their own arrangements.&lt;/p&gt;

&lt;p&gt;For several reasons, the rise of the Internet makes this an especially
propitious time for experimenting with democratic structures themselves. The
means available for communication and information-processing are obviously
going to change the possibilities for collective decision-making. (Bureaucracy
was not an option in the Old Stone Age, nor representative democracy without
something like cheap printing.) We do not yet &lt;em&gt;know&lt;/em&gt; the possibilities
of Internet-mediated communication for gathering dispersed knowledge, for
generating new knowledge, for complex problem-solving, or for collective
decision-making, but we really ought to find out.&lt;/p&gt;

&lt;p&gt;In fact, we are already starting to find out. People are building systems to
accomplish all of these tasks, in narrower or broader domains, for their own
reasons. Wikipedia is, of course, a famous example of allowing lots of
more-or-less anonymous people to concentrate dispersed information about an
immense range of subjects, and to do so both cheaply and
reliably&lt;sup&gt;&lt;a href=&quot;file://localhost/Users/henryfarrell/temp/cognitive_democracy.html#fn8&quot;
class=&quot;footnoteRef&quot; id=&quot;fnref8&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;. Crucially, however, it is not
unique. News-sharing sites like Digg, Reddit, etc. are ways of focusing
collective attention and filtering vast quantities of information. Sites like
StackExchange have become a vital part of programming practice, because they
encourage the sharing of know-how about programming, with the same system
spreading to many other technical domains. The knowledge being aggregated
through such systems is not &lt;em&gt;tacit&lt;/em&gt;, rather it is articulated and
discursive, but it was dispersed and is now shared. Similar systems are even
being used to develop new knowledge. One mode of this is open-source software
development, but it is also being used in experiments like the Polymath Project
for doing original mathematics
collaboratively&lt;sup&gt;&lt;a href=&quot;file://localhost/Users/henryfarrell/temp/cognitive_democracy.html#fn9&quot;
class=&quot;footnoteRef&quot; id=&quot;fnref9&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;At a more humble level, there are the ubiquitous phenomena of mailing lists,
discussion forums, etc., etc., where people with similar interests discuss
them, on basically all topics of interest to people with enough resources to
get on-line. These are, largely inadvertently, experiments in developing
collective understandings, or at least shared and structured disagreements,
about these topics.&lt;/p&gt;

&lt;p&gt;All such systems have to face tricky problems of coordinating their
computational architecture, their social organization, and their cognitive
functions (Shalizi, 2007; Farrell and Schwartzberg, 2008). They need ways of of
making findings (or claims) accessible, of keeping discussion productive, and
so forth and so on. (Often, participants are otherwise strangers to each other,
which is at the least suggestive of the problems of trust and motivation which
will face efforts to make mass democracy more participative.) This opens up an
immense design space, which is still very poorly understood --- but almost
certainly presents a rugged search landscape, with an immense number of local
maxima and no very obvious path to the true peaks. (It is even possible that
the landscape, and so the peaks, could vary with the subject under debate.) One
of the great aspects of the current moment, for cognitive democracy, is that it
has become (comparatively) very cheap and easy for such experiments to be made
online, so that this design space can be explored.&lt;/p&gt;

&lt;p&gt;There are also online ventures which are failures, and these, too, are
informative. They range from poorly-designed sites which never attract (or
actively repel) a user base, or produce much of value, to online groupings
which are very successful in their own terms, but are, cognitively, full of
fail, such as thriving communities dedicated to conspiracy theories. These are
not just random, isolated eccentrics,
but &lt;A href=&quot;../notebooks/psychoceramics.html&quot;&gt;highly structured communities
engaged in sharing and developing ideas, which just so happen to be &lt;em&gt;very
bad&lt;/em&gt; ideas&lt;/a&gt;. (See, for instance, Bell et al. (2006)
on &lt;a href=&quot;514.html&quot;&gt;the networks of those who share delusions that their
minds are being controlled by outside forces&lt;/a&gt;.) If we want to understand
what makes successful online institutions work, and perhaps even draw lessons
for institutional design more generally, it will help tremendously to contrast
the successes with such failures.&lt;/p&gt;

&lt;p&gt;The other great aspect for learning right now is that all these experiments
are leaving incredibly detailed records. People who use these sites or systems
leave detailed, machine-accessible traces of their interactions with each
other, even ones which &lt;em&gt;tell us about what they were thinking&lt;/em&gt;. This is
an unprecedented flood of detail about experiments with collective cognition,
and indeed with all kinds of institutions, and about how well they served
various functions. Not only could we begin to just &lt;em&gt;observe&lt;/em&gt; successes
and failures, but we can probe the mechanisms behind those outcomes.&lt;/p&gt;

&lt;p&gt;This points, we think, to a very clear constructive agenda. To exaggerate a
little, it is to see how far the Internet enables modern democracies to make as
much use of their citizens' minds as did Ober's Athens. We want to learn from
existing online ventures in collective cognition and decision-making. We want
to treat these ventures are, more or less, spontaneous
experiments&lt;sup&gt;&lt;a href=&quot;#fn10&quot; class=&quot;footnoteRef&quot; id=&quot;fnref10&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;,
and compare the success and failures (including &lt;em&gt;partial&lt;/em&gt; successes and
failures) to learn about institutional mechanisms which work well at harnessing
the cognitive diversity of large numbers of people who do not know each other
well (or at all), and meet under conditions of relative equality, not
hierarchy. If this succeeds, what we learn from this will provide the basis for
experimenting with the re-design of democratic institutions themselves.&lt;/p&gt;

&lt;p&gt;We have, implicitly, been viewing institutions through the lens of
information-processing. To be explicit, the human actions and interactions
which instantiate an institution also implement abstract computations
(&lt;a href=&quot;../reviews/cognition-in-the-wild/&quot;&gt;Hutchins, 1995&lt;/a&gt;). Especially when designing institutions for collective
cognition and decision-making, it is important to understand them as
computational processes. This brings us to our concluding suggestions about
some of the ways social science and computer science can help each other.&lt;/p&gt;

&lt;p&gt;Hong and Page's work provides a particularly clear, if abstract,
formalization of the way in which diverse individual perspectives or heuristics
can combine for better problem-solving. This observation is highly familiar in
machine learning, where the large and rapidly-growing class of &quot;ensemble
methods&quot; work, explicitly, by combining multiple imperfect models, which helps
only because the models are different (Domingos, 1999) --- in some cases it
helps &lt;em&gt;exactly to the extent&lt;/em&gt; that the models are different (Krogh and
Vedelsby, 1995). Different ensemble techniques correspond to different
assumptions about the capacities of individual learners, and how to combine or
communicate their predictions. The latter are typically extremely simplistic,
and understanding the possibilities of non-trivial organizations for learning
seems like a crucial question for both machine learning and for social
science.&lt;/p&gt;

&lt;h4 id=&quot;conclusions-cognitive-democracy&quot;&gt;Conclusions: Cognitive Democracy&lt;/h4&gt;

&lt;p&gt;Democracy, we have argued, has a capacity unmatched among other macro-structures to actually experiment, and to make use of cognitive diversity in solving complex problems. To make the best use of these potentials, democratic structures must themselves be shaped so that social interaction and cognitive function reinforce each other. But the cleverest institutional design in the world will not help unless the resources --- material, social, cultural --- needed for participation are actually broadly shared. This is not, or not just, about being nice or equitable; cognitive diversity is itself a resource, a source of power, and not something we can afford to waste.&lt;/p&gt;

&lt;h4 id=&quot;partial-bibliography&quot;&gt;[Partial] Bibliography&lt;/h4&gt;

&lt;p&gt;Badii, Remo and Antonio Politi (1997). &lt;em&gt;Complexity: Hierarchical Structures and Scaling in Physics&lt;/em&gt;, Cambridge, England: Cambridge University Press.&lt;/p&gt;
&lt;p&gt;Bell, Vaughan, Cara Maiden, Antonio Mu{~n}oz-Solomando and Venu Reddy (2006). &quot;``Mind Control'' Experience on the {Internet}: Implications for the Psychiatric Diagnosis of Delusions&quot;, &lt;em&gt;Psychopathology&lt;/em&gt;, vol. 39, pp. 87--91, http://arginine.spc.org/vaughan/Bell_et_al_2006_Preprint.pdf&lt;/p&gt;
&lt;p&gt;Blume, Lawrence Blume and David Easley (2006). &quot;If You're so Smart, Why Aren't You Rich? Belief Selection in Complete and Incomplete Markets&quot;, &lt;em&gt;Econometrica&lt;/em&gt;, vol. 74, pp. 929--966, http://www.santafe.edu/media/workingpapers/01-06-031.pdf&lt;/p&gt;
&lt;p&gt;Bowles, Samuel and Suresh Naidu (2008). &quot;Persistent Institutions&quot;, working paper 08-04-015, Santa Fe Institute, http://tuvalu.santafe.edu/~bowles/PersistentInst.pdf&lt;/p&gt;
&lt;p&gt;Domingos, Pedro (1999). &quot;The Role of Occam's Razor in Knowledge Discovery&quot;, &lt;em&gt;Data Mining and Knowledge Discovery&lt;/em&gt;, vol. 3, pp. 409--425, http://www.cs.washington.edu/homes/pedrod/papers/dmkd99.pdf&lt;/p&gt;
&lt;p&gt;Farrell, Henry and Melissa Schwartzberg (2008). &quot;Norms, Minorities, and Collective Choice Online&quot;, &lt;em&gt;Ethics and International Affairs&lt;/em&gt; 22:4, http://www.carnegiecouncil.org/resources/journal/22_4/essays/002.html&lt;/p&gt;
&lt;p&gt;Hacker, Jacob S. and Paul Pierson (2010). &lt;em&gt;Winner-Take-All Politics: How Washington Made the Rich Richer --- And Turned Its Back on the Middle Class&lt;/em&gt;. New York: Simon and Schuster.&lt;/p&gt;
&lt;p&gt;Hayek, Friedrich A. (1948). &lt;em&gt;Individualism and Economic Order&lt;/em&gt;. Chicago: University of Chicago Press.&lt;/p&gt;
&lt;p&gt;Hong, Lu and Scott E. Page (2004). &quot;Groups of diverse problem solvers can outperform groups of high-ability problem solvers&quot;, &lt;em&gt;Proceedings of the National Academy of Sciences&lt;/em&gt;, vol. 101, pp. 16385--16389, http://www.cscs.umich.edu/~spage/pnas.pdf&lt;/p&gt;
&lt;p&gt;Huckfeldt, Robert, Paul E. Johnson and John Sprague (2004). &lt;em&gt;Political Disagreement: The Survival of Diverse Opinions within Communication Networks&lt;/em&gt;, Cambridge, England: Cambridge University Press.&lt;/p&gt;
&lt;p&gt;Hutchins, Edwin (1995). &lt;em&gt;Cognition in the Wild&lt;/em&gt;. Cambridge, Massachusetts: MIT Press.&lt;/p&gt;
&lt;p&gt;Judd, Stephen, Michael Kearns and Yevgeniy Vorobeychik (2010). &quot;Behavioral dynamics and influence in networked coloring and consensus&quot;, &lt;em&gt;Proceedings of the National Academy of Sciences&lt;/em&gt;, vol. 107, pp. 14978--14982, doi:10.1073/pnas.1001280107&lt;/p&gt;
&lt;p&gt;Knight, Jack (1992). &lt;em&gt;Institutions and Social Conflict&lt;/em&gt;, Cambridge, England: Cambridge University Press.&lt;/p&gt;
&lt;p&gt;Knight, Jack and James Johnson (2011). &lt;em&gt;The Priority of Democracy: Political Consequences of Pragmatism&lt;/em&gt;, Princeton: Princeton University Press.&lt;/p&gt;
&lt;p&gt;Krogh, Anders and Jesper Vedelsby (1995). &quot;Neural Network Ensembles, Cross Validation, and Active Learning&quot;, pp. 231--238 in G. Tesauro et al. (eds)., &lt;em&gt;Advances in Neural Information Processing Systems&lt;/em&gt; 7 [NIPS 1994], http://books.nips.cc/papers/files/nips07/0231.pdf&lt;/p&gt;
&lt;p&gt;Laughlin, Patrick R. (2011). &lt;em&gt;Group Problem Solving&lt;/em&gt;. Princeton: Princeton University Press.&lt;/p&gt;
&lt;p&gt;Lazer, David and Allan Friedman (2007). &quot;The Network Structure of Exploration and Exploitation&quot;, &lt;em&gt;Administrative Science Quarterly&lt;/em&gt;, vol. 52, pp. 667--694, http://www.hks.harvard.edu/davidlazer/files/papers/Lazer_Friedman_ASQ.pdf&lt;/p&gt;
&lt;p&gt;Lindblom, Charles (1982). &quot;The Market as Prison&quot;, &lt;em&gt;The Journal of Politics&lt;/em&gt;, vol. 44, pp. 324--336, http://www.jstor.org/pss/2130588&lt;/p&gt;
&lt;p&gt;Mason, Winter A., Andy Jones and Robert L. Goldstone (2008). &quot;Propagation of Innovations in Networked Groups&quot;, &lt;em&gt;Journal of Experimental Psychology: General&lt;/em&gt;, vol. 137, pp. 427--433, doi:10.1037/a0012798&lt;/p&gt;
&lt;p&gt;Mason, Winter A. and Duncan J. Watts (2012). &quot;Collaborative Learning in Networks&quot;, &lt;em&gt;Proceedings of the National Academy of Sciences&lt;/em&gt;, vol. 109, pp. 764--769, doi:10.1073/pnas.1110069108&lt;/p&gt;
&lt;p&gt;Mercer, Neil (2000). &lt;em&gt;Words and Minds: How We Use Language to Think Together&lt;/em&gt;. London: Routledge.&lt;/p&gt;
&lt;p&gt;Mitchell, Melanie (1996). &lt;em&gt;An Introduction to Genetic Algorithms&lt;/em&gt;, Cambridge, Massachusetts: MIT Press.&lt;/p&gt;
&lt;p&gt;Moore, Cristopher and Stephan Mertens (2011). &lt;em&gt;The Nature of Computation&lt;/em&gt;. Oxford: Oxford University Press.&lt;/p&gt;
&lt;p&gt;Nelson, Richard R. and Sidney G. Winter (1982). &lt;em&gt;An Evolutionary Theory of Economic Change&lt;/em&gt;. Cambridge, Massachusetts: Harvard University Press.&lt;/p&gt;
&lt;p&gt;Page, Scott E. Page (2011). &lt;em&gt;Diversity and Complexity&lt;/em&gt;, Princeton: Princeton University Press.&lt;/p&gt;
&lt;p&gt;Pfeffer, Jeffrey and Robert I. Sutton (2006). &lt;em&gt;Hard Facts, Dangerous Half-Truths and Total Nonsense: Profiting from Evidence-Based Management&lt;/em&gt;. Boston: Harvard Business School Press.&lt;/p&gt;
&lt;p&gt;Popper, Karl R. (1957). &lt;em&gt;The Poverty of Historicism&lt;/em&gt;, London: Routledge.&lt;/p&gt;
&lt;p&gt;Popper, Karl R. (1963). &lt;em&gt;Conjectures and Refutations: The Growth of Scientific Knowledge&lt;/em&gt;, London: Routledge.&lt;/p&gt;
&lt;p&gt;Salganik, Matthew J., Peter S. Dodds and Duncan J. Watts (2006). &quot;Experimental study of inequality and unpredictability in an artificial cultural market&quot;, &lt;em&gt;Science&lt;/em&gt;, vol. 311, pp. 854--856, http://www.princeton.edu/~mjs3/musiclab.shtml&lt;/p&gt;
&lt;p&gt;Shalizi, Cosma Rohilla (2007). &quot;Social Media as Windows on the Social Life of the Mind&quot;, in &lt;em&gt;AAAI 2008 Spring Symposia: Social Information Processing&lt;/em&gt;, http://arxiv.org/abs/0710.4911&lt;/p&gt;
&lt;p&gt;Shalizi, Cosma Rohilla, Kristina Lisa Klinkner and Robert Haslinger (2004). &quot;Quantifying Self-Organization with Optimal Predictors&quot;, &lt;em&gt;Physical Review Letters&lt;/em&gt;, vol. 93, art. 118701, http://arxiv.org/abs/nlin.AO/0409024&lt;/p&gt;
&lt;p&gt;Shalizi, Cosma Rohilla Shalizi and Andrew C. Thomas (2011). &quot;Homophily and Contagion Are Generically Confounded in Observational Social Network Studies&quot;, &lt;em&gt;Sociological Methods and Research&lt;/em&gt;, vol. 40, pp. 211--239, http://arxiv.org/abs/1004.4704&lt;/p&gt;
&lt;p&gt;Shapiro, Carl and Hal R. Varian (1998). &lt;em&gt;Information Rules: A Strategic Guide to the Network Economy&lt;/em&gt;, Boston: Harvard Business School Press&lt;/p&gt;
&lt;p&gt;Stiglitz, Joseph E. (2000). &quot;The Contributions of the Economics of Information to Twentieth Century Economics&quot;, &lt;em&gt;The Quarterly Journal of Economics&lt;/em&gt;, vol. 115, pp. 1441--1478, http://www2.gsb.columbia.edu/faculty/jstiglitz/download/papers/2000_Contributions_of_Economics.pdf&lt;/p&gt;
&lt;p&gt;Sunstein, Cass R. (2003). &lt;em&gt;Why Societies Need Dissent&lt;/em&gt;. Cambridge, Massachusetts: Harvard University Press.&lt;/p&gt;
&lt;p&gt;Swartz, Aaron (2006). &quot;Who Writes Wikipedia?&quot;, http://www.aaronsw.com/weblog/whowriteswikipedia&lt;/p&gt;

&lt;div class=&quot;footnotes&quot;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&quot;fn1&quot;&gt;&lt;p&gt;Two qualifications are in order. First, we don't think that justice and social order are unimportant. If our arguments imply social institutions that are either profoundly unjust or likely to cause socially devastating instability, they are open to challenge on these alternative normative criteria. Second, our normative arguments about what these institutions are &lt;em&gt;good for&lt;/em&gt; should not be taken as an empirical statement about how these institutions have &lt;em&gt;come into being.&lt;/em&gt; Making institutions, like making sausages and making laws, is usually an unpleasant process.&lt;a href=&quot;#fnref1&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn2&quot;&gt;&lt;p&gt;Much more could of course be said about the meaning of the term &quot;complexity&quot;. In particular, it may later be useful to look at formal measures of the intrinsic complexity of problems in terms of the resources required to solve them (&quot;computational complexity&quot; theory, see Moore and Mertens), or the degree of behavioral flexibility of systems, such as interacting decision-makers (Badii and Politi; Shalizi, Klinkner and Haslinger). We should also note here that several decades of work in experimental psychology indicates that groups are better at problem-solving than the best individuals within the group (Laughlin, 2011). We do not emphasize this interesting experimental tradition, however, because it is largely concerned with problems which are, in our terms, rather simple, and so suitable to the psychology laboratory.&lt;a href=&quot;#fnref2&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn3&quot;&gt;&lt;p&gt;Imagine trying to discover whether a locally-grown tomato in Pittsburgh is better, from the point of view of greenhouse-gas emission, than one imported from Florida. After working out the differences in emissions from transport, one has to consider the emissions involved in growing the tomatoes in the first place, the emissions-cost of producing different fertilizers, farm machinery, etc., etc. The problem quickly becomes intractable --- and this is before a consumer with limited funds must decide how much a ton of emitted carbon dioxide is worth to them. Let there be a price on greenhouse-gas emission, however, and the whole informational problem disappears, or rather gets solved implicitly by ordinary market interactions.&lt;a href=&quot;#fnref3&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn4&quot;&gt;&lt;p&gt;&quot;Thus bridges are built; harbours open'd; ramparts rais'd; canals form'd; fleets equip'd; and armies disciplin'd every where, by the care of government, which, tho' compos'd of men subject to all human infirmities, becomes, by one of the finest and most subtle inventions imaginable, a composition, which is, in some measure, exempted from all these infirmities.&quot; --- Hume, &lt;em&gt;Treatise of Human Nature&lt;/em&gt;, book III, part II, sect. vii.&lt;a href=&quot;#fnref4&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn5&quot;&gt;&lt;p&gt;Broadly similar results have come from experiments on learning and problem-solving in controlled networks of human subjects in the laboratory (Mason et al., 2008; Judd et al., 2010; Mason and Watts, 2012). However, we are not aware of experiments on human subjects which have deliberately varied network structure in a way directly comparable to Lazer and Friedman's simulations. We also note that using multiple semi-isolated sub-populations (&quot;islands&quot;) is a common trick in evolutionary optimization, precisely to prevent premature convergence on sub-optimal solution (Mitchell, 1996).&lt;a href=&quot;#fnref5&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn6&quot;&gt;&lt;p&gt;This resonates with Karl Popper's insistence (1957, 1963) that, to the extent science is rational and objective, it is not because individual scientists are disinterested, rational, etc. --- he knew perfectly well that individual scientists are often pig-headed and blinkered --- but because of the way the social organization of scientific communities channels scientists' ambition and contentiousness. The reliability of science is an emergent property of scientific institutions, not of scientists.&lt;a href=&quot;#fnref6&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn7&quot;&gt;&lt;p&gt;Bureaucracies can do experiments, such as field trials of new policies, or &quot;A/B&quot; tests of new procedures, now quite common with Internet companies. (See, e.g., the discussion of such experiments in Pfeffer and Sutton.) Power hierarchies, however, are big obstacles to experimenting with options which would upset those power relations, or threaten the interests of those high in the hierarchy. Market-based selection of variants (explored by Nelson and Winter, 1982) also has serious limits (see e.g., Blume and Easley). There are, after all, many reasons why there are no markets in alternative institutions. E.g., even if such a market could get started, it would be a prime candidate efficiency-destroying network externalities, leading at best to monopolistic competition. (Cf.Â&amp;nbsp;Shapiro and Varian's advice to businesses about manipulating standards-setting processes.)&lt;a href=&quot;#fnref7&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn8&quot;&gt;&lt;p&gt;Empirically, most of the &lt;em&gt;content&lt;/em&gt; of Wikipedia seems to come from a large number of users each of whom makes a substantial contribution or contributions to a very small number of articles. The needed formatting, clean-up, coordination, etc., on the other hand, comes disproportionately from a rather small number of users very dedicated to Wikipedia (see Swartz, 2006). On the role of internal norms and power in the way Wikipedia works, see Farrell and Schwartzberg (2008).&lt;a href=&quot;#fnref8&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn9&quot;&gt;&lt;p&gt;For an enthusiastic and intelligent account of ways in which the Internet might be used to enhance the practice of science, see Nielsen. (We cannot adequately explore, here, how scientific disciplines fit into our account of institutions and democratic processes.)&lt;a href=&quot;#fnref9&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&quot;fn10&quot;&gt;&lt;p&gt;Obviously, the institutions people volunteer to participate in on-line will depend on their pre-existing characteristics, and it would be naive to ignore this. We cannot here go into strategies for causal inference in the face of such endogenous selection bias, which is pretty much inescapable in social networks (Shalizi and Thomas, 2011). &lt;em&gt;Deliberate&lt;/em&gt; experimentation with online institutional arrangements is attractive, if it could be done effectively and ethically (cf. Salganik et al., 2006).&lt;a href=&quot;#fnref10&quot;&gt;[return to main text]&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_the_collective_use-and_evolution_of_concepts.html&quot;&gt;The Collective Use and Evolution of Concepts&lt;/a&gt;;
&lt;a href=&quot;cat_the_progressive_forces.html&quot;&gt;The Progressive Forces&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>If Peer Review Did Not Exist, We Would Have to Invent Something Very Like It to Serve Highly Similar Ends</title>
    <link>http://bactra.org/weblog/916.html</link>
    <description>
&lt;blockquote&gt;&lt;em&gt;Attention conservation notice&lt;/em&gt;: 1400 words on a friend's
proposal to do away with peer review, written many weeks ago when there was
actually some debate about this.&lt;/blockquote&gt;

&lt;P&gt;&lt;a href=&quot;http://www.stat.cmu.edu/~larry/&quot;&gt;Larry&lt;/a&gt; is writing about peer
review (&lt;a href=&quot;395.html&quot;&gt;again&lt;/a&gt;), this time to advocate
&quot;&lt;a href=&quot;http://www.stat.cmu.edu/~larry/Peer-Review.pdf&quot;&gt;A World Without
Referees&lt;/a&gt;&quot;.  Every scientist, of course, has day-dreamed about this, in a
first-lets-kill-all-the-lawyers way, but Larry is serious, so let's treat this
seriously.  I'm not going to summarize his argument; it's short and you can and
should go read it yourself.

&lt;P&gt;I think it helps, when thinking about this, to separate two functions
peer-reviewed journals and conferences have traditionally served.  One is
spreading claims (dissemination), and the other is letting readers know about
claims worthy of their attention (certification).

&lt;P&gt;Arxiv, or something like it, can take over dissemination handily.  Making
copies of papers is now very cheap and very fast, so we no longer have to be
choosy about which ones we disseminate.  In physics, this use of Arxiv is just
as well-established as Larry says.  In fact, one reason Arxiv was able to
establish itself so rapidly and thoroughly among physicists was that they
already had a well-entrenched culture of circulating preprints long before
journal publication.  What Arxiv did was make this public and universally
accessible.

&lt;P&gt;But physicists still rely on journals for certification.  People pay more
attention to papers which come out in &lt;citE&gt;Physical Review Letters&lt;/cite&gt;, or
even just &lt;cite&gt;Physical Review E&lt;/cite&gt;, than ones which are only on Arxiv.
&quot;Could it make it past peer review?&quot; is used by many people as a filter to weed
out the stuff which is too wrong or too unimportant to bother with.  This
doesn't work so well for those directly concerned with a particular research
topic, but if something is only peripherally of interest, it makes a lot of
sense.

&lt;P&gt;Even within a specialized research community, consisting entirely of experts
who can evaluate new contributions on their own, there is a
rankling &lt;em&gt;inefficiency&lt;/em&gt; to the world without referees.  Larry talks
about spending a minute or two looking at new stats. papers on Arxiv every day.
But everyone filtering Arxiv for themselves is going to get harder and harder
as more potentially-relevant stuff gets put on it.  I'm interested in
information theory, so I've long looked
at &lt;a href=&quot;http://arxiv.org/list/cs.IT/recent&quot;&gt;cs.IT&lt;/a&gt;, and it's become
notably more time-consuming as that community has embraced the Arxiv.  Yet
within any given epistemic community, lots of people are going to be applying
very similar filters.  So the world-without-referees has an increasing amount o
work being done by individuals, but a lot of that work is redundant.
Efficiency, the division of labor, points to having a few people put their time
into filtering, and the rest of us relying on it, even when in principle we
could do the filtering ourselves.  To be fair, of course, we should probably
take this job in turns...

&lt;P&gt;So: if all papers get put on Arxiv, filtering becomes a big job, so
efficiency pushes us towards having only some members of the research community
do the filtering for the rest.  We have re-invented something &lt;em&gt;very
much&lt;/em&gt; like peer review, purely so that our lives are not completely
consumed by evaluating new papers, and we can actually get some work done.

&lt;P&gt;Larry's proposal for a world without referees also doesn't seem to take into
account the needs of researchers to rely on findings in fields in which they
are not experts, and so can't act as their own filters.  (Or they could if they
put in a few years in something else first.)  If I need some result from
neuroscience, or for that matter from topology, I do not have the time to spend
becoming a neuroscientist or topologist, and it is an immense benefit to have
institutions I can trust to tell me &quot;these claims about cortical columns, or
locally compact Hausdorff spaces, are at least not crazy&quot;.  This is also a kind
of filtering, and there is the same push, based on the division of labor, to
rely on only &lt;em&gt;some&lt;/em&gt; neuroscientists or topologists to do the filtering
for outsiders (or all of them only some of the time), and again we have
re-created something very much like refereeing.

&lt;P&gt;So: &lt;em&gt;some&lt;/em&gt; form or forms of filtering is inevitable, and the forces
pushing for a division of labor in filtering are very strong.  I don't know of
any reason to think that the current, historically-evolved peer review system
is the best way of organizing this cognitive triage, but we're not going to
avoid having some such system, nor should we want to.  Different ways of
organizing the work of filtering will have different costs and benefits, but we
should be talking about those and those trade-offs, not hoping that we can just
wish the problem away now that making copies is cheap&lt;sup&gt;&lt;a href=&quot;#copies&quot;
name=&quot;copies-back&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.  It's not at all obvious, for instance, that
attention-filtering for the internal benefit of members of a research community
should be done in the same way as reliability-filtering for outsiders.  But, to
repeat, we &lt;em&gt;are&lt;/em&gt; going to have filters and they are almost certainly
going to involve a division of labor.

&lt;P&gt;Lenin, supposedly, said that &quot;small production engenders capitalism and the
bourgeoisie daily, hourly, spontaneously and on a mass scale&quot;
(Nove, &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/0044460155&quot;&gt;The
Economics of Feasible Socialism Revisited&lt;/a&gt;&lt;/cite&gt;, p. 46).  Whether he was
right about the bourgeoisie or not, the rate of production of the scientific
literature, the similarity of interests and standards with a community, and the
need to rely on other field's findings are all doing to engender
refereeing-like institutions, &quot;daily, hourly, spontaneously and on a mass
scale&quot;.  I don't &lt;em&gt;think&lt;/em&gt; Larry would go to the same lengths to get rid
of referees that Lenin went to get rid of the bourgeoisie, but in any case the
truly progressive course is not to suppress the old system by force, but to
provide a superior alternative.

&lt;P&gt;Speaking personally, I am attracted to a scenario we might call &quot;peer review
among consenting adults&quot;.  Let anyone put anything on Arxiv (modulo the usual
crank-screen). But then let others create filtered versions, applying such
standards of topic, rigor, applicability, writing quality, etc., as they please
--- and be explicit about what those standards are.  These can be layered as
deep as their audience can support.  Presumably the later filters would be
intended for those further from active research in the area, and so would be
less tolerant of false alarms, and more tolerant of missing possible
discoveries, than the filters for those close to the work.  But this could be
an area for experiment, and for seeing what people actually find useful.  This
is, I take it, more or less what &lt;a href=&quot;459.html&quot;&gt;Paul Ginsparg&lt;/a&gt; proposes,
and it has a lot to recommend it.  Every contribution is available if anyone
wants to read it, but no one is compelled to try to filter the whole flow of
the scholarly literature unaided, and human intelligence can still be used to
amplify interesting signals, or even to improve papers.

&lt;P&gt;Attractive as I find this idea, I am not saying it is historically
inevitable, or even the best possible way of ordering these matters.  The main
point is that peer review does some very important jobs for the community of
inquirers (whether or not it evolved to do them), and that if we want to get
rid of it, it would be a good idea to have something else ready to do those
jobs.


&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a name=&quot;copies&quot;&gt;[1]&lt;/a&gt;: For instance, many people have suggested that
referees should have to take responsibility, in some way, for their reports, so
that those who do sloppy or ignorant or merely-partisan work will be at least
shamed.  There is genuinely a lot to be said for this.  But it does run into
the conflicting demand that science should not be a respecter of persons --- if
Grand Poo-Bah X writes a crappy paper, people should be able to call X on it,
without fear of retribution or considering the (inevitable) internal politics
of the discipline and the job-market.  I do not know if there is a way to
reconcile these, but that's one of the kind of trade-offs we have to consider
as we try to re-design this institution. &lt;a href=&quot;#copies-back&quot;&gt;^&lt;/a&gt;&lt;/span&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_learned_folly.html&quot;&gt;Learned Folly&lt;/a&gt;;
&lt;a href=&quot;cat_incestuous_amplification.html&quot;&gt;Kith and Kin&lt;/a&gt;;
&lt;a href=&quot;cat_the_collective_use-and_evolution_of_concepts.html&quot;&gt;The Collective Use and Evolution of Concepts&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Ten Years of Monster Raving Egomania and Utter Batshit Insanity</title>
    <link>http://bactra.org/weblog/915.html</link>
    <description>
&lt;P&gt;Sometimes, all you can do is quote verbatim* from your inbox:


&lt;blockquote&gt;&lt;pre&gt;
Date: Tue, 17 Apr 2012 09:31:57 -0400
From: Stephen Wolfram
To: Cosma Shalizi
Subject: 10-year followup on &quot;A New Kind of Science&quot;

Next month it'll be 10 years since I published &quot;A New Kind of Science&quot;
... and I'm planning to take stock of the decade of commentary, feedback and
follow-on work about the book that's appeared.

My archives show that you wrote an early review of the book:
http://www.cscs.umich.edu/~crshalizi/reviews/wolfram/

At the time reviews like yours appeared, most of the modern web apparatus
for response and public discussion had not yet developed.  But now it has,
and there seems to be considerable interest in the community in me using
that venue to give my responses and comments to early reviews.

I'm writing to ask if there's more you'd like to add before I embark on my
analysis in the next week or so.

I'd like to take this opportunity to thank you for the work you put into
writing a review of my book.  I know it was a challenge to review a book of
its size, especially quickly.  I plan to read all reviews with forbearance,
and hope that---especially leavened by the passage of a decade---useful
intellectual points can be derived from discussing them.

If you don't have anything to add to your early review, it'd be very helpful
to know that as soon as possible.

Thanks in advance for your help.

-- Stephen Wolfram

P.S. Nowadays you can find the whole book online at
http://www.wolframscience.com/nksonline/toc.html  If you'd like a new
physical copy, just let me know and I can have it shipped...


&lt;/pre&gt;&lt;/blockquote&gt;

&lt;P&gt;I wrote my &lt;a href=&quot;http://bactra.org/reviews/wolfram/&quot;&gt;my review&lt;/a&gt; in
2002 (though I didn't &lt;a href=&quot;387.html&quot;&gt;put it out until 2005&lt;/a&gt;).  The idea
that complex patterns can arise from simple rules was already old then, and has
only become more commonplace since.  A lot of interesting, substantive,
specific science has been done on that theme in the ensuing decade.  To this
effort, neither Wolfram nor his book have contributed anything of any note.
The one respect in which I was overly pessimistic is that I have not, in fact,
had to spend much time &quot;de-programming students [who] read &lt;cite&gt;A New Kind of
Science&lt;/cite&gt; before knowing any better&quot; &amp;mdash; but I get a rather different
class of students these days than I did in 2002.

&lt;P&gt;Otherwise, and for the record, I do indeed still stand behind the review.

&lt;P&gt;&lt;em&gt;Manual trackback&lt;/em&gt;: &lt;a href=&quot;http://news.ycombinator.com/item?id=3926956&quot;&gt;Hacker News&lt;/a&gt;; &lt;a href=&quot;http://wbmh.blogspot.com/2012/05/nice-guys-in-physics.html&quot;&gt;Wolfgang&lt;/a&gt;; &lt;A href=&quot;http://andrewgelman.com/2012/05/picking-on-stephen-wolfram/&quot;&gt;Andrew Gelman&lt;/a&gt;


&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt; *: I removed our e-mail addresses, because no one deserves spam.&lt;/span&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_selfcentered.html&quot;&gt;Self-Centered&lt;/a&gt;;
&lt;a href=&quot;cat_complexity.html&quot;&gt;Complexity&lt;/a&gt;;
&lt;a href=&quot;cat_psychoceramica.html&quot;&gt;Psychoceramica&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Installing &lt;tt&gt;pcalg&lt;/tt&gt;</title>
    <link>http://bactra.org/weblog/914.html</link>
    <description>
&lt;blockquote&gt;&lt;em&gt;Attention conservation notice&lt;/em&gt;: Boring details about
getting finicky statistical software to work; or, please read the friendly
manual.&lt;/blockquote&gt;

&lt;P&gt;Some of my students are finding it difficult to install
the &lt;a href=&quot;ftp://ftp.math.ethz.ch/sfs/Manuscripts/buhlmann/pcalg-software.pdf&quot;&gt;R
package &lt;tt&gt;pcalg&lt;/tt&gt;&lt;/a&gt;; I share these instructions in case others are also
in difficulty.

&lt;ol&gt;
&lt;li&gt; For representing graphs, &lt;tt&gt;pcalg&lt;/tt&gt; relies on two packages
called &lt;tt&gt;&lt;a href=&quot;http://bioconductor.org/packages/release/bioc/html/RBGL.html&quot;&gt;RBGL&lt;/a&gt;&lt;/tt&gt;
and &lt;tt&gt;&lt;a href=&quot;http://bioconductor.org/packages/release/bioc/html/graph.html&quot;&gt;graph&lt;/a&gt;&lt;/tt&gt;.
These are not available on &lt;a href=&quot;http://cran.r-project.org/&quot;&gt;CRAN&lt;/a&gt;, but
rather are on the &lt;em&gt;other&lt;/em&gt; R software
repository, &lt;a href=&quot;http://bioconductor.org/&quot;&gt;BioConductor&lt;/a&gt;.  To install
them, follow the instructions at those links; to summarize, run this:

&lt;blockquote&gt;
&lt;tt&gt;source(&quot;http://bioconductor.org/biocLite.R&quot;)
&lt;br&gt;biocLite(&quot;RBGL&quot;)&lt;/tt&gt;
&lt;/blockquote&gt;

(Since &lt;tt&gt;RBGL&lt;/tt&gt; depends on &lt;tt&gt;graph&lt;/tt&gt;, this should automatically also
install &lt;tt&gt;graph&lt;/tt&gt;; if not, run &lt;tt&gt;biocLite(&quot;graph&quot;)&lt;/tt&gt;,
then &lt;tt&gt;biocLite(&quot;RBGL&quot;)&lt;/tt&gt;.)

&lt;li&gt;Now
install &lt;a href=&quot;http://cran.r-project.org/web/packages/pcalg/index.html&quot;&gt;&lt;tt&gt;pcalg&lt;/tt&gt;
from CRAN&lt;/a&gt;, along with the packages it depends on.  You will get a warning
about not having the &lt;tt&gt;Rgraphviz&lt;/tt&gt; package.  However, you will be able to
load &lt;tt&gt;pcalg&lt;/tt&gt; and run it.  You should be able to step through the example
labeled &quot;Using Gaussian Data&quot; at the end of &lt;tt&gt;help(pc)&lt;/tt&gt;, though it will &lt;em&gt;not&lt;/em&gt; produce any plots.

&lt;P&gt;You can still extract the graph by hand from the fitted models returned by
functions like &lt;tt&gt;pc&lt;/tt&gt; --- if one of those objects is &lt;tt&gt;fit&lt;/tt&gt;,
then &lt;tt&gt;fit@graph@edgeL&lt;/tt&gt; is a list of lists, where each node has its
own list, naming the other nodes it has arrows to (not from).  If you are doing
this for the final in ADA, you don't actually &lt;em&gt;need&lt;/em&gt; anything beyond
this to do the assignment, as explained in question A1a.

&lt;li&gt;&lt;tt&gt;&lt;a href=&quot;http://www.bioconductor.org/packages/release/bioc/html/Rgraphviz.html&quot;&gt;Rgraphviz&lt;/a&gt;&lt;/tt&gt;
is what &lt;tt&gt;pcalg&lt;/tt&gt; relies on for drawing pictures of causal graphs.  Its installation is somewhat tricky, so there is a &lt;a href=&quot;http://www.bioconductor.org/packages/release/bioc/readmes/Rgraphviz/README&quot;&gt;README
file&lt;/a&gt;, which you should read.
&lt;br&gt;The key point is that &lt;tt&gt;Rgraphviz&lt;/tt&gt; itself relies on a &lt;em&gt;non-R&lt;/em&gt;
suite of programs called &lt;tt&gt;graphviz&lt;/tt&gt;.  You will want to install these.
Go to &lt;a href=&quot;http://www.graphviz.org/&quot;&gt;&lt;tt&gt;graphviz.org&lt;/tt&gt;&lt;/a&gt;, and
download and install the software.  (If you use a Mac, the &lt;a href=&quot;http://www.graphviz.org/Download_macos.php&quot;&gt;standard download&lt;/a&gt; also includes
&lt;tt&gt;Graphviz.app&lt;/tt&gt;, which is a nice visual interface to the actual
graph-drawing functions, and what I use for drawing the DAGs in the lecture
notes.)

&lt;li&gt; You have to make sure that your operating system will let other software
(like R) call on &lt;tt&gt;graphviz&lt;/tt&gt;.  The way to do this is to add the directory
(or folder) where you installed &lt;tt&gt;graphviz&lt;/tt&gt; to the list of places your
computer recognizes as containing executable programs --- the system's &quot;command
path&quot;.  The README for installing &lt;tt&gt;Rgraphviz&lt;/tt&gt; explains what you have to
add to the path.  (If you are a Windows user and do not know how to alter the
command path, &lt;a href=&quot;http://www.computerhope.com/issues/ch000549.htm&quot;&gt;read
this&lt;/a&gt;.)

&lt;li&gt; If you have R open, close it.  (If you do not, it will probably not know
about the new software you've just gotten the system to recognize.)  Re-open R,
and install &lt;tt&gt;Rgraphviz&lt;/tt&gt;.  The basic installation command is just
&lt;blockquote&gt;

&lt;tt&gt;source(&quot;http://bioconductor.org/biocLite.R&quot;)
&lt;br&gt;biocLite(&quot;Rgraphviz&quot;)&lt;/tt&gt;
&lt;/blockquote&gt;

The README for &lt;tt&gt;Rgraphviz&lt;/tt&gt; gives some checks which you should be able to
run if everything is working; try them.

&lt;li&gt; You should now be able to generate pictures of DAGs with &lt;tt&gt;pc&lt;/tt&gt; and
the other functions in &lt;tt&gt;pcalg&lt;/tt&gt;; try stepping through all the examples at
the end of &lt;tt&gt;help(pc)&lt;/tt&gt;.
&lt;/ol&gt;

&lt;P&gt;When I installed &lt;tt&gt;pcalg&lt;/tt&gt; on my laptop two weeks ago, it was painless,
because (1) I already had &lt;tt&gt;graphviz&lt;/tt&gt;, and (2) I knew about BioConductor.
(In fact, the R graphical interface on the Mac will switch between installing
packages from CRAN and from BioConductor.)  To check these instructions, I just
now deleted all the packages from my computer and re-installed them, and
everything worked; elapsed time, ten minutes, mostly downloading.

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Final Exam (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/913.html</link>
    <description>
&lt;P&gt;In which we are devoted to two problems of political economy, viz., strikes,
and macroeconomic forecasting.

&lt;P&gt;&lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/exams/3/exam-3.pdf&quot;&gt;Assignment&lt;/a&gt;; &lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/exams/3/macro.csv&quot;&gt;macro.csv&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Time Series I (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/912.html</link>
    <description>
What time series are.  Properties: autocorrelation or serial correlation; other
notions of serial dependence; strong and weak stationarity.  The correlation
time and the world's simplest ergodic theorem; effective sample size.  The
meaning of ergodicity: a single increasing long time series becomes
representative of the whole process.  Conditional probability estimates; Markov
models; the meaning of the Markov property.  Autoregressive models, especially
additive autoregressions; conditional variance estimates.  Bootstrapping time
series.  Trends and de-trending.

&lt;dd&gt;&lt;em&gt;Reading&lt;/em&gt;: Notes, &lt;A href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch26.pdf&quot;&gt;chapter 26&lt;/a&gt;;
&lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/time-series.R&quot;&gt;R
for
examples&lt;/a&gt;; &lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/gdp-pc.csv&quot;&gt;&lt;tt&gt;gdp-pc.csv&lt;/tt&gt;&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Books to Read While the Algae Grow in Your Fur, April 2012</title>
    <link>http://bactra.org/weblog/algae-2012-04.html</link>
    <description>
&lt;P&gt;&lt;em&gt;Attention conservation notice&lt;/em&gt;: I have no taste.

&lt;dl&gt;
&lt;dt&gt;&lt;a href=&quot;http://www.bl.uk/researchregister/1.10/?app_cd=RR&amp;page_cd=RESEARCHER&amp;l_researcher_id=34&quot;&gt;Susan Whitfield&lt;/a&gt;, &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/9780520232143&quot; name=&quot;silk-road&quot;&gt;Life along the Silk Road&lt;/a&gt;&lt;/cite&gt;&lt;/dt&gt;
&lt;dd&gt;Not-quite-historical fiction: life stories of sundry Silk Road characters
&amp;mdash; merchants, monks, soldiers, artists, ordinary widows &amp;mdash;
distributed from Samarkand to Chang-an, and from 700 to 900 AD.  These are all
more or less composites of actual people, glimpsed from the archaeological
record, and especially through the manuscripts
&lt;a href=&quot;../reviews/on-ancient-central-asian-tracks/&quot;&gt;preserved at Dunhuang&lt;/a&gt;
and saved/stolen by &lt;a href=&quot;../reviews/lives-of-aurel-stein/&quot;&gt;Aurel Stein&lt;/a&gt;.
(In fact the whole book owes a great deal to Stein, with a lot of input from
&lt;a href=&quot;http://dannyreviews.com/h/Tibetan_Empire.html&quot;&gt;Beckwith's &lt;citE&gt;The Tibetan Empire in Central Asia&lt;/cite&gt;&lt;/a&gt;.)  The lack of
references makes it hard to know how much is stitched together from sources and
how much is Whitfield's invention, but at the very least it's well-told.

&lt;dt&gt;&lt;a href=&quot;http://www.sabrepunk.com/&quot;&gt;Nathan
Long&lt;/a&gt;, &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/9781597803960&quot;
name=&quot;waar&quot;&gt;Jane Carver of Waar&lt;/a&gt;&lt;/cite&gt;&lt;/dt&gt;
&lt;dd&gt;Mind candy.  This is at once a parody of, and homage to, Barsoom.  Unlike
Burroughs, Long's book can be enjoyed after the Golden Age of Science Fiction
(i.e., by those over the age of sixteen): his characters are all &lt;em&gt;at
least&lt;/em&gt; two-dimensional (Jane herself is an engaging narrator, though
definitely at the Hill end of
the &lt;a href=&quot;http://pulllist.comixology.com/articles/167/Moby-vs-Hill&quot;&gt;Moby-Hill
spectrum&lt;/a&gt;), his style is decent, and the plot is actually interesting.  I
think it would be enjoyable even if you &lt;em&gt;hadn't&lt;/em&gt; dosed up on planetary
romances as a kid.&lt;/dd&gt;


&lt;dt&gt;&lt;a href=&quot;http://the-expanse.com/&quot;&gt;James S. A. Corey&lt;/a&gt; (i.e., Daniel Abraham and Ty Franck), &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/9780316129084&quot; name=&quot;leviathan-wakes&quot;&gt;Leviathan Wakes&lt;/a&gt;&lt;/cite&gt;&lt;/dt&gt;
&lt;dd&gt;Mind candy.  Space opera, confined to the solar system a few centuries
hence.  This has gotten a lot of favorable attention, but I found it merely OK;
perhaps I'd have enjoyed it more if my expectations had been lower.  It's split
between two plot lines, with two point-of-view characters; I enjoyed (but
wasn't blown away by) one of them, but found the other both over-predictable
and irritating.  It does some things well (a reasonably-sized solar system!
minimal handwavium! a
non-&lt;a href=&quot;http://www.jwz.org/blog/2005/09/the-grim-meathook-future/&quot;&gt;grim-meathook-future&lt;/a&gt;
future! some decent characterization!), but it never really managed to grab me.
It's definitely nowhere near as good as
say, &lt;a href=&quot;algae-2010-11.html#quiet-war&quot;&gt;McAuley's &lt;cite&gt;The Quiet
War&lt;/cite&gt;&lt;/a&gt;, to name a recent and thematically-similar book.  The sequel
will be out soon, and seems like it will be continuing along the better of the
two narrative threads here, so I might pick it up, but I won't rush to do
so.&lt;/dd&gt;
&lt;dd&gt;Spoiler-laden griping: Bar bs gur gjb cybg yvarf vf n uneq-obvyrq vairfgvtngvba, pbzcyrgr jvgu na nypbubyvp zvqqyr-ntrq qrgrpgvir, pbeehcg vagevthrf, naq n zlfgrevbhf qnzr jub gur qrgrpgvir snyyf va ybir jvgu. V qba'g yvxr gur uneq-obvyrq traer, orpnhfr, juvyr V nz irel fragvzragny, vgf cnegvphyne pbzovangvba bs fragvzragnyvgl naq plavpvfz vf bss-chggvat. Fb onfvpnyyl V jnagrq gb fxvc nyy gur puncgref sebz Zvyyre'f cbvag bs ivrj, naq whfg sbyybj gubfr jvgu Ubyqra naq uvf perj. Yrff crefbanyyl (v.r., nf n engvbanyvmngvba), abve cerfhccbfrf fhpu n irel cnegvphyne, uvfgbevpnyyl-yvzvgrq phygheny frggvat gung frrvat vg fvzcyl qhzcrq vagb jung fubhyq or n enqvpnyyl arj xvaq bs fbpvrgl jnf wneevat. (Rirelguvat ba Prerf jbexf yvxr Puvpntb pvepn 1940 orpnhfr ubj ryfr?).&lt;/dd&gt;
&lt;dd&gt;Ba n qvssrerag cynar nygbtrgure, Cebgbtra'f ernfbaf sbe jnagvat gb gel bhg gur nyvra ivehf/znpuvar ba gur jubyr cbchyngvba bs Rebf ner jrnx. Vs gur cbvag bs gur znpuvar vf gb gnxr bire rkvfgvat ovbznff naq erfuncr vg nppbeqvat gb fbzr cebtenz, vg jbhyq frrz vasvavgryl rnfvre gb tvir vg hzcgrra gbaf bs lrnfg gb cynl jvgu, guna gb fcraq lrnef bepurfgengvat gur gnxr-bire bs n pbybal jvgu bire n zvyyvba crbcyr, gb fnl abguvat bs gur erqhprq cbffvovyvgl sbe oybj-onpx, frphevgl oernpurf, rgp. Ab qbhog gurl'q jnag gb gel vg ba crbcyr riraghnyyl, ohg fgnegvat gurer, jvgu ab pbageby bire rssrpgf, vf whfg onq rkcrevzragny qrfvta. Cyhf &quot;tvir gur napvrag fhcre-nqinaprq nyvra jne znpuvar pbageby bire na nfgrebvq&quot; qbrf abg fbhaq yvxr n cyna juvpu jbhyq qrirybc gb n fbpvbcngu'f nqinagntr. (Gurl jbhyqa'g pner nobhg gur qnzntr gb bguref, ohg gurzfryirf?)&lt;/dd&gt;
&lt;dd&gt;In conclusion, bring me back my cane and then get off my lawn, you're
trampling the lilies.&lt;/dd&gt;

&lt;dt&gt;&lt;a href=&quot;http://www.zatrikion.blogspot.com/&quot; name=&quot;fall-from-earth&quot;&gt;Matthew Johnson&lt;/a&gt;, &lt;citE&gt;Fall from Earth&lt;/cite&gt; [buying: &lt;a href=&quot;http://store.bundoranpress.com/science-fiction/fall-from-earth.html&quot;&gt;publisher&lt;/a&gt;, &lt;a href=&quot;http://iambik.com/books/fall-earth-by-matthew-johnson/&quot;&gt;audio&lt;/a&gt;]&lt;/dt&gt;
&lt;dd&gt;Mind candy. Scheme-laden first-contact space opera with a social setting I
can only call &quot;The Ming Dynasty IN SPAAAAACE&quot;.  Good enough that I will keep a look out for more from Johnson.&lt;/dd&gt;
&lt;dd&gt;&lt;span class=&quot;blognotes&quot;&gt;It's a small thing, but Johnson shows no
appreciation of the energy required to move food from planet to planet, which
makes his &quot;equitable marketing system&quot; a complete non-starter.  (But he shares
this flaw with
Cherryh's &lt;a href=&quot;http://www.tor.com/blogs/2008/12/norway-has-her-standards-cj-cherryhs-downbelow-station&quot;&gt;deservedly-admired&lt;/a&gt; &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/9780756405502&quot;&gt;Downbelow
Station&lt;/a&gt;&lt;/cite&gt;.)  If, however, the magistracy wants to make sure that no
world can become self-sufficient, the way to do it would be to restrict
their &lt;em&gt;manufacturing&lt;/em&gt;, since any colony would be dependent for survival
on a complex industrial infrastructure.&lt;/span&lt;/dd&gt;

&lt;dt&gt;Bernard Williams, &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/9780691117911&quot; name=&quot;williams-on-truthfulness&quot;&gt;Truth and Truthfulness: An Essay in Genealogy&lt;/a&gt;&lt;/cite&gt;&lt;/dt&gt;
&lt;dd&gt;Shorter Williams: &quot;Say what you mean. Bear witness.  Iterate.&quot;  (The late
John M. Ford, &lt;a href=&quot;http://nielsenhayden.com/electrolite/archives/003789.html#29472&quot;&gt;in a different context&lt;/a&gt;.)&lt;/dd&gt;
&lt;dd&gt;Slightly longer: You can get a decent sense of what the book is about from the
&lt;a href=&quot;http://press.princeton.edu/titles/7328.html&quot;&gt;publishers&lt;/a&gt;, so I'll
comment without much exposition.&lt;/dd&gt;
&lt;dd&gt;When Williams talks about a &quot;genealogy&quot; of some idea or practice, he means
an account of why, if it did not exist, we would have to invent it.
Specifically, he spins a state-of-nature story about how if, in the state of
nature, human beings did not have an idea of truth, but nonetheless were social
and rational animals, and so dependent on a division of epistemic labor, they
would have to form one, and two &quot;virtues of truthfulness&quot;, namely &quot;sincerity&quot;
(Ford's &quot;say what you mean&quot;) and &quot;accuracy&quot; (Ford's &quot;bear witness&quot;) to make it
effective.  This is not intended as history or pre-history (Williams: &quot;the
state of nature is not the Pleistocene&quot;), but it is a bit mysterious to me how
then it is supposed to &lt;em&gt;explain&lt;/em&gt; our notions of truth, truthfulness,
sincerity, accuracy, etc., much less explain them &quot;non-reductively&quot;.  Perhaps
&amp;mdash; this is suggested by his section on &quot;Shameful Origins&quot; &amp;mdash; it is
just supposed to make us feel better about having them, by convincing us that
we could have acquired such ideas in a way which doesn't discredit them.  (We
are not suckers.)&lt;/dd&gt;
&lt;dd&gt;It may sound odd to describe &quot;accuracy&quot; as a virtue, but being accurate ---
bearing &lt;em&gt;good&lt;/em&gt; witness --- means things like check tendencies to leap to
conclusion, choosing appropriate methods of inquiry, taking pains to secure all
the relevant facts (Williams is especially good on the notion of &quot;facts&quot;), etc.
Williams is indeed eloquent on how the virtues of accuracy are one of the
things which have made the pursuit of science a source of human values,
especially in circumstances where honesty otherwise was hard.&lt;/dd&gt;
&lt;dd&gt;As this last suggests, culture lets us articulate the raw virtues of
sincerity and accuracy into incredibly elaborate and interlocking complexes of
attitudes and practices (Ford's &quot;iterate&quot;).  From the inside, these have, or at
least seem to have, intrinsic as well as instrumental value, and indeed they
would not work at if their value was &lt;em&gt;just&lt;/em&gt; instrumental.  I confess
that I do not fully follow Williams's attempt to try to explain when or why or
how the virtues of truth become &quot;intrinsic values&quot;.  It seems to be something
like: people find these values &lt;em&gt;compelling&lt;/em&gt;, in a way which they would
not if they saw them just as handy tools for achieving selfish ends; this in
turn makes these values successful commitment devices &lt;a href=&quot;#n1&quot; name=&quot;b1&quot;&gt;[1]&lt;/a&gt;. Williams seems
to me to equivocate as to whether these virtues &lt;em&gt;really do&lt;/em&gt; have such
intrinsic value, but on balance I am just as happy that he strayed no deeper
into the swamp of meta-ethics, and wisely turned back to the sounder terrain of
looking at certain episodes in the articulation of these virtues.  The two main
case-studies he gives are contrasts of Thucydides and Herodotus on history, and
of Rousseau and Diderot on authenticity and the self.  Both of these really
have a wider, philosophical import, and as such they would both have been
stronger for a more comparative, cross-cultural perspective &amp;mdash; not in the
service of the small virtue of courtesy (Williams has mercifully few &quot;what you
mean 'we', white man?&quot;  moments), but rather in the service of the great virtue
of accuracy &lt;a href=&quot;#n2&quot; name=&quot;b2&quot;&gt;[2]&lt;/a&gt;.&lt;/dd&gt;
&lt;dd&gt;But I see that I am descending into my usual quibbling.  This is a
profoundly thoughtful and profoundly learned book, which says interesting
things to say about some of the deepest and most humanly-important problems in
philosophy, and says them elegantly.  Go read.&lt;/dd&gt;

&lt;dd&gt;&lt;span class=&quot;blognotes&quot;&gt;&lt;a name=&quot;n1&quot;&gt;[1]&lt;/a&gt; I cannot help but be reminded of &lt;a href=&quot;http://psychclassics.yorku.ca/James/Principles/prin24.htm&quot;&gt;William James&lt;/a&gt;:
&lt;blockquote&gt;
&lt;em&gt;Now, why do the various animals do what seem to us such strange things&lt;/em&gt;, in the presence of such outlandish stimuli? Why does the hen, for example, submit herself to the tedium of incubating such a fearfully uninteresting set of objects as a nestful of eggs, unless she have some sort of a prophetic inkling of the result?  The only answer is &lt;em&gt;ad hominem&lt;/em&gt;.  We can only interpret the instincts of brutes by what we know of instincts in ourselves.  Why do men always lie down, when they can, on soft beds rather than on hard floors?  Why do they sit round the stove on a cold day?  Why, in a, room, do they place themselves, ninety-nine times out of a hundred, with their faces towards its middle rather than to the wall?  Why do they prefer saddle of mutton and champagne to hard-tack and ditch-water?  Why does the maiden interest the youth so that everything about her seems more important and significant than anything else in the world?  Nothing more can be said than that these are human ways, and that every creature likes its own ways, and takes to the following them as a, matter of course.  Science may come and consider these ways, and find that most of them are useful.  But it is not for the sake of their utility that they are followed, but because at the moment of following them we feel that that is the only appropriate and natural thing to do.  Not one man in a billion, when taking his dinner, ever thinks of utility.  He eats because the food tastes good and makes him want more.  If you ask him why he should want to eat more of what tastes like that, instead of revering you as a philosopher he will probably laugh at you for a fool.  The connection between the savory sensation and the act it awakens is for him absolute and &lt;em&gt;selbstverst&amp;auml;ndlich&lt;/em&gt;, an &quot;a priori synthesis&quot; of the most perfect sort, needing no proof but its own evidence.  It takes, in short, what Berkeley calls a mind debauched by learning to carry the process of making the natural seem strange, so far as to ask for the why of any instinctive human act.  To the metaphysician alone can such questions occur as: Why do we smile, when pleased, and not scowl?  Why are we unable to talk to a crowd as we talk to a single friend?  Why does a particular maiden turn our wits so upside-down?  The common man can only say, &quot;Of course we smile, of course our heart palpitates at the sight of the crowd, of course we love the maiden, that beautiful soul clad in that perfect form, so palpably and flagrantly made from all eternity to be loved!&quot;

&lt;br&gt;And so, probably, does each animal feel about the particular things it tends to do in presence of particular objects.  They, too, are a priori syntheses.  To the lion it is the lioness which is made to be loved; to the bear, the she-bear.  To the broody hen the notion would probably seem monstrous that there should be a creature in the world to whom a nestful of eggs was not the utterly fascinating and precious and never-to-be-too-much-sat-upon object which it is to her.

&lt;br&gt;Thus we may be sure that, however mysterious some animals' instincts may appear to us, our instincts will appear no less mysterious to them.  And we may conclude that, to the animal which obeys it, every impulse and every step of every instinct shines with its own sufficient light, end seems at the moment the only eternally right and proper thing to do.  It is done for its own sake exclusively.  What voluptuous thrill may not shake a fly, when she at last discovers the one particular leaf, or carrion, or bit of dung, that out of all the world can stimulate her ovipositor to its discharge?  Does not the discharge then seem to her the only fitting thing?  And need she care or know anything about the future maggot and its food?&lt;/blockquote&gt;

More soberly, or at least with fewer hens and maggots, this is highly
reminiscent of Robert
Frank's &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/9780393960228&quot;&gt;Passion
within Reason&lt;/a&gt;&lt;/cite&gt;, which I do not believe Williams mentions.
&lt;a href=&quot;#b1&quot;&gt;^&lt;/a&gt;&lt;/span&gt;

&lt;dd&gt;&lt;span class=&quot;blognotes&quot;&gt;&lt;a name=&quot;n2&quot;&gt;[2]&lt;/a&gt; Williams claims, quite
plausibly, that Thucydides had different ideas about historical explanation and
historical evidence than did Herodotus &amp;mdash; ones which are both stricter
about what counts as acceptable history, and which are supported by compelling
rationales even within the older framework.  He also claims, more sketchily,
that Herodotus was immersed in a culture which was still partly oral and partly
literature, while Thucydides was &lt;em&gt;not&lt;/em&gt;.  If all this was right, should
not the same contrast show up in the historical traditions of China, the
Islamic world, etc.?  Why does such a tradition not seem to be indigenous to
India?  (Cf., on all this,
Brown's &lt;a href=&quot;http://www.powells.com/partner/35751/biblio/9780816510603&quot;&gt;&lt;cite&gt;History,
Hierarchy, and Human Nature&lt;/a&gt;&lt;/cite&gt;.) Western Europe, after the fall of the
western Roman Empire, never lost literacy, but it certainly didn't produce
histories like Thucydides's for many centuries: why, on Williams's account,
not?  (Actually, outside of Italy, did western Europe ever produce such
histories &lt;em&gt;before&lt;/em&gt; the fall of the empire?)  If there are important
distinctions between these cases, such that Williams's account applies only in
the special circumstances of the Aegean around 500--300 BC, what are those
circumstances?  &amp;mdash; Let me add that it was &lt;em&gt;Williams&lt;/em&gt; who made all
these considerations relevant, not me. &lt;a href=&quot;#b2&quot;&gt;^&lt;/a&gt;&lt;/span&gt;&lt;/dd&gt;






&lt;dt&gt;J. C. W. Rayner and D. J. Best, &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/0-19-505610-8&quot; name=&quot;rayner-best&quot;&gt;Smooth Tests of Goodness of Fit&lt;/a&gt;&lt;/cite&gt;&lt;/dt&gt;
&lt;dd&gt;Suppose a random variable \( Y \) is confined to the unit interval \( [0,1]
\), and we want to test whether it is uniformly distributed.  One way to do
this would be to construct alternative distributions which are in some sense
smooth departures from uniformity, with densities \( g(y;\theta) =
e^{\sum_{j=1}^{d}{\theta_j h_j(y)}}/z(\theta) \), where it is convenient to
chose the \( h_j \) functions to be an orthonormal basis --- the cosine basis,
say, or the Legendre polynomials.  (That is, they are orthonormal in \( L_2 \),
the space of square-integrable functions on the unit interval.)  Uniformity is
then the special case \( \theta = 0 \), and we can test it against the
alternative that \( \theta \neq 0 \) by the usual devices of a likelihood-ratio
test, a score test, etc., which will all, under the null hypothesis, have an
asymptotic \( \chi^2_d \) distribution.  This is Neyman's original smooth test,
which &lt;a href=&quot;http://ssrn.com/abstract=272888&quot;&gt;seems to have originated&lt;/a&gt;
from the problem of how to combine &lt;i&gt;p&lt;/i&gt;-values from independent
experiments, which should all be uniformly distributed under the null
hypothesis.  One nice feature of this test is that if we reject the null, we
immediately have an alternative, namely our maximum likelihood estimate of \(
\theta \), for what the actual distribution is --- it tells us not just that
the null model is wrong, but how, and what a better one would be like.&lt;/dd&gt;
&lt;dd&gt;The real power of this comes from the following observation.  If \( X \) is
distributed according to some continuous CDF \( F \), then \( Y=F(X) \) is
uniformly distributed on \( [0,1] \).  The smooth alternatives for \( Y \)
translate into smooth alternatives for \( X \), with densities \( g_X(x,\theta)
= f(x) e^{\sum_{j=1}^{d}{\theta_j h_j(F(x))}}/z(\theta) \).  We can test
whether \( X \sim F \) by, once again, testing with \( \theta = 0 \), and the
theory works just as before.  If \( F \) is not fixed but involves some
parameters \( \beta \), then we consider the smooth alternative densities \(
g_{X}(x;\beta,\theta) = f(x;\beta) e^{\sum_{j=1}^{d}{\theta_j
h_j(F(x;\beta))}}/z(\theta) \), and again we test the specification by testing
\( \theta = 0 \).  Since this always involves fixing \( d \) parameters, we
always get a \( \chi^2_d \) asymptotic distribution under the null.&lt;/dd&gt;
&lt;dd&gt;Rayner and Best's monograph is a clear, if now somewhat old-fashioned,
exposition of Neyman's smooth test and its relatives and extensions.  They
actually begin with Pearson's \( X^2 \) or \( \chi^2 \) test, which can be seen
as a smooth test for multinomial (rather than continuous) data, before going on
to consider the general theory of likelihood ratio and score tests, and
Neyman's smooth tests.  Much of the book is taken up with various permutations
of discretizing continuous variables and/or allowing estimation of the
parameters I have written \( \beta \); the latter concern seems less important
these days.&lt;/dd&gt;
&lt;dd&gt;An important set of developments which does not get as much attention here
as a more recent treatment would give is that of picking the order of the
alternatives \( d \).  Neyman suggested \( d = 4 \) but emphasized it was
guess; some later workers guessed \( d = 2 \) should be enough.  Really,
however, this is a problem of &lt;a href=&quot;../reviews/claeskens-hjort.html&quot;&gt;model
selection&lt;/a&gt; or capacity control, and so all the usual tools, like
cross-validation or information criteria, can be applied.  This is one place
where &lt;a href=&quot;http://doc.utwente.nl/62408/&quot;&gt;BIC has proved particularly
useful&lt;/a&gt;, leading
to &lt;a href=&quot;http://CRAN.R-project.org/package=ddst&quot;&gt;&quot;data-driven&quot; smooth
tests&lt;/a&gt;.  These no longer have nice \( \chi^2 \) asymptotics, but it's pretty
easy to get their sampling distributions from simulation.&lt;/dd&gt;
&lt;dd&gt;Despite these limits, this is still a useful reference for people interested in specification checking.&lt;/dd&gt;




&lt;dt&gt;&lt;a href=&quot;http://aliettedebodard.com/&quot;&gt;Aliette De Bodard&lt;/a&gt;, &lt;cite&gt;&lt;a href=&quot;http://www.powells.com/partner/35751/biblio/9780857660312&quot; name=&quot;servant-of-the-underworld&quot;&gt;Servant of the
Underworld&lt;/a&gt;&lt;/cite&gt;&lt;/dt&gt;
&lt;dd&gt;Mind candy: historical fantasy/mystery set in Tenochtitlan (a few
generations before what would be the Conquest), only with the mythology of the
Aztecs being literally true and magic very much a part of actual life.  It had
some typical first-novel flaws (too much exposition, the plot drags in places),
but overall decent.&lt;/dd&gt;

&lt;/dl&gt;


&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_algae.html&quot;&gt;Books to Read While the Algae Grow in Your Fur&lt;/a&gt;;
&lt;a href=&quot;cat_scientifiction.html&quot;&gt;Scientifiction and Fantastica&lt;/a&gt;;
&lt;a href=&quot;cat_detection.html&quot;&gt;Pleasures of Detection, Portraits of Crime&lt;/a&gt;;
&lt;a href=&quot;cat_enigmas_of_chance.html&quot;&gt;Enigmas of Chance&lt;/a&gt;;
&lt;a href=&quot;cat_central_asia.html&quot;&gt;Central Asia&lt;/a&gt;;
&lt;a href=&quot;cat_philosophy.html&quot;&gt;Philosophy&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Brought to You by the Letters D, A, and G (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/911.html</link>
    <description>
&lt;P&gt;In which the arts of estimating causal effects from observational data are practiced on &lt;cite&gt;Sesame Street&lt;/cite&gt;.

&lt;P&gt;&lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/hw/11/hw-11.pdf&quot;&gt;Assignment&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Estimating Causal Effects from Observations (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/910.html</link>
    <description>
&lt;P&gt;Estimating graphical models: substituting consistent estimators into the
formulas for front and back door identification; average effects and
regression; tricks to avoid estimating marginal distributions; propensity
scores and matching and propensity scores as computational short-cuts in
back-door adjustment.  Instrumental variables estimation: the Wald estimator,
two-stage least-squares.  Summary recommendations for estimating causal
effects.

&lt;P&gt;&lt;em&gt;Reading&lt;/em&gt;:
Notes, &lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch24.pdf&quot;&gt;chapter
24&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Separated at Birth (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/909.html</link>
    <description>
&lt;P&gt;In which we use graphical causal models to understand twin studies and variance components.

&lt;P&gt;&lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/hw/10/hw-10.pdf&quot;&gt;Assignment&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Identifying Causal Effects from Observations (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/908.html</link>
    <description>
&lt;P&gt;Reprise of causal effects vs. probabilistic conditioning.  &quot;Why think, when
you can do the experiment?&quot;  Experimentation by controlling everything
(Galileo) and by randomizing (Fisher).  Confounding and identifiability.  The
back-door criterion for identifying causal effects: condition on covariates
which block undesired paths.  The front-door criterion for identification: find
isolated and exhaustive causal mechanisms.  Deciding how many black boxes to
open up.  Instrumental variables for identification: finding some exogenous
source of variation and tracing its effects.  Critique of instrumental
variables: vital role of theory, its fragility, consequences of weak
instruments.  Irremovable confounding: an example with the detection of social
influence; the possibility of bounding unidentifiable effects.  Summary
recommendations for identifying causal effects.

&lt;P&gt;&lt;em&gt;Reading&lt;/em&gt;:
Notes, &lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch23.pdf&quot;&gt;chapter
23&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Just How Quickly Do We Forget?</title>
    <link>http://bactra.org/weblog/907.html</link>
    <description>
&lt;!-- 
&lt;script type=&quot;text/javascript&quot;
   src=&quot;http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;&lt;/script&gt;
!--&gt;

&lt;blockquote&gt;&lt;em&gt;Attention conservation notice&lt;/em&gt;: 2500+ words on estimating
how quickly time series forget their own history.  Only of interest if you care
about the intersection of stochastic processes and statistical learning theory.
Full of jargon, equations, log-rolling and self-promotion, yet utterly
abstract.&lt;/blockquote&gt;

&lt;P&gt;I &lt;a href=&quot;902.html&quot;&gt;promised to say something
about the content of Daniel's thesis&lt;/a&gt;, so let me talk about two of his
papers, which go into chapter 4; there is a short conference version and a long
journal version.
&lt;dl&gt;
&lt;dt&gt;Daniel J. McDonald, Cosma Rohilla Shalizi and Mark Schervish, &quot;Estimating beta-mixing coefficients&quot;, &lt;a href=&quot;http://jmlr.csail.mit.edu/proceedings/papers/v15/mcdonald11a.html&quot;&gt;AIStats 2011&lt;/a&gt;, &lt;a href=&quot;http://arxiv.org/abs/1103.0941&quot;&gt;arxiv:1103.0941&lt;/a&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;em&gt;Abstract&lt;/em&gt;: The literature on statistical learning for time series
assumes the asymptotic independence or &quot;mixing&quot; of the data-generating
process. These mixing assumptions are never tested, nor are there methods for
estimating mixing rates from data. We give an estimator for the \( \beta
\)-mixing rate based on a single stationary sample path and show it is \( L_1
\)-risk consistent.&lt;/dd&gt;
&lt;dt&gt;----, &quot;Estimating beta-mixing coefficients via histograms&quot;, &lt;a href=&quot;http://arxiv.org/abs/1109.5998&quot;&gt;arxiv:1109.5998&lt;/a&gt;&lt;/dt&gt;
&lt;dd&gt;&lt;em&gt;Abstract&lt;/em&gt;: The literature on statistical learning for time series often assumes asymptotic independence or &quot;mixing&quot; of data sources. Beta-mixing has long been important in establishing the central limit theorem and invariance principle for stochastic processes; recent work has identified it as crucial to extending results from empirical processes and statistical learning theory to dependent data, with quantitative risk bounds involving the actual beta coefficients. There is, however, presently no way to actually estimate those coefficients from data; while general functional forms are known for some common classes of processes (Markov processes, ARMA models, etc.), specific coefficients are generally beyond calculation. We present an \( L_1 \)-risk consistent estimator for the beta-mixing coefficients, based on a single stationary sample path. Since mixing coefficients involve infinite-order dependence, we use an order-d Markov approximation. We prove high-probability concentration results for the Markov approximation and show that as \( d \rightarrow \infty \), the Markov approximation converges to the true mixing coefficient. Our estimator is constructed using d dimensional histogram density estimates. Allowing asymptotics in the bandwidth as well as the dimension, we prove \( L_1 \) concentration for the histogram as an intermediate step.&lt;/dd&gt;
&lt;/dl&gt;

&lt;P&gt;&lt;a href=&quot;668.html&quot;&gt;Recall&lt;/a&gt; the world's simplest ergodic theorem: if \( X_t \) is a sequence
of random variables with common expectation \( m \) and variance \( v \), and
stationary covariance \( \mathrm{Cov}[X_t, X_{t+h}] = c_h \).  Then
the time average \( \overline{X}_n \equiv \frac{1}{n}\sum_{i=1}^{n}{X_i} \) also has expectation \( m \), and the question is whether it converges on that expectation.
The world's simplest ergodic theorem asserts that
if the correlation time
\[
T = \frac{\sum_{h=1}^{\infty}{|c_h|}}{v} &lt; \infty
\]
then
\[
\mathrm{Var}\left[ \overline{X}_n \right] \leq \frac{v}{n}(1+2T)
\]

&lt;P&gt;Since, as I said, the expectation of \( \overline{X}_n \) is \( m \) and its variance is
going to zero, we say that \( \overline{X}_n \rightarrow m \) &quot;in mean square&quot;.

&lt;P&gt;From this, we can get a crude but often effective &lt;a href=&quot;http://bactra.org/notebooks/deviation-inequalities.html&quot;&gt;deviation
inequality&lt;/a&gt;, using &lt;a href=&quot;http://en.wikipedia.org/wiki/Chebyshev's_inequality&quot;&gt;Chebyshev's inequality&lt;/a&gt;:

\[
\Pr{\left(|\overline{X}_n - m| &gt; \epsilon\right)} \leq \frac{v}{\epsilon^2}\frac{1+2T}{n}
\]


&lt;P&gt;The meaning of the condition that the correlation time \( T \) be finite is
that the correlations themselves have to trail off as we consider events which
are widely separated in time &amp;mdash; they don't ever have to be zero, but they
do need to get smaller and smaller as the separation \( h \) grows.  (One can
actually weaken the requirement on the covariance function to just \(
\lim_{n\rightarrow \infty}{\frac{1}{n}\sum_{h=1}^{n}{c_h}} = 0 \), but this
would take us too far afield.)  In fact, as these formulas show, the
convergence looks just like what we'd see for independent data, only with \(
\frac{n}{1+2T} \) samples instead of \( n \), so we call the former the
effective sample size.

&lt;P&gt;All of this is about the convergence of averages of \( X_t \), and based on
its covariance function \( c_h \).  What if we care not about \( X \) but about
\( f(X) \)?  The same idea would apply, but unless \( f \) is linear, we can't
easily get its covariance function from \( c_h \).  The mathematicians'
solution to this has been to invent stronger notions of decay-of-correlations,
called &quot;mixing&quot;.  Very roughly speaking, we say that \( X \) is mixing when, if
you pick any two (nice) functions \( f \) and \( g \), I can always show that

\[
\lim_{h\rightarrow\infty}{\mathrm{Cov}\left[ f(X_t), g(X_{t+h}) \right]} = 0
\]

&lt;P&gt;Note (or believe) that this is &quot;convergence in distribution&quot;; it happens if,
and only if, the distribution of events up to time \( t \) is becoming
independent of the distribution of events from time \( t+h \) onwards.

&lt;P&gt;To get useful results, it is necessary to quantify mixing, which is usually
done through somewhat stronger notions of dependence.  (Unfortunately, none of
these have meaningful names.
The &lt;a href=&quot;http://arxiv.org/abs/math/0511078&quot;&gt;review by Bradley&lt;/a&gt; ought to
be the standard reference.)  For instance, the &quot;total variation&quot; or \( L_1 \)
distance between probability measures \( P \) and \( Q \), with densities \( p
\) and \( q \) is,

\[
d_{TV}(P,Q) = \frac{1}{2}\int{|p(u) - q(u)| du}
\]

This has several interpretations, but the easiest to grasp is that it says how
much \( P \) and \( Q \) can differ in the probability they give to any one
event: for any \( E \), \( d_{TV}(P,Q) \geq |P(E) - Q(E)| \).  One use of this
distance is to measure how the dependence between random variables, by seeing
far their joint distribution is from the product of their marginal
distributions.  Abusing notation a little to write \( P(U,V) \) for the joint
distribution of \( U \) and \( V \), we measure dependence as

\[
\beta(U,V) \equiv d_{TV}(P(U,V), P(U) \otimes P(V)) = \frac{1}{2}\int{|p(u,v)-p(u)p(v)|du dv}
\]

This will be zero just when \( U \) and \( V \) are statistically independent,
and one when, on average, conditioning on \( U \) confines \( V \) to a set
which would otherwise have probability zero.  (For instance if \( U \) has a
continuous distribution and \( V \) is a function of \( U \) &amp;mdash; or one of
two randomly chosen functions of \( U \).)

&lt;P&gt;We can relate this back to the earlier idea of correlations between functions by realizing that

\[
\beta(U,V) = \sup_{|r|\leq 1}{\left|\int{r(u,v) dP(U,V)} - \int{r(u,v)dP(U)dP(V)}\right|} ~,
\]

that \( \beta \) says how much the expected value of a bounded function \( r \)
could change between the dependent and the independent distributions.  (There
is no assumption that the test function \( r \) factorizes, and in fact it's
important to allow \( r(u,v) \neq f(u)g(v) \).)





&lt;P&gt;We apply these ideas to time series by looking at the dependence between the past and the future:

\[
\begin{eqnarray*}
\beta(h) &amp; \equiv &amp; d_{TV}(P(X^t_{-\infty}, X_{t+h}^{\infty}), P(X^t_{-\infty}) \otimes P(X_{t+h}^{\infty})) \\
&amp; = &amp; \frac{1}{2}\int{|p(x^t_{-\infty},x_{t+h}^{\infty})-p(x^t_{-\infty})p(x^{\infty}_{t+h})|dx^t_{-\infty}dx^{\infty}_{t+h}}
\end{eqnarray*}
\]

(By stationarity, the integral actually does not depend on \( t \).)  When \(
\beta(h) \rightarrow 0 \) as \( h \rightarrow \infty \), we have a
&quot;beta-mixing&quot; process.  (These are also called
&quot;&lt;a href=&quot;http://absolutely-regular.blogspot.com/&quot;&gt;absolutely regular&lt;/a&gt;&quot;.)
Convergence in total variation implies convergence in distribution, but not
vice versa, so beta-mixing is stronger than common-or-garden mixing.

&lt;P&gt;Notions like beta-mixing were originally introduced purely for probabilistic
convenience, to handle questions
like &lt;a href=&quot;http://www.pnas.org/cgi/reprint/42/1/43&quot;&gt;&quot;when does the central
limit theorem hold for stochastic processes?&quot;&lt;/a&gt;  These are interesting for
people who like stochastic processes, or indeed for those who want to do Markov
chain Monte Carlo and want to know how long to let the chain run.  For our
purposes, though, what's important is that when people in statistical learning
theory have
given &lt;a href=&quot;http://bactra.org/notebooks/dependent-learning.html&quot;&gt;serious
attention to dependent data&lt;/a&gt;, they have usually relied on a beta-mixing
assumption.

&lt;P&gt;The reason for this focus on beta-mixing is that it &quot;plays nicely&quot; with
approximating dependent processes by independent ones.  The usual form of such
arguments is as follows.  We want to prove a result about our dependent but
mixing process \( X \).  For instance, we realize that our favorite prediction
model will tend to do worse out-of-sample than on the data used to fit it, and
we might want to bound the probability that this over-fitting will exceed \(
\epsilon \).  If we know the beta-mixing coefficients \( \beta(h) \), we can
pick a separation, call it \( a \), where \( \beta(a) \) is reasonably small.
Now we divide \( X \) up into \( \mu = n/a \) blocks of length \( a \).  If we
take every other block, they're nearly independent of each other (because \(
\beta(a) \) is small) but not quite (because \( \beta(a) \neq 0 \)).  Introduce
a (fictitious) random sequence \( Y \), where blocks of length \( a \) have the
same distribution as the blocks in \( X \), but there's no dependence between
blocks.  Since \( Y \) is an IID process, it is easy for us to prove that, for
instance, the probability of over-fitting \( Y \) by more than \( \epsilon \)
is at most some small \( \delta(\epsilon,\mu/2) \).  Since \( \beta \) tells us
about how well dependent probabilities are approximated by independent ones,
the probability of the bad event happening with the dependent data is at most
\( \delta(\epsilon,\mu/2) + (\mu/2)\beta(a) \).  We can make this as small as
we like by letting \( \mu \) and \( a \) both grow as the time series gets
longer.  Basically, anything result which holds for an IID process will also
hold for a beta-mixing one, with a penalty in the probability that depends on
\( \beta \).  There are some details to fill in here (how to pick the
separation \( a \)?  should the blocks always be the same length as the
&quot;filler&quot; between blocks?), but this is the basic frame.

&lt;P&gt;What it leaves open, however, is how to &lt;em&gt;estimate&lt;/em&gt; the mixing
coefficients \( \beta(h) \).  For Markov models, one could it principle
calculate it from the transition probabilities.  For more general processes,
though, calculating beta from the known distribution is not easy.  In fact, we
are not aware of any previous work on &lt;em&gt;estimating&lt;/em&gt; the \( \beta(h) \)
coefficients from observational data.  (References welcome!)  Because of this,
even in learning theory, people have just assumed that the mixing coefficients
were known, or that it was known they went to zero at a certain rate.  This was
not enough for what we wanted to do, which was actually calculate bounds on
error from data.

&lt;P&gt;There were two tricks to actually coming up with an estimator.  The first
was to reduce the ambitions a little bit.  If you look at the equation for \(
\beta(h) \) above, you'll see that it involves integrating over the
infinite-dimensional distribution.  This is daunting, so instead of looking at
the whole past and future, we'll introduce a horizon, \( d \) steps away, and
cut things off there:

\[
\begin{eqnarray*}
\beta^{(d)}(h) &amp; \equiv &amp; d_{TV}(P(X^t_{t-d}, X_{t+h}^{t+h+d}), P(X^t_{t-d}) \otimes P(X_{t+h}^{t+h+d})) \\
&amp; = &amp; \frac{1}{2}\int{|p(x^t_{t-d},x_{t+h}^{t+h+d})-p(x^t_{t-d})p(x^{t+h+d}_{t+h})|dx^t_{t-d}dx^{t+h+d}_{t+h}}
\end{eqnarray*}
\]

If \( X \) is a Markov process, then there's no difference between \(
\beta^{(d)}(h) \) and \( \beta(h) \).  If \( X \) is a Markov process of order
\( p \), then \( \beta^{(d)}(h) = \beta(h) \) once \( d \geq p \).  If \( X \)
is not Markov at any order, it is still the case that \( \beta^{(d)}(h)
\rightarrow \beta(h) \) as \( d \) grows.  So we have an approximation to \(
\beta \) which only involves finite-dimensional integrals, which we might have
some hope of doing.

&lt;P&gt;The other trick is to get rid of those integrals.  Another way of writing
the beta-dependence between the random variables \( U \) and \( V \) is

\[
\beta(U,V) =
\sup_{\mathcal{A},\mathcal{B}}{\frac{1}{2}\sum_{a\in\mathcal{A}}{\sum_{b\in\mathcal{B}}{\left| \Pr{(a
\cap b)} - \Pr{(a)}\Pr{(b)} \right|}}}
\]

where \( \mathcal{A} \) runs over finite partitions of values of \( U \), and
\( \mathcal{B} \) likewise runs over finite partitions of values of \( V \).  I
won't try to show that this formula is equivalent to the earlier definition,
but I will contend that if you think about how that integral gets cashed out as
a sum, you can sort of see how it would be.  If we want \( \beta^{(d)}(h) \),
we can take \( U = X^{t}_{t-d} \) and \( V = X^{t+h+d}_{t+h} \), and we could
find the dependence by taking the supremum over partitions of those two
variables.

&lt;P&gt;Now, suppose that the joint density \( p(x^t_{t-d},x_{t+h}^{t+h+d}) \) was
piecewise constant, with those pieces being rectangles parallel to the
coordinate axes.  Then sub-dividing those rectangles would not change the sum,
and the \( \sup \) would actually be attained for that particular partition.
Most densities are not of course piecewise constant, but we
can &lt;em&gt;approximate&lt;/em&gt; them by such piecewise-constant functions, and make
the approximation arbitrarily close (in total variation).  More, we
can &lt;a href=&quot;algae-2009-11.html#combinatorial-methods&quot;&gt;&lt;em&gt;estimate&lt;/em&gt;
those piecewise-constant approximating densities&lt;/a&gt; from a time series.  Those
estimates are, simply, histograms, which are about the oldest form of
&lt;a href=&quot;../notebooks/density-estimation.html&quot;&gt;density estimation&lt;/a&gt;.  We show
that histogram density estimates converge in total variation on the true
densities, when the bin-width is allowed to shrink as we get more data.

&lt;P&gt;Because the total variation distance is in fact a metric, we can use the
triangle inequality to get an upper bound on the true beta coefficient, in
terms of the beta coefficients of the estimated histograms, and the expected
error of the histogram estimates.  All of the error terms shrink to zero as the
time series gets longer, so we end up with consistent estimates of \(
\beta^{(d)}(h) \).  That's enough if we have a Markov process, but in general
we don't.  So we can let \( d \) grow as \( n \) does, and that (after a
surprisingly long measure-theoretic argument) turns out to do the job: our
histogram estimates of \( \beta^{(d)}(h) \), with suitably-growing \( d \),
converge on the true \( \beta(h) \).

&lt;P&gt;To confirm that this works, the papers go through some simulation examples,
where it's possible to cross-check our estimates.  We can of course also do
this for empirical time series.  For instance, in his this Daniel took four
standard macroeconomic time series for the US (GDP, consumption, investment,
and hours worked, all de-trended in the usual way).  This data goes back to
1948, and is measured four times a year, so there are 255 quarterly
observations.  Daniel estimated a \( \beta \) of 0.26 at one quarter's
separation, \( \widehat{\beta}(2) = 0.15 \), \( \widehat{\beta}(3) = 0.02 \),
and somewhere between 0 and 0.11 for \(\widehat{\beta}(4) \).  (That last is a
sign that we don't have enough data to go beyond \( h = 4 \).)  Optimistically
assuming no dependence beyond a year, one can calculate the effective number of
independent data points, which is not 255 but 31.  This has morals for
macroeconomics which are worth dwelling on, but that will have to wait for
another time.  (Spoiler: \( \sqrt{\frac{1}{31}} \approx 0.18 \), and that's if
you're lucky.)


&lt;P&gt;It's inelegant to have to construct histograms when all we want is a single
number, so it wouldn't surprise us if there were a slicker way of doing this.
(For &lt;a href=&quot;http://bactra.org/notebooks/entropy-estimation.html&quot;&gt;estimating
mutual information&lt;/a&gt;, which is in many ways analogous, estimating the joint
distribution as an intermediate step is neither necessary nor desirable.)  But
for now, we &lt;em&gt;can&lt;/em&gt; do it, when we couldn't before.

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_enigmas_of_chance.html&quot;&gt;Enigmas of Chance&lt;/a&gt;;
&lt;a href=&quot;cat_incestuous_amplification.html&quot;&gt;Kith and Kin&lt;/a&gt;;
&lt;a href=&quot;cat_selfcentered.html&quot;&gt;Self-Centered&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Graphical Causal Models (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/906.html</link>
    <description>
&lt;P&gt;Probabilistic prediction is about passively selecting a sub-ensemble,
leaving all the mechanisms in place, and seeing what turns up after applying
that filter.  Causal prediction is about actively &lt;em&gt;producing&lt;/em&gt; a new
ensemble, and seeing what would happen if something were to change
(&quot;counterfactuals&quot;).  Graphical causal models are a way of reasoning about
causal prediction; their algebraic counterparts are structural equation models
(generally nonlinear and non-Gaussian).  The causal Markov property.
Faithfulness.  Performing causal prediction by &quot;surgery&quot; on causal graphical
models.  The d-separation criterion.  Path diagram rules for linear models.

&lt;P&gt;&lt;em&gt;Reading&lt;/em&gt;:
Notes, &lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch22.pdf&quot;&gt;chapter
22&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Exam: Is This Test Really Necessary? (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/905.html</link>
    <description>
&lt;P&gt;In which the analysis of multivariate data is recursively applied.

&lt;P&gt;&lt;em&gt;Reading&lt;/em&gt;:
Notes, &lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/exams/exam-2.pdf&quot;&gt;assignment&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  <item>
    <title>Graphical Models (Advanced Data Analysis from an Elementary Point of View)</title>
    <link>http://bactra.org/weblog/904.html</link>
    <description>

&lt;P&gt;Conditional independence and dependence properties in factor models.  The
generalization to graphical models.  Directed acyclic graphs.  DAG models.
Factor, mixture, and Markov models as DAGs.  The graphical Markov property.
Reading conditional independence properties from a DAG.  Creating conditional
dependence properties from a DAG.  Statistical aspects of DAGs.  Reasoning with
DAGs; does asbestos whiten teeth?


&lt;P&gt;&lt;em&gt;Reading&lt;/em&gt;:
Notes, &lt;a href=&quot;http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch21.pdf&quot;&gt;chapter
21&lt;/a&gt;

&lt;P&gt;&lt;span class=&quot;blognotes&quot;&gt;
&lt;a href=&quot;cat_36-402.html&quot;&gt;Advanced Data Analysis from an Elementary Point of View&lt;/a&gt;
&lt;/span&gt;
</description>
  </item>
  </channel>
</rss>

