Universal Prediction Algorithms
16 Feb 2008 23:09
Given: a single time series, perhaps a very long one, from a stochastic process which is basically unknown; perhaps merely that it is stationary and ergodic.
Desired: a forecast which will converge on the best possible forecast, as the series becomes longer and longer. Or: the best possible forecast from within a fixed class of forecasting algorithms.
A solution is called a universal prediction algorithm because it applied equally to all the processes within class, and is not tailored to any one of them.
This has connections to information theory (via universal compression algorithms), to the problem of finding Markovian representations, and to many other topics.
See also: Ergodic Theory; Learning Theory; Machine Learning, Statistical Inference and Induction; Time series
- Recommended:
- Paul H. Algoet
- "Universal Schemes for Prediction, Gambling, and Portfolio Selection," Annals of Probability 20 (1992): 901--941 [JSTOR] and an important Correction, 23 (1995): 474--478
- "Universal Schemes for Learning the Best Nonlinear Predictor Given the Infinite Past and Side Information," IEEE Transactions on Information Theory 45 (1999): 1165--1185
- Nicolo Cesa-Bianchi and Gabor Lugosi, Prediction, Learning, and Games
- Shane Legg, "Is There an Elegant Universal Theory of Prediction?", cs.AI/0606070 [A nice set of diagonalization arguments against the hope of a universal prediction scheme which has the nice features of Solomonoff-style induction, but is actually computable.]
- To read:
- L. Gyorfi, G. Morvai, S. Yakowitz, "Limits to consistent on-line forecasting for ergodic time series", IEEE Transactions on Information Theory 44 (1998): 886-892 = arxiv:0712.2430
- Marcus Hutter
- "Convergence and Error Bounds for Universal Prediction of Nonbinary Sequences," cs.LG/0106036
- "Convergence and Loss Bounds for Bayesian Sequence Prediction," cs.LG/0301014
- "General Loss Bounds for Universal Sequence Prediction," cs.AI/0101019
- "Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet", cs.LG/0311014
- Yuri Kalnishkan, Vladimir Vovk and Michael V. Vyugin, "How many strings are easy to predict?", Information and Computation 201 (2005): 55--71 ["It is well known in the theory of Kolmogorov complexity that most strings cannot be compressed; more precisely, only exponentially few (O(2^n-m)) binary strings of length n can be compressed by m bits. This paper extends the 'incompressibility' property of Kolmogorov complexity to the 'unpredictability' property of predictive complexity. The 'unpredictability' property states that predictive complexity (defined as the loss suffered by a universal prediction algorithm working infinitely long) of most strings is close to a trivial upper bound (the loss suffered by a trivial minimax constant prediction strategy). We show that only exponentially few strings can be successfully predicted and find the base of the exponent."]
- Gusztav Morvai, "Guessing the output of a stationary binary time series", arxiv:0710.3760
- Gusztav Morvai, Sanjeev R. Kulkarni, Andrew B. Nobel, "Regression estimation from an individual stable sequence", Statistics 33 (1999): 99--118 = arxiv:0710.2496
- Gusztáv Morvai and Benjamin Weiss
- "Forecasting for Stationary Binary Times Series", Atca Appl. Math. 79 (2003): 25--34 = arxiv:0710.5144
- "Inferring the conditional mean", Theory of Stochastic Processes 11 (2005): 112--120 = arxiv:0710.3757
- "Intermittent estimation of stationary time series", Test 13 (2004): 525--542 = arxiv:0711.0350
- "Limitations on intermittent forecasting", Statistics and Probability Letters 72 (2005): 285--290 = arxiv:0710.3773
- "Prediction for discrete time series", Probability Theory and Related Fields 132 (2005): 1--12 = arxiv:0711.0471
- G. Morvai, S. Yakowitz, L. Gyorfi, "Nonparametric inference for ergodic, stationary time series", Annals of Statistics 24 (1996): 370--379 = arxiv:0711.0367
- Andrew B. Nobel, Gusztav Morvai, Sanjeev R. Kulkarni, "Density estimation from an individual numerical sequence", IEEE Transactions on Information Theory 44 (1998): 537--541 = arxiv:0710.2500
- Donald Ornstein and Benjamin Weiss, "How Sampling Reveals a Process", Annals of Probability 18 (1990): 905--930
- Boris Ryabko and Jaakko Astola
- "Prediction of Large Alphabet Processes and Its Application to Adaptive Source Coding", cs.IT/0504079
- "Universal Codes as a Basis for Time Series Testing", cs.IT/0602084
- "Universal Codes as a Basis for Nonparametric Testing of Serial Independence for Time Series", cs.IT/0506094
- Daniil Ryabko and Marcus Hutter, "On Sequence Prediction for Arbitrary Measures", cs.LG/0606077
- S. Yakowitz, L. Gyorfi, J. Kieffer, G. Morvai, "Strongly consistent nonparametric forecasting and regression for stationary ergodic sequences", J. Multivariate Analysis 71 (1999): 24--41 = arxiv:0712.2592
