Header logo is

Coherent Inference on Optimal Play in Game Trees


Conference Paper



Round-based games are an instance of discrete planning problems. Some of the best contemporary game tree search algorithms use random roll-outs as data. Relying on a good policy, they learn on-policy values by propagating information upwards in the tree, but not between sibling nodes. Here, we present a generative model and a corresponding approximate message passing scheme for inference on the optimal, off-policy value of nodes in smooth AND/OR trees, given random roll-outs. The crucial insight is that the distribution of values in game trees is not completely arbitrary. We define a generative model of the on-policy values using a latent score for each state, representing the value under the random roll-out policy. Inference on the values under the optimal policy separates into an inductive, pre-data step and a deductive, post-data part. Both can be solved approximately with Expectation Propagation, allowing off-policy value inference for any node in the (exponentially big) tree in linear time.

Author(s): Hennig, P. and Stern, D. and Graepel, T.
Book Title: JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010
Pages: 326-333
Year: 2010
Month: May
Day: 0
Editors: Teh, Y.W. , M. Titterington
Publisher: JMLR

Department(s): Empirical Inference, Probabilistic Numerics
Bibtex Type: Conference Paper (inproceedings)

Event Name: Thirteenth International Conference on Artificial Intelligence and Statistics
Event Place: Chia Laguna Resort, Italy

Address: Cambridge, MA, USA
Digital: 0

Links: PDF


  title = {Coherent Inference on Optimal Play in Game Trees},
  author = {Hennig, P. and Stern, D. and Graepel, T.},
  booktitle = {JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010},
  pages = {326-333},
  editors = {Teh, Y.W. , M. Titterington },
  publisher = {JMLR},
  address = {Cambridge, MA, USA},
  month = may,
  year = {2010},
  month_numeric = {5}