Related Papers & Resources

This section provides a curated list of research papers, articles, and resources that form the theoretical foundation of AZ-Go and provide context for understanding AlphaZero-style reinforcement learning.

Foundational Papers

AlphaGo Zero (2017)

“Mastering the game of Go without human knowledge”
David Silver, Julian Schrittwieser, Karen Simonyan, et al.
Nature 550, 354–359 (2017)

Key contributions:

  • Tabula rasa reinforcement learning
  • Dual-headed neural network (policy + value)
  • Self-play training without human games
  • Simplified MCTS without rollouts

AlphaZero (2018)

“A general reinforcement learning algorithm that masters chess, shogi and Go through self-play”
David Silver, Thomas Hubert, Julian Schrittwieser, et al.
Science 362, 1140-1144 (2018)

Key contributions:

  • Generalization to multiple games
  • Unified algorithm architecture
  • No domain-specific augmentations
  • Superhuman performance in hours

MuZero (2020)

“Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model”
Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, et al.
Nature 588, 604–609 (2020)

Key contributions:

  • Model-based reinforcement learning
  • Learning without knowing game rules
  • Planning in learned latent space

Technical Deep Dives

Neural Network Architecture

“Residual Networks Behave Like Ensembles of Relatively Shallow Networks”
Andreas Veit, Michael Wilber, Serge Belongie
NeurIPS 2016

Understanding ResNet architectures used in AlphaZero.

“Batch Normalization: Accelerating Deep Network Training”
Sergey Ioffe, Christian Szegedy
ICML 2015

Critical for stable training of deep networks.

“A Survey of Monte Carlo Tree Search Methods”
Cameron Browne, Edward Powley, Daniel Whitehouse, et al.
IEEE TCIAIG 2012

Comprehensive overview of MCTS variants and applications.

“Bandit based Monte-Carlo Planning”
Levente Kocsis, Csaba Szepesvári
ECML 2006

Original UCT algorithm that MCTS is based on.

Implementation Details

Distributed Training

“Distributed Deep Reinforcement Learning: Learn how to play Go”
Yuandong Tian, Jerry Ma, Qucheng Gong, et al.
ICLR 2018

Facebook’s ELF OpenGo implementation insights.

Optimization Techniques

“Fixing Weight Decay Regularization in Adam”
Ilya Loshchilov, Frank Hutter
ICLR 2019

AdamW optimizer used in modern implementations.

Go-Specific Research

Computer Go History

“Deep Blue and Beyond: Chess and Go”
Murray Campbell
AI Magazine 2002

Historical context of computer game playing.

Position Evaluation

“Teaching Deep Convolutional Neural Networks to Play Go”
Christopher Clark, Amos Storkey
ICLR 2015

Early work on using CNNs for Go position evaluation.

Open Source Projects

  1. Leela Zero
    Community implementation of AlphaGo Zero
    GitHub

  2. KataGo
    Improvements over AlphaZero with additional features
    GitHub
    Paper

  3. MiniGo
    Simplified Python implementation
    GitHub

Theoretical Background

Reinforcement Learning

“Reinforcement Learning: An Introduction”
Richard Sutton, Andrew Barto
Book (2018)

Foundational RL concepts and algorithms.

Game Theory

“Combinatorial Game Theory”
Aaron N. Siegel
AMS (2013)

Mathematical foundations of perfect information games.

Recent Advances

Efficient Training

“Accelerating Self-Play Learning in Go”
David J. Wu
arXiv 2019

KataGo’s improvements to training efficiency.

Analysis Tools

“Analyzing Deep Neural Networks for Game AI”
Tom Schaul, et al.
ICML 2022

Methods for understanding learned representations.

Practical Resources

Tutorials and Guides

  1. “AlphaZero from Scratch”
    Step-by-step implementation guide
    Blog Series

  2. “Understanding AlphaGo”
    Visual explanations of key concepts
    Interactive Guide

Video Lectures

  1. David Silver’s RL Course
    UCL Course on Reinforcement Learning
    YouTube Playlist

  2. AlphaGo Documentary
    Behind the scenes of the historic match
    DeepMind Film

Community and Discussion

Forums and Communities

Competitions

Citation Format

When referencing AZ-Go in academic work:

@software{azgo2025,
  author = {Nguyen, Toan and Good, Blake and Leath, Harrison},
  title = {AZ-Go: Distributed AlphaZero Implementation for Go},
  year = {2025},
  url = {https://github.com/yourusername/AZ-Go}
}

Contributing to Research

If you use AZ-Go for research:

  1. Consider sharing your findings
  2. Submit improvements via pull requests
  3. Report interesting discoveries in Issues
  4. Join our research discussion forum

Further Reading

For those wanting to dive deeper:

  1. Mathematical Foundations
    • Markov Decision Processes
    • Multi-armed Bandit Problems
    • Function Approximation in RL
  2. Advanced Topics
    • Curriculum Learning
    • Transfer Learning in Games
    • Explainable AI for Game Playing
  3. Related Domains
    • General Game Playing (GGP)
    • Real-time Strategy Games
    • Imperfect Information Games