Reinforcement Learning
Borahan Tümer
Lecture Slides
  1. Introduction
  2. Evaluative Feedback
  3. Reinforcement Learning Problem
  4. Dynamic Programming
  5. Monte Carlo Methods
  6. Temporal-Difference Learning
  7. Eligibility Traces (ETs)
  8. Generalization and Function Approximation (will be added)
  9. Planning and Learning
  10. Hierarchical RL (HRL) (will be added)

* Main references are:
  • Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. Vol. 1. No. 1. Cambridge: MIT press, 1998.
  • Sutton, Richard S., Doina Precup, and Satinder Singh. "Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning." Artificial intelligence 112.1-2 (1999): 181-211.
Reference Books
  • Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. Vol. 1. No. 1. Cambridge: MIT press, 1998. (Second edition is in progress, but the draft can be found here.)
Related Papers
  • RL in Non-Stationary Environments
    • Da Silva, Bruno C., et al. "Dealing with non-stationary environments using context detection." Proceedings of the 23rd international conference on Machine learning. ACM, 2006.
    • Hadoux, Emmanuel, Aurélie Beynier, and Paul Weng. "Sequential decision-making under non-stationary environments via sequential change-point detection." Learning over Multiple Contexts (LMCE). 2014.
    • Jaulmes, Robin, Joelle Pineau, and Doina Precup. "Learning in non-stationary partially observable Markov decision processes." ECML Workshop on Reinforcement Learning in non-stationary environments. Vol. 25. 2005.
    • Choi, Samuel PM, Dit-Yan Yeung, and Nevin L. Zhang. "Hidden-mode markov decision processes for nonstationary sequential decision making." Sequence Learning. Springer, Berlin, Heidelberg, 2000. 264-287.
    • Choi, Samuel PM, Dit-Yan Yeung, and Nevin Lianwen Zhang. "An environment model for nonstationary reinforcement learning." Advances in neural information processing systems. 2000.
  • Hierarchical RL
    • Sutton, Richard S., Doina Precup, and Satinder Singh. "Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning." Artificial intelligence 112.1-2 (1999): 181-211.
    • Yücesoy, Yigit E., and M. Borahan Tümer. "Hierarchical Reinforcement Learning with Context Detection (HRL-CD)." International Journal of Machine Learning and Computing 5.5 (2015): 353.
    • Stolle, Martin, and Doina Precup. "Learning options in reinforcement learning." International Symposium on abstraction, reformulation, and approximation. Springer, Berlin, Heidelberg, 2002.
    • Botvinick, Matthew M., Yael Niv, and Andrew C. Barto. "Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective." Cognition 113.3 (2009): 262-280.
    • Şimşek, Özgür, and Andrew G. Barto. "Skill characterization based on betweenness." Advances in neural information processing systems. 2009.
    • McGovern, Amy, and Andrew G. Barto. "Automatic discovery of subgoals in reinforcement learning using diverse density." ICML. Vol. 1. 2001.
  • Transfer Learning in RL
    • Taylor, Matthew E., and Peter Stone. "Transfer learning for reinforcement learning domains: A survey." Journal of Machine Learning Research 10.Jul (2009): 1633-1685.
  • Multi-Agent RL
    • Goncu, Burak, and M. Borahan Tümer. Reinforcement learning in non-stationary environments using spatiotemporal analysis (under review process)
  • RL with Continuous State Spaces
    • Whiteson, Shimon Azariah. Adaptive representations for reinforcement learning. The University of Texas at Austin, 2007.
  • Other
    • Moore, Andrew W., and Christopher G. Atkeson. "Prioritized sweeping: Reinforcement learning with less data and less time." Machine learning 13.1 (1993): 103-130.