https://zhuanlan.zhihu.com/p/24392239
一. Lua 語言的程序包(運用框架:Torch 7):
1. 相關論文:Human-level control through deep reinforcement learning
CODE鏈接(需翻牆) 另外的鏈接(不需要翻牆):kuz/DeepMind-Atari-Deep-Q-Learner
實現的算法名稱:Deep Q-Networks(DQN)
推薦指數(★★★★★)
推薦理由:谷歌公司開源的第一個深度強化學習軟件包,重要價值不用我多說了吧。
2. 軟件包名稱:ehrenbrav/DeepQNetwork
實現算法:DQN 應用場景:玩超級馬里奧遊戲 推薦指數(★★★)
相關論文:Human-Level Control through Deep Reinforcement Learning
實現算法: DQN, persistent advantage learning, dueling network, double DQN, A3C
推薦指數(★★★★)
4. 軟件包名稱:iassael/learning-to-communicate
實現算法:Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL)
推薦指數(★★★)
相關論文:[1605.06676] Learning to Communicate with Deep Multi-Agent Reinforcement Learning
推薦指數(★★★) 推薦理由:Simple environment for creating very simple 2D games and training neural network models to perform tasks within them
相關論文:A Sandbox for Learning from Games
6. 軟件包名稱:eparisotto/ActorMimic
推薦指數(★★★) 實現算法:ActorMimic
相關論文:Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
二. Python語言的DRL程序包:
基於TensorFlow:
1. 軟件包名稱:devsisters/DQN-tensorflow
實現算法:DQN
推薦指數(★★★)
相關論文:Human-Level Control through Deep Reinforcement Learning
2. 軟件包名稱:gliese581gg/DQN_tensorflow
實現算法:DQN
推薦指數(★★)
3. 軟件包名稱:nivwusquorum/tensorflow-deepq
實現算法:DQN
推薦指數(★★★★)
推薦理由:可以用Jupyter Notebook
4. 軟件包名稱: deep-rl-tensorflow
實現算法:DQN、DDQN、Dueling Network
相關論文:
[1] Playing Atari with Deep Reinforcement Learning
[2] Human-Level Control through Deep Reinforcement Learning
[3] Deep Reinforcement Learning with Double Q-learning
[4] Dueling Network Architectures for Deep Reinforcement Learning
推薦指數(★★★★★) 推薦理由:基於TensorFlow下的多種DRL算法實現,有很好的擴展價值。
5. 軟件包名稱:coreylynch/async-rl
實現算法:A3C 推薦指數(★★★★)
相關論文:Asynchronous Methods for Deep Reinforcement Learning".
推薦理由:結合使用Tensorflow + Keras + OpenAI Gym
基於Keras:
1. 軟件包名稱:matthiasplappert/keras-rl
實現算法:
- Deep Q Learning (DQN) [1], [2]
- Double DQN [3]
- Deep Deterministic Policy Gradient (DDPG) [4]
- Continuous DQN (CDQN or NAF) [6]
- Cross-Entropy Method (CEM) [7], [8]
相關論文在標註鏈接裏面。
推薦指數(★★★★★)
推薦理由:基於keras的最好的一款DRL軟件包,實現的算法較全(包括離散動作空間、連續動作空間)
基於Theano:
實現算法:DQN
推薦指數(★★★)
推薦理由:基於Theano框架
基於neon深度學習包:
1. 軟件包名稱:tambetm/simple_dqn
實現算法:DQN
相關論文:Human-Level Control through Deep Reinforcement Learning
推薦指數(★★)
其它的一些Python語言的DRL軟件包:
主要實現算法:DQN,prioritized experience replay, double Q-learning, DDPG
推薦指數(★★★)
實現算法: A3C 推薦指數(★★)
相關論文:Asynchronous Methods for Deep Reinforcement Learning.
3. 軟件包名稱:miyosuda/async_deep_reinforce
實現算法: A3C
推薦指數(★★)
相關論文:Asynchronous Methods for Deep Reinforcement Learning.
實現算法:
- REINFORCE
- Truncated Natural Policy Gradient
- Reward-Weighted Regression
- Relative Entropy Policy Search
- Trust Region Policy Optimization
- Cross Entropy Method
- Covariance Matrix Adaption Evolution Strategy
- Deep Deterministic Policy Gradient
推薦指數(★★★★★)
推薦理由:OpenAI出品,必是精品。
推薦指數(★★★★★)
推薦理由:OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms.
推薦指數(★★★★★)
推薦理由:A software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications.(多麼美好的未來啊)
7. 軟件包名稱:joschu/modular_rl
實現算法: TRPO,Proximal Policy Optimization,CEM
推薦指數(★★★)
實現算法:Variational Information Maximizing Exploration (VIME)
推薦指數(★★)
相關論文:VIME: Variational Information Maximizing Exploration
C和C++的DRL程序包:
1. 相關論文:Human-level control through deep reinforcement learning
CODE鏈接 實現算法:DQN
推薦指數(★★★★)
推薦理由:首次基於Caffe深度學習框架嘗試解決深度強化學習問題。
2. 軟件包名稱:Replicating-DeepMind
主要實現算法:DQN
推薦指數(★★)
3. 軟件包名稱:xbpeng/DeepTerrainRL
相關論文: Terrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning
推薦指數(★★)
推薦指數(★★★★★)
推薦理由:A customisable 3D platform for agent-based AI research(用來對抗OpenAI的Universe?)
5. 軟件包名稱:junhyukoh/nips2015-action-conditional-video-prediction
推薦指數(★★)
相關論文:Action-Conditional Video Prediction using Deep Networks in Atari Games
實現算法:DRQN (Recurrent DQN)
推薦指數(★★★)
相關論文:Deep Recurrent Q-Learning for Partially Observable MDPs
Javascript的DRL程序包:
1. 軟件包名稱:karpathy/reinforcejs
實現算法:
- Dynamic Programming methods
- (Tabular) Temporal Difference Learning (SARSA/Q-Learning)
- Deep Q-Learning
- Stochastic/Deterministic Policy Gradients and Actor Critic architectures for dealing with continuous action spaces. (very alpha, likely buggy or at the very least finicky and inconsistent)
推薦指數(★★★★)
推薦理由:單憑Javascript,我就覺得很牛逼了。
Java的DRL程序包:
1. 軟件包名稱:deeplearning4j/rl4j
實現算法: DQN,A3C
推薦指數(★★★★★)
推薦理由:Java語言,我的最愛。目前商用價值最高的語言。
另外一些好玩的DRL項目:
1. 軟件包名稱:yenchenlin/DeepLearningFlappyBird 和 songrotek/DRL-FlappyBird
實現算法:DQN
應用場景:玩憤怒的小鳥
推薦指數(★★★★)
2. 軟件包名稱:bitwise-ben/Snake
實現算法:DQN
應用場景:玩貪吃蛇
推薦指數(★★★)
3. 軟件包名稱:yanpanlau/DDPG-Keras-Torcs
實現算法:DDPG
基於的框架:keras
應用場景:TORCS賽車
相關論文:Deep Deterministic Policy Gradient
推薦指數(★★★★★) 推薦理由:男生應該都對車有興趣吧
實現算法:DSR
應用場景:Doom射擊
推薦指數(★★★★)
相關論文:Deep Successor Reinforcement Learning (DSR)
兩年期間,又有許多有價值的強化學習項目在網上開源:
簡述:Python replication for Sutton & Barto's bookReinforcement Learning: An Introduction (2nd Edition)
2. dennybritz/reinforcement-learning
簡述:Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
3. MorvanZhou/Reinforcement-learning-with-tensorflow
簡述:Simple Reinforcement learning tutorials,適合入門。
簡述:TRFL (pronounced "truffle") is a library built on top of TensorFlow that exposes several useful building blocks for implementing Reinforcement Learning agents.
簡述:Deep Learning and Reinforcement Learning Library for Scientists
簡述:OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
簡述:Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
簡述:Deep Reinforcement Learningfor Keras.
簡述:a TensorFlow library for applied reinforcement learning
簡述:rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
11. NervanaSystems/coach
簡述:Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
12. tensorflow/agents
簡述:TF-Agents is a library for Reinforcement Learning in TensorFlow