Deep Q-Learning (DQN)
As an extension of the Q-learning, DQN's main technical contribution is the use of replay buffer and target network, both of which would help improve the stability of the algorithm.
Original papers:
- Playing Atari with Deep Reinforcement Learning
 - Human-level control through deep reinforcement learning
 
Our single-file implementations of DQN:
- dqn.py
- Works with the 
Boxobservation space of low-level features - Works with the 
Discereteaction space - Works with envs like 
CartPole-v1 
 - Works with the 
 - dqn_atari.py
- For playing Atari games. It uses convolutional layers and common atari-based pre-processing techniques.
 - Works with the Atari's pixel 
Boxobservation space of shape(210, 160, 3) - Works with the 
Discereteaction space