| ✅ Proximal Policy Gradient (PPO) | 
 ppo.py,  docs | 
 | 
 ppo_atari.py,  docs | 
 | 
 ppo_continuous_action.py,  docs | 
 | 
 ppo_atari_lstm.py | 
 | 
 ppo_procgen.py | 
| ✅ Deep Q-Learning (DQN) | 
 dqn.py | 
 | 
 dqn_atari.py | 
| ✅ Categorical DQN (C51) | 
 c51.py | 
 | 
 c51_atari.py | 
| ✅ Apex Deep Q-Learning (Apex-DQN) | 
 apex_dqn_atari.py | 
| ✅ Soft Actor-Critic (SAC) | 
 sac_continuous_action.py | 
| ✅ Deep Deterministic Policy Gradient (DDPG) | 
 ddpg_continuous_action.py | 
| ✅ Twin Delayed Deep Deterministic Policy Gradient (TD3) | 
 td3_continuous_action.py |