Rl algorithms
Below are the implemented algorithms and their brief descriptions.
- [x] Deep Q-Learning (DQN)
- dqn.py
- For discrete action space.
 
 - dqn_atari.py
- For playing Atari games. It uses convolutional layers and common atari-based pre-processing techniques.
 
 
 - dqn.py
 - [x] Categorical DQN (C51)
- c51.py
- For discrete action space.
 
 - c51_atari.py
- For playing Atari games. It uses convolutional layers and common atari-based pre-processing techniques.
 
 - c51_atari_visual.py
- Adds return and q-values visulization for 
dqn_atari.py. 
 - Adds return and q-values visulization for 
 
 - c51.py
 - [x] Proximal Policy Gradient (PPO) 
- All of the PPO implementations below are augmented with some code-level optimizations. See https://costa.sh/blog-the-32-implementation-details-of-ppo.html for more details
 - ppo.py
- For discrete action space.
 
 - ppo_continuous_action.py
- For continuous action space. Also implemented Mujoco-specific code-level optimizations
 
 - ppo_atari.py
- For playing Atari games. It uses convolutional layers and common atari-based pre-processing techniques.
 
 
 - [x] Soft Actor Critic (SAC)
- sac_continuous_action.py
- For continuous action space.
 
 
 - sac_continuous_action.py
 - [x] Deep Deterministic Policy Gradient (DDPG)
- ddpg_continuous_action.py
- For continuous action space.
 
 
 - ddpg_continuous_action.py
 - [x] Twin Delayed Deep Deterministic Policy Gradient (TD3)
- td3_continuous_action.py
- For continuous action space.
 
 
 - td3_continuous_action.py
 - [x] Apex Deep Q-Learning (Apex-DQN)
- apex_dqn_atari_visual.py
- For playing Atari games. It uses convolutional layers and common atari-based pre-processing techniques.
 
 
 - apex_dqn_atari_visual.py