GRID-WORLD-EXPLORATION-USING-REINFORCEMENT-LEARNING

This project utilizes reinforcement learning (RL) to solve a navigational task within a defined environment, specifically, a "Lawnmower Grid World". The environment consists of 16 states with a goal to maximize collected rewards while avoiding penalties.The primary tasks involve moving from the initial state at the top-left corner of the grid to the goal state at the bottom-right corner, collecting batteries (positive rewards) and avoiding pebbles (negative rewards).

Environment Setup

States: The states are defined as tuples representing the row and column indices on a 4x4 grid, ranging from (0, 0) to (3, 3).
Actions: Four possible actions include moving Up, Down, Right, and Left.
Rewards: Rocks represent negative rewards (-5, -6), batteries positive rewards (+5, +6), and the goal state offers the highest reward (+10).
Objective: Navigate through the grid to reach the goal state at (3, 3) by maximizing the total rewards.

Vizualization

Algorithms Implemented

SARSA (State-Action-Reward-State-Action)

Approach: On-policy learning algorithm where the agent learns from the action taken based on the current policy.
Update Formula: Q(s, a) = Q(s, a) + α (r + γ * Q(s', a') - Q(s, a)), with parameters for learning rate (α), reward (r), discount factor (γ), and Q-values of the current and next state-action pairs.
Features: Employs a direct method from actual experiences with a balance between exploration and exploitation.

Q-Learning

Approach: Off-policy algorithm that learns the optimal policy indirectly using the greedy method.
Update Formula: Q(s, a) = Q(s, a) + α * (r + γ * max(Q(s', a')) - Q(s, a)).
Features: It tends to overestimate Q-values due to its maximization step but is robust in large state spaces.

The project showcases the application of SARSA and Q-Learning to a simple yet illustrative RL problem, emphasizing the importance of hyperparameter tuning and algorithm selection based on the specific characteristics of the environment and task objectives.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
Q_learning.ipynb		Q_learning.ipynb
README.md		README.md
SARSA.ipynb		SARSA.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GRID-WORLD-EXPLORATION-USING-REINFORCEMENT-LEARNING

Environment Setup

Vizualization

Algorithms Implemented

SARSA (State-Action-Reward-State-Action)

Q-Learning

About

Releases

Packages

Languages

License

lokesh97jain/GRID-WORLD-EXPLORATION-USING-REINFORCEMENT-LEARNING

Folders and files

Latest commit

History

Repository files navigation

GRID-WORLD-EXPLORATION-USING-REINFORCEMENT-LEARNING

Environment Setup

Vizualization

Algorithms Implemented

SARSA (State-Action-Reward-State-Action)

Q-Learning

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages