gridworld | Gridworlds and Markov Decision Processes in Python

by stober Python Version: Current License: BSD-2-Clause

X-Ray Key Features Code Snippets Community Discussions(9)Vulnerabilities Install Support

kandi X-RAY | gridworld Summary

gridworld is a Python library. gridworld has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has low support. You can download it from GitHub.

Gridwolds and Markov Decision Processes in Python. Author: Jeremy Stober Contact: stober@gmail.com License: BSD (see LICENSE). This package contains implementations of discrete mdps known as gridworlds that are convenient for testing reinforcement learning algorithms. Some visualization is provided, along with a few other mdps drawn from the literature on reinforcement learning. Note: This package now depends on for sparse feature generation.

Support

Quality

Security

License

Reuse

Support

gridworld has a low active ecosystem.

It has 10 star(s) with 5 fork(s). There are 3 watchers for this library.

It had no major release in the last 6 months.

gridworld has no issues reported. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of gridworld is current.

Quality

gridworld has 0 bugs and 0 code smells.

Security

gridworld has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

gridworld code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

gridworld is licensed under the BSD-2-Clause License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

gridworld releases are not available. You will need to build from source code and install.

Build file is available. You can build the component from source.

It has 1699 lines of code, 196 functions and 13 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed gridworld and discovered the below as its top functions. This is intended to give you an instant insight into gridworld implemented functionality, and help decide if they suit your requirements.

Get shortest path from start to end
Return the coordinates of an action
Return the action at i
Return the next state in the state
Compute the feature probability of an action
Number of features
Return the observation associated with s s
Compute the feature matrix of a feature
Calculate the rfc function of s
Return the coords of the grid
Finds the perfect policy for a given state
Calculate the coordinates of the grid
Simulate the game
Move the trace back to the trace
Return the vphi of the state
Check if the given state is terminal
Compute the feature matrix
Calculate the r function of s
Compute the feature probability density of a feature
Compute the indices of the bin indices for a given s
Generate a list of wall patterns
Determine the coordinates of the grid
Returns the neighbors of the given state

Get all kandi verified functions for this library.

gridworld Key Features

No Key Features are available at this moment for gridworld.

gridworld Examples and Code Snippets

No Code Snippets are available at this moment for gridworld.

Community Discussions

Trending Discussions on gridworld

Multidimensional Action Space in Reinforcement Learning

Python Erroneously Modifying Values of Another Array

Initialising 2D array/list in Python for 2 different varaibles

Why is this tensorflow training taking so long?

Create a new Actor with Location in Gridworld

Why does initialising the variable inside or outside of the loop change the code behaviour?

How to use DeepQLearning in Julia for very large states?

Linker Error "unresolved external symbol" when using constructor of other class

explain the: self.stateSpace=[i for i range(self.m*self.n)]

QUESTION

Multidimensional Action Space in Reinforcement Learning

Asked 2022-Apr-17 at 22:05

My goal is to train an agent (ship) that takes two actions for now. 1. Choosing it's heading angle (where to go next) and 2. Choosing it's acceleration (if it will change its speed or not).

However, it seems like that I cannot undestand how to properly construct my action space and state space. I keep getting an error which I do not know how to fix. I have been trying to make it work using the Space wrapper.

I use the following code.

...

ANSWER

Answered 2022-Apr-17 at 15:05

I think the error message already explained it clearly.

Source https://stackoverflow.com/questions/71901031

QUESTION

Python Erroneously Modifying Values of Another Array

Asked 2021-Nov-23 at 07:01

I am working on codifying the policy iteration for a Gridworld task using Python.

My idea was to have two arrays holding the Gridworld, one that holds the results of the previous iteration, and one that holds the results of the current iteration; however, once I wrote the code for it I noticed my values in my results were off because the array that holds the previous iteration was also being modified.

...

ANSWER

Answered 2021-Nov-23 at 07:01

I believe the comment from @TimRoberts above correctly diagnoses the issue.

These arrays currently reference the same object, so any update to one updates the other.

When you initialize arr2 = arr1, it creates a reference to the same object in memory. This makes it such that when one object is updated, so is the values of the other.

To create an array without this pointer style reference you can use:

Source https://stackoverflow.com/questions/70075832

QUESTION

Initialising 2D array/list in Python for 2 different varaibles

Asked 2021-Jun-09 at 00:39

I'm attempting to port something from Java to Python and was curious how I would go about converting this method. It is used to initialise a 2D array with 2 different varaibles. Here is the code in Java:

...

ANSWER

Answered 2021-Mar-21 at 13:13

In order to initialize a list of n duplicate values, you can use the multiplication operator, like this:

Source https://stackoverflow.com/questions/66732578

QUESTION

Why is this tensorflow training taking so long?

Asked 2021-May-13 at 12:42

I'm learning DRL with the book Deep Reinforcement Learning in Action. In chapter 3, they present the simple game Gridworld (instructions here, in the rules section) with the corresponding code in PyTorch.

I've experimented with the code and it takes less than 3 minutes to train the network with 89% of wins (won 89 of 100 games after training).

As an exercise, I have migrated the code to tensorflow. All the code is here.

The problem is that with my tensorflow port it takes near 2 hours to train the network with a win rate of 84%. Both versions are using the only CPU to train (I don't have GPU)

Training loss figures seem correct and also the rate of a win (we have to take into consideration that the game is random and can have impossible states). The problem is the performance of the overall process.

I'm doing something terribly wrong, but what?

The main differences are in the training loop, in torch is this:

...

ANSWER

Answered 2021-May-13 at 12:42

Why is TensorFlow slow

TensorFlow has 2 execution modes: eager execution, and graph mode. TensorFlow default behavior, since version 2, is to default to eager execution. Eager execution is great as it enables you to write code close to how you would write standard python. It's easier to write, and it's easier to debug. Unfortunately, it's really not as fast as graph mode.

So the idea is, once the function is prototyped in eager mode, to make TensorFlow execute it in graph mode. For that you can use tf.function. tf.function compiles a callable into a TensorFlow graph. Once the function is compiled into a graph, the performance gain is usually quite important. The recommended approach when developing in TensorFlow is the following:

Debug in eager mode, then decorate with @tf.function.

Don't rely on Python side effects like object mutation or list appends.

tf.function works best with TensorFlow ops; NumPy and Python calls are converted to constants.

I would add: think about the critical parts of your program, and which ones should be converted first into graph mode. It's usually the parts where you call a model to get a result. It's where you will see the best improvements.

You can find more information in the following guides:

Applying tf.function to your code

So, there are at least two things you can change in your code to make it run quite faster:

The first one is to not use model.predict on a small amount of data. The function is made to work on a huge dataset or on a generator. (See this comment on Github). Instead, you should call the model directly, and for performance enhancement, you can wrap the call to the model in a tf.function.

Model.predict is a top-level API designed for batch-predicting outside of any loops, with the fully-features of the Keras APIs.

The second one is to make your training step a separate function, and to decorate that function with @tf.function.

So, I would declare the following things before your training loop:

Source https://stackoverflow.com/questions/67383458

QUESTION

Create a new Actor with Location in Gridworld

Asked 2020-Dec-06 at 17:51

I want to create a Gridworld with one "Car"-actor and a fix spawn location:

...

ANSWER

Answered 2020-Dec-06 at 17:51

You can use the ActorWorld.add() method with the Location argument to place the actor at a specific location:

add

public void add(Location loc, Actor occupant)

Adds an actor to this world at a given location.

In your case it would be something like:

Source https://stackoverflow.com/questions/65168134

QUESTION

Why does initialising the variable inside or outside of the loop change the code behaviour?

Asked 2020-Jun-05 at 22:58

I am implementing policy iteration in python for the gridworld environment as a part of my learning. I have written the following code:

...

ANSWER

Answered 2020-Jun-05 at 22:38

On your first time through the loop, policy_converged is set to False. After that, nothing will ever set it to True, so the break is never reached, and it loops forever.

Source https://stackoverflow.com/questions/62225037

QUESTION

How to use DeepQLearning in Julia for very large states?

Asked 2020-May-28 at 04:56

I would like to use the DeepQLearning.jl package from https://github.com/JuliaPOMDP/DeepQLearning.jl. In order to do so, we have to do something similar to

...

ANSWER

Answered 2020-May-28 at 04:56

DeepQLearning does not require to enumerate the state space and can handle continuous space problems. DeepQLearning.jl only uses the generative interface of POMDPs.jl. As such, you do not need to implement the states function but just gen and initialstate (see the link on how to implement the generative interface).

However, due to the discrete action nature of DQN you also need POMDPs.actions(mdp::YourMDP) which should return an iterator over the action space.

By making those modifications to your implementation you should be able to use the solver.

The neural network in DQN takes as input a vector representation of the state. If your state is a m dimensional vector, the neural network input will be of size m. The output size of the network will be equal to the number of actions in your model.

In the case of the grid world example, the input size of the Flux model is 2 (x, y positions) and the output size is length(actions(mdp))=4.

Source https://stackoverflow.com/questions/62055175

QUESTION

Linker Error "unresolved external symbol" when using constructor of other class

Asked 2020-May-07 at 17:57

I've been struggling to find why my linker gets an unresolved external symbol error. The error looks like this:

...

ANSWER

Answered 2020-May-07 at 17:57

Provide default constructor of the class as well.

Source https://stackoverflow.com/questions/61664275

QUESTION

explain the: self.stateSpace=[i for i range(self.m*self.n)]

Asked 2020-Apr-29 at 12:30

This code is supposed to be the beginning of a gym custom environment but I can't get the syntax of the following for loop. I don't get what does it do.

...

ANSWER

Answered 2020-Apr-29 at 02:17

While it is not clear what lines of code you need explained above, I believe your question is, about List Comprehensions in python, as that is focus of the question title.

Source https://stackoverflow.com/questions/61493082

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install gridworld

You can download it from GitHub.
You can use gridworld like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: