A3C-Continuous | Tensorflow implementation | Reinforcement Learning library

by stefanbo92 Python Version: Current License: MIT

X-Ray Key Features Code Snippets Community Discussions(1)Vulnerabilities Install Support

kandi X-RAY | A3C-Continuous Summary

A3C-Continuous is a Python library typically used in Artificial Intelligence, Reinforcement Learning, Deep Learning, Tensorflow applications. A3C-Continuous has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. However A3C-Continuous build file is not available. You can download it from GitHub.

Tensorflow implementation of the asynchronous advantage actor-critic (A3C) reinforcement learning algorithm (paper) for continuous action space. Code is mostly based on Morvan Zhou (github).

Support

Quality

Security

License

Reuse

Support

A3C-Continuous has a low active ecosystem.

It has 33 star(s) with 15 fork(s). There are 7 watchers for this library.

It had no major release in the last 6 months.

A3C-Continuous has no issues reported. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of A3C-Continuous is current.

Quality

A3C-Continuous has 0 bugs and 0 code smells.

Security

A3C-Continuous has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

A3C-Continuous code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

A3C-Continuous is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

A3C-Continuous releases are not available. You will need to build from source code and install.

A3C-Continuous has no build file. You will be need to create the build yourself to build the component from source.

A3C-Continuous saves you 64 person hours of effort in developing the same functionality from scratch.

It has 168 lines of code, 7 functions and 1 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed A3C-Continuous and discovered the below as its top functions. This is intended to give you an instant insight into A3C-Continuous implemented functionality, and help decide if they suit your requirements.

Run the action loop
Choose a single action
Pulls all parameters from the global variables
Update global variables

Get all kandi verified functions for this library.

A3C-Continuous Key Features

No Key Features are available at this moment for A3C-Continuous.

A3C-Continuous Examples and Code Snippets

No Code Snippets are available at this moment for A3C-Continuous.

Community Discussions

Trending Discussions on A3C-Continuous

a3c continuous action probelm

QUESTION

a3c continuous action probelm

Asked 2018-Sep-14 at 17:39

I want to implement reinforcement learning for a game which uses the mouse to move. This game only cares about x-axis of the mouse.

My first try is to make it discrete. The game will have 3 actions. Two actions are used to move the mouse 30 pixels to left and right and one action for standing still. It worked but now I want to make it continuous.

What I have done is to make the neural network output mean and std. Exactly like this code https://github.com/stefanbo92/A3C-Continuous/blob/master/a3c.py. I even used this code on a second try. The width of the game is 480 so A_BOUND are [-240,240]. To make the problem always have a positive action, I added the predicted action to 240 then set the mouse position to the new one.

For example: If the action is 240 + -240, then the mouse x pos will be 0. The problem is that my neural network output only extremes from 240 to -240 consistently seconds after the start.

...

ANSWER

Answered 2018-Sep-14 at 17:05

The reason for your problem is because the output of your neural network is being squashed by an activation function. This is a problem because there are very few values that results in a output that is not the max or min value.

Tanh activation

The above is the shape of the hyperbolic tangent activation function. As you can see, the value is only non max/min if the input values are between -3 to 3, any values outside of that results in either the max or min values.

To overcome this, you must initialize your neural network with very small weights. You can initialize the weights using random uniform values between -0.003 to 0.003. Those are the values I use. This way, initially, your neural network will output close to 0 values, and then the weights will be updated and the learning will be more stable.

To further correct for this error, you must put a small penalty for performing a large changes in state.

For example, penalty = (state * 0.01) ^ 2, where state = [-240, 240].

This way, your neural network will realize that theres a higher loss associated with large changes, so it will use it sparingly, and only when necessary.

Source https://stackoverflow.com/questions/52334685

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install A3C-Continuous

You can download it from GitHub.
You can use A3C-Continuous like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: