A3C-Continuous | Tensorflow implementation | Reinforcement Learning library

 by   stefanbo92 Python Version: Current License: MIT

kandi X-RAY | A3C-Continuous Summary

kandi X-RAY | A3C-Continuous Summary

A3C-Continuous is a Python library typically used in Artificial Intelligence, Reinforcement Learning, Deep Learning, Tensorflow applications. A3C-Continuous has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. However A3C-Continuous build file is not available. You can download it from GitHub.

Tensorflow implementation of the asynchronous advantage actor-critic (A3C) reinforcement learning algorithm (paper) for continuous action space. Code is mostly based on Morvan Zhou (github).
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              A3C-Continuous has a low active ecosystem.
              It has 33 star(s) with 15 fork(s). There are 7 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              A3C-Continuous has no issues reported. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of A3C-Continuous is current.

            kandi-Quality Quality

              A3C-Continuous has 0 bugs and 0 code smells.

            kandi-Security Security

              A3C-Continuous has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              A3C-Continuous code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              A3C-Continuous is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              A3C-Continuous releases are not available. You will need to build from source code and install.
              A3C-Continuous has no build file. You will be need to create the build yourself to build the component from source.
              A3C-Continuous saves you 64 person hours of effort in developing the same functionality from scratch.
              It has 168 lines of code, 7 functions and 1 files.
              It has medium code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed A3C-Continuous and discovered the below as its top functions. This is intended to give you an instant insight into A3C-Continuous implemented functionality, and help decide if they suit your requirements.
            • Run the action loop
            • Choose a single action
            • Pulls all parameters from the global variables
            • Update global variables
            Get all kandi verified functions for this library.

            A3C-Continuous Key Features

            No Key Features are available at this moment for A3C-Continuous.

            A3C-Continuous Examples and Code Snippets

            No Code Snippets are available at this moment for A3C-Continuous.

            Community Discussions

            Trending Discussions on A3C-Continuous

            QUESTION

            a3c continuous action probelm
            Asked 2018-Sep-14 at 17:39

            I want to implement reinforcement learning for a game which uses the mouse to move. This game only cares about x-axis of the mouse.

            My first try is to make it discrete. The game will have 3 actions. Two actions are used to move the mouse 30 pixels to left and right and one action for standing still. It worked but now I want to make it continuous.

            What I have done is to make the neural network output mean and std. Exactly like this code https://github.com/stefanbo92/A3C-Continuous/blob/master/a3c.py. I even used this code on a second try. The width of the game is 480 so A_BOUND are [-240,240]. To make the problem always have a positive action, I added the predicted action to 240 then set the mouse position to the new one.

            For example: If the action is 240 + -240, then the mouse x pos will be 0. The problem is that my neural network output only extremes from 240 to -240 consistently seconds after the start.

            ...

            ANSWER

            Answered 2018-Sep-14 at 17:05

            The reason for your problem is because the output of your neural network is being squashed by an activation function. This is a problem because there are very few values that results in a output that is not the max or min value.

            Tanh activation

            The above is the shape of the hyperbolic tangent activation function. As you can see, the value is only non max/min if the input values are between -3 to 3, any values outside of that results in either the max or min values.

            To overcome this, you must initialize your neural network with very small weights. You can initialize the weights using random uniform values between -0.003 to 0.003. Those are the values I use. This way, initially, your neural network will output close to 0 values, and then the weights will be updated and the learning will be more stable.

            To further correct for this error, you must put a small penalty for performing a large changes in state.

            For example, penalty = (state * 0.01) ^ 2, where state = [-240, 240].

            This way, your neural network will realize that theres a higher loss associated with large changes, so it will use it sparingly, and only when necessary.

            Source https://stackoverflow.com/questions/52334685

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install A3C-Continuous

            You can download it from GitHub.
            You can use A3C-Continuous like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/stefanbo92/A3C-Continuous.git

          • CLI

            gh repo clone stefanbo92/A3C-Continuous

          • sshUrl

            git@github.com:stefanbo92/A3C-Continuous.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Consider Popular Reinforcement Learning Libraries

            Try Top Libraries by stefanbo92

            color-detector

            by stefanbo92C++

            Tensorflow-Classifier

            by stefanbo92Python

            Visual-Odometry

            by stefanbo92C++

            Image-Analyser

            by stefanbo92Python

            headtrackerApp

            by stefanbo92JavaScript