a3c | Tensorflow implementation of Asynchronous Advantage Actor | Reinforcement Learning library

by hongzimao Python Version: Current License: No License

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | a3c Summary

a3c is a Python library typically used in Artificial Intelligence, Reinforcement Learning, Deep Learning, Tensorflow applications. a3c has no bugs, it has no vulnerabilities and it has low support. However a3c build file is not available. You can download it from GitHub.

Tensorflow implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

Support

Quality

Security

License

Reuse

Support

a3c has a low active ecosystem.

It has 24 star(s) with 14 fork(s). There are 3 watchers for this library.

It had no major release in the last 6 months.

There are 0 open issues and 2 have been closed. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of a3c is current.

Quality

a3c has 0 bugs and 0 code smells.

Security

a3c has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

a3c code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

a3c does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

a3c releases are not available. You will need to build from source code and install.

a3c has no build file. You will be need to create the build yourself to build the component from source.

a3c saves you 133 person hours of effort in developing the same functionality from scratch.

It has 334 lines of code, 25 functions and 3 files.

It has high code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed a3c and discovered the below as its top functions. This is intended to give you an instant insight into a3c implemented functionality, and help decide if they suit your requirements.

Create a central agent .
Run an agent .
Initialize the critic .
Compute the gradients of the actor .
Start the unit training process .
Discount the value of a signal .
Compute the entropy of an array .
Build summaries .
Creates the critic network .
Create the actor network .

Get all kandi verified functions for this library.

a3c Key Features

No Key Features are available at this moment for a3c.

a3c Examples and Code Snippets

No Code Snippets are available at this moment for a3c.

Community Discussions

Trending Discussions on a3c

RLLib tunes PPOTrainer but not A2CTrainer

How much deep a Neural Network Required for 12 inputs of ranging from -5000 to 5000 in a3c Reinforcement Learning

sqrtf() throws a domain error, even though I guard against negative numbers

Multiprocessing a function from main forking main as well in python

RuntimeError("grad can be implicitly created only for scalar outputs")

is there any way to merging values inside array of objects with same key value pair in javascript

How to add a new Orderer Organization to existing Hyperledger Fabric network

Glide not Loading Image from the cloud storage

Is Tensorflow 2.0 with the Keras API thread-safe?

Why can't Open Graph checkers detect Open Graph data?

QUESTION

RLLib tunes PPOTrainer but not A2CTrainer

Asked 2021-Feb-11 at 18:29

I am making a comparison between both kind of algorithms against the CartPole environment. Having the imports as:

...

ANSWER

Answered 2021-Feb-11 at 18:29

The A2C code fails due to the configuration you copied from the PPO trial: "sgd_minibatch_size", "kl_coeff" and many others are PPO-specific configs, which cause the problem when running using A2C.

The error is explained in the "error.txt" in the logdir.

Source https://stackoverflow.com/questions/65668160

QUESTION

How much deep a Neural Network Required for 12 inputs of ranging from -5000 to 5000 in a3c Reinforcement Learning

Asked 2020-Nov-13 at 13:46

I am trying to use A3C with LSTM for an environment where states has 12 inputs ranging from -5000 to 5000. I am using an LSTM layer of size 12 and then 2 fully connected hidden layers of size 256, then 1 fc for 3 action dim and 1 fc for 1 value function. The reward is in range (-1,1).

However during initial training I am unable to get good results.

My question is- Is this Neural Network good enough for this kind of environment.

Below is the code for Actor Critic

...

ANSWER

Answered 2020-Nov-13 at 13:46

Since you have 12 inputs so make sure you dont use too many parameters, also try changing activation function. i dont use Torch so i can not understand model architecture. why your first layer is LSTM? is your data time series? try using only Dense layer,

1 Dense only with 12 neurons and output layer
2 Dense Layers with 12 neurons each and output layer

As for activation function use leaky relu, since your data is -5000, or you can make your data positive only by adding 5000 to all data samples.

Source https://stackoverflow.com/questions/63195873

QUESTION

sqrtf() throws a domain error, even though I guard against negative numbers

Asked 2020-Aug-14 at 02:32

I am normalizing a 3D vector, and the clang-9 generated code throws a SIGFPE on the sqrtf() even though I do a test before calling it.

Note that I run with FP exceptions enabled.

...

ANSWER

Answered 2020-Aug-14 at 00:57

throws a domain error, even thoug I guard against negative numbers

But if (lensq > FLT_EPSILON) is too late as earlier dx*dx + dy*dy + dz*dz caused int overflow. "and indeed overflow, causing lensq to be negative" - which is undefined behavior UB.

Compiler can take advantage that sqrtf(lensq) can always work because it can assume dx*dx + dy*dy + dz*dz >= 0 and so lensq >= 0.0f.

Get rid of UB.

Source https://stackoverflow.com/questions/63404720

QUESTION

Multiprocessing a function from main forking main as well in python

Asked 2020-Aug-05 at 23:13

I am trying to do multiprocessing of a function 5 times by using python multiprocessing.Process library

To my surprise I am also getting the prints that I have added 5 times more..my intuition is that main is also getting called 5 times.

I added prints in the train function that I need to process 5 times I can see multiprocessing happening from the prints.But I am unable to figure it out why main is also getting called 5 times.

Here is my code..Can someone please help a check what's wrong with this code.

...

ANSWER

Answered 2020-Aug-05 at 23:13

Short answer

Since you are running this on windows, all code that is not function or class definitions should be inside the if __name__ is "__main__" block.

Long answer

On POSIX operating systems, the multiprocessing module is implemented using the fork() system call, which creates a copy of a process. This is very handy because the second process is completely initialized out of the box.

Microsoft windows does not have this system call. So Python tries to mimick this by starting a new Python interpreter and importing your program as a module. For this to work well importing your program should not have side effects. The best way to achieve that is to put anything that is not a class or function definition inside the if __name__ is "__main__" block.

Since part of your code is outside the main block, it will be executed when your program is imported in the newly created Python processes. That is why you are seeing the multiple "epoch" prints.

Source https://stackoverflow.com/questions/63274284

QUESTION

RuntimeError("grad can be implicitly created only for scalar outputs")

Asked 2020-Aug-03 at 23:08

I have following code for train function for training an A3C. I am stuck with following error.

...

ANSWER

Answered 2020-Aug-03 at 23:08

The pytorch error you get means "you can only call backward on scalars, i.e 0-dimensional tensors". Here, according to your prints, policy_lossis not scalar, it's a 1x4 matrix. As a consequence, so is policy_loss + 0.5 * value_loss. Thus your call to backward yields an error.

You probably forgot to reduce your losses to a scalar (with functions like norm or MSELoss ...). See example here

The reason it does not work is the way the gradient propagation works internally (it's basically a Jacobian multiplication engine). You can call backward on a non-scalar tensor, but then you have to provide a gradient yourself, like :

Source https://stackoverflow.com/questions/63218710

QUESTION

is there any way to merging values inside array of objects with same key value pair in javascript

Asked 2020-Jul-13 at 15:31

this is the json i have and im trying to merge the objects with respect to toolName

...

ANSWER

Answered 2020-Jul-13 at 15:31

Simple implementation using reduce

Source https://stackoverflow.com/questions/62879048

QUESTION

How to add a new Orderer Organization to existing Hyperledger Fabric network

Asked 2020-Apr-11 at 07:52

I am trying to add a new Orderer Organization to RAFT based existing ordering service. I am using the first-network from fabric-samples as the base network. While generating crypto-material, I have modified to generate crypto-material for 1 more orderer organization. The crypto-config.yaml looks like:

...

ANSWER

Answered 2020-Apr-11 at 07:52

I was able to extend the first-network by adding a new Orderer Organization as follows:

Start the first-network through the byfn.sh script in the fabric-samples repo in the etcdraft mode.
I generated crypto-material like described in the crypto-config.yaml in the question above.
Use the configtxgen tool to print the new orderer organization's MSP into JSON format.
Mount or docker cp this JSON file to the running cli container.
Set the environment inside the cli container corresponding to existing ordering node. Import the latest system-channel configuration. Decode it to JSON format.
Edit the system channel configuration block's Orderer section to add the new orderer organization's MSP as follows:

jq -s '.[0] * {"channel_group":{"groups":{"Orderer":{"groups": {"Orderer1Org":.[1]}}}}}' config.json orderer1org.json > config1.json
Edit the system channel configuration block's Consortiums section to add the new orderer organization's MSP as follows:

jq -s '.[0] * {"channel_group":{"groups":{"Consortiums":{"groups":{"SampleConsortium":{"groups": {"Orderer1MSP":.[1]}}}}}}}' config1.json orderer1org.json > config2.json
Edit the system channel configuration block's Consenters section to add the TLS credentials for the new orderer organization's orderer.example1.com node as follows:

cert=`base64 ../crypto/ordererOrganizations/example1.com/orderers/orderer.example1.com/tls/server.crt | sed ':a;N;$!ba;s/\n//g'`

cat config2.json | jq '.channel_group.groups.Orderer.values.ConsensusType.value.metadata.consenters += [{"client_tls_cert": "'$cert'", "host": "orderer.example1.com", "port": 7050, "server_tls_cert": "'$cert'"}] ' > modified_config.json
Encode the block, find delta, create channel update transaction, encode it as protobuf envelope and submit the channel update transaction.
Fetch the latest system channel configuration block.
Start one of the orderers (the one who was added to consenters list previously) using this latest fetched system channel configuration block as it's genesis.block file.
Perform docker exec into the cli container. Using the environment of an existing orderer node, fetch the latest system channel configuration.
Edit the system channel configuration block to add the new orderer's endpoint in the OrdererAddresses section as follows:

cat config.json | jq '.channel_group.values.OrdererAddresses.value.addresses += ["orderer.example1.com:7050"] ' > modified_config.json
Encode the block, find delta, create channel update transaction, encode it as protobuf envelope and get the block signed by Orderer1Org admin to satify the mod_policy for /Channel/OrdererAddresses resource which is set to Admins policy. This implicit meta policy expects the signature of MAJORITY Admins at that level of update. So, as the number of orderer organizations are 2 now, we need both the organization's admins to sign this system channel update transaction. Set the environment corresponding to Orderer1Org admin and run the following command:

peer channel signconfigtx -f ordorg_update_in_envelope.pb
Set the environment back to OrdererOrg admin and submit the channel update transaction. The peer channel update will automatically sign the transaction on behalf of OrdererOrg admin.

peer channel update -f ordorg_update_in_envelope.pb -c $CHANNEL_NAME -o orderer.example.com:7050 --tls true --cafile $ORDERER_CA

For updating any application channel, just replace the step 7 by updating the application channel configuration block's Application section to add the new orderer organization's MSP there.

Hope this helps!

Source https://stackoverflow.com/questions/61122349

QUESTION

Glide not Loading Image from the cloud storage

Asked 2020-Apr-07 at 00:40

I am having an issue displaying images from my cloud storage,Glide keeps throwing this very long errors about exifinterface i have done some research concerning the issue but my problem doesnt seem to go away. i have attached my stack trace below.

...

ANSWER

Answered 2019-Nov-06 at 09:33

use this dependency for glide

Source https://stackoverflow.com/questions/58726801

QUESTION

Is Tensorflow 2.0 with the Keras API thread-safe?

Asked 2020-Mar-28 at 13:18

Is Tensorflow 2.0 thread-safe?

More specifically, is calling fit/predict or other methods on the same model from different threads safe in Tensorflow 2.0 (using Keras API)?

I couldn't find a clear answer from the documentation or from looking online.

I saw this question from 2017 which says that Keras (although the question mentioned a Theano backend) is thread-safe, but you have to call private method model._make_predict_function() before you call predict() (which I believe has been deprecated). However, I read this blog post from 2019 which says that it's not thread-safe.

I also found this question from 2018 that says that Tensorflow (pre-Keras) is thread-safe but you have to make sure you use the default graph explicity (which I believe is irrelevant for Tensorflow 2.* because of eager execution). When I looked up thread-safety in eager execution I saw this article in the documents that does mention thread-safety of eager execution, but it's in relation to Java.

And to make things more confusing, I saw an A3C implementation in Github with Keras from this year (2020) that was using locks before training the shared policy/value networks, hinting that Keras is not thread-safe and you have to acquire a lock before training a shared model. However, it looks to me like the implementation is flawed because each worker was creating and using it's own unique lock which defeats the purpose of having a lock. My conclusion is that either his code was running successfully regardless of the "lock" because Keras is thread-safe, or that he has a bug.

I did my own final test where I ran two threads fitting the same model to different outputs (for the same constant input) and tried calling predict during the training and it seems to be working, but I'm asking this question because I want to make sure. Are there any cases where Tensorflow 2.0/Keras are not thread safe?

...

ANSWER

Answered 2020-Mar-28 at 13:18

According to a Keras contributor in this GitHub issue:

Keras models can't be guaranteed to be thread-safe. Consider having independent copies of the model in each thread for CPU inference.

Source https://stackoverflow.com/questions/60848263

QUESTION

Why can't Open Graph checkers detect Open Graph data?

Asked 2019-Aug-26 at 09:22

My page, after adding an SSL certificate (Let's Encrypt), cannot have preview fetched by Facebook or Twitter when sharing the link. I have followed The Open Graph protocol and include the following open graph tags:

...

ANSWER

Answered 2019-Aug-26 at 09:22

In my case, it seems that the crawler is just having a bug. I have my own canonical answer in this question, hope that helps someone: FB OpenGraph og:image not pulling images (possibly https?)

Source https://stackoverflow.com/questions/57529664

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install a3c

You can download it from GitHub.
You can use a3c like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: