a3c | Tensorflow implementation of Asynchronous Advantage Actor | Reinforcement Learning library
kandi X-RAY | a3c Summary
kandi X-RAY | a3c Summary
Tensorflow implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Create a central agent .
- Run an agent .
- Initialize the critic .
- Compute the gradients of the actor .
- Start the unit training process .
- Discount the value of a signal .
- Compute the entropy of an array .
- Build summaries .
- Creates the critic network .
- Create the actor network .
a3c Key Features
a3c Examples and Code Snippets
Community Discussions
Trending Discussions on a3c
QUESTION
I am making a comparison between both kind of algorithms against the CartPole environment. Having the imports as:
...ANSWER
Answered 2021-Feb-11 at 18:29The A2C code fails due to the configuration you copied from the PPO trial: "sgd_minibatch_size", "kl_coeff" and many others are PPO-specific configs, which cause the problem when running using A2C.
The error is explained in the "error.txt" in the logdir.
QUESTION
I am trying to use A3C with LSTM for an environment where states has 12 inputs ranging from -5000 to 5000. I am using an LSTM layer of size 12 and then 2 fully connected hidden layers of size 256, then 1 fc for 3 action dim and 1 fc for 1 value function. The reward is in range (-1,1).
However during initial training I am unable to get good results.
My question is- Is this Neural Network good enough for this kind of environment.
Below is the code for Actor Critic
...ANSWER
Answered 2020-Nov-13 at 13:46Since you have 12 inputs so make sure you dont use too many parameters, also try changing activation function. i dont use Torch so i can not understand model architecture. why your first layer is LSTM? is your data time series? try using only Dense layer,
- 1 Dense only with 12 neurons and output layer
- 2 Dense Layers with 12 neurons each and output layer
As for activation function use leaky relu, since your data is -5000, or you can make your data positive only by adding 5000 to all data samples.
QUESTION
I am normalizing a 3D vector, and the clang-9 generated code throws a SIGFPE on the sqrtf()
even though I do a test before calling it.
Note that I run with FP exceptions enabled.
...ANSWER
Answered 2020-Aug-14 at 00:57throws a domain error, even thoug I guard against negative numbers
But if (lensq > FLT_EPSILON)
is too late as earlier dx*dx + dy*dy + dz*dz
caused int
overflow. "and indeed overflow, causing lensq
to be negative" - which is undefined behavior UB.
Compiler can take advantage that sqrtf(lensq)
can always work because it can assume dx*dx + dy*dy + dz*dz >= 0
and so lensq >= 0.0f
.
Get rid of UB.
QUESTION
I am trying to do multiprocessing of a function 5 times by using python multiprocessing.Process library
To my surprise I am also getting the prints that I have added 5 times more..my intuition is that main is also getting called 5 times.
I added prints in the train function that I need to process 5 times I can see multiprocessing happening from the prints.But I am unable to figure it out why main is also getting called 5 times.
Here is my code..Can someone please help a check what's wrong with this code.
...ANSWER
Answered 2020-Aug-05 at 23:13Short answer
Since you are running this on windows, all code that is not function or class definitions should be inside the if __name__ is "__main__"
block.
Long answer
On POSIX operating systems, the multiprocessing
module is implemented using the fork()
system call, which creates a copy of a process.
This is very handy because the second process is completely initialized out of the box.
Microsoft windows does not have this system call. So Python tries to mimick this by starting a new Python interpreter and importing your program as a module.
For this to work well importing your program should not have side effects. The best way to achieve that is to put anything that is not a class or function definition inside the if __name__ is "__main__"
block.
Since part of your code is outside the main block, it will be executed when your program is imported in the newly created Python processes. That is why you are seeing the multiple "epoch" prints.
QUESTION
I have following code for train function for training an A3C. I am stuck with following error.
...ANSWER
Answered 2020-Aug-03 at 23:08The pytorch error you get means "you can only call backward on scalars, i.e 0-dimensional tensors". Here, according to your prints, policy_loss
is not scalar, it's a 1x4 matrix. As a consequence, so is policy_loss + 0.5 * value_loss
. Thus your call to backward
yields an error.
You probably forgot to reduce your losses to a scalar (with functions like norm
or MSELoss
...). See example here
The reason it does not work is the way the gradient propagation works internally (it's basically a Jacobian multiplication engine). You can call backward on a non-scalar tensor, but then you have to provide a gradient yourself, like :
QUESTION
this is the json i have and im trying to merge the objects with respect to toolName
...ANSWER
Answered 2020-Jul-13 at 15:31Simple implementation using reduce
QUESTION
I am trying to add a new Orderer Organization to RAFT based existing ordering service. I am using the first-network
from fabric-samples
as the base network. While generating crypto-material, I have modified to generate crypto-material for 1 more orderer organization. The crypto-config.yaml
looks like:
ANSWER
Answered 2020-Apr-11 at 07:52I was able to extend the first-network
by adding a new Orderer Organization as follows:
- Start the
first-network
through thebyfn.sh
script in thefabric-samples
repo in theetcdraft
mode. - I generated crypto-material like described in the
crypto-config.yaml
in the question above. - Use the
configtxgen
tool to print the new orderer organization's MSP into JSON format. - Mount or
docker cp
this JSON file to the runningcli
container. - Set the environment inside the
cli
container corresponding to existing ordering node. Import the latestsystem-channel
configuration. Decode it to JSON format. Edit the system channel configuration block's
Orderer
section to add the new orderer organization's MSP as follows:jq -s '.[0] * {"channel_group":{"groups":{"Orderer":{"groups": {"Orderer1Org":.[1]}}}}}' config.json orderer1org.json > config1.json
Edit the system channel configuration block's
Consortiums
section to add the new orderer organization's MSP as follows:jq -s '.[0] * {"channel_group":{"groups":{"Consortiums":{"groups":{"SampleConsortium":{"groups": {"Orderer1MSP":.[1]}}}}}}}' config1.json orderer1org.json > config2.json
Edit the system channel configuration block's
Consenters
section to add the TLS credentials for the new orderer organization'sorderer.example1.com
node as follows:cert=`base64 ../crypto/ordererOrganizations/example1.com/orderers/orderer.example1.com/tls/server.crt | sed ':a;N;$!ba;s/\n//g'`
cat config2.json | jq '.channel_group.groups.Orderer.values.ConsensusType.value.metadata.consenters += [{"client_tls_cert": "'$cert'", "host": "orderer.example1.com", "port": 7050, "server_tls_cert": "'$cert'"}] ' > modified_config.json
Encode the block, find delta, create channel update transaction, encode it as protobuf envelope and submit the channel update transaction.
- Fetch the latest system channel configuration block.
- Start one of the orderers (the one who was added to consenters list previously) using this latest fetched system channel configuration block as it's
genesis.block
file. - Perform
docker exec
into thecli
container. Using the environment of an existing orderer node, fetch the latest system channel configuration. Edit the system channel configuration block to add the new orderer's endpoint in the
OrdererAddresses
section as follows:cat config.json | jq '.channel_group.values.OrdererAddresses.value.addresses += ["orderer.example1.com:7050"] ' > modified_config.json
Encode the block, find delta, create channel update transaction, encode it as protobuf envelope and get the block signed by
Orderer1Org
admin to satify themod_policy
for/Channel/OrdererAddresses
resource which is set toAdmins
policy. This implicit meta policy expects the signature ofMAJORITY Admins
at that level of update. So, as the number of orderer organizations are 2 now, we need both the organization's admins to sign this system channel update transaction. Set the environment corresponding toOrderer1Org
admin and run the following command:peer channel signconfigtx -f ordorg_update_in_envelope.pb
Set the environment back to
OrdererOrg
admin and submit the channel update transaction. Thepeer channel update
will automatically sign the transaction on behalf of OrdererOrg admin.peer channel update -f ordorg_update_in_envelope.pb -c $CHANNEL_NAME -o orderer.example.com:7050 --tls true --cafile $ORDERER_CA
For updating any application channel, just replace the step 7 by updating the application channel configuration block's Application
section to add the new orderer organization's MSP there.
Hope this helps!
QUESTION
I am having an issue displaying images from my cloud storage,Glide keeps throwing this very long errors about exifinterface i have done some research concerning the issue but my problem doesnt seem to go away. i have attached my stack trace below.
...ANSWER
Answered 2019-Nov-06 at 09:33use this dependency for glide
QUESTION
More specifically, is calling fit
/predict
or other methods on the same model from different threads safe in Tensorflow 2.0 (using Keras API)?
I couldn't find a clear answer from the documentation or from looking online.
I saw this question from 2017 which says that Keras (although the question mentioned a Theano backend) is thread-safe, but you have to call private method model._make_predict_function()
before you call predict()
(which I believe has been deprecated). However, I read this blog post from 2019 which says that it's not thread-safe.
I also found this question from 2018 that says that Tensorflow (pre-Keras) is thread-safe but you have to make sure you use the default graph explicity (which I believe is irrelevant for Tensorflow 2.* because of eager execution). When I looked up thread-safety in eager execution I saw this article in the documents that does mention thread-safety of eager execution, but it's in relation to Java.
And to make things more confusing, I saw an A3C implementation in Github with Keras from this year (2020) that was using locks before training the shared policy/value networks, hinting that Keras is not thread-safe and you have to acquire a lock before training a shared model. However, it looks to me like the implementation is flawed because each worker was creating and using it's own unique lock which defeats the purpose of having a lock. My conclusion is that either his code was running successfully regardless of the "lock" because Keras is thread-safe, or that he has a bug.
I did my own final test where I ran two threads fitting the same model to different outputs (for the same constant input) and tried calling predict during the training and it seems to be working, but I'm asking this question because I want to make sure. Are there any cases where Tensorflow 2.0/Keras are not thread safe?
...ANSWER
Answered 2020-Mar-28 at 13:18According to a Keras contributor in this GitHub issue:
Keras models can't be guaranteed to be thread-safe. Consider having independent copies of the model in each thread for CPU inference.
QUESTION
My page, after adding an SSL certificate (Let's Encrypt), cannot have preview fetched by Facebook or Twitter when sharing the link. I have followed The Open Graph protocol and include the following open graph tags:
...ANSWER
Answered 2019-Aug-26 at 09:22In my case, it seems that the crawler is just having a bug. I have my own canonical answer in this question, hope that helps someone: FB OpenGraph og:image not pulling images (possibly https?)
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install a3c
You can use a3c like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page