AdaBound | An optimizer that trains as fast as Adam and as good as SGD | Machine Learning library
kandi X-RAY | AdaBound Summary
kandi X-RAY | AdaBound Summary
An optimizer that trains as fast as Adam and as good as SGD, for developing state-of-the-art deep learning models on a wide variety of popular tasks in the field of CV, NLP, and etc. Based on Luo et al. (2019). Adaptive Gradient Methods with Dynamic Bound of Learning Rate. In Proc. of ICLR 2019.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Test the loss function
- Define a densenet
- Shortcut for ResNet
- Build the dataset
- Train the network
- Create an optimizer
- Get argument parser
- Build the model
- Return the name of the CKPT
- Load a checkpoint from a checkpoint
AdaBound Key Features
AdaBound Examples and Code Snippets
model: resnet18
msc: False # if you use temporal multi-scale input
class_weight: True # if you use class weight to calculate cross entropy or not
writer_flag: True # if you use tensorboardx or not
n_classes: 400
batch_size: 32
inp
# learning can be either a scalar or a tensor
# use exclude_from_weight_decay feature,
# if you wanna selectively disable updating weight-decayed weights
optimizer = AdaBoundOptimizer(
learning_rate=1e-3,
final_lr=1e-1,
beta_1=0.9,
model=Net(num_classes=9).to('cuda')
#model=MobileNetV2(n_class=9).to('cuda')
#model=resnet50(num_class=9,pretrained=False).to('cuda')
print(model)
print("cuda:0")
model=Net(num_classes=9).to('cpu')
#model = MobileNetV2(n_class=9).to("cpu")
#model=re
Community Discussions
Trending Discussions on AdaBound
QUESTION
I tried using SGD, Adadelta, Adabound, Adam. Everything gives me fluctuations in validation accuracy. I tried all the activation functions in keras, but still, I'm getting fluctuations in val_acc.
Training samples: 1352
Validation Samples: 339
Validation Accuracy
ANSWER
Answered 2020-Jan-04 at 12:47Your model may be too noise sensitive, see this answer.
Based on the answer in the link and what I see from your model, your network may be too deep for the amount of data you have (large model and not enough datas ==> overfitting ==> noise sensitivity). I suggest to use a simpler model as a sanity check.
The learning rate could also be a possible reason (as stated by Neb). You are using the default learning rate of sgd (which is 0.01, maybe too high). Try with 1.e-3 or below.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install AdaBound
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page