Autotuner | repo contains the code | Machine Learning library

by crmclean R Version: 0.99.0 License: MIT

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | Autotuner Summary

Autotuner is a R library typically used in Artificial Intelligence, Machine Learning, Deep Learning, Tensorflow, Numpy applications. Autotuner has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

This repo contains the code needed to run the R package AutoTuner. AutoTuner is used to identify dataset specific parameters to process untargeted metabolomics data. So far, AutoTuner has been tested on untargeted data generated on qTOF, orbitrap and Fourier transform ion cyclotron resonance mass analyzers. Currently, AutoTuner requires R version 3.6 or greater. For input, AutoTuner requires at least 3 samples of raw data converted from proprietary instrument formats (eg .mzML, .mzXML, or .CDF). It also requires a spreadsheet containing at least two columns. One column must match the raw data samples by name, and the other must describe the different experimental factors each sample belongs to.

Support

Quality

Security

License

Reuse

Support

Autotuner has a low active ecosystem.

It has 11 star(s) with 6 fork(s). There are 4 watchers for this library.

It had no major release in the last 12 months.

There are 1 open issues and 28 have been closed. On average issues are closed in 17 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of Autotuner is 0.99.0

Quality

Autotuner has no bugs reported.

Security

Autotuner has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

Autotuner is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

Autotuner releases are available to install and integrate.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of Autotuner

Get all kandi verified functions for this library.

Autotuner Key Features

No Key Features are available at this moment for Autotuner.

Autotuner Examples and Code Snippets

No Code Snippets are available at this moment for Autotuner.

Community Discussions

Trending Discussions on Autotuner

How to extract mlr3 tuned graph step by step?

How to transform '2 levels ParamUty' class in nested cross-validation of mlr3proba?

Benchmarking multiple AutoTuning instances

how to repeat hyperparameter tuning (alpha and/or lambda) of glmnet in mlr3

How to interpret the aggregated performance result of nested resampling in mlr3?

Importance based variable reduction

How to access parameters fitted by benchmark?

mlr3 predictions to new data with parameters from autotune

Start CloudSQL Proxy on Python Dataflow / Apache Beam

mlr3 PipeOps: Create branches with different data transformations and benchmark different learners within and between branches

QUESTION

How to extract mlr3 tuned graph step by step?

Asked 2021-Jun-09 at 07:49

My codes in following

...

ANSWER

Answered 2021-Jun-08 at 09:22

To be able to fiddle with the models after resampling its best to call resample with store_models = TRUE

Using your example

Source https://stackoverflow.com/questions/67869401

QUESTION

How to transform '2 levels ParamUty' class in nested cross-validation of mlr3proba?

Asked 2021-Apr-19 at 04:08

For survival analysis, I am using mlr3proba package of R.
My dataset consists of 39 features(both continuous and factor, which i converted all to integer and numeric) and target (time & status).
I want to tune hyperparameter: num_nodes, in Param_set.
This is a ParamUty class parameter with default value: 32,32.
so I decided to transform it.
I wrote the code as follows for hyperparamters optimization of surv.deephit learner using 'nested cross-validation' (with 10 inner and 3 outer folds).

...

ANSWER

Answered 2021-Apr-17 at 08:46

Hi thanks for using mlr3proba. I have actually just finished writing a tutorial that answers exactly this question! It covers training, tuning, and evaluating the neural networks in mlr3proba. For your specific question, the relevant part of the tutorial is this:

Source https://stackoverflow.com/questions/67132598

QUESTION

Benchmarking multiple AutoTuning instances

Asked 2021-Mar-24 at 09:04

I have been trying to use mlr3 to do some hyperparameter tuning for xgboost. I want to compare three different models:

xgboost tuned over just the alpha hyperparameter
xgboost tuned over alpha and lambda hyperparameters
xgboost tuned over alpha, lambda, and maxdepth hyperparameters.

After reading the mlr3 book, I thought that using AutoTuner for the nested resampling and benchmarking would be the best way to go about doing this. Here is what I have tried:

...

ANSWER

Answered 2021-Mar-24 at 09:04

To see whether tuning has an effect, you can just add an untuned learner to the benchmark. Otherwise, the conclusion could be that tuning alpha is sufficient for your example.

I adapted the code so that it runs with an example task.

Source https://stackoverflow.com/questions/66774423

QUESTION

how to repeat hyperparameter tuning (alpha and/or lambda) of glmnet in mlr3

Asked 2021-Mar-22 at 09:34

I would like to repeat the hyperparameter tuning (alpha and/or lambda) of glmnet in mlr3 to avoid variability in smaller data sets

In caret, I could do this with "repeatedcv"

Since I really like the mlr3 family packages I would like to use them for my analysis. However, I am not sure about the correct way how to do this step in mlr3

Example data

...

ANSWER

Answered 2021-Mar-21 at 22:36

Repeated hyperparameter tuning (alpha and lambda) of glmnet can be done using the SECOND mlr3 approach as stated above. The coefficients can be extracted with stats::coef and the stored values in the AutoTuner

Source https://stackoverflow.com/questions/66696405

QUESTION

How to interpret the aggregated performance result of nested resampling in mlr3?

Asked 2021-Feb-26 at 12:50

Recently I am learning about the nested resampling in mlr3 package. According to the mlr3 book, the target of nested resampling is getting an unbiased performance estimates for learners. I run a test as follow:

...

ANSWER

Answered 2021-Feb-26 at 12:50

The result shows that the 3 hyperparameters chosen from 3 inner resampling are not garantee to be the same.

It sounds like you want to fit a final model with the hyperparameters selected in the inner resamplings. Nested resampling is not used to select hyperparameter values for a final model. Only check the inner tuning results for stable hyperparameters. This means that the selected hyperparameters should not vary too much.

Yes, you are comparing the aggregated performance of all outer resampling test sets (rr$aggregate()) with the performances estimated on the inner resampling test sets (lapply(rr$learners, function(x) x$tuning_result)).
The aggregated performance of all outer resampling iterations is the unbiased performance of a ranger model with optimal hyperparameters found by grid search. You can run at$train(task) to get a final model and report the performance estimated with nested resampling as the unbiased performance of this model.

Source https://stackoverflow.com/questions/66293002

QUESTION

Importance based variable reduction

Asked 2021-Feb-19 at 23:21

I am facing a difficulty with filtering out the least important variables in my model. I received a set of data with more than 4,000 variables, and I have been asked to reduce the number of variables getting into the model.

I did try already two approaches, but I have failed twice.

The first thing I tried was to manually check variable importance after the modelling and based on that removing non significant variables.

...

ANSWER

Answered 2021-Feb-19 at 23:21

The reason why you can't access $importance of the at variable is that it is an AutoTuner, which does not directly offer variable importance and only "wraps" around the actual Learner being tuned.

The trained GraphLearner is saved inside your AutoTuner under $learner:

Source https://stackoverflow.com/questions/66267945

QUESTION

How to access parameters fitted by benchmark?

Asked 2021-Jan-22 at 16:04

This is a really basic question, but I haven't found the answer on other sites, so I am kind of forced to ask about it here.

I fitted my "classif.ranger" learner usin benchmark(design,store_models) function form mlr3 library and I need to acces the fitted parameters (obv). I found nothing about it in the benchmark documentation, so I tried to do it the hard way: -> I set store_models to TRUE -> I tried to acces the model using fitted(), but it returned NULL.

I know the question is basic and that I probably am doing smth stupid (for ex. misreading the documentation or smth like that) but I just have no idea of how to acctualy access the parameters... please help.

If it is needed in such (probably) trivial situation, here comes the code:

...

ANSWER

Answered 2021-Jan-21 at 21:28

You can use getBMRModels() to get the models, which will tell you what hyperparameters were used to fit them. See the benchmark section of the documentation.

Source https://stackoverflow.com/questions/65835419

QUESTION

mlr3 predictions to new data with parameters from autotune

Asked 2020-May-07 at 10:30

I have a follow-up question to this one. As in the initial question, I am using the mlr3verse, have a new dataset, and would like to make predictions using parameters that performed well during autotuning. The answer to that question says to use at$train(task). This seems to initiate tuning again. Does it take advantage of the nested resampling at all by using those parameters?

Also, looking at at$tuning_result there are two sets of parameters, one called tune_x and one called params. What is the difference between these?

Thanks.

Edit: example workflow added below

...

ANSWER

Answered 2020-May-07 at 10:30

As ?AutoTuner tells, this class fits a model with the best hyperparameters found during the tuning. This model is then used for prediction, in your case to newdata when calling its method .$predict_newdata().

Also in ?AutoTuner you see the documentation linked to ?TuningInstance. This then tells you what the $tune_x and params slots represent. Try to look up the help pages next time - that's what they are there for ;)

This seems to initiate tuning again.

Why again? It does it in the first place, on all observations of task. I assume you might confuse yourself by the common misconception between "train/predict" vs. "resample". Read more about the theoretical differences of both to understand what both are doing. They have completely different aims and are not connected. Maybe the following reprex makes it more clear.

Source https://stackoverflow.com/questions/61622299

QUESTION

Start CloudSQL Proxy on Python Dataflow / Apache Beam

Asked 2020-May-01 at 12:56

I am currently working on a ETL Dataflow job (using the Apache Beam Python SDK) which queries data from CloudSQL (with psycopg2 and a custom ParDo) and writes it to BigQuery. My goal is to create a Dataflow template which I can start from a AppEngine using a Cron job.

I have a version which works locally using the DirectRunner. For that I use the CloudSQL (Postgres) proxy client so that I can connect to the database on 127.0.0.1 .

When using the DataflowRunner with custom commands to start the proxy within a setup.py script, the job won't execute. It stucks with repeating this log-message:

Setting node annotation to enable volume controller attach/detach

A part of my setup.py looks the following:

...

ANSWER

Answered 2018-Jun-13 at 08:10

Workaround Solution:

I finally found a workaround. I took the idea to connect via the public IP of the CloudSQL instance. For that you needed to allow connections to your CloudSQL instance from every IP:

Go to the overview page of your CloudSQL instance in GCP
Click on the Authorization tab
Click on Add network and add 0.0.0.0/0 (!! this will allow every IP address to connect to your instance !!)

To add security to the process, I used SSL keys and only allowed SSL connections to the instance:

Click on SSL tab
Click on Create a new certificate to create a SSL certificate for your server
Click on Create a client certificate to create a SSL certificate for you client
Click on Allow only SSL connections to reject all none SSL connection attempts

After that I stored the certificates in a Google Cloud Storage bucket and load them before connecting within the Dataflow job, i.e.:

Source https://stackoverflow.com/questions/50701736

QUESTION

mlr3 PipeOps: Create branches with different data transformations and benchmark different learners within and between branches

Asked 2020-Apr-16 at 11:04

I'd like use PipeOps to train a learner on three alternative transformations of a dataset:

No transformation.
Class balancing- down.
Class balancing- up.

Then, I'd like to benchmark the three learned models.

My idea was to set up the pipeline as follows:

Make pipeline: Input -> Impute dataset (optional) -> Branch -> Split into the three branches described above -> Add the learner within each branch -> Unbranch.
Train pipeline and hope (that's where I'm getting it wrong) that the will be a result saved for each learner within each branch.

Unfortunately, following these steps results in a single learner that seems to have 'merged' everything from the different branches. I was hoping to get a list of length 3, but I get a list of length one instead.

R code:

...

ANSWER

Answered 2020-Apr-16 at 11:04

I think that I've found the answer to what I'm looking for. In brief, what I'd like to do is:

Create a graph pipeline with multiple learners. I'd like some of the learners to be inserted with fixed hyperparameters, while for others I'd like to have their hyperparameters tuned. Then, I'd like to benchmark them and select the 'best' one. I'd also like the benchmarking of learners to happen under different class balancing strategies, namely, do nothing, up-sample and down-sample. The optimal parameter settings for the up/down-sampling (e.g. ratio) would also be determined during tuning.

Two examples below, one that almost does what I want, the other doing exactly what I want.

Example 1: Build a pipe that includes all learners, that is, learners with fixed hyperparameters, as well as learners whose hyperparameters require tuning

As will be shown, it seems like a bad idea to have both kinds of learners (i.e. with fixed and tunable hyperparameters), because tuning the pipe disregards the learners with tunable hyperparameters.

Source https://stackoverflow.com/questions/61014457

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install Autotuner

AutoTuner is now available through bioconductor. The current released version of the package may be installed by running the following code. Please use the devel version of the package to obtain the most up to date version of the code. The development version of the package may also be downloaded using devtools. This can be accomplished by running the following code.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: