mlr3 | mlr3 : Machine Learning in R - next generation | Machine Learning library

by mlr-org R Version: v0.16.0 License: LGPL-3.0

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | mlr3 Summary

mlr3 is a R library typically used in Artificial Intelligence, Machine Learning, Deep Learning applications. mlr3 has no bugs, it has no vulnerabilities, it has a Weak Copyleft License and it has medium support. You can download it from GitHub.

Package website: release | dev. Efficient, object-oriented programming on the building blocks of machine learning. Successor of mlr.

Support

Quality

Security

License

Reuse

Support

mlr3 has a medium active ecosystem.

It has 780 star(s) with 79 fork(s). There are 31 watchers for this library.

It had no major release in the last 12 months.

There are 53 open issues and 455 have been closed. On average issues are closed in 79 days. There are 7 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of mlr3 is v0.16.0

Quality

mlr3 has 0 bugs and 0 code smells.

Security

mlr3 has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

mlr3 code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

mlr3 is licensed under the LGPL-3.0 License. This license is Weak Copyleft.

Weak Copyleft licenses have some restrictions, but you can use them in commercial projects.

Reuse

mlr3 releases are available to install and integrate.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of mlr3

Get all kandi verified functions for this library.

mlr3 Key Features

No Key Features are available at this moment for mlr3.

mlr3 Examples and Code Snippets

No Code Snippets are available at this moment for mlr3.

Community Discussions

Trending Discussions on mlr3

Error in UUIDgenerate() : Too many DLL modules. in mlr3 pakcage

How to perform spatial crossvalidation using mlr3 and then perform raster predict

Number of configurations in mlr3 hyperband tuning

mlr3 hyperband: unused argument (clone = character())

mlr3 AutoFSelector glmnet: Error in (if(cv)glmnet::cv.glmnet else glmnet::glmnet)(x = data, y = target, :# x should be a matrix with 2 or more columns

How can I use a pipeline graph with upsampling when my task is ordered?

Do learners require the task to be split in to train and test sets, or do they do it themselves

mlr3, benchmarking and nested resampling: how to extract a tuned model from a benchmark object to calculate feature importance

How to save mlr3 lightgbm model correctly?

Machine learning in R: Using MLR package survival filters in MLR3

QUESTION

Error in UUIDgenerate() : Too many DLL modules. in mlr3 pakcage

Asked 2022-Mar-14 at 22:09

I using the mlr3 package for autotuning ML models (mlr3pipelines graph, to be more correct).

It is very hard to reproduce the problem because the error occurs occasionally. The same code sometimes returns an error and sometimes doesn't.

Here is the code snippet

...

ANSWER

Answered 2022-Mar-14 at 22:09

There is a fixed buffer for loading RNG functions in uuid which will fail if there are too many DLLs loaded already. A simple work-around is to run

Source https://stackoverflow.com/questions/71352627

QUESTION

How to perform spatial crossvalidation using mlr3 and then perform raster predict

Asked 2022-Mar-02 at 11:11

I have the following problem. I want to build a model for landcover classification. My data are multitemporal Remote Sensing data with several bands. For training I created stratified randomly distributed points to extract spectral data at their positions. With these data a Random Forrest (Rpart) was trained using mlr3 package. For accuracy measurement a repeated spatial cross validation using mlr3spatiotempcv was performed. The resulting model of the training step is, after extraction, stored in an R Object of type rpart. In the terms field of this object are the variable names stored. These are all my used bands but also the spatial x and y coordinates. This brings problems when predicting new data. I used terra package and got an error the x and y layer are missing in my input data. Which kind of makes sense because they are stored in the terms field of the model. But from my understanding, the coordinates should not be a variable of the model. The coordinates are just used for spatial resampling and not for predicting. I "solved" this problem by removing x and y coordinates during the training process and perform just an ordinary non-spatial cross validation. After that I performed the prediction and it works perfectly.

So, my Question is, how can I train a model, using mlr3 package, with data containing coordinates, to perform spatial cross validation?, and then use this model to predict a new Raster.

...

ANSWER

Answered 2022-Mar-02 at 11:11

You have found a bug. When the task is created from a data.frame instead of an sf object, coords_as_features is set to TRUE. The default should be FALSE. You can install a fixed version of the package with remotes::install_github("mlr-org/mlr3spatiotempcv"). This fix should be included in the next CRAN version soon. Thanks for reporting.

This brings problems when predicting new data.

Why do you use the models from resampling to predict new data? Usually, you estimate the performance of the final model with (spatial) cross validation but the final model to predict new data is fitted on the complete data set.

Source https://stackoverflow.com/questions/71308422

QUESTION

Number of configurations in mlr3 hyperband tuning

Asked 2022-Feb-08 at 20:42

How can I control the number of configurations being evaluated during hyperband tuning in mlr3? I noticed that when I tune 6 parameters in xgboost(), the code evaluates about 9 configurations. When I tune the same number of parameters in catboost(), the code starts with evaluating 729 configurations. I am using eta = 3 in both cases.

...

ANSWER

Answered 2022-Feb-08 at 20:42

The number of sampled configurations in hyperband is defined by the lower and upper bound of the budget hyperparameter and eta. You can get a preview of the schedule and number of configurations:

Source https://stackoverflow.com/questions/71039266

QUESTION

mlr3 hyperband: unused argument (clone = character())

Asked 2022-Feb-01 at 22:16

I want to code some hyperband tuning in mlr3. I started with running the subsample-rpart hyperband example from chapter 4.4 in mlr3 book - directly copied from there. I am getting an error: Error in benchmark(design, store_models = self$store_models, allow_hotstart = self$allow_hotstart, : unused argument (clone = character())

How do I fix it?

...

ANSWER

Answered 2022-Feb-01 at 22:16

You have to install mlr3 0.13.1 from CRAN.

Source https://stackoverflow.com/questions/70945460

QUESTION

mlr3 AutoFSelector glmnet: Error in (if(cv)glmnet::cv.glmnet else glmnet::glmnet)(x = data, y = target, :# x should be a matrix with 2 or more columns

Asked 2022-Jan-24 at 18:05

I am a beginner on mlr3 and am facing problems while running AutoFSelector learner associated to glmnet on a classification task containing >2000 numeric variables. I reproduce this error while using the simpler mlr3 predefined task Sonar. For note, I am using R version 4.1.2 (2021-11-01)on macOS Monterey 12.1. All required packages have been loaded on CRAN.

...

ANSWER

Answered 2022-Jan-24 at 18:05

This is a problem specific to glmnet. glmnet requires at least two features to fit a model, but in at least one configuration (the first ones in a sequential forward search) you only have one feature.

There are two possibilities to solve this:

Open an issue in mlr3fselect and request a new argument min_features (there already is max_features) to be able to start the search with 2 or more features.
Augment the base learner with a fallback which gets fitted if the base learner fails. Here is fallback to a simple logistic regression:

Source https://stackoverflow.com/questions/70835642

QUESTION

How can I use a pipeline graph with upsampling when my task is ordered?

Asked 2021-Dec-11 at 08:17

I have a task where the observation in rows have a date order. I generate a custom resampling scheme that respects this order in all train/test splits.

I also want to adress the unbalanced classes problem by upsampling the minority class. Within the training sets, the time order is not important (and the learner would not use this anyway).

Now, I want to resample this combination of ordered task, graph learner (including upsampling) and time-sensitive custom resampling scheme. But this is problematic.

To show this, I generated the following code. I use a sample task to make this reproducible and I augment this task with a date column to generate an ordered task that is similar to my problem. This code runs only if I omit the problematic lines indicated in the code. But they generate exactly what I have in my real-world problem: An order. So how can I solve this?

(I omit some of the output in the following reprex for readability.)

...

ANSWER

Answered 2021-Dec-09 at 18:08

You could upsample the data set first and then create the custom resampling splits.

Source https://stackoverflow.com/questions/70292805

QUESTION

Do learners require the task to be split in to train and test sets, or do they do it themselves

Asked 2021-Nov-14 at 18:45

The code below is the sample from the mlr3cluster Github repo. My question is whether the learner is overtrained, due to the task not being split into a train and test set, or does it take care of that on its own internally?

My guess is that it does but I'm just not sure and I'm new to R and mlr3 and can't seem to find documentation regarding this topic.

...

ANSWER

Answered 2021-Nov-14 at 18:45

As pointed out in the comments, learners don't split the data for themselves. The resampling part of the mlr3 book has much more detail on this -- the short version is that mlr3 provides many ways of splitting your data automatically that you can then use to evaluate the models learners induce in an unbiased fashion.

All of that said, for clustering this doesn't really apply in the same way. This is because clustering is an unsupervised method (i.e. there's no ground truth data we want the model to learn). So what you're doing in your code is fine if all you're interested in is how observations are assigned to clusters for further analysis.

However, if you are treating this as a classification problem (i.e. you want the clustering to recover the classes that are contained in the original task), you do need to split into training and testing. In that case I would recommend using a classification learner instead of a clustering method though.

Source https://stackoverflow.com/questions/69965792

QUESTION

mlr3, benchmarking and nested resampling: how to extract a tuned model from a benchmark object to calculate feature importance

Asked 2021-Nov-04 at 13:46

I am using the benchmark() function in mlr3 to compare several ML algorithms. One of them is XGB with hyperparameter tuning. Thus, I have an outer resampling to evaluate the overall performance (hold-out sample) and an inner resampling for the hyper parameter tuning (5-fold Cross-validation). Besides having an estimate of the accuracy for all ML algorithms, I would like to see the feature importance of the tuned XGB. For that, I would have to access the tuned model (within the benchmark object). I do not know how to do that. The object returned by benchmark() is a deeply nested list and I do not understand its structure.

This answer on stackoverflow did not help me, because it uses a different setup (a learner in a pipeline rather than a benchmark object).

This answer on github did not help me, because it shows how to extract all the information about the benchmarking at once but not how to extract one (tuned) model of one of the learners in the benchmark.

Below is the code I am using to carry out the nested resampling. Following the benchmarking, I would like to estimate the feature importance as described here, which requires accessing the tuned XGB model.

...

ANSWER

Answered 2021-Nov-03 at 16:54

library(mlr3tuning)
library(mlr3learners)
library(mlr3misc)

learner = lrn("classif.xgboost", nrounds = to_tune(100, 500), eval_metric = "logloss")

at = AutoTuner$new(
  learner = learner,
  resampling = rsmp("cv", folds = 3),
  measure = msr("classif.ce"),
  terminator = trm("evals", n_evals = 5),
  tuner = tnr("random_search"),
  store_models = TRUE
)

design = benchmark_grid(task = tsk("pima"), learner = at, resampling = rsmp("cv", folds = 5))
bmr = benchmark(design, store_models = TRUE)

Source https://stackoverflow.com/questions/69827716

QUESTION

How to save mlr3 lightgbm model correctly?

Asked 2021-Nov-01 at 08:17

I have some following codes. I met error when save trained model. It's only error when i using lightgbm.

...

ANSWER

Answered 2021-Nov-01 at 08:17

lightgbm uses special functions to save and read models. You have to extract the model before saving and add it to the graph learner after loading. However, this might be not practical for benchmarks. We will look into it.

Source https://stackoverflow.com/questions/69793716

QUESTION

Machine learning in R: Using MLR package survival filters in MLR3

Asked 2021-Oct-17 at 18:55

I want to run a number of machine learning algorithms with different feature selection methods on survival data using the MLR3 package. For that, I am using the Benchmark() function of MLR3.

Unfortunately, filter feature selection methods of MLR3 do not support survival, yet. However, MLR package supports survival filters.

I can fuse MLR learners with an MLR filter method. After that, I need to convert them to a learner in MLR3 in order to be able to use banchmark_grid() function of MLR3.

Is there any way to use MLR survival filters in MLR3? Or is there any way that I can convert MLR filters to MLR3 filters?

...

ANSWER

Answered 2021-Oct-17 at 17:43

Unfortunately not -- the basic design of mlr and mlr3 is fundamentally different.

Source https://stackoverflow.com/questions/69606645

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install mlr3

Install the last release from CRAN:.

Support

This R package is licensed under the LGPL-3. If you encounter problems using this software (lack of documentation, misleading or wrong documentation, unexpected behavior, bugs, …) or just want to suggest features, please open an issue in the issue tracker. Pull requests are welcome and will be included at the discretion of the maintainers. Please consult the wiki for a style guide, a roxygen guide and a pull request guide.

Find more information at: