caretEnsemble | caret models all the way down turtle | Machine Learning library
kandi X-RAY | caretEnsemble Summary
kandi X-RAY | caretEnsemble Summary
Framework for fitting multiple caret models using the same re-sampling strategy as well as creating ensembles of such models. Use caretList to fit multiple models, and then use caretEnsemble to combine them greedily, or caretStack to combine them using a caret model. caretEnsemble was inspired by medley, which in turn was inspired by Caruana et. al.'s (2004) paper Ensemble Selection from Libraries of Models.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of caretEnsemble
caretEnsemble Key Features
caretEnsemble Examples and Code Snippets
Community Discussions
Trending Discussions on caretEnsemble
QUESTION
I keep running into an error while attempting to plot variable importance from ensemble of models.
I have ensemble of models I've fitted and now I am trying to create multiple variable importance plots for each algorithm I've fitted. I am using varImp()
function from caret to extract variable importance, then plot()
it. To fit ensemble of models, I am using caretEnsemble
package.
Thank you for any help, please see the example of code below.
...ANSWER
Answered 2019-Jun-06 at 11:31I've come up with the solution to the problem above and decided to post it as my own answer. I've written a small function to plot variable importance without relying on caret
helper functions to create plots. I used dotplot
and levelplot
because caret
returns data.frame
that differs based on provided algorithm. It may not work on different algorithms and models that didn't fit.
QUESTION
Is it possible to perform a stacked ensemble with H2O (under R) using previously ran caret models? How could we load caret models to the H2O server?
(I am aware of the existence of the 'caretEnsemble' package, but it does not handle multiclass data).
Thanks for your advices.
...ANSWER
Answered 2018-Dec-14 at 10:28No, you can only do a stacked ensemble of H2O models, i.e. the models must have been trained on H2O cluster.
QUESTION
I have several prediction models which are created using the same trainControl
. These models have to be created beforehand (i.e. I can't use caretList
to train multiple models simultaneously).
Below is my minimal example. When I manually combine multiple (already created) models and pass them to caretStack
,
ANSWER
Answered 2018-Nov-13 at 12:36You are almost there. One of the things to remember is that when you want to use caretEnsemble is that in trainControl
you have to set the resample index via the 'index' option in trainControl
. If you run caretList it tends to set this itself, but it is better to do this yourself. This is especially true when you run different models outside of caretList. You need to make sure the resampling is the same. You can also see this in the example on github you refer to.
QUESTION
I have some r/caret code to fit several cross-validated models to some data, but I'm getting a warning message that I'm having trouble finding any information on. Is this something I should be concerned about?
...ANSWER
Answered 2018-Apr-26 at 01:47trainControl
does not by default generate you the indices, it acts as a way of passing all the parameters to each model you are training.
When we search github issues regarding the error we can find this particular issue.
You need to make sure that every model is fit with the EXACT same resampling folds. caretEnsemble builds the ensemble by merging together the test sets for each cross-validation fold, and you will get incorrect results if each fold has different observations in it.
Before you fit your models, you need to construct a trainControl object, and manually set the indexes in that object.
E.g.
myControl <- trainControl(index=createFolds(y, 10))
.We are working on an interface to caretEnsemble that handles constructing the resampling strategy for you and then fitting multiple models using those resamples, but it is not yet finished.
To reiterate, that check is there for a reason. You need to set the index argument in trainControl, and pass the EXACT SAME indexes to each model you wish to ensemble.
So what that means is when you specify number = 5
and repeats = 3
the models aren't actually getting a predetermined index for what samples belong to each fold but are rather generating their own independently.
Therefore to ensure that the models are consistent with one another regarding which samples belong to which folds you must specify index = createFolds(iris$Species, 5)
in your trainControl
object
QUESTION
I have some code which fits several (cross-validated) models to some data, as below.
...ANSWER
Answered 2017-Jul-18 at 02:25ok this is actually addressed well in the docs.
QUESTION
I've been trying to stack together predictions from 2 regression models (glmnet and bagEarth) but I have been getting the "Error in FUN(X[[i]], ...) : { .... is not TRUE" message. Based on what I've read,I've seen this issue stem from resampling indexes, but since I am training the models together, I can't see how I can get the issue. I've been able to replicate using random numbers:
...ANSWER
Answered 2018-Jun-19 at 12:40A bit of debugging and the error comes from a function called bestPreds
. This is a not exported function and looks in the model_lists for the saved predictions (all or final) in the control object. This you have not set in your control object. If you add this, everything will run fine. I do admit that an error message would be nice in this place instead of just throwing an error.
QUESTION
I'm creating a simple ensemble of two xgboost and mxnet models. The data frame is A3n.df with the classification variable at A3n.df[,1]. Both the models run fine on their own and get believable accuracy. All data is normalized 0-1, shuffled and the class variable converted to a factor (for caret). I have already run grid search for the best hyperparameters, but need to include a grid for caretEnsemble.
...ANSWER
Answered 2018-Feb-13 at 21:17There's a note in the caret documentation that num.round needs to be set by the user outside the tune_grid: http://topepo.github.io/caret/train-models-by-tag.html
QUESTION
I am testing most of the models caret supports on a bunch of PCs. Unfortunately caret "suggested" packages do not include most of the model packages available to caret. Every time a new version of R comes out I have to sit in front of each PC and wait for each prompt to press the 1 button and Enter. Is there an option I could set to tell R or Rstudio to just install anything asked for? A for every a/s/n prompt too.
...ANSWER
Answered 2017-Jul-31 at 02:45This code:
QUESTION
TL/DR ANSWER: specify training data in newdata
argument.
How do I consistently extract class probabilities from trained models with caret
's predict
? Currently I get an error when the argument to predict
was trained with the formula notation and a variable was indicated to be ignored with -variable
.
This can be reproduced with:
...ANSWER
Answered 2017-Jan-19 at 05:28Just use the newdata
parameter and it will work
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install caretEnsemble
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page