kandi background
Explore Kits

xgboost | Scalable, Portable and Distributed Gradient Boosting Library, for Python, R, Java, Scala, C and more

 by   dmlc C++ Version: v1.6.0rc1 License: Apache-2.0

 by   dmlc C++ Version: v1.6.0rc1 License: Apache-2.0

Download this library from

kandi X-RAY | xgboost Summary

xgboost is a C++ library typically used in Big Data, Spark, Hadoop applications. xgboost has no bugs, it has no vulnerabilities, it has a Permissive License and it has medium support. You can download it from GitHub.
<img src=https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/logo-m/xgboost.png width=135/> eXtreme Gradient Boosting.
Support
Support
Quality
Quality
Security
Security
License
License
Reuse
Reuse

kandi-support Support

  • xgboost has a medium active ecosystem.
  • It has 22464 star(s) with 8355 fork(s). There are 939 watchers for this library.
  • There were 1 major release(s) in the last 6 months.
  • There are 244 open issues and 4099 have been closed. On average issues are closed in 163 days. There are 44 open pull requests and 0 closed requests.
  • It has a neutral sentiment in the developer community.
  • The latest version of xgboost is v1.6.0rc1
xgboost Support
Best in #C++
Average in #C++
xgboost Support
Best in #C++
Average in #C++

quality kandi Quality

  • xgboost has 0 bugs and 0 code smells.
xgboost Quality
Best in #C++
Average in #C++
xgboost Quality
Best in #C++
Average in #C++

securitySecurity

  • xgboost has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
  • xgboost code analysis shows 0 unresolved vulnerabilities.
  • There are 0 security hotspots that need review.
xgboost Security
Best in #C++
Average in #C++
xgboost Security
Best in #C++
Average in #C++

license License

  • xgboost is licensed under the Apache-2.0 License. This license is Permissive.
  • Permissive licenses have the least restrictions, and you can use them in most projects.
xgboost License
Best in #C++
Average in #C++
xgboost License
Best in #C++
Average in #C++

buildReuse

  • xgboost releases are available to install and integrate.
  • Installation instructions are not available. Examples and code snippets are available.
  • It has 38264 lines of code, 2650 functions and 326 files.
  • It has medium code complexity. Code complexity directly impacts maintainability of the code.
xgboost Reuse
Best in #C++
Average in #C++
xgboost Reuse
Best in #C++
Average in #C++
Top functions reviewed by kandi - BETA

Coming Soon for all Libraries!

Currently covering the most popular Java, JavaScript and Python libraries. See a SAMPLE HERE.
kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.

xgboost Key Features

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

default

copy iconCopydownload iconDownload
© Contributors, 2021. Licensed under an [Apache-2](https://github.com/dmlc/xgboost/blob/master/LICENSE) license.

Contribute to XGBoost
---------------------
XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone.
Checkout the [Community Page](https://xgboost.ai/community).

Reference
---------
- Tianqi Chen and Carlos Guestrin. [XGBoost: A Scalable Tree Boosting System](http://arxiv.org/abs/1603.02754). In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016
- XGBoost originates from research project at University of Washington.

Sponsors
--------
Become a sponsor and get a logo here. See details at [Sponsoring the XGBoost Project](https://xgboost.ai/sponsors). The funds are used to defray the cost of continuous integration and testing infrastructure (https://xgboost-ci.net).

## Open Source Collective sponsors
[![Backers on Open Collective](https://opencollective.com/xgboost/backers/badge.svg)](#backers) [![Sponsors on Open Collective](https://opencollective.com/xgboost/sponsors/badge.svg)](#sponsors)

### Sponsors
[[Become a sponsor](https://opencollective.com/xgboost#sponsor)]

&lt;!--&lt;a href="https://opencollective.com/xgboost/sponsor/0/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/0/avatar.svg"&gt;&lt;/a&gt;--&gt;
&lt;a href="https://www.nvidia.com/en-us/" target="_blank"&gt;&lt;img src="https://raw.githubusercontent.com/xgboost-ai/xgboost-ai.github.io/master/images/sponsors/nvidia.jpg" alt="NVIDIA" width="72" height="72"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/1/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/1/avatar.svg"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/2/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/2/avatar.svg"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/3/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/3/avatar.svg"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/4/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/4/avatar.svg"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/5/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/5/avatar.svg"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/6/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/6/avatar.svg"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/7/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/7/avatar.svg"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/8/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/8/avatar.svg"&gt;&lt;/a&gt;
&lt;a href="https://opencollective.com/xgboost/sponsor/9/website" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/sponsor/9/avatar.svg"&gt;&lt;/a&gt;

### Backers
[[Become a backer](https://opencollective.com/xgboost#backer)]

&lt;a href="https://opencollective.com/xgboost#backers" target="_blank"&gt;&lt;img src="https://opencollective.com/xgboost/backers.svg?width=890"&gt;&lt;/a&gt;

## Other sponsors
The sponsors in this list are donating cloud hours in lieu of cash donation.

&lt;a href="https://aws.amazon.com/" target="_blank"&gt;&lt;img src="https://raw.githubusercontent.com/xgboost-ai/xgboost-ai.github.io/master/images/sponsors/aws.png" alt="Amazon Web Services" width="72" height="72"&gt;&lt;/a&gt;

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, : Unexpected CURL error: getaddrinfo() thread failed to start

copy iconCopydownload iconDownload
Unexpected CURL error: Failed to connect to 127.0.0.1 port 54321: Connection reset by peer
$ brew edit curl # add --disable-socketpair to args list
$ brew install --build-from-source curl # using reinstall might be needed instead of install

$ export RCURL_PATH="usr/local/opt/curl@7.81.0" # can be found using `brew info curl`
$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl

$ sudo apt install devscripts
$ # make sure source repositories are enabled (uncommented in /etc/apt/s
$ apt-get source curl
$ sudo apt-get build-dep curl
$ cd curl
$ nano debian/rules # add the --disable-socketpair configure option
$ dch -i # bump the version
$ debuild -us -uc -b # build the package
$ dpkg -i ../curl-some_version.dpkg

$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl
-----------------------
Unexpected CURL error: Failed to connect to 127.0.0.1 port 54321: Connection reset by peer
$ brew edit curl # add --disable-socketpair to args list
$ brew install --build-from-source curl # using reinstall might be needed instead of install

$ export RCURL_PATH="usr/local/opt/curl@7.81.0" # can be found using `brew info curl`
$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl

$ sudo apt install devscripts
$ # make sure source repositories are enabled (uncommented in /etc/apt/s
$ apt-get source curl
$ sudo apt-get build-dep curl
$ cd curl
$ nano debian/rules # add the --disable-socketpair configure option
$ dch -i # bump the version
$ debuild -us -uc -b # build the package
$ dpkg -i ../curl-some_version.dpkg

$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl
-----------------------
Unexpected CURL error: Failed to connect to 127.0.0.1 port 54321: Connection reset by peer
$ brew edit curl # add --disable-socketpair to args list
$ brew install --build-from-source curl # using reinstall might be needed instead of install

$ export RCURL_PATH="usr/local/opt/curl@7.81.0" # can be found using `brew info curl`
$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl

$ sudo apt install devscripts
$ # make sure source repositories are enabled (uncommented in /etc/apt/s
$ apt-get source curl
$ sudo apt-get build-dep curl
$ cd curl
$ nano debian/rules # add the --disable-socketpair configure option
$ dch -i # bump the version
$ debuild -us -uc -b # build the package
$ dpkg -i ../curl-some_version.dpkg

$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl

how to properly initialize a child class of XGBRegressor?

copy iconCopydownload iconDownload
class XGBoostQuantileRegressor(XGBRegressor):
    def __init__(self, quant_alpha, max_depth=3, **kwargs):
        self.quant_alpha = quant_alpha
        super().__init__(max_depth=max_depth, **kwargs)

    # other methods unchanged and omitted for brevity.

Jupyter shell commands in a function

copy iconCopydownload iconDownload
def foo(astr):
    !ls $astr

foo('*.py')
!ls *.py
-----------------------
def foo(astr):
    !ls $astr

foo('*.py')
!ls *.py
-----------------------
from IPython import get_ipython
ipython = get_ipython()

code = ipython.transform_cell('!ls')
print(code)
exec(code)
exec(ipython.transform_cell('!ls'))
-----------------------
from IPython import get_ipython
ipython = get_ipython()

code = ipython.transform_cell('!ls')
print(code)
exec(code)
exec(ipython.transform_cell('!ls'))
-----------------------
from IPython import get_ipython
ipython = get_ipython()

code = ipython.transform_cell('!ls')
print(code)
exec(code)
exec(ipython.transform_cell('!ls'))

dask_xgboost.predict works but cannot be shown -Data must be 1-dimensional

copy iconCopydownload iconDownload
Dask-XGBoost has been deprecated and is no longer maintained.
The functionality of this project has been included directly
in XGBoost. To use Dask and XGBoost together, please use
xgboost.dask instead
https://xgboost.readthedocs.io/en/latest/tutorials/dask.html.
# note the .dask
model_xgb = xgb.dask.DaskXGBRegressor(seed=42, verbose=True)

grid_search = GridSearchCV(model_xgb, params, cv=3, scoring='neg_mean_squared_error')

grid_search.fit(x_train, y_train)

#train data with best params
model_xgb.client = client
model_xgb.set_params(grid_search.best_params_)
model_xgb.fit(X_train, y_train, eval_set=[(X_test, y_test)])
-----------------------
Dask-XGBoost has been deprecated and is no longer maintained.
The functionality of this project has been included directly
in XGBoost. To use Dask and XGBoost together, please use
xgboost.dask instead
https://xgboost.readthedocs.io/en/latest/tutorials/dask.html.
# note the .dask
model_xgb = xgb.dask.DaskXGBRegressor(seed=42, verbose=True)

grid_search = GridSearchCV(model_xgb, params, cv=3, scoring='neg_mean_squared_error')

grid_search.fit(x_train, y_train)

#train data with best params
model_xgb.client = client
model_xgb.set_params(grid_search.best_params_)
model_xgb.fit(X_train, y_train, eval_set=[(X_test, y_test)])

Tuning XGBoost Hyperparameters with RandomizedSearchCV

copy iconCopydownload iconDownload
hyperparameter_grid = {
    'n_estimators': [100, 500, 900, 1100, 1500],
    'max_depth': [2, 3, 5, 10, 15],
    'learning_rate': [0.05, 0.1, 0.15, 0.20],
    'min_child_weight': [1, 2, 3, 4]
    }
hyperparameter_grid = {
    'n_estimators': [100, 400, 800],
    'max_depth': [3, 6, 9],
    'learning_rate': [0.05, 0.1, 0.20],
    'min_child_weight': [1, 10, 100]
    }
hyperparameter_grid = {
    'max_depth': [3, 6, 9],
    'min_child_weight': [1, 10, 100]
    }
-----------------------
hyperparameter_grid = {
    'n_estimators': [100, 500, 900, 1100, 1500],
    'max_depth': [2, 3, 5, 10, 15],
    'learning_rate': [0.05, 0.1, 0.15, 0.20],
    'min_child_weight': [1, 2, 3, 4]
    }
hyperparameter_grid = {
    'n_estimators': [100, 400, 800],
    'max_depth': [3, 6, 9],
    'learning_rate': [0.05, 0.1, 0.20],
    'min_child_weight': [1, 10, 100]
    }
hyperparameter_grid = {
    'max_depth': [3, 6, 9],
    'min_child_weight': [1, 10, 100]
    }
-----------------------
hyperparameter_grid = {
    'n_estimators': [100, 500, 900, 1100, 1500],
    'max_depth': [2, 3, 5, 10, 15],
    'learning_rate': [0.05, 0.1, 0.15, 0.20],
    'min_child_weight': [1, 2, 3, 4]
    }
hyperparameter_grid = {
    'n_estimators': [100, 400, 800],
    'max_depth': [3, 6, 9],
    'learning_rate': [0.05, 0.1, 0.20],
    'min_child_weight': [1, 10, 100]
    }
hyperparameter_grid = {
    'max_depth': [3, 6, 9],
    'min_child_weight': [1, 10, 100]
    }

How to get hyperparameters of xgb.train in python

copy iconCopydownload iconDownload
import json

config = json.loads(bst.save_config())
config['learner']['gradient_booster']['updater']['grow_colmaker']['train_param']

-----------------------
import json

config = json.loads(bst.save_config())
config['learner']['gradient_booster']['updater']['grow_colmaker']['train_param']

FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan

copy iconCopydownload iconDownload
{'eta ':[0.01, 0.05, 0.1, 0.2]},...
{'eta':[0.01, 0.05, 0.1, 0.2]},...
-----------------------
{'eta ':[0.01, 0.05, 0.1, 0.2]},...
{'eta':[0.01, 0.05, 0.1, 0.2]},...
-----------------------
grid_lr = {
'cls__class_weight': [None, 'balanced'],
'cls__C': [0, .001, .01, .1, 1]
}

I fail to run caret's nnet regression

copy iconCopydownload iconDownload
library(tidyverse)
library(caret)
feature = data.frame(x = rnorm(100, 0, 1) %>% as.double()) # changed line 
outcome = rnorm(100, 0, 1) %>% as.double()
CATE_model = caret::train(
  x = feature, y = outcome, method = "nnet",
  tuneGrid = expand.grid(size=c(1:3), decay=seq(0.1, 1, 0.1)),
  weights = NULL, linout = TRUE
)

Can you find the mistake in my custom evaluation metric? XGBOOST R

copy iconCopydownload iconDownload
eval_metric <- c()

for (i in 1:100) {
  
  trained_models<-xgb.train(data=training_vectors,gamma=0,nrounds=i,max_depth=2,objective="binary:logistic", verbose = 0,feval=my_metric,watchlist = watchlist)
  eval_metric[i] <- my_metric(predict(trained_models,testing_vectors), testing_vectors)$value
}
eval_metric

  [1] 1.0833762 1.0332087 1.0702217 1.1165583 0.9980249 1.0447095 0.9964721 0.9674231 0.8648293
 [10] 0.9044608 0.8724537 0.9304222 0.8491665 0.8829176 0.9304336 0.9221882 0.8533177 0.8376518
 [19] 0.7965470 0.8284276 0.8067912 0.7947449 0.7577542 0.7864774 0.7560513 0.7355429 0.7609600
 [28] 0.7640666 0.7101464 0.7291165 0.7655773 0.7347603 0.6886943 0.7110074 0.6942958 0.6838692
 [37] 0.2975801 0.3121724 0.6874055 0.3178953 0.3018035 0.3133702 0.6857661 0.6927544 0.3043382
 [46] 0.2982567 0.2908952 0.2772635 0.2722214 0.2677541 0.2610758 0.2715461 0.2818424 0.3041806
 [55] 0.3227641 0.3138340 0.3105319 0.3045225 0.3009517 0.3114915 0.3061301 0.3169128 0.3118879
 [64] 0.3083425 0.3185155 0.3115889 0.3202170 0.3141242 0.3115893 0.3265834 0.3178155 0.3211948
 [73] 0.3145838 0.3232811 0.3168709 0.3215020 0.3140709 0.3214312 0.3146561 0.3219147 0.3156422
 [82] 0.3099746 0.3176437 0.3261342 0.3212111 0.3146619 0.3215416 0.3296011 0.3362954 0.3328568
 [91] 0.3266897 0.3216920 0.3297096 0.3246411 0.3192709 0.3235182 0.6988236 0.3270507 0.7030137
[100] 0.3299713
-----------------------
eval_metric <- c()

for (i in 1:100) {
  
  trained_models<-xgb.train(data=training_vectors,gamma=0,nrounds=i,max_depth=2,objective="binary:logistic", verbose = 0,feval=my_metric,watchlist = watchlist)
  eval_metric[i] <- my_metric(predict(trained_models,testing_vectors), testing_vectors)$value
}
eval_metric

  [1] 1.0833762 1.0332087 1.0702217 1.1165583 0.9980249 1.0447095 0.9964721 0.9674231 0.8648293
 [10] 0.9044608 0.8724537 0.9304222 0.8491665 0.8829176 0.9304336 0.9221882 0.8533177 0.8376518
 [19] 0.7965470 0.8284276 0.8067912 0.7947449 0.7577542 0.7864774 0.7560513 0.7355429 0.7609600
 [28] 0.7640666 0.7101464 0.7291165 0.7655773 0.7347603 0.6886943 0.7110074 0.6942958 0.6838692
 [37] 0.2975801 0.3121724 0.6874055 0.3178953 0.3018035 0.3133702 0.6857661 0.6927544 0.3043382
 [46] 0.2982567 0.2908952 0.2772635 0.2722214 0.2677541 0.2610758 0.2715461 0.2818424 0.3041806
 [55] 0.3227641 0.3138340 0.3105319 0.3045225 0.3009517 0.3114915 0.3061301 0.3169128 0.3118879
 [64] 0.3083425 0.3185155 0.3115889 0.3202170 0.3141242 0.3115893 0.3265834 0.3178155 0.3211948
 [73] 0.3145838 0.3232811 0.3168709 0.3215020 0.3140709 0.3214312 0.3146561 0.3219147 0.3156422
 [82] 0.3099746 0.3176437 0.3261342 0.3212111 0.3146619 0.3215416 0.3296011 0.3362954 0.3328568
 [91] 0.3266897 0.3216920 0.3297096 0.3246411 0.3192709 0.3235182 0.6988236 0.3270507 0.7030137
[100] 0.3299713

GridSearchCV not choosing the best hyperparameters for xgboost

copy iconCopydownload iconDownload
from sklearn.model_selection import KFold

seed_cv = 123 # any random value here

kf = KFold(n_splits=5, random_state=seed_cv)

grid_xgb_reg=GridSearchCV(xgb_reg,
                          param_grid=params,
                          scoring=scorer,
                          cv=kf,   # <- change here
                          n_jobs=-1)
seed_xgb = 456 # any random value here (can even be the same with seed_cv)
xgb_reg = xgb.XGBRegressor(random_state=seed_xgb)
-----------------------
from sklearn.model_selection import KFold

seed_cv = 123 # any random value here

kf = KFold(n_splits=5, random_state=seed_cv)

grid_xgb_reg=GridSearchCV(xgb_reg,
                          param_grid=params,
                          scoring=scorer,
                          cv=kf,   # <- change here
                          n_jobs=-1)
seed_xgb = 456 # any random value here (can even be the same with seed_cv)
xgb_reg = xgb.XGBRegressor(random_state=seed_xgb)

Community Discussions

Trending Discussions on xgboost
  • Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, : Unexpected CURL error: getaddrinfo() thread failed to start
  • Can we set minimum samples per leaf in XGBoost (like in other GBM algos)?
  • Dataproc Cluster creation is failing with PIP error &quot;Could not build wheels&quot;
  • h2o build fails with java 15
  • how to properly initialize a child class of XGBRegressor?
  • What is the use of DMatrix?
  • Jupyter shell commands in a function
  • dask_xgboost.predict works but cannot be shown -Data must be 1-dimensional
  • Tuning XGBoost Hyperparameters with RandomizedSearchCV
  • How to get hyperparameters of xgb.train in python
Trending Discussions on xgboost

QUESTION

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, : Unexpected CURL error: getaddrinfo() thread failed to start

Asked 2022-Jan-27 at 19:14

I am experiencing a persistent error while trying to use H2O's h2o.automl function. I am trying to repeatedly run this model. It seems to completely fail after 5 or 10 runs.

Error in .h2o.__checkConnectionHealth() : 
  H2O connection has been severed. Cannot connect to instance at http://localhost:54321/
getaddrinfo() thread failed to start

In addition: There were 13 warnings (use warnings() to see them)
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix,  : 
  Unexpected CURL error: getaddrinfo() thread failed to start

I have updated java in response to: https://h2o-release.s3.amazonaws.com/h2o/rel-wolpert/4/docs-website/h2o-docs/faq/r.html (even though I am using a linux virtual machine). I have added a h2o.removeall() and gc() in response to R h2o server CURL error, kind of repeatable I have not attempted any changes regarding memory because my cluster has 16+ GB and the highest reading I have seen is 1.6 GiB in RStudio.

H2O is running in R/Rstudio Server on an Ubuntu 20.04 virtual machine. Could the virtual box software be blocking something?

The details on my H2O cluster are below:

openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04)
OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing)

Starting H2O JVM and connecting: ... Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         1 seconds 896 milliseconds 
    H2O cluster timezone:       America/Chicago 
    H2O data parsing timezone:  UTC 
    H2O cluster version:        3.35.0.2 
    H2O cluster version age:    19 hours and 24 minutes  
    H2O cluster name:           H2O_started_from_R_jholderieath_glq667 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   19.84 GB 
    H2O cluster total cores:    12 
    H2O cluster allowed cores:  12 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 
    R Version:                  R version 4.1.1 (2021-08-10) 

ANSWER

Answered 2022-Jan-27 at 19:14

I think I also experienced this issue, although on macOS 12.1. I tried to debug it and found out that sometimes I also get another error:

Unexpected CURL error: Failed to connect to 127.0.0.1 port 54321: Connection reset by peer

I found out that this issue appears only when I have RCurl compiled against curl 7.68.0 and above.

Downgrading to curl 7.67.0 resolved the issue for me but then I got some issues with RStudio (Segmentation Fault) so I looked into the issue little further.

And I found out that compiling a recent version of curl with --disable-socketpair solved it for me as well.

I was monitoring open files and sockets (lsof) and it seems to me that R process runs out of sockets it can create and RCurl then fails with one of those errors. Running gc() in R frequently helps (I called it after every single request) but still the minimum number of open sockets after gc() is slowly but monotonically increasing which leads me to believe there might be some leak. I reported this as a possible bug to the RCurl maintainers.

For anybody using macOS and homebrew this can be accomplished by running the following:

$ brew edit curl # add --disable-socketpair to args list
$ brew install --build-from-source curl # using reinstall might be needed instead of install

$ export RCURL_PATH="usr/local/opt/curl@7.81.0" # can be found using `brew info curl`
$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl

Looking at the curl version in ubuntu 20.04 which is 7.68.0 (according to https://packages.ubuntu.com/focal/curl) I think you won't be able to use the following as the --disable-socketpair was added in curl 7.73.0 but since you are using a virtual machine it might be easier to just use ubuntu 18.04 since it's still supported and is using old enough curl version (7.58.0).

I haven't used ubuntu for a while but at least I can provide some pseudo-code that should do the same:

$ sudo apt install devscripts
$ # make sure source repositories are enabled (uncommented in /etc/apt/s
$ apt-get source curl
$ sudo apt-get build-dep curl
$ cd curl
$ nano debian/rules # add the --disable-socketpair configure option
$ dch -i # bump the version
$ debuild -us -uc -b # build the package
$ dpkg -i ../curl-some_version.dpkg

$ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
$ export LDFLAGS="-L$RCURL_PATH//lib"
$ export CPPFLAGS="-I$RCURL_PATH/include"
$ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"

$ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
$ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl

Source https://stackoverflow.com/questions/69485936

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install xgboost

You can download it from GitHub.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

DOWNLOAD this Library from

Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
over 430 million Knowledge Items
Find more libraries
Reuse Solution Kits and Libraries Curated by Popular Use Cases

Save this library and start creating your kit

Explore Related Topics

Share this Page

share link
Reuse Pre-built Kits with xgboost
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from
over 430 million Knowledge Items
Find more libraries
Reuse Solution Kits and Libraries Curated by Popular Use Cases

Save this library and start creating your kit

  • © 2022 Open Weaver Inc.