glmtree | Logistic regression trees : a decision tree | Machine Learning library
kandi X-RAY | glmtree Summary
kandi X-RAY | glmtree Summary
The goal of glmtree is to build decision trees with logistic regressions at their leaves, so that the resulting model mixes non parametric VS parametric and stepwise VS linear approaches to have the best predictive results, yet maintaining interpretability. This is the implementation of glmtree as described in Formalization and study of statistical problems in Credit Scoring, Ehrhardt A. (see manuscript or web article).
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of glmtree
glmtree Key Features
glmtree Examples and Code Snippets
Community Discussions
Trending Discussions on glmtree
QUESTION
Currently I am working with the glmtree() function in R. I have some factor variables with 20+ levels. The problem comes with the representation of the tree. There is some information at certain leafs that is impossible to visualise due to the large amount of levels in certain variables (i.e. i_mode has 29 levels).
One possible solution would be to "dummify" those levels. However, I'd rather not do it, if possible at all.
Do you know a method in which I can represent the same plot in a more readable form?
Any clue?
Thank you
...ANSWER
Answered 2021-Feb-16 at 01:45My feeling is that it will be challenging to understand such a plot, also beyond the labeling issue. Personally, I would try to break down such a factor into more intelligible groups with fewer levels (not necessarily binary, though).
Having said that, the panel function edge_simple()
that draws the edge labels in the tree has some arguments that can help improve the readability, e.g., you can alternate their position and change the font size. For a worked example see:
R partykit::ctree offset labels on edges
Additionally you could try abbreviating the factor levels prior to learning the tree. However, with 29 levels all of this will probably not help much, I'm afraid.
QUESTION
Currently I am working with the dataset predictions. In this data I have converted clear character type variables into factors because I think factors work better than characters for glmtree() code (tell me if I am wrong with this):
...ANSWER
Answered 2021-Feb-14 at 09:49You are right that glmtree()
and the underlying mob()
function expect the split variables to be factors in case of nominal information. However, computationally this is only feasible for factors that have either a limited number of levels because the algorithm will try all possible partitions of the number of levels into two groups. Thus, for your i_mode
factor this necessitates going through nl
levels and mi
splits into two groups with:
QUESTION
ANSWER
Answered 2021-Jan-24 at 16:38In such a situation I recommend to plot the tree on a device that is big enough to show everything and where you can zoom easily etc. For example, one can plot into a big PDF file and then browse and zoom with the PDF viewer. Something like this should work ok:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install glmtree
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page