Comments (17)
Yes. Plan to get that underway this week. Hope beyond hope that will be a smooth process... Would love to see a sample of that use later on if that's possible.
from diagrammer.
Not yet finished but of course I will send it to you when written.
Btw, I think it is not possible but I prefer to ask, is it possible to have several trees in the same image?
The ML package I am working on is from the ensemble tree family, meaning the model uses several decision trees to take a decision. Therefore I will need to plot several trees.
To give you an idea, this is a txt dump of a simple model:
booster[0]:
0:[f0<1.00001] yes=1,no=2,missing=2 gain=9.00675,cover=21
1:[f3<62.5] yes=3,no=4,missing=3 gain=0.588164,cover=10.75
3:leaf=-1.36842cover=8.5
4:[f3<65.5] yes=7,no=8,missing=7 gain=0.307692,cover=2.25
7:leaf=-0cover=1
8:leaf=-0.666667cover=1.25
2:[f3<39] yes=5,no=6,missing=5 gain=3.19787,cover=10.25
5:leaf=-0.909091cover=1.75
6:[f3<61.5] yes=9,no=10,missing=9 gain=5.62929,cover=8.5
9:[f3<51.5] yes=11,no=12,missing=11 gain=0.405797,cover=4.75
11:leaf=0.222222cover=1.25
12:leaf=1.11111cover=3.5
10:[f2<1.00001] yes=13,no=14,missing=14 gain=0.361134,cover=3.75
13:leaf=-0.8cover=1.5
14:[f3<67.5] yes=15,no=16,missing=15 gain=1.42308,cover=2.25
15:leaf=0.5cover=1
16:leaf=-0.666667cover=1.25
booster[1]:
0:[f3<53.5] yes=1,no=2,missing=1 gain=1.233,cover=16.2397
1:[f0<1.00001] yes=3,no=4,missing=4 gain=0.235218,cover=5.902
3:leaf=-0.722073cover=3.23434
4:[f2<1.00001] yes=7,no=8,missing=8 gain=0.324221,cover=2.66767
7:leaf=0.143253cover=1.06578
8:leaf=-0.416187cover=1.60189
2:[f3<57.5] yes=5,no=6,missing=5 gain=2.28004,cover=10.3377
5:leaf=0.734631cover=2.08826
6:[f3<64.5] yes=9,no=10,missing=9 gain=0.576289,cover=8.24939
9:[f3<61.5] yes=11,no=12,missing=11 gain=0.520244,cover=4.85999
11:leaf=-0.0681728cover=2.46091
12:leaf=-0.703358cover=2.39908
10:[f0<1.00001] yes=13,no=14,missing=14 gain=0.0666218,cover=3.3894
13:leaf=-0.082765cover=1.37079
14:leaf=0.145369cover=2.01861
booster[2]:
0:[f3<31.5] yes=1,no=2,missing=1 gain=0.994684,cover=14.4559
1:leaf=-0.725791cover=1.20749
2:[f0<1.00001] yes=3,no=4,missing=4 gain=0.43832,cover=13.2484
3:[f7<1.00001] yes=5,no=6,missing=6 gain=0.221549,cover=6.05906
5:[f3<61.5] yes=9,no=10,missing=9 gain=0.0545174,cover=2.72321
9:leaf=-0.0736994cover=1.61054
10:leaf=0.143768cover=1.11267
6:[f6<1.00001] yes=11,no=12,missing=12 gain=0.00975752,cover=3.33585
11:leaf=-0.402298cover=1.56072
12:leaf=-0.135091cover=1.77513
4:[f3<68.5] yes=7,no=8,missing=7 gain=0.715236,cover=7.18931
7:[f7<1.00001] yes=13,no=14,missing=14 gain=0.284033,cover=6.038
13:[f2<1.00001] yes=15,no=16,missing=16 gain=0.866298,cover=3.33827
15:leaf=-0.435249cover=1.17335
16:leaf=0.386262cover=2.16492
14:leaf=0.488874cover=2.69973
8:leaf=-0.372583cover=1.15131
booster[3]:
0:[f5<1.00001] yes=1,no=2,missing=2 gain=0.291621,cover=13.0014
1:leaf=-0.437401cover=1.14007
2:[f3<56.5] yes=3,no=4,missing=3 gain=0.296743,cover=11.8614
3:[f3<32.5] yes=5,no=6,missing=5 gain=0.767718,cover=3.7931
5:leaf=-0.312716cover=1.30799
6:[f3<53.5] yes=9,no=10,missing=9 gain=0.0421358,cover=2.48511
9:leaf=0.151712cover=1.42147
10:leaf=0.569316cover=1.06365
4:[f3<58.5] yes=7,no=8,missing=7 gain=0.288247,cover=8.06826
7:leaf=-0.449068cover=1.2372
8:[f3<60.5] yes=11,no=12,missing=11 gain=0.218973,cover=6.83106
11:leaf=0.222998cover=1.51689
12:[f2<1.00001] yes=13,no=14,missing=14 gain=0.216326,cover=5.31417
13:leaf=0.154285cover=1.28126
14:[f3<64.5] yes=15,no=16,missing=15 gain=0.0556476,cover=4.03291
15:leaf=-0.356278cover=1.26271
16:[f3<67.5] yes=17,no=18,missing=17 gain=0.0487459,cover=2.7702
17:leaf=0.0423429cover=1.23857
18:leaf=-0.173485cover=1.53163
booster[4]:
0:[f4<1.00001] yes=1,no=2,missing=2 gain=0.327443,cover=12.76
1:leaf=0.32368cover=1.06381
2:[f3<54.5] yes=3,no=4,missing=3 gain=0.391578,cover=11.6962
3:[f0<1.00001] yes=5,no=6,missing=6 gain=0.100243,cover=3.19493
5:leaf=-0.473333cover=1.77612
6:leaf=-0.0766397cover=1.41881
4:[f3<57.5] yes=7,no=8,missing=7 gain=0.233588,cover=8.50126
7:leaf=0.307528cover=1.14447
8:[f3<66.5] yes=9,no=10,missing=9 gain=0.193483,cover=7.3568
9:[f3<63.5] yes=11,no=12,missing=11 gain=0.0894609,cover=5.31245
11:[f2<1.00001] yes=13,no=14,missing=14 gain=0.0373335,cover=3.80138
13:leaf=-0.124951cover=1.74452
14:[f3<60.5] yes=15,no=16,missing=15 gain=0.0176378,cover=2.05686
15:leaf=0.0971158cover=1.05058
16:leaf=-0.0390881cover=1.00628
12:leaf=-0.305334cover=1.51107
10:leaf=0.15303cover=2.04435
booster[5]:
0:[f3<32.5] yes=1,no=2,missing=1 gain=0.202925,cover=12.3398
1:leaf=-0.29677cover=1.26005
2:[f3<47] yes=3,no=4,missing=3 gain=0.271826,cover=11.0798
3:leaf=0.345466cover=1.23939
4:[f7<1.00001] yes=5,no=6,missing=6 gain=0.110205,cover=9.84037
5:[f2<1.00001] yes=7,no=8,missing=8 gain=0.13005,cover=5.43926
7:leaf=-0.120634cover=1.8881
8:[f3<58.5] yes=11,no=12,missing=11 gain=0.529277,cover=3.55116
11:leaf=-0.235791cover=1.3903
12:[f3<61.5] yes=13,no=14,missing=13 gain=0.111476,cover=2.16086
13:leaf=0.543864cover=1.04246
14:leaf=0.0710653cover=1.1184
6:[f3<66.5] yes=9,no=10,missing=9 gain=0.150789,cover=4.40111
9:leaf=-0.276771cover=2.27269
10:leaf=0.0468063cover=2.12842
booster[6]:
0:[f4<1.00001] yes=1,no=2,missing=2 gain=0.170242,cover=12.0306
1:leaf=0.260167cover=1.13877
2:[f3<53.5] yes=3,no=4,missing=3 gain=0.283698,cover=10.8919
3:leaf=-0.306286cover=2.11739
4:[f3<57.5] yes=5,no=6,missing=5 gain=0.312135,cover=8.77446
5:leaf=0.335615cover=1.6591
6:[f3<64.5] yes=7,no=8,missing=7 gain=0.14184,cover=7.11536
7:[f2<1.00001] yes=9,no=10,missing=10 gain=0.0247509,cover=4.01684
9:leaf=-0.0443511cover=1.81671
10:leaf=-0.223651cover=2.20013
8:[f3<67.5] yes=11,no=12,missing=11 gain=0.100104,cover=3.09852
11:leaf=0.240929cover=1.10811
12:leaf=-0.0519394cover=1.99042
booster[7]:
0:[f3<32.5] yes=1,no=2,missing=1 gain=0.197457,cover=11.9813
1:leaf=-0.265111cover=1.19593
2:[f3<56.5] yes=3,no=4,missing=3 gain=0.306775,cover=10.7854
3:[f6<1.00001] yes=5,no=6,missing=6 gain=0.0724137,cover=3.03167
5:leaf=0.0705344cover=1.62101
6:leaf=0.402162cover=1.41066
4:[f3<58.5] yes=7,no=8,missing=7 gain=0.112676,cover=7.7537
7:leaf=-0.241201cover=1.26529
8:[f3<65.5] yes=9,no=10,missing=9 gain=0.0552068,cover=6.48841
9:[f2<1.00001] yes=11,no=12,missing=12 gain=0.407982,cover=3.64093
11:leaf=-0.227347cover=1.54115
12:[f3<61.5] yes=15,no=16,missing=15 gain=0.00967593,cover=2.09978
15:leaf=0.386804cover=1.01897
16:leaf=0.0974195cover=1.08081
10:[f3<68.5] yes=13,no=14,missing=13 gain=0.0909914,cover=2.84748
13:leaf=-0.190313cover=1.5918
14:leaf=0.0909942cover=1.25568
booster[8]:
0:[f2<1.00001] yes=1,no=2,missing=2 gain=0.243304,cover=11.8183
1:[f3<58.5] yes=3,no=4,missing=3 gain=0.0995766,cover=3.14931
3:leaf=0.347267cover=1.28762
4:leaf=0.022496cover=1.86169
2:[f3<58.5] yes=5,no=6,missing=5 gain=0.0824455,cover=8.66899
5:[f3<56.5] yes=7,no=8,missing=7 gain=0.194386,cover=4.17696
7:[f0<1.00001] yes=11,no=12,missing=12 gain=0.200633,cover=3.14686
11:leaf=0.183377cover=1.65508
12:leaf=-0.211774cover=1.49178
8:leaf=-0.409993cover=1.03009
6:[f3<62.5] yes=9,no=10,missing=9 gain=0.159963,cover=4.49203
9:leaf=0.210931cover=1.48626
10:[f3<67.5] yes=13,no=14,missing=13 gain=0.000921036,cover=3.00578
13:leaf=-0.0424455cover=1.47902
14:leaf=-0.136137cover=1.52676
booster[9]:
0:[f5<1.00001] yes=1,no=2,missing=2 gain=0.232486,cover=12.0491
1:leaf=-0.316969cover=1.05291
2:[f3<32.5] yes=3,no=4,missing=3 gain=0.203656,cover=10.9962
3:leaf=-0.239403cover=1.14459
4:[f3<49.5] yes=5,no=6,missing=5 gain=0.258366,cover=9.85156
5:leaf=0.42213cover=1.02418
6:[f0<1.00001] yes=7,no=8,missing=8 gain=0.122395,cover=8.82738
7:[f3<62.5] yes=9,no=10,missing=9 gain=0.420141,cover=4.28074
9:[f3<56] yes=13,no=14,missing=13 gain=0.131082,cover=2.6222
13:leaf=-0.0102133cover=1.1535
14:leaf=-0.427704cover=1.4687
10:leaf=0.229973cover=1.65853
8:[f3<68.5] yes=11,no=12,missing=11 gain=0.38422,cover=4.54665
11:[f2<1.00001] yes=15,no=16,missing=16 gain=0.104008,cover=3.49122
15:leaf=-0.00161279cover=1.0497
16:leaf=0.35645cover=2.44152
12:leaf=-0.258477cover=1.05543
Each booster is an independant decision tree usually focusing on a part of the data not learned by the previous trees. The f[number] is an id which will be replaced by the name of a feature used to split the tree, the yes=, no= ... is the key to understand the relation between the branch of the tree, and the gain is a metric of the importance of the feature in the decision tree.
from diagrammer.
Would really like to see the functionality proposed and also like to see DiagrammeR
extended to cover rpart
or the more comprehensive partykit
. See here as an experiment before DiagrammeR
existed.
I do think thought that Suggests
will be better than Imports
, since I would say this is an enhancement rather than a requirement. See Package Dependencies.
from diagrammer.
I have tried and it was easy to have several graph on the same image. That s a very good thing.
@timelyportfolio first thanks for your post in your blog about DiagrammeR package, that is how I discovered it (and thanks to @rich-iannone for having built it). I have posted an image of the first tree here https://github.com/tqchen/xgboost/issues/123. Basically I parse the text model with some regex and convert it to a data.table. Then I built the markdown with some paste command using the data.table. I wait this package to be pushed to Cran before pushing my code to xgboost (and it gives me time to polishing my code). I am very pleased with the result.
from diagrammer.
@pommedeterresautee could you perhaps post just the code + example of "Basically I parse the text model with some regex and convert it to a data.table." somewhere? Sorry to thread-jack
from diagrammer.
@mostly-harmless my WIP code is here: https://github.com/pommedeterresautee/xgboost/blob/master/R-package/R/xgb.plot.tree.R
The file read is the one I posted 2 posts ago. Just put the content in a text file, change the path and generate the Viz.
All the trees are generated.
@rich-iannone @timelyportfolio
Does anyone know if in Shiny it's possible to collapse a branch of a generated tree? (like you click on a node and the branch after the node are collapsed)
from diagrammer.
I like the direction this conversation is headed. To separate from CRAN integration, I thought it might be good to start issue #8 for
Does anyone know if in Shiny it's possible to collapse a branch of a generated tree? (like you click on a node and the branch after the node are collapsed)
from diagrammer.
@mostly-harmless function is complete.
from diagrammer.
Thanks @pommedeterresautee !
from diagrammer.
Thanks again @pommedeterresautee, that's great!
from diagrammer.
@rich-iannone did you find time to submit your package to cran?
from diagrammer.
@pommedeterresautee there is still a problem building the vignette. I need to resolve that issue before submitting to CRAN.
from diagrammer.
Okay, @pommedeterresautee and @timelyportfolio, figured out the build issue with the vignette, I had a slightly older version of knitr. Once I updated that, I could build the vignette and building the source package was free of errors. I'll submit to CRAN.
from diagrammer.
Now submitted to CRAN. Just need to wait for a reply from BDR.
from diagrammer.
After a few rounds of fixes, it's now in CRAN.
from diagrammer.
Awesome I push my code on XGBoost! First reverse dependency for DiagrammeR :-)
from diagrammer.
That is great to hear! Thanks @pommedeterresautee for all the help and interest so far.
from diagrammer.
Related Issues (20)
- Avoid nodes overlapping
- Integrate r variables in Mermaid Graph HOT 1
- github website link leads to spam HOT 1
- influenceR orphaned dependency HOT 4
- Release DiagrammeR 1.0.10
- export_graph layout
- render_graph layout
- `generate_dot()` uses single quotes instead of double quotes HOT 1
- as_svg produces `object 'display' not found' error
- Are the new features of gantt charts mermaid.js updated? HOT 2
- Error in render_graph HOT 1
- Test relies on specific output from sample_islands() for given seed HOT 4
- mermaid sequenceDiagram not working? HOT 2
- Documentation checks
- Quarto 1.4 - Cannot render `grVIz()` chart into HTML
- Error in `s$close()`: ! attempt to apply non-function
- Release DiagrammeR 1.0.11 HOT 6
- Maybe need `.name_repair`, according to warning message HOT 1
- DiagrammeR Graph not displayed when rendering in Quarto HOT 3
- Feature Request: Support for lavaan EFA
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diagrammer.