GithubHelp home page GithubHelp logo

Cran integration about diagrammer HOT 17 CLOSED

rich-iannone avatar rich-iannone commented on May 18, 2024
Cran integration

from diagrammer.

Comments (17)

rich-iannone avatar rich-iannone commented on May 18, 2024

Yes. Plan to get that underway this week. Hope beyond hope that will be a smooth process... Would love to see a sample of that use later on if that's possible.

from diagrammer.

pommedeterresautee avatar pommedeterresautee commented on May 18, 2024

Not yet finished but of course I will send it to you when written.
Btw, I think it is not possible but I prefer to ask, is it possible to have several trees in the same image?

The ML package I am working on is from the ensemble tree family, meaning the model uses several decision trees to take a decision. Therefore I will need to plot several trees.

To give you an idea, this is a txt dump of a simple model:

booster[0]:
0:[f0<1.00001] yes=1,no=2,missing=2 gain=9.00675,cover=21
    1:[f3<62.5] yes=3,no=4,missing=3 gain=0.588164,cover=10.75
        3:leaf=-1.36842cover=8.5
        4:[f3<65.5] yes=7,no=8,missing=7 gain=0.307692,cover=2.25
            7:leaf=-0cover=1
            8:leaf=-0.666667cover=1.25
    2:[f3<39] yes=5,no=6,missing=5 gain=3.19787,cover=10.25
        5:leaf=-0.909091cover=1.75
        6:[f3<61.5] yes=9,no=10,missing=9 gain=5.62929,cover=8.5
            9:[f3<51.5] yes=11,no=12,missing=11 gain=0.405797,cover=4.75
                11:leaf=0.222222cover=1.25
                12:leaf=1.11111cover=3.5
            10:[f2<1.00001] yes=13,no=14,missing=14 gain=0.361134,cover=3.75
                13:leaf=-0.8cover=1.5
                14:[f3<67.5] yes=15,no=16,missing=15 gain=1.42308,cover=2.25
                    15:leaf=0.5cover=1
                    16:leaf=-0.666667cover=1.25
booster[1]:
0:[f3<53.5] yes=1,no=2,missing=1 gain=1.233,cover=16.2397
    1:[f0<1.00001] yes=3,no=4,missing=4 gain=0.235218,cover=5.902
        3:leaf=-0.722073cover=3.23434
        4:[f2<1.00001] yes=7,no=8,missing=8 gain=0.324221,cover=2.66767
            7:leaf=0.143253cover=1.06578
            8:leaf=-0.416187cover=1.60189
    2:[f3<57.5] yes=5,no=6,missing=5 gain=2.28004,cover=10.3377
        5:leaf=0.734631cover=2.08826
        6:[f3<64.5] yes=9,no=10,missing=9 gain=0.576289,cover=8.24939
            9:[f3<61.5] yes=11,no=12,missing=11 gain=0.520244,cover=4.85999
                11:leaf=-0.0681728cover=2.46091
                12:leaf=-0.703358cover=2.39908
            10:[f0<1.00001] yes=13,no=14,missing=14 gain=0.0666218,cover=3.3894
                13:leaf=-0.082765cover=1.37079
                14:leaf=0.145369cover=2.01861
booster[2]:
0:[f3<31.5] yes=1,no=2,missing=1 gain=0.994684,cover=14.4559
    1:leaf=-0.725791cover=1.20749
    2:[f0<1.00001] yes=3,no=4,missing=4 gain=0.43832,cover=13.2484
        3:[f7<1.00001] yes=5,no=6,missing=6 gain=0.221549,cover=6.05906
            5:[f3<61.5] yes=9,no=10,missing=9 gain=0.0545174,cover=2.72321
                9:leaf=-0.0736994cover=1.61054
                10:leaf=0.143768cover=1.11267
            6:[f6<1.00001] yes=11,no=12,missing=12 gain=0.00975752,cover=3.33585
                11:leaf=-0.402298cover=1.56072
                12:leaf=-0.135091cover=1.77513
        4:[f3<68.5] yes=7,no=8,missing=7 gain=0.715236,cover=7.18931
            7:[f7<1.00001] yes=13,no=14,missing=14 gain=0.284033,cover=6.038
                13:[f2<1.00001] yes=15,no=16,missing=16 gain=0.866298,cover=3.33827
                    15:leaf=-0.435249cover=1.17335
                    16:leaf=0.386262cover=2.16492
                14:leaf=0.488874cover=2.69973
            8:leaf=-0.372583cover=1.15131
booster[3]:
0:[f5<1.00001] yes=1,no=2,missing=2 gain=0.291621,cover=13.0014
    1:leaf=-0.437401cover=1.14007
    2:[f3<56.5] yes=3,no=4,missing=3 gain=0.296743,cover=11.8614
        3:[f3<32.5] yes=5,no=6,missing=5 gain=0.767718,cover=3.7931
            5:leaf=-0.312716cover=1.30799
            6:[f3<53.5] yes=9,no=10,missing=9 gain=0.0421358,cover=2.48511
                9:leaf=0.151712cover=1.42147
                10:leaf=0.569316cover=1.06365
        4:[f3<58.5] yes=7,no=8,missing=7 gain=0.288247,cover=8.06826
            7:leaf=-0.449068cover=1.2372
            8:[f3<60.5] yes=11,no=12,missing=11 gain=0.218973,cover=6.83106
                11:leaf=0.222998cover=1.51689
                12:[f2<1.00001] yes=13,no=14,missing=14 gain=0.216326,cover=5.31417
                    13:leaf=0.154285cover=1.28126
                    14:[f3<64.5] yes=15,no=16,missing=15 gain=0.0556476,cover=4.03291
                        15:leaf=-0.356278cover=1.26271
                        16:[f3<67.5] yes=17,no=18,missing=17 gain=0.0487459,cover=2.7702
                            17:leaf=0.0423429cover=1.23857
                            18:leaf=-0.173485cover=1.53163
booster[4]:
0:[f4<1.00001] yes=1,no=2,missing=2 gain=0.327443,cover=12.76
    1:leaf=0.32368cover=1.06381
    2:[f3<54.5] yes=3,no=4,missing=3 gain=0.391578,cover=11.6962
        3:[f0<1.00001] yes=5,no=6,missing=6 gain=0.100243,cover=3.19493
            5:leaf=-0.473333cover=1.77612
            6:leaf=-0.0766397cover=1.41881
        4:[f3<57.5] yes=7,no=8,missing=7 gain=0.233588,cover=8.50126
            7:leaf=0.307528cover=1.14447
            8:[f3<66.5] yes=9,no=10,missing=9 gain=0.193483,cover=7.3568
                9:[f3<63.5] yes=11,no=12,missing=11 gain=0.0894609,cover=5.31245
                    11:[f2<1.00001] yes=13,no=14,missing=14 gain=0.0373335,cover=3.80138
                        13:leaf=-0.124951cover=1.74452
                        14:[f3<60.5] yes=15,no=16,missing=15 gain=0.0176378,cover=2.05686
                            15:leaf=0.0971158cover=1.05058
                            16:leaf=-0.0390881cover=1.00628
                    12:leaf=-0.305334cover=1.51107
                10:leaf=0.15303cover=2.04435
booster[5]:
0:[f3<32.5] yes=1,no=2,missing=1 gain=0.202925,cover=12.3398
    1:leaf=-0.29677cover=1.26005
    2:[f3<47] yes=3,no=4,missing=3 gain=0.271826,cover=11.0798
        3:leaf=0.345466cover=1.23939
        4:[f7<1.00001] yes=5,no=6,missing=6 gain=0.110205,cover=9.84037
            5:[f2<1.00001] yes=7,no=8,missing=8 gain=0.13005,cover=5.43926
                7:leaf=-0.120634cover=1.8881
                8:[f3<58.5] yes=11,no=12,missing=11 gain=0.529277,cover=3.55116
                    11:leaf=-0.235791cover=1.3903
                    12:[f3<61.5] yes=13,no=14,missing=13 gain=0.111476,cover=2.16086
                        13:leaf=0.543864cover=1.04246
                        14:leaf=0.0710653cover=1.1184
            6:[f3<66.5] yes=9,no=10,missing=9 gain=0.150789,cover=4.40111
                9:leaf=-0.276771cover=2.27269
                10:leaf=0.0468063cover=2.12842
booster[6]:
0:[f4<1.00001] yes=1,no=2,missing=2 gain=0.170242,cover=12.0306
    1:leaf=0.260167cover=1.13877
    2:[f3<53.5] yes=3,no=4,missing=3 gain=0.283698,cover=10.8919
        3:leaf=-0.306286cover=2.11739
        4:[f3<57.5] yes=5,no=6,missing=5 gain=0.312135,cover=8.77446
            5:leaf=0.335615cover=1.6591
            6:[f3<64.5] yes=7,no=8,missing=7 gain=0.14184,cover=7.11536
                7:[f2<1.00001] yes=9,no=10,missing=10 gain=0.0247509,cover=4.01684
                    9:leaf=-0.0443511cover=1.81671
                    10:leaf=-0.223651cover=2.20013
                8:[f3<67.5] yes=11,no=12,missing=11 gain=0.100104,cover=3.09852
                    11:leaf=0.240929cover=1.10811
                    12:leaf=-0.0519394cover=1.99042
booster[7]:
0:[f3<32.5] yes=1,no=2,missing=1 gain=0.197457,cover=11.9813
    1:leaf=-0.265111cover=1.19593
    2:[f3<56.5] yes=3,no=4,missing=3 gain=0.306775,cover=10.7854
        3:[f6<1.00001] yes=5,no=6,missing=6 gain=0.0724137,cover=3.03167
            5:leaf=0.0705344cover=1.62101
            6:leaf=0.402162cover=1.41066
        4:[f3<58.5] yes=7,no=8,missing=7 gain=0.112676,cover=7.7537
            7:leaf=-0.241201cover=1.26529
            8:[f3<65.5] yes=9,no=10,missing=9 gain=0.0552068,cover=6.48841
                9:[f2<1.00001] yes=11,no=12,missing=12 gain=0.407982,cover=3.64093
                    11:leaf=-0.227347cover=1.54115
                    12:[f3<61.5] yes=15,no=16,missing=15 gain=0.00967593,cover=2.09978
                        15:leaf=0.386804cover=1.01897
                        16:leaf=0.0974195cover=1.08081
                10:[f3<68.5] yes=13,no=14,missing=13 gain=0.0909914,cover=2.84748
                    13:leaf=-0.190313cover=1.5918
                    14:leaf=0.0909942cover=1.25568
booster[8]:
0:[f2<1.00001] yes=1,no=2,missing=2 gain=0.243304,cover=11.8183
    1:[f3<58.5] yes=3,no=4,missing=3 gain=0.0995766,cover=3.14931
        3:leaf=0.347267cover=1.28762
        4:leaf=0.022496cover=1.86169
    2:[f3<58.5] yes=5,no=6,missing=5 gain=0.0824455,cover=8.66899
        5:[f3<56.5] yes=7,no=8,missing=7 gain=0.194386,cover=4.17696
            7:[f0<1.00001] yes=11,no=12,missing=12 gain=0.200633,cover=3.14686
                11:leaf=0.183377cover=1.65508
                12:leaf=-0.211774cover=1.49178
            8:leaf=-0.409993cover=1.03009
        6:[f3<62.5] yes=9,no=10,missing=9 gain=0.159963,cover=4.49203
            9:leaf=0.210931cover=1.48626
            10:[f3<67.5] yes=13,no=14,missing=13 gain=0.000921036,cover=3.00578
                13:leaf=-0.0424455cover=1.47902
                14:leaf=-0.136137cover=1.52676
booster[9]:
0:[f5<1.00001] yes=1,no=2,missing=2 gain=0.232486,cover=12.0491
    1:leaf=-0.316969cover=1.05291
    2:[f3<32.5] yes=3,no=4,missing=3 gain=0.203656,cover=10.9962
        3:leaf=-0.239403cover=1.14459
        4:[f3<49.5] yes=5,no=6,missing=5 gain=0.258366,cover=9.85156
            5:leaf=0.42213cover=1.02418
            6:[f0<1.00001] yes=7,no=8,missing=8 gain=0.122395,cover=8.82738
                7:[f3<62.5] yes=9,no=10,missing=9 gain=0.420141,cover=4.28074
                    9:[f3<56] yes=13,no=14,missing=13 gain=0.131082,cover=2.6222
                        13:leaf=-0.0102133cover=1.1535
                        14:leaf=-0.427704cover=1.4687
                    10:leaf=0.229973cover=1.65853
                8:[f3<68.5] yes=11,no=12,missing=11 gain=0.38422,cover=4.54665
                    11:[f2<1.00001] yes=15,no=16,missing=16 gain=0.104008,cover=3.49122
                        15:leaf=-0.00161279cover=1.0497
                        16:leaf=0.35645cover=2.44152
                    12:leaf=-0.258477cover=1.05543

Each booster is an independant decision tree usually focusing on a part of the data not learned by the previous trees. The f[number] is an id which will be replaced by the name of a feature used to split the tree, the yes=, no= ... is the key to understand the relation between the branch of the tree, and the gain is a metric of the importance of the feature in the decision tree.

from diagrammer.

timelyportfolio avatar timelyportfolio commented on May 18, 2024

Would really like to see the functionality proposed and also like to see DiagrammeR extended to cover rpart or the more comprehensive partykit. See here as an experiment before DiagrammeR existed.

I do think thought that Suggests will be better than Imports, since I would say this is an enhancement rather than a requirement. See Package Dependencies.

from diagrammer.

pommedeterresautee avatar pommedeterresautee commented on May 18, 2024

I have tried and it was easy to have several graph on the same image. That s a very good thing.
@timelyportfolio first thanks for your post in your blog about DiagrammeR package, that is how I discovered it (and thanks to @rich-iannone for having built it). I have posted an image of the first tree here https://github.com/tqchen/xgboost/issues/123. Basically I parse the text model with some regex and convert it to a data.table. Then I built the markdown with some paste command using the data.table. I wait this package to be pushed to Cran before pushing my code to xgboost (and it gives me time to polishing my code). I am very pleased with the result.

from diagrammer.

vnijs avatar vnijs commented on May 18, 2024

@pommedeterresautee could you perhaps post just the code + example of "Basically I parse the text model with some regex and convert it to a data.table." somewhere? Sorry to thread-jack

from diagrammer.

pommedeterresautee avatar pommedeterresautee commented on May 18, 2024

@mostly-harmless my WIP code is here: https://github.com/pommedeterresautee/xgboost/blob/master/R-package/R/xgb.plot.tree.R

The file read is the one I posted 2 posts ago. Just put the content in a text file, change the path and generate the Viz.
All the trees are generated.

@rich-iannone @timelyportfolio
Does anyone know if in Shiny it's possible to collapse a branch of a generated tree? (like you click on a node and the branch after the node are collapsed)

from diagrammer.

timelyportfolio avatar timelyportfolio commented on May 18, 2024

I like the direction this conversation is headed. To separate from CRAN integration, I thought it might be good to start issue #8 for

Does anyone know if in Shiny it's possible to collapse a branch of a generated tree? (like you click on a node and the branch after the node are collapsed)

from diagrammer.

pommedeterresautee avatar pommedeterresautee commented on May 18, 2024

@mostly-harmless function is complete.

from diagrammer.

vnijs avatar vnijs commented on May 18, 2024

Thanks @pommedeterresautee !

from diagrammer.

rich-iannone avatar rich-iannone commented on May 18, 2024

Thanks again @pommedeterresautee, that's great!

from diagrammer.

pommedeterresautee avatar pommedeterresautee commented on May 18, 2024

@rich-iannone did you find time to submit your package to cran?

from diagrammer.

rich-iannone avatar rich-iannone commented on May 18, 2024

@pommedeterresautee there is still a problem building the vignette. I need to resolve that issue before submitting to CRAN.

from diagrammer.

rich-iannone avatar rich-iannone commented on May 18, 2024

Okay, @pommedeterresautee and @timelyportfolio, figured out the build issue with the vignette, I had a slightly older version of knitr. Once I updated that, I could build the vignette and building the source package was free of errors. I'll submit to CRAN.

from diagrammer.

rich-iannone avatar rich-iannone commented on May 18, 2024

Now submitted to CRAN. Just need to wait for a reply from BDR.

from diagrammer.

rich-iannone avatar rich-iannone commented on May 18, 2024

After a few rounds of fixes, it's now in CRAN.

from diagrammer.

pommedeterresautee avatar pommedeterresautee commented on May 18, 2024

Awesome I push my code on XGBoost! First reverse dependency for DiagrammeR :-)

from diagrammer.

rich-iannone avatar rich-iannone commented on May 18, 2024

That is great to hear! Thanks @pommedeterresautee for all the help and interest so far.

from diagrammer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.