GithubHelp home page GithubHelp logo

Comments (10)

ExpandingMan avatar ExpandingMan commented on June 12, 2024 1

Btw, a really quick and minimal effort way of getting this working which I would be happy to merge is if we just added a type keyword arg when, if not nothing overrides all other options. We'd have to make any future keyword args campatible with it but I don't think that would be hard.

from xgboost.jl.

bobaronoff avatar bobaronoff commented on June 12, 2024 1

Perhaps adding the 'type' keyword is the best approach. It seems most flexible particularly if more options are added in the future. I am willing to give a try at a PR (it would be my first ever), but it would need to be heavily edited. My programming skills are no where near yours. I am most concerned on how best to handle all the return shape configurations this creates.

In R, there is a way to specify a list of parameter options. I see julia packages that do this (Plots.jl comes to mind) but I don't know how to code this.

from xgboost.jl.

ExpandingMan avatar ExpandingMan commented on June 12, 2024

I don't see any additional options that we can pass to XGBoosterPredict...

To be clear, the parameters we already have in that opts dict are the only ones I see documented.

I'm also not seeing any reference anywhere to TreeSHAP, can you show specifically how this would be called?

from xgboost.jl.

bobaronoff avatar bobaronoff commented on June 12, 2024

I just found the following at XGBoost C Package

Make prediction from DMatrix, replacing [XGBoosterPredict](https://xgboost.readthedocs.io/en/stable/c.html#group__Prediction_1ga3e4d11089d266ae4f913ab43864c6b12).

“type”: [0, 6]

0: normal prediction
1: output margin
2: predict contribution
3: predict approximated contribution
4: predict feature interaction
5: predict approximated feature interaction
6: predict leaf “training”: bool Whether the prediction function is used as part of a training loop. Not used for inplace prediction.

Looking over I think the parameters I saw reflect how they are named in the Python package, but according to the location I referenced they are implemented through the 'type' parameter which is not true/false but rather 0-6.

I apologize if I have this incorrect.

from xgboost.jl.

bobaronoff avatar bobaronoff commented on June 12, 2024

Here is the proper link: XGBoosterPredict

from xgboost.jl.

ExpandingMan avatar ExpandingMan commented on June 12, 2024

Ah, I was looking at the wrong one, indeed we are using PredictFromDMatrix. I think that also the documentation is out of sync, maybe I should have been looking at this page instead of the one I linked.

I assume you are interested in additional values for type? Should be easy enough, though we'll have to think about what the options would look like on the Julia side since the type integer by itself is pretty opaque. Looks like currently the only option for type I handle is margin.

I'll probably get to this eventually. Of course, a PR would be welcome.

from xgboost.jl.

bobaronoff avatar bobaronoff commented on June 12, 2024

am attempting a version of predict that allows for differing type values {0,6}. I am able to get the differing returns from libxgboost but getting confused on how to process results in to proper Julia array.

Here are 3 lines in current routine that I think I understand but not certain.

dims = reverse(unsafe_wrap(Array, oshape[], odim[]))
o = unsafe_wrap(Array, o[], tuple(dims...))
length(dims) > 1 ? transpose(o) : o

In seems that the 'reverse' function will effect a reshape when the unsafe_wrap converts the c array to a Julia Array. The last line affects a transpose if dims are >1. I understand this is in 2 dimensions (and it completes the conversion from row major to column major). I am not familiar how transpose works and what would happen if applied to a 3 dimensional array as might come from type=4 i.e., interaction (or type=2 i.e., contribution, in a multi: model).

Any thoughts would be greatly appreciated.

from xgboost.jl.

ExpandingMan avatar ExpandingMan commented on June 12, 2024

These lines are merely for adapting libxgboost's internal memory format (in which it returns) to the memory format of Julia arrays (in particular, the former is row-major and the latter is column-major). If the other type returns are implemented correctly, it should return the array metadata in exactly the same way as it does for type=0. Therefore, I don't think any of these lines should be touched at all.

from xgboost.jl.

bobaronoff avatar bobaronoff commented on June 12, 2024

I must not be conveying the issue correctly. Here is my understanding and working with my data bears out that understanding. unsafe_wrap takes the C pointer and uses it to specify a Julia object stored at that pointer with the array dimensions supplied. It does nothing to remap the data in memory from row major to column major. For a two dimensional array if one reshapes by reversing the dimensions and transposing, the indices will map to the proper locations in memory. Theoretically this works for 3, 4, or any dimensional array. However, transpose is only designed for a 2 dimensional array. It throws an error if you try to use it for a 3 dimensional array.

libxgboost returns 3 dimensional arrays for type 4 and 5 ALWAYS and for type 2 and 3 when the objective is multi:softprob/multi:softmax. The current format (i.e. ,transpose) will fail every time for type 4 and 5 and sometimes (i.e., multi: objectives) for type 2 and 3. I have confirmed this on my data sets!!

Rather than modify a function to create situations that would fail, I think it better to leave the current XGBoost.predict() as it is and create a new function ( perhaps XGBoost.predictbytype()) that includes permutedims to handle all contingencies. The only reason to specify type is for the Shapley values which is a one time call and the reallocation cost would be less impactful and known to the user upfront.

I will change the function name. Since I am proposing a new function there is no need for backward compatibility and keeping margin is redundant. It will take me a bit to figure out how to roll back my fork so the current XGBoost.predict() remains untouched.

from xgboost.jl.

ExpandingMan avatar ExpandingMan commented on June 12, 2024

I'm a bit confused... why not just check if dims == 2 in the existing predict function? That way you can know whether transpose works or you have to do permutedims?

I'm not necessarily opposed to adding a new, lower-level function, that might have some advantages. However, the only think I can think of stopping us from just returning whatever is the appropriate array here is type stability and, again, that's already pretty compromised so I'm not sure it makes sense to try to keep it narrowed down.

from xgboost.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.