Comments (2)
Related to sampling, it looks like when applying the doubleBootstrap for honest random forests without groups and folds, we ignore observationWeights and uniformly sample with replacement (see code) for the averaging set. When groups and folds are used, it looks like we sample with replacement according to observationWeights (see code). As a corollary, we also always take a doubleBootstrap when using groups/folds.
- Is this understanding correct?
- If so, is this intended behavior?
from rforestry.
I believe this understanding is correct.
Right now this is intended behavior, but I could see reason to change this to use the weights in the second bootstrap as well. The only complication would be that the meaning of the weights would change based on the sample that is taken in the first bootstrap (since those observations are removed from sampling and their weights are removed from the total set of weights that would be used in the next step).
from rforestry.
Related Issues (20)
- Getting weights variables HOT 1
- Clean up README for Python Package
- Training data column with only NAs
- `has_nas` method defined, but not used
- Export model to JSON string
- Add coverage report for unittests
- `seed` same in different objects HOT 3
- :no_good: Branch `bptest/test` has an incorrect name
- Implement Custom Sampling in Python
- Allow all NA or same value columns with scaling HOT 1
- Change linFeats to be 1-indexed instead of 0-indexed
- Python package documentation is outdated
- Fix authentication issue preventing python build pipeline to access docker image
- Build R package and upload it as pipeline artifact
- R release on CRAN - resolve std::cout?
- Add pipeline step for cross language tests / Python pipeline
- Floating point imprecision HOT 3
- Python Predictions not being rescaled when forest is trained with scale = True
- Compatibility with sklearn estimators HOT 6
- segmentation faults when splitratio = 0.5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rforestry.