GithubHelp home page GithubHelp logo

learnedsystems / rmi Goto Github PK

View Code? Open in Web Editor NEW
213.0 10.0 34.0 1.01 MB

The recursive model index, a learned index structure

License: MIT License

Rust 92.18% Makefile 2.05% C++ 5.24% Python 0.53%

rmi's Issues

Off-by-one-Error in `train_two_layer` implementation

Hi,

I encountered an assert issue when training RMI on very small (100 items) datasets for testing purposes:

thread '<unnamed>' panicked at 'start index was 100 but end index was 100', [...]/RMI/rmi_lib/src/train/two_layer.rs:27:5

Upon closer investigation, I think I have found an off-by-one error in the train_two_layer function implementation. I did not take a look at the context beyond the function, therefore take what I am about to say with a grain of salt:


Link to source file

  1. The value of split_idx is calculated here:

let split_idx = md_container.lower_bound_by(|x| {

  1. split_idx should be in the interval [0, md_container.len())

if split_idx > 0 && split_idx < md_container.len() {

Now lets look at the case where split_idx == md_container.len() - 1, which is valid per [2.]:

  1. The else branch is taken, since split_idx < md_container.len()

let mut leaf_models = if split_idx >= md_container.len() {

  1. split_idx + 1 (== md_container.len()) is passed to build_models_from as start_idx
|| build_models_from(&md_container, &top_model, layer2_model,
                                   split_idx + 1, md_container.len(),
                                   split_idx_target,
                                   second_half_models)
fn build_models_from<T: TrainingKey>(data: &RMITrainingData<T>,
                                    top_model: &Box<dyn Model>,
                                    model_type: &str,
                                    start_idx: usize, end_idx: usize,
                                    first_model_idx: usize,
                                    num_models: usize) -> Vec<Box<dyn Model>>

Link 4.1

Link 4.2

  1. Assert fails, since md_container.len() > md_container.len() is false
assert!(end_idx > start_idx,
        "start index was {} but end index was {}",
        start_idx, end_idx
);

Link 5


An obvious fix would be to change the condition in [3.] to split_idx >= md_container.len() - 1, however I am not entirely certain whether that leads to issues in other contexts. I guess a similar issue would happen if similar_idx == 0, only for the first call. I changed the condition in my local version and re-ran the tests - it seems to work just fine:

Running test cache_fix_osm
Test cache_fix_osm finished.
Running test cache_fix_wiki
Test cache_fix_wiki finished.
Running test max_size_wiki
Test max_size_wiki finished.
Running test radix_model_wiki
Test radix_model_wiki finished.
Running test simple_model_osm
Test simple_model_osm finished.
Running test simple_model_wiki
Test simple_model_wiki finished.
============== TEST RESULTS ===============
python3 report.py
PASS cache_fix_osm
PASS cache_fix_wiki
PASS max_size_wiki
PASS radix_model_wiki
PASS simple_model_osm
PASS simple_model_wiki

I can open a pull request with that fix if you would like.

Optimizer interpretation

Any thoughts on how to interpret the output of the optimizer? I'm seeing a table with entries

Models                          Branch        AvgLg2        MaxLg2       Size (b)

But I haven't found an explanation of what these mean.

Multiple layers

Trying to train with three layers fails:

cargo run --release -- books_200M_uint32 my_first_rmi linear,linear,linear 100
    Finished release [optimized + debuginfo] target(s) in 0.02s
     Running `target/release/rmi /z/downloads/books_200M_uint32 my_first_rmi linear,linear,linear 100`
thread 'main' panicked at 'explicit panic', <::std::macros::panic macros>:2:4
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Is this because of an intentional limitation on the number of layers or some other issue?

Citation

Do you have a preferred citation for this work?

Radix root model not working with small branching factor

Hi, I recently tried to build a model with a radix layer and noticed that radix models do not work with small branching factors (1, 2, 3), e.g.

cargo run --release -- uniform_dense_200M_uint64 sosd_rmi radix,linear 1

I always get an error similar to this:

thread '<unnamed>' panicked at 'Top model gave an index of 2954937499648 which is out of bounds of 2. Subset range: 1 to 200000000', /private/tmp/rust-20200803-28615-1977jkb/rustc-1.45.2-src/src/libstd/macros.rs:16:9

I suspect that the issue stems from models::utils::num_bits. I guess you are computing the amount of bits that are needed to represent the largest index (while loop) and then substract 1 to ensure that the radix model always predicts an index smaller than the branching_factor. However, your implementation appears to be off by 1, i.e. the additional nbits -= 1 is not needed.

NN model support

I noticed that more complex NN models could be used in the paper powered by TensorFlow. But there are only simple models here? How can I build an RMI with a more complex NN model? @RyanMarcus @anikristo

cleanup() should only be called on malloc'ed model parameters

Hi, I just trained a model with two linear_spline layers and a branching_factor of 200,000, i.e.

cargo run --release -- somedata_100M_uint64 sosd_rmi linear_spline,linear_spline 200000

The resulting C++ program does not compile with the following error:

sosd_rmi.cpp:11:5: error: no matching function for call to 'free'
    free(L1_PARAMETERS);
    ^~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk/usr/include/malloc/_malloc.h:42:7: note: candidate function not viable: no known conversion from 'const double [400000]' to 'void *' for 1st argument
void     free(void *);
         ^
1 error generated.

It seems like in the cleanup() function, free is called on L1_PARAMETERS although it was not allocated with malloc.

Looking at codegen::generate_code() free_code should possibly only be generated in case storage has value StorageConf::Disk(path), not in case of StorageConf::Embed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.