GithubHelp home page GithubHelp logo

Comments (8)

jacobxli-xx avatar jacobxli-xx commented on May 2, 2024 2

Hey, Andrew.
The answer for the both questions are yes.
For both explicit flush and auto flush, we are going to limit the scope of verification to recover all writes till the maximum sequence number of write recorded before onFlushCompleted() (test cases limited to only 1 column family).

from rocksdb.

jacobxli-xx avatar jacobxli-xx commented on May 2, 2024 1

hi, ajkr@ and hx235@.
Glad to take this task : )

from rocksdb.

ajkr avatar ajkr commented on May 2, 2024 1

because when we do explicit flush on specific cf, it's not guaranteed that all cfs will be flushed, so the maximum of all seqno is not guaranteed to be equal to the maximum seqno of the cf we flushed.

It is guaranteed in the case where it is needed (atomic_flush=true) -

if (FLAGS_atomic_flush) {
return db_->Flush(flush_opts, column_families_);
}

It is not guaranteed when WAL is enabled and atomic_flush is disabled, but there it is not needed because WAL ensures consistent recovery across column families.

WAL-disabled, atomic_flush-disabled may be worth testing at some future point but that is still an open topic to discuss. My thoughts are here - #11841.

Another case that may be worth testing at some point is WAL-disabled, atomic_flush-enabled, and manual flush triggered on a subset of column families (what you suggested we are already doing). Still, it's not something we decided to do yet, so adding extra complexity assuming we will do it feels premature

from rocksdb.

hx235 avatar hx235 commented on May 2, 2024

In case it's not already obvious, for WAL disabled case, we will only consider with atomic_flush=true. For a flush, we do not need to verify recoverability of writes concurrent to that flush as there is no guarantee from RocksDB on whether that concurrent writes are included in that flush or not.

from rocksdb.

ajkr avatar ajkr commented on May 2, 2024

@jacobxli-xx Thank you, it is assigned now.

from rocksdb.

ajkr avatar ajkr commented on May 2, 2024

@jacobxli-xx Here are some questions I had about what you are planning to do:

  1. For explicit sync mechanisms, will the scope be limited to verifying writes that completed before the explicit sync are recovered?
  2. For implicit sync mechanisms (i.e., automatic flush), will the scope be limited to verifying writes that are persisted according to OnFlushCompleted()?

If any answers are no, we'll probably want to know the reason and eventually the plan to make a guess of whether we'd be on board with it.

from rocksdb.

hx235 avatar hx235 commented on May 2, 2024

@jacobxli-xx If you are considering storing and m-mapping map<std::string, uint64_t> cf_flushed_seqno, I wonder why can't we just store and mmap the maximum of all the seqnos in the map?

from rocksdb.

jacobxli-xx avatar jacobxli-xx commented on May 2, 2024

@hx235 The m-mapping map<std::string, uint64_t> cf_flushed_seqno stores the cf name and the corresponding maximum flushed seqno of the specific cf. Why we do not only store the maximum of all the seqnos is because when we do explicit flush on specific cf, it's not guaranteed that all cfs will be flushed, so the maximum of all seqno is not guaranteed to be equal to the maximum seqno of the cf we flushed.

Although our test cases are limited to 1 cf, we use this map for forward compatibility as it can be used in multiple cfs cases.

from rocksdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.