Comments (8)
Here is an idea for how to proceed here:
- Create a representative benchmark that scans 20MB of 1KB random JSON objects, see where we currently are - both cold (using purge command above) and warm
- Put the same data in LevelDB and do same test
- Put the same data in Noms on the CLI and do same test
from replicache.
I've been working on making scan return an AsyncIterable
.
class ScanResult implements AsyncIterable<JSONValue> { ... }
scan(options?: ScanOptions): ScanResult;
One of the outcomes of that is that the transaction will escape query
if we return the ScanResult
from query
(instead of draining the async iterator in query
):
const iter = await rep.query(tx => tx.scan());
for await (const v of iter) { // does _invoke with the tx ID from the tx in the query
console.log(v);
}
If we do the above, the transaction ID is used to read more values in the async iterator but the way things are currently structured we close the transaction at the end of the async query
function.
My proposed solution is to use ref counting for the transaction and when the ref goes down to 0
we call _invoke
'closeTransaction'
. We could also use WeakRef but it is not widely supported yet and there is no guarantee that the reference is garbage collected.
@aboodman Feedback wanted if this is worth it?
from replicache.
I don't quite follow the refcounting or weakref ideas, but it is important that the tx has a well-defined lifetime. Once we have GC implemented inside the datastore, we need to know what is safe to GC. TX lifetimes are part of this (we can't collect anything there's an open tx for).
In other words, the scan iterator should stop working if it escapes the tx. Later on, we might decide to have a different tx api like:
tx = replicache.read()
tx.scan()
tx.close()
Even in that case scan must stop working after its associated tx has closed.
from replicache.
the scan iterator should stop working if it escapes the tx
to be more precise: the scan iterator should stop working after the associated tx closes.
from replicache.
There is really no guarantee that closeTransaction gets called.
The API we currently have allows scan to be used outside of query. That works fine since we close the implicit transaction when the iterator is closed. The problem is when we use scan inside query because query handles the transaction and query closes the transaction when query is done.
What I'm suggesting is to keep track of the iterators and when query is done and all the iterators are done we close the transaction.
This is easiest done with ref counting (but do not mistake it with ref counting used for memory management).
from replicache.
Sorry, my comments should probably have been on #30
from replicache.
I did some performance testing
I had a DB of 1000 key-value-pairs and the value is around 50 characters when serialized to JSON.
Setting the page size to > 2000 makes us fetch all scan items in one go:
Total time to scan: 45ms (according to JS)
02 Jul 20 16:19:17.8586 -0700 DBG rpc --> data={} db=perf gr=18 req=openTransaction rid=2049
02 Jul 20 16:19:17.8587 -0700 DBG rpc <-- cid=7y8QzWw7idnhCy5AhSHUBG db=perf dur=0.167184 gr=18 req=openTransaction rid=2049
02 Jul 20 16:19:17.8672 -0700 DBG rpc --> data="{\"transactionId\":22,\"prefix\":\"\",\"limit\":10000}" db=perf gr=18 req=scan rid=2050
02 Jul 20 16:19:17.8888 -0700 DBG rpc <-- cid=7y8QzWw7idnhCy5AhSHUBG db=perf dur=21.604977 gr=18 req=scan rid=2050
And from openTransaction
to end of scan
30ms
Actual time in Dispatch 22ms
The two requests have:
Duration: 9.07 ms (7.55 ms network transfer + 1.51 ms resource loading)
Duration: 34.43 ms (27.11 ms network transfer + 7.32 ms resource loading)
The self js time for the parts are .19ms + .44ms + 1.56ms = 2.19ms
Summary
At this point Go is taking up 50% the time and the other 50% is used by HTTP.
from replicache.
If I change the page size to 100 (ie 10 requests)
The total time is 176ms according to JS.
Go log from openTransaction
to last scan
me 167ms.
The total time spent in JS is now 91ms
Actual time spent in repm.Dispatch
: 30ms
There is a lot of queueing and waiting for responses. One example request:
from replicache.
Related Issues (20)
- ReadTransaction.scan ignores limit option when an indexName is provided
- Add support for other JavaScript types to indexes HOT 3
- RFE: Add a push() method HOT 4
- Question: Subscribing to all changes? HOT 2
- Caching pulls for high scale HOT 1
- RFE: A way to know if/when at least one sync has happened
- Obtaining the version number of the domain object that was last know to the a client? HOT 1
- Invalid ref count during persist HOT 1
- License Key Socket Connection Timeout HOT 9
- Data not persisting through refreshes HOT 2
- Replicache should print a log (at debug) when the cookie doesn't increase HOT 3
- Link broken HOT 1
- First-class reactive queries
- Non-covering indexes
- Incorrect license link in getting started guide HOT 4
- Issue while trying to generate license HOT 1
- dropAllDatabases does not respect experimentalCreateKVStore HOT 4
- A way to filter out "self-inflicted" changes
- how to pass addtional header with the Pull URL HOT 2
- Error: Failed to create customer HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from replicache.