pouchdb / mapreduce Goto Github PK
View Code? Open in Web Editor NEWPouchDB map/reduce plugin ( ⚠️ UPDATE: moved back to PouchDB core ⚠️ )
PouchDB map/reduce plugin ( ⚠️ UPDATE: moved back to PouchDB core ⚠️ )
This is pretty bad and misleading
Why here? Why outdated?
This makes the code a lot DRYer: https://github.com/pouchdb/mapreduce/tree/1658-temp-using-persistent. Closures are still supported by passing in the original map/reduce functions.
I believe the main issue is going to be dealing with temp views, currently toPromise just checks if the last argument is a function and if so assume it's a callback, but with temp views you will sometimes have cases of wanting to do db.query(function).then..
so we probobly would want to switch it to check if the last argument is a function and the arguments length is greater then 2 as I can't see a situation in this plugin of only passing a callback and nothing else.
Occasionally when I run npm test
I see:
174 passing (11s)
1 failing
1) local with temp views: views Testing query with keys:
AssertionError: returns one doc: expected [] to have a length of 1 but got 0
This may just be a leveldb problem, since I haven't seen it in the browser yet.
from daleharvey/pouchdb#1264
I'm not a dependencies master but currently if someone downloads the repo it's broken because mapreduce depends on some newer pouchdb.
Shouldn't we do sth like devDependency.pouchdb = daleharvey/pouchdb#master
?
Before we release persisted mapreduce to the world in PouchDB 2.2.0, we should revisit the sort order and confirm that it's in line with CouchDB. I just checked, and the only test where we confirm the assumed [key, docid, value]
ordering is this one.
It would be nice to test e.g. values with different object types (arrays, objects, booleans, the doc itself, the doc with a joined id, etc.).
See discussion in #12. Apparently we just need to sort on values in case the key and the docid are both the same (i.e. a doc emits the same key multiple times, but with different values).
I don't believe we need to do this
As realized in https://github.com/pouchdb/mapreduce/pull/88/files#r11305918. First, though, we need to update the nightly version of PouchDB from 2.1.1-alpha
to 2.2.0-alpha
.
Right now only group_level=exact
is supported.
CouchDB supports numeric group_level
values:
The group and group_level options control whether the reduce function reduces to a > set of distinct keys or to a single result row. group_level lets you specify how many > items of the key array are used in grouping; group=true is effectively the same as group_level=999 (for an arbitrarily high value of 999.) Don't specify both group and group_level; the second one given will override the first.
(see HTTP view API)
Another weird edge case. CouchDB returns this error if you use keys
with >1 key and group=false
:
{
"status":400,
"name":"query_parse_error",
"message":"Multi-key fetchs for reduce views must use `group=true`"
}
Also it would be nice if the index names didn't have to have a slash in them. For instance, if they had a slash we could do what we normally do, but if not we could just rename myIndex
to myIndex/myIndex
. It's a good Couch practice anyway to have one map/reduce function per design doc.
We should extend it first.
This will exist and it does not test what it's suppose to test:
Line 418 in d6add3c
By the way it bothers me really much that if I add to that line not
(should.not.exist(...)) then the test report is:
1) local views Views should include _conflicts:
Error: timeout of 2000ms exceeded
at null.<anonymous> (/home/neo/mapreduce/node_modules/mocha/lib/runnable.js:165:14)
at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)
And it makes no sense. I suppose it has to do with assertion library throwing errors and so done
is not fired but it should not work like that.
I wasted some time finding this one.
This could be probably solved by #70.
https://github.com/pouchdb/mapreduce/blob/master/index.js#L7
is there twice, it could possibly even be completely removed as its beginning to date already
When providing a callback to db.query(...)
any exceptions thrown in that callback are silently swallowed. For example:
function myView (callback) {
db.query('mydesign/myview', {startkey: 'a'}, function (err, result) {
throw new Error('whoops!')
callback(err, result)
})
}
The error will never be seen, and instead you end up with what appears to be a hung query. This makes sense if you are planning to return the promise higher up the call stack:
function myView () {
return db.query('mydesign/myview', {startkey: 'a'})
}
but that pretty much forces your clients to use promises, which is a bit obnoxious. In my app, I've been doing this where I have calls to query
:
function myView (callback) {
db.query('mydesign/myview', {startkey: 'a'}, function (err, result) {
process.nextTick(callback.bind(this, err, result))
})
}
That way any exceptions thrown in my callback bubble up as expected (crashing my server or whatever, which is what I expect them to do).
I propose that query
should do this automatically when a callback is provided, as there is no meaningful way to handle callback exceptions in a promise without also forcing clients to consume promises.
Since @calvinmetcalf is a maintainer both here and for lie
, I'd also humbly suggest that lie
should implement something like Bluebirds nodeify so that pouchdb.query can behave as outlined above when given a node-style callback.
I think we can remove fake design test as a duplicate of the more advanced one](
Lines 505 to 523 in a2442c4
Thoughts?
I've been working hard to improve performance of IDB/WebSQL's basic get/put/allDocs/bulkDocs operations, which are the main bottleneck in persisted mapreduce, since we use a regular PouchDB under the hood.
However, the main drag on performance is just that we do a single atomic operation for each change we get from db.changes()
. There's a lot of unnecessary reading and writing from _local
docs that we could avoid.
I'm going to wait until #100 is merged, but for now I'm thinking of aiming for a batch size of about 50/100 or so. Holding 50 docs in memory shouldn't cause OutOfMemory errors on most mobile devices for most doc sizes, and if it's a problem in the future, we can make it configurable.
In my big commit (dfe44b0) I removed emit('error')
. This was undocumented, untested so can anyone tell me what it was supposed to do?
As specified here: pouchdb/pouchdb#1820
need docs on how to run tests, probably links back to the pouchdb repo / contributors file etc
Line 4 in bbb7da1
We should prevent this if possible: 111a23d
moved from daleharvey/pouchdb#1432
If I am querying CouchDB
AND I provide a startkey + endkey
AND I provide descending = true
THEN I am also expected to reverse the values of startkey and endkey parameters for the query to worksee http://docs.couchdb.org/en/latest/couchapp/views/intro.html?highlight=descending
If I am querying CouchDB
AND I provide a startkey + endkey
AND I provide descending = true
THEN I am NOT expected to reverse the values of startkey and endkey parametersIs this a deliberate feature of PouchDB? PouchDB does indeed seem to reverse the order of the returned values but I don't need to also switch the startkey/endkey parameters.
Similar to https://github.com/daleharvey/pouchdb/pull/1239, we need to do this for mapreduce.
if you use "md5-jkmyers": "0.0.1"
for the md5 then it will only get included once.
I available, mapreduce should use Global.Promise
Throw here some console.log:
Line 690 in a2442c4
I'd be delighted to see somewhere in the documentation exact description of total_rows because it's just my guess that it describes the number of documents before any filtering (startkey, endkey, keys, key)
What's more I'm not sure what offset says. I feel like it's the offset (position on the full list) of the first returned row.
If we ever need to run migrations on the IDs themselves (e.g. because we find out we made a mistake in toIndexableString), it'd be useful to be able to know we're in a mapreduce db.
I'm not sure whether it's correct reason of this promise being rejected
{ sum: '0lala',
min: NaN,
max: NaN,
count: 1,
sumsqr: { [invalid_value: builtin _stats function requires map values to be numbers] name: 'invalid_value', status: 500 } }
Should it just be the thrown error? (ie. this.sumsqr)
I have to dig a little more because it looks like my couchdb kinda crashes (does not respond to the view query) if I emit non-number value and use _stats.
Due to this setup I can't run test with mocha
directly for example to grep specific test
Line 14 in 82c6099
If you include any of these weird objects as a key or part of a key, Couch converts:
null, undefined, Infinity, -Infinity, NaN -> null
date -> JSON.stringify(date)
'' -> '' // no conversion
We're already doing undefined, null, and the empty string correctly, but not the others. Is there anything I missed?
I had bad feelings about a10c698 and it looks like I were right.
You can find broken tests in my commit.
Reason: you can only use strings for keys in js objects. Also: I don't like that this lookup is needed only for keys but part of its implementation is even inside emit. It should be enclosed inside mapUsingKeys.
currently all non-http queries are done from scratch each time, we could save the result from the map query to a _local document and the sequence number, subsequent queries to avoid having to iterate through the whole database, we could even listen to the changes feed and update cache every time a document is created/updated.
If a view name contains a slash, then create a design doc with the name on the left and a view with the name on the right.
If the view name doesn't contain a slash, then just use the same string for both.
#68 used callbacks instead of promises, but @neojski has a branch where he fixes that: https://github.com/pouchdb/mapreduce/tree/1658-promised
Is https://github.com/pouchdb/mapreduce/blob/master/test/test.js#L3 needed since it is in pouch.js
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.