GithubHelp home page GithubHelp logo

pouchdb-community / transform-pouch Goto Github PK

View Code? Open in Web Editor NEW
102.0 9.0 29.0 31.35 MB

PouchDB plugin for modifying documents before and after storage in the database.

License: Apache License 2.0

JavaScript 100.00%

transform-pouch's Introduction

Transform Pouch

.github/workflows/transform-pouch.yml

Apply a transform function to documents before and after they are stored in the database. These functions are triggered invisibly for every get(), put(), post(), bulkDocs(), bulkGet(), allDocs(), changes(), and also to documents added via replication.

This allows you to:

  • Encrypt and decrypt sensitive document fields
  • Compress and uncompress large content (e.g. to avoid hitting browser storage limits)
  • Remove or modify documents before storage (e.g. to massage data from CouchDB)

Note: This plugin was formerly known as filter-pouch, but was renamed to be less confusing. The filter() API is still supported, but deprecated.

Usage

Just npm install it:

npm install transform-pouch

And then attach it to the PouchDB object:

var PouchDB = require('pouchdb');
PouchDB.plugin(require('transform-pouch'));

You can also use npm run build to compile browser-ready bundles.

API

When you create a new PouchDB, you need to configure the transform functions:

var pouch = new PouchDB('mydb');
pouch.transform({
  incoming: function (doc) {
    // do something to the document before storage
    return doc;
  },
  outgoing: function (doc) {
    // do something to the document after retrieval
    return doc;
  }
});

You can also use Promises:

var pouch = new PouchDB('mydb');
pouch.transform({
  incoming: function (doc) {
    return Promise.resolve(doc);
  },
  outgoing: function (doc) {
    return Promise.resolve(doc);
  }
});

Notes:

  • You can provide an incoming function, an outgoing function, or both.
  • Your transform function must return the document itself, or a new document (or a promise for such).
  • incoming functions apply to put(), post(), bulkDocs(), and incoming replications.
  • outgoing functions apply to get(), allDocs(), bulkGet(), changes(), query(), and outgoing replications.
  • The incoming/outgoing methods can be async or sync – just return a Promise for a doc, or the doc itself.

Example: Encryption

Update! Check out crypto-pouch, which is based on this plugin, and runs in both the browser and Node. The instructions below will only work in Node.

Using the Node.js crypto library, let's first set up our encrypt/decrypt functions:

var crypto = require('crypto');

var cipher = crypto.createCipher('aes-256-cbc', 'password');
var decipher = crypto.createDecipher('aes-256-cbc', 'password');

function encrypt(text) {
  var crypted = cipher.update(text, 'utf8', 'base64');
  return crypted + cipher.final('base64');
}

function decrypt(text) {
  var dec = decipher.update(text, 'base64', 'utf8');
  return dec + decipher.final('utf8');
}

Obviously you would want to change the 'password' to be something only the user knows!

Next, let's set up our transforms:

pouch.transform({
  incoming: function (doc) {
    Object.keys(doc).forEach(function (field) {
      if (field !== '_id' && field !== '_rev' && field !== '_revisions') {
        doc[field] = encrypt(doc[field]);
      }
    });
    return doc;
  },
  outgoing: function (doc) {
    Object.keys(doc).forEach(function (field) {
      if (field !== '_id' && field !== '_rev' && field !== '_revisions') {
        doc[field] = decrypt(doc[field]);
      }
    });
    return doc;
  }
});

(transform-pouch will automatically ignore deleted documents, so you don't need to handle that case.)

Now, the documents are encrypted whenever they're stored in the database. If you want to verify, try opening them with a Pouch where you haven't set up any transforms. You'll see documents like:

{
  secret: 'YrAtAEbvp0bPLil8EpbNeA==',
  _id: 'doc',
  _rev: '1-bfc37cd00225f68671fe3187c054f9e3'
}

whereas privileged users will see:

{
  secret: 'my super secret text!',
  _id: 'doc',
  _rev: '1-bfc37cd00225f68671fe3187c054f9e3'
}

This works for remote CouchDB databases as well. In fact, only the encrypted data is sent over the wire, so it's ideal for protecting sensitive information.

Note on query()

Since the remote CouchDB doesn't have accesss to the untransformed document, map/reduce functions that are executed directly against CouchDB will be applied to the untransformed version. PouchDB doesn't have this limitation, because everything is local.

So for instance, if you try to emit() an encrypted field in your map function:

function (doc) {
  emit(doc.secret, 'shhhhh');
}

... the emitted key will be encrypted when you query() the remote database, but decrypted when you query() a local database. So be aware that the query() functionality is not exactly the same.

Building

You can build transform-pouch for the browser with npm run build:

npm install
npm run build

This will place browser bundles, minified and unminified, in the dist/ folder.

Testing

You can run the test suite with npm test.

To run tests in Node specifically, using LevelDB:

npm run test:node

You can also run tests in a headless browser with mochify:

npm run test:browser

You can also check for code coverage using:

npm run coverage

You can run single test using options from mocha:

TEST_DB=local npm run test:node -- --reporter spec --grep search_phrase

The TEST_DB environment variable specifies the database that PouchDB should use. You may specify either local (which uses LevelDB) or http (which uses the $COUCH_URL environment variable to connect to a CouchDB installation.)

License

Apache-2.0

transform-pouch's People

Contributors

andreashohnholt avatar calvinmetcalf avatar djungowski avatar garbados avatar gr2m avatar hulkoba avatar jcoglan avatar marten-de-vries avatar nolanlawson avatar vzickner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transform-pouch's Issues

`incoming` handler runs twice when using `.bulkDocs()`

Since PouchDB 6.0.3, the incoming handler runs twice when using the .bulkDocs method. You can demostrate this issue with this snippet:

const PouchDB = require('pouchdb')
PouchDB.plugin(require('transform-pouch'))

const db = new PouchDB('.test')
let i = 0
db.transform({
  incoming: function (doc) {
    i++
    return doc
  }
})

Promise.resolve().then(async () => {
  await db.bulkDocs([{ _id: 'a' }])
  console.log(i === 2)
}).catch((err) => {
  console.error(err)
}).then(db.destroy)

`incoming` handler does not run against CouchDB

When using CouchDB, the incoming handler never runs. It does run when using local PouchDB adapters.

You can demonstrate this bug using this gist. For example:

$ curl https://gist.githubusercontent.com/garbados/50cf8156680c62dfad9db768a8f51a65/raw/76d5e3ca0da7b437540e005a4ee019575127231d/transform-pouch-bug.js > transform-pouch-bug.js
$ export COUCH_URL="http://[user]:[pass]@localhost:5984"
$ node transform-pouch-bug.js 
Both handlers ran.
$ USE_COUCH=true node transform-pouch-bug.js 
`incoming` handler never ran!

Uncaught (in promise) TypeError: Cannot read property 'map' of undefined(…)

Uncaught (in promise) TypeError: Cannot read property 'map' of undefined(…)
I get this error randomly during db reads. PouchDB 5.1.0 and Transform-pouch 1.0.2. It appears to break replication afterwards. I reproduced error by turning off internet connectivity, making changes, waiting a few seconds (with this warning PouchDB: the remote database may not have CORS enabled.If not please enable CORS: http://pouchdb.com/errors.html#no_access_control_allow_origin_header). On turning back connectivity, I get the error and syncing stops, until manually restarted.
I only have return doc in both incoming and outgoing. Syncing works as it should once I remove transform-pouch.

incoming not running on PUT with masked _id

I want to mask the _id with another fieldname.

When I add the following test, it does not work. The incoming-function will not fire. isUntransformable is false and I see no way why this should not work.

    it('transforms on PUT, with masked _id', function () {
      db.transform({
        incoming: function (doc) {
          doc._id = doc.passportId;
          delete doc.passportId;
          return doc;
        }
      });
      return db.put({passportId: 'foo'}).then(function () {
        return db.get('foo');
      }).then(function (doc) {
        doc._id.should.equal('foo');
      });
    });

When I use bulkDocs instead of put, it works as expected.

    it('transforms on bulkDocs, with masked _id', function () {
      db.transform({
        incoming: function (doc) {
          doc._id = doc.passportId;
          delete doc.passportId;
          return doc;
        }
      });
      return db.bulkDocs([{passportId: 'foo'}]).then(function () {
        return db.get('foo');
      }).then(function (doc) {
        doc._id.should.equal('foo');
      });
    });

Is this behavior intended?

Problem with live replication for pouch change event

I have implemented transform-pouch for encryption of pouch data. I am facing issue with live replication.

Any how outgoing function is getting already decrypted data. But idly outgoing function call when someone get data form pouch and this function should get encrypted data.

I had already encrypt data in incoming function of transform-pouch.

Detail of problem is describe here with code snippet

http://stackoverflow.com/questions/31244222/issue-with-pouch-transformation-during-replication

Encrytion Readme Example need a update

Hi,

I implement my encryption/decrytion (based on the sjcl lib) like the readme example but it does not work, because of the _revisions field.

This works for me:

localDB.transform({
        incoming: function (doc) {
          Object.keys(doc).forEach(function (field) {
            if (field !== '_id' && field !== '_rev' && field !== '_revisions') {
                //appuser is the current user logged in to the app
                var fieldValue = sjcl.encrypt(appuser.GetDBPassword(),angular.toJson(doc[field]));
                doc[field] = fieldValue;
            }
          });
          return doc;
        },
        outgoing: function (doc) {
          Object.keys(doc).forEach(function (field) {
            if (field !== '_id' && field !== '_rev' && field !== '_revisions') {
              var fieldValue = angular.fromJson(sjcl.decrypt(appuser.GetDBPassword(),doc[field]));
              doc[field] = fieldValue;
            }
          });
          return doc;
        }
    });

transform-pouch breaks Design Documents in PouchDB > 6.1.1

As of PouchDB 6.1.1, transform-pouch breaks the indexing of design documents and a direct query on a view does not return any result where it should. It does not even do something, it's enough that it's initialized

How to reproduce (I used PouchDB 6.3.4 and pouch-transform 1.1.3):

const pouchConnection = new PouchDB('some-pouchdb');
const documentId = 'some-id';

pouchConnection.transform({});

pouchConnection.put({
  _id: '_design/some_view',
  views: {
    some_view: {
      map: function(doc) {
        emit(doc.id, doc.value);
      }.toString(),
      reduce: '_sum'
    }
  }
})
  .then(() => pouchConnection.put({_id: documentId, value: 5}))
  .then(() => pouchConnection.query('some_view'))
  .then(docs => console.log(docs.rows.length));

Expected output: 1
Actual output: 0

If you remove the line pouchConnection.transform({}); everything works fine. PouchDB 6.1.0 is the last version, transform-pouch works with without any errors.

Docs with rev greater than 1-* not transforming when replicating from remote

Hi everyone!
Working with PouchDB and transform-pouch for some time now and come across this problem (original issue: pouchdb/pouchdb#7034)

Issue

Documents with revision greater than 1-* are not transforming when replicating from remote-CouchDB.

_changes?style=all_docs&seq_interval=100&since=0&limit=100 gets all changes, but in
_all_docs?conflicts=true&include_docs=true request payload are only keys of docs with rev = 1.*

Info

  • Environment: browser
  • Platform: Chrome
  • Adapter: IndexedDB
  • Server: CouchDB

Reproduce

  1. Put document in remote (rev 1).
  2. Modify doc and put it in remote (rev 2).
  3. Set some transformation on an outgoing docs (ie.: doc.local = true; doc.remote = false;).
  4. Start replication to local PouchDB.
  5. List local docs (should not contain changes from transformation).

Can't get glitch to work with transform-pouch (any ideas?):
https://glitch.com/edit/#!/understood-ear

I've been struggling with this for almost a week now and don't know if I'm doing something awfully wrong or this is a bug / desired behavior. Help, please? :)

incoming transform :: document updates are not live replicated

I've noticed that the act of attaching transform-pouch to a DB breaks live replication, specifically in the case where an existing document is updated. Creates are fine.

I don't know how to fix this, but I have managed to recreate within the transform-pouch test harness. I've included the test case in my local branch (see linked PR).

Test case may be over-complicated; apologies.

Sync failed for updated docs from a local encrypted db

Hi,

if I update a doc and then try to sync this doc to a remote db - pouchdb throw this exception:
Error: There was a problem getting docs.
at finishBatch (pouchdb-6.1.1.js:13009)

If I create a new doc in an empty db (local and remote) the first sync works well.

Here is the code of my crypto transform (I am using sjcl because it works well on iOS Safari) :

localDB.transform({
            incoming: function (doc) {
              Object.keys(doc).forEach(function (field) {
                if (field !== '_id' && field !== '_rev' && field !== '_revisions' && field !== '_attachments') {
                    var password = appuser.GetDBPassword();
                    var fieldValue = sjcl.encrypt(password,angular.toJson(doc[field]));
                    doc[field] = fieldValue;
                }
              });
              return doc;
            },
            outgoing: function (doc) {
              Object.keys(doc).forEach(function (field) {
                if (field !== '_id' && field !== '_rev' && field !== '_revisions' && field !== '_attachments') {
                  var password = appuser.GetDBPassword();
                  var fieldValue = angular.fromJson(sjcl.decrypt(password,doc[field]));
                  doc[field] = fieldValue;
                }
              });
              return doc;
            }
        });  

incoming is not triggered when using pouchDB put function

Hello,

I used transform-pouch (v1.1.3) within node.js (v6.7.0) to access a couchdb (v1.6.1) backend server. When i use PouchDB (v 6.0.5) post function, incoming is triggered. But, when i use PouchDB put function it is not.

Can you help ?
Thanks for your support.

Here is the code of my node app and the resulting console output :

Node app code

var express = require('express')
  , http = require('http')

var app = express();
app.set('port', process.env.PORT || 3000);

var PouchDB = require('pouchdb');
PouchDB.plugin(require('transform-pouch'));
var db = new PouchDB('http://localhost:5984/db');
db.transform({    
	outgoing: function(doc) {
		console.log("****Outgoing****" + ":" + JSON.stringify(doc));
		return doc;
	},
	incoming: function(doc) {
		console.log("****Incoming****" + ":" + JSON.stringify(doc));
		return doc;
	}
});

db.put({"_id": "test", "PUT":"PUT"}).then(function (result) {
	console.log("PUT result")
	console.log(result)
	db.get("test").then(function (result) {
		console.log("GET result");
		console.log(result);
		db.post({"test": "test", "POST":"POST"}).then(function (result) {
			console.log("POST result")
			console.log(result)
		})
	})
}).catch(function (err) {
	console.log(err);
});

var server = http.createServer(app).listen(app.get('port'), function(){
  console.log('Express server listening on port ' + app.get('port'));
});

Node output console

Express server listening on port 3000
PUT result
{ ok: true,
  id: 'test',
  rev: '7-92c1e5528c692fbeb6cf97f0687f53e7' }
****Outgoing****:{"_id":"test","_rev":"7-92c1e5528c692fbeb6cf97f0687f53e7","PUT":"PUT"}
GET result
{ _id: 'test',
  _rev: '7-92c1e5528c692fbeb6cf97f0687f53e7',
  PUT: 'PUT' }
****Incoming****:{"test":"test","POST":"POST"}
POST result
{ ok: true,
  id: 'bf6cf5681c3fb3207a5380f1ad004629',
  rev: '1-319fa1494afe5f62e191725989a66c34' }

Dependent dbs

When an index is created the new dependent db won't have the transform, E.g.if you use crypto pouch your query index is not encrypted

Rename to transform-pouch

filter-pouch makes me think of Array.prototype.filter, which is not really what this thing is doing.

map-pouch would be more appropriate, but the word map is overloaded. Groovy has a transform method that does the same thing. I kind of like that name.

Non-replicating (local) / replication-only transformation functions

It would make a lot of sense - especially for applications that maintain a special in-DB representation of data that should be replicated without transformation, like, say, crypto-pouch - if there were variants like incomingLocal, outgoingLocal, incomingReplication, and outgoingReplication, where the latter two only apply for replication, and the former only apply for non-replicating actions (get, put, etc).

I guess this can be emulated right now by doing something like this:

const dbOriginal = new PouchDB('mydb');
const dbOnlyForLocalUse = new PouchDB(dbOriginal)
  .transform({incoming: incomingLocal, outgoing: outgoingLocal});
const dbOnlyForReplication = new PouchDB(dbOriginal)
  .transform({incoming: incomingReplication, outgoing: outgoingReplication});

but that would lead to issues, with, say, a general function that takes one database as an argument, on which it performs both replication and non-replication operations.

Also, does new PouchDB(dbTransformHasAlreadyBeenCalledOn) copy the transformations that are on that DB? My gut says it should, but I'm not certain (and I'm not sure if the docs make any statement on the matter one way or another).

Does transformation affect `_rev` when replicating?

Was just thinking about this: if I put an outgoing transformation on a database and replicate to a fresh database, is that new database going to now think it has the same revision as the original database, with altered content? What about if I replicate a document that the database has - is it going to reject the replication, ignore it (thinking it already has the document), or record an update?

This seems like it'd be an important reason for addressing #37, and could be (part of) replication issues that have been reported like #8.

Problem with unused library functions

My name is Hernan; with a group of colleagues we are conducting a research about unused code present in dependencies of JavaScript projects. We call this functions, UFF (Unused foreign functions). We found that in most projects there exist a great amount of UFF that are being included in the final bundle.

In the case of transform-pouch (v 1.1.3) our tools detected approximately 78 unused function in dependencies. Removing those functions, the size of transform-pouch bundled could be reduced at least 26% (All tests passed). I replaced the bundled in several projects that use transform-pouch as pouch-pid, crypto-pouch, pouch-box. I’m attaching the reduced version of your project.
pouchdb.transform-pouch(optimized).txt

I’ll be very grateful if you can answer me the following questions:
-Did you were aware of the existence of these unused functions in your projects?
-Do you think that this is a problem?
-Do you think that can be useful a tool for deal with this kind of problem?

Thanks in advance.

Cheers,

outgoing transform gets called twice when get()-ing first document from db after closing and re-opening

Hey guys,

first of all thank you very much for PouchDB, I think this is a truly awesome project!

Now, I am experiencing a problem using transformers which I was able to reproduce with this jasmine test:

    it("shows duplicate outgoing transform calls", function (done) {
        var database = new PouchDB("bar", {});
        database.transform({
            incoming: function incoming(doc) {
                console.log("incoming called for doc", doc);
                return doc;
            }, outgoing: function outgoing(doc) {
                console.log("outgoing called for doc", doc);
                return doc;
            }
        });
        database.put({ _id: "1", val: "bar" }).then(function () {
            return database.close();
        }).then(function () {
            database = new PouchDB("bar", {});
            database.transform({
                incoming: function incoming(doc) {
                    console.log("incoming called for doc", doc);
                    return doc;
                }, outgoing: function outgoing(doc) {
                    console.log("outgoing called for doc", doc);
                    return doc;
                }
            });
            database.get("1").then(function (doc) {
                return database.destroy();
            }).then(function () {
                done();
            });
        });
    });

The console output I get is this (consistent with a behavior I saw in my production code):

LOG: 'incoming called for doc', Object{_id: '1', val: 'bar'}
LOG: 'outgoing called for doc', Object{val: 'bar', _id: '1', _rev: '1-506c4ea0b5a45566efbb9b6013ebecf7'}
LOG: 'outgoing called for doc', Object{val: 'bar', _id: '1', _rev: '1-506c4ea0b5a45566efbb9b6013ebecf7'}

I am using PouchDB 5.3.2 together with transform-pouch 1.1.1. This happens in my production code when an app is started with a DB containing a few documents and getting the first time any one of these documents. Subsequent calls to get() work as expected, i.e. outgoing is called only once. I think this might be a bug, isn't it?

Kind Regards

Sven

How to remove documents during incoming/outgoing handlers?

I'm looking to setup a PouchDB environment such that documents can contain an expiry property (e.g. an expiresAt timestamp).

As such, I am interested in if I can utilize this plugin's incoming/outgoing functionality to check a document's expiry when being read/written and verify it should still exist. If so, return it; if not, remove it from the database and do not return it.

Looking through the code and issue tracker, it is not at all clear how to achieve this.

The README does, however, mention this basic concept in passing:

This allows you to:

  • ...
  • Remove or modify documents before storage (e.g. to massage data from CouchDB)

Can someone please help me understand how to hook such a thing up?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.