GithubHelp home page GithubHelp logo

mongodb_beaker's Introduction

mongodb_beaker

MongoDB. backend for Beaker.'s caching / session system.

Based upon Beaker.'s ext:memcache code.

This is implemented in a dont-assume-its-there manner. It uses the beaker namespace as the mongodb row's _id, with everything in that namespace ( e.g. a session or cache namespace) stored as a full document. Each key/value is part of that compound document, using upserts for performance.

I will probably add a toggleable option for using subcollections, as in certain cases such as caching mako templates, this may be desirable / preferred performance wise.

Right now, this is primarily optimized for usage with beaker sessions, although I need to look at tweaking beaker session itself as having it store in individual keys rather than everything in a 'session' key may be desirable for pruning/management/querying.

I have not tackled expiration yet, so you may want to hold off using this if you need it. It will be in the next update, but limits usefulness primarily to sessions right now. (I'll tack a cleanup script in later as well).

Due to the use of upserts, no check-insert is required, but it will overwrite previous values which should be expected behavior while caching. Safe is NOT invoked, so failure will be quiet. TODO - Safe as overridable config option?

Note that, unless you disable. it, the mongodb_beaker container will use pickle (tries loading cpickle first, falls back on pickle) to serialize/deserialize data to MongoDB.

Beaker should maintain thread safety on connections internally and so I am relying upon that rather than setting up threadlocal, etc. If this assumption is wrong or you run into issues, please let me know.

Configuration

To set this up in your own project so that beaker can find it, it must define a setuptools entry point in your setup.py file. If you install from the egg distribution, mongodb_beaker's setup.py SHOULD create a beaker.backend entry point. If you need to tweak it/see how it's done or it just doesn't work and you need to define your own, mine looks like this:

>>> entry_points="""
... [beaker.backends]
... mongodb = mongodb_beaker:MongoDBNamespaceManager
... """,

With this defined, beaker should automatically find the entry point at startup (Beaker 1.4 and higher support custom entry points) and load it as an optional backend called 'mongodb'. There are several ways to configure Beaker, I only cover ini file (such as with Pylons) here. There are more configuration options and details in the Beaker configuration docs [1].

[1]Beaker's configuration documentation - http://beaker.groovie.org/configuration.htm

I have a few cache regions in one of my applications, some of which are memcache and some are on mongodb. The region config looks like this:

>>> # new style cache settings
... beaker.cache.regions = comic_archives, navigation
... beaker.cache.comic_archives.type = libmemcached
... beaker.cache.comic_archives.url = 127.0.0.1:11211
... beaker.cache.comic_archives.expire = 604800
... beaker.cache.navigation.type = mongodb
... beaker.cache.navigation.url = mongodb://localhost:27017/beaker.navigation
... beaker.cache.navigation.expire = 86400

The Beaker docs[1] contain detailed information on configuring regions. The item we're interested in here is the beaker.cache.navigation keys. Each beaker cache definition needs a type field, which defines which backend to use. Specifying mongodb will (if the module is properly installed) tell Beaker to cache via mongodb. Note that if Beaker cannot load the extension, it will tell you that mongodb is an invalid backend.

Expiration is standard beaker syntax, although not supported at the moment in this backend.

Finally, you need to define a URL to connect to MongoDB. This follows the standardized MongoDB URI Format[3]_. Currently the only options supported is 'slaveOK'. For backwards compatibility with old versions of mongodb_beaker, separating database and collection with a '#' instead of '.' is supported, but deprecated. The syntax is mongodb://<hostname>[:port]/<database>.<collection>

You must define a collection for MongoDB to store data in, in addition to a database.

If you want to use MongoDB's optional authentication support, that is also supported. Simply define your URL as such:

>>> beaker.cache.navigation.url = mongodb://bwmcadams@passW0Rd?@localhost:27017/beaker.navigation

The mongodb_beaker backend will attempt to authenticate with the username and password. You must configure MongoDB's optional authentication support[2]_ for this to work (By default MongoDB doesn't use authentication).

[2]MongoDB Authentication Documentation: http://www.mongodb.org/display/DOCS/Security+and+Authentication
[3]MongoDB URI Format: http://www.mongodb.org/display/DOCS/Connections

Reading from Secondaries (SlaveOK)

If you'd like to enable reading from secondaries (SlaveOK), you can add that to your URL:

>>> beaker.cache.navigation.url = mongodb://bwmcadams@passW0Rd?@localhost:27017/beaker.navigation?slaveok=true

Using Beaker Sessions and disabling pickling

If you want to save some CPU cycles and can guarantee that what you're passing in is either "mongo-safe" and doesn't need pickling, or you know it's already pickled (such as while using beaker sessions), you can set an extra beaker config flag of skip_pickle=True. .. admonition:: To make that perfectly clear, Beaker sessions are ALREADY PASSED IN pickled, so you want to configure it to skip_pickle. It shouldn't hurt anything to double-pickle, but you will certainly waste precious CPU cycles. And wasting CPU cycles is kind of counterproductive in a caching system.

My pylons application configuration for mongodb_beaker has the following session_configuration:

>>> beaker.session.type = mongodb
... beaker.session.url = mongodb://localhost:27017/beaker.sessions
... beaker.session.skip_pickle = True

Depending on your individual needs, you may also wish to create a capped collection for your caching (e.g. memcache-like only most recently used storage)

See the MongoDB CappedCollection. docs for details.

Sparse Collection Support

The default behavior of mongodb_beaker is to create a single MongoDB Document for each namespace, and store each cache key/value on that document. In this case, the "_id" of the document will be the namespace, and each new cache entry will be attached to that document.

This approach works well in many cases and makes it very easy for Mongo to efficiently manage your cache. However, in other cases you may wish to change behavior. This may be for efficiency reasons, or because you're worried about documents getting too large.

In this case, you can enable a "sparse collection" mode, where mongodb_beaker will create a document for EACH key in the namespace. When sparse collections are enabled, the "_id" of a document is a compound document containing the namespace and the key:

{ "_id" : { "namespace" : "testcache", "key" : "value" } }

The cache data for that key will be stored in a document field 'data'. You can enable sparse collections in your config with the 'sparse_collections' variable:

>>> beaker.session.type = mongodb
... beaker.session.url = mongodb://localhost:27017/beaker.sessions
... beaker.session.sparse_collections = True

Note for Users of Previous Releases

For bug fix and feature reasons, MongoDB Beaker 0.5+ are not compatible with caches created by previous releases. Because this is cache data, it shouldn't be a big deal. We recommend dropping or flushing your entire cache collection(s) before upgrading to 0.5+ and be aware that it will generate new caches.

mongodb_beaker's People

Contributors

bwmcadams avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mongodb_beaker's Issues

exception passed to logging

In init.py line 509 the exception is passed to the log function. This results in an exception because logging will try to format the string with the argument, however, there is no place to put it in the string. Just remove it and we are fine. Thanks

Session Url problem for Replica Sets

Hi,

I have a mongodb replica-set, and I have provided the following as url in my pyramid app:

session.url = mongodb://xyz.example.com,xyz2.example.com,xyz3.example.com/beaker.sessions?replicaSet=name

But the host_uri being formed by the mongodb_beaker is incorrect, it doesn't have ',' between these hosts in the uri formed.

host_uri = mongodb://xyz.example.com:27017xyz2.example.com:27017xyz3.example.com:27017

-Ravi

Patch suggestion for dotted keys

Short version : This will patch addresses the issue where the beaker keys may contain dots.

Long version:
I have used your library as a beaker cache and learned that when a function is decorated with beaker and if any of the parameters passed to this function has dots, the cache does not work.

Example:
@cache_region('short_term')
def some_function(value1):
....
....
return some_computed_value

if in the above function the value1 is passed as "some.dotted.string" the cache on this function does not work. This is because the "key" for the above function will be:
"some_function some.dotted.string"

Mongodb interprets the above key and store it something as:
{some_function some: { dotted : { string : .....}

As such the passed key and stored keys never match and cache fails.

Suggested Patch/Fix:

  1. Add the Following function:
    def format_key(self,key):
    return re.sub(r'[\s]+',' ',key).replace(' ',':').replace('.',':')
  2. Call it in every function that deals with keys e.g:
    contains
    set_value
    get_item
    delitem

The above essentially, converts the key as follows:
"some_function some.dotted.string" => "some_function:some:dotted:string"

invalidStringData when using as cache

Hi:
I am trying to use the mongodb_beaker as a cache for pylons applications. I am running pylons 1.0 (not pyramid). My methods return usual python objects. but when I configure beaker to use mongodb cache, I get the following error. Is there a limitation and/or any specific requirement for the data type/format being returned by the methods that are being decorated (using cache_region) to use mongodb cache?

InvalidStringData: strings in documents must be valid UTF-8:

don't encode when depickling

In init.py line 507 please remove the encode call. No idea what it's supposed to do there. When pickling no decode is done, why should we encode when depickling? Apart from that it raises an exception anyway (it's a byte object which as no such method).

error: README.rst: No such file or directory in easy_install

Easy install is reporting this error...

C:\Program Files\Java\jdk1.6.0_26\jre\lib>easy_install -U mongodb_beaker
install_dir C:\Python27\Lib\site-packages
Searching for mongodb-beaker
Reading http://pypi.python.org/simple/mongodb_beaker/
Reading http://bitbucket.org/bwmcadams/mongodb_beaker/
Reading http://github.com/bwmcadams/mongodb_beaker/
Best match: mongodb-beaker 0.5
Downloading http://pypi.python.org/packages/source/m/mongodb_beaker/mongodb_beak
er-0.5.tar.gz#md5=a8681c2c9892c1c6097070425e3b6881
Processing mongodb_beaker-0.5.tar.gz
Running mongodb_beaker-0.5\setup.py -q bdist_egg --dist-dir c:\temp\easy_install
-ogdyy_\mongodb_beaker-0.5\egg-dist-tmp-8xv7zi
error: README.rst: No such file or directory

'MongoDBNamespaceManager' object has no attribute 'lock_dir'

Hi,

I am running into a few issues. First is that installing mongodb_beaker seems to ignore the fact that I'm running in a virtual enviornment and installs to "build/bdist.win-amd64/egg/mongodb_beaker" when i manually try to install it.

Secondly,

we plan on running a distributed mongodb for sessions and so I would think that a lock file on the local webserver would be out of the question (and that mongo would handle the concurrent writes and whatnot) and so was not specifying a lock_dir. When I specified a lock dir it seemed to work. But I don't want to use a lock file and nowhere did I read in the documentation that it is required. Any thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.