GithubHelp home page GithubHelp logo

adewes / blitzdb Goto Github PK

View Code? Open in Web Editor NEW
330.0 330.0 37.0 1.15 MB

Blitz is a document-oriented database for Python that is backend-agnostic. It comes with a flat-file database for JSON documents and provides MongoDB-like querying capabilities.

Home Page: http://blitzdb.readthedocs.org

License: MIT License

Python 100.00%

blitzdb's Introduction

Hello there.

blitzdb's People

Contributors

adewes avatar bryant1410 avatar cbrauge avatar cmutel avatar dmytrokyrychuk avatar epatters avatar jcollado avatar jxieeducation avatar kylewm avatar leonardola avatar m4rtink avatar markperdue avatar matrixise avatar programmdesign avatar quantifiedcode-bot avatar sfermigier avatar tktech avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blitzdb's Issues

Autogenerate global list of all Object-derived classes

Autogenerate a global list of all classes that are derived from the "Object" base class, so that a backend can load these on demand without the need to register them. This facilitates e.g. the loading of nested objects without having to register the corresponding class first.

pip install blitzdb failing (version 0.2.5)

Sorry to be hassling you at the moment @adewes

The latest build (0.2.5) on PyPi is currently broken when installing via pip (eg. pip install blitzdb)

The error I am getting is as follows:

    (venv-flask-blitzdb)puredistortion:projects $ pip install blitzdb
    Downloading/unpacking blitzdb
      Downloading blitzdb-0.2.5.linux-x86_64.tar.gz (81Kb): 81Kb downloaded
      Running setup.py egg_info for package blitzdb
        Traceback (most recent call last):
          File "<string>", line 14, in <module>
        IOError: [Errno 2] No such file or directory: '/Users/dalestirling/Documents/projects/venv-flask-blitzdb/build/blitzdb/setup.py'
        Complete output from command python setup.py egg_info:
        Traceback (most recent call last):

      File "<string>", line 14, in <module>

    IOError: [Errno 2] No such file or directory: '/Users/dalestirling/Documents/projects/venv-flask-blitzdb/build/blitzdb/setup.py'

This is only for version 0.2.5, I have been able to install version 0.2.4 (eg. pip install bitzdb==0.2.4) and this worked fine. See below:

    (venv-flask-blitzdb)puredistortion:flask-blitzdb $ pip install blitzdb==0.2.4
    Downloading/unpacking blitzdb==0.2.4
      Downloading blitzdb-0.2.4.tar.gz
      Running setup.py egg_info for package blitzdb

    Installing collected packages: blitzdb
      Running setup.py install for blitzdb

    Successfully installed blitzdb
    Cleaning up...

Add unique indices

Grüezi Andreas-

Thanks for making Blitz, it is basically perfect for my use case (also scientific computing and data management).

More of a request for comment than an issue, actually.

I have started work on a unique indices branch. I guess this is not really different than the default behavior, as index hash values are stored in a dictionary in any case, but it would be nice to get an error if a duplicate value was inserted (although this would only happen when committed, not when the offending object was saved). But I am not sure how to handle multi-value indices, or what would make sense here. From looking at add_key, you add an index value for each field, and for the list/tuple of values, if supplied. Do you have any thoughts on what would make the most sense here? Maybe unique indices only make sense on an index with a single field?

Filter expressions example from README fails for file backend

from blitzdb import Document
from blitzdb.backends.file import Backend

class Movie(Document):
    pass

backend = Backend('_test')
Movie({'name': 'The Godfather', 'year': 1972}).save(backend)
Movie({'name': 'Goodfellas', 'year': 1990}).save(backend)
Movie({'name': 'Star Wars', 'year': 1977}).save(backend)
backend.commit()

movies = backend.filter(Movie, {'year': lambda year: year >= 1970 and year <= 1979})

gives

Traceback (most recent call last):
  File "test.py", line 19, in <module>
    'year': lambda year: year >= 1970 and year <= 1979,
  File "/home/kmahan/projects/blitzdb/blitzdb/backends/file/backend.py", line 586, in filter
    query_set = compiled_query(query_function)
  File "/home/kmahan/projects/blitzdb/blitzdb/backends/file/queries.py", line 38, in _get
    return query_function(key, expression)
  File "/home/kmahan/projects/blitzdb/blitzdb/backends/file/backend.py", line 569, in query_function
    indexes[key].get_keys_for(expression)
  File "/home/kmahan/projects/blitzdb/blitzdb/backends/file/queryset.py", line 42, in __init__
    self.keys = list(keys)
TypeError: 'bool' object is not iterable

The problem seems to be that Index.get_keys_for has value=lambda year: year>=1970..., and calls value(self) on it, but changing the get_keys_for function seemed to break assumptions for a bunch of other query styles.

object file not saved

python 2.7.6 & 3.5.32
blitzdb 0.2.12

console commands

from blitzdb import Docment, FileBackend
backend = FileBackend('./test-blitz-db')
Document({'name', 'myname'}).save(backend)

no errors.

But the object file isn't saved at all.



~ > ls -Ral test-blitz-db
test-blitz-db:
total 16K
drwxrwxr-x  3 shawn 4.0K May 11 15:49 ./
drwxr-xr-x 59 shawn 4.0K May 11 15:53 ../
-rw-rw-r--  1 shawn  254 May 11 15:49 config.json
drwxrwxr-x  4 shawn 4.0K May 11 15:49 document/

test-blitz-db/document:
total 16K
drwxrwxr-x 4 shawn 4.0K May 11 15:49 ./
drwxrwxr-x 3 shawn 4.0K May 11 15:49 ../
drwxrwxr-x 3 shawn 4.0K May 11 15:49 indexes/
drwxrwxr-x 2 shawn 4.0K May 11 15:49 objects/

test-blitz-db/document/indexes:
total 12K
drwxrwxr-x 3 shawn 4.0K May 11 15:49 ./
drwxrwxr-x 4 shawn 4.0K May 11 15:49 ../
drwxrwxr-x 2 shawn 4.0K May 11 15:49 deb83eae6d5942e5b38f11f4ed13dd2c/

test-blitz-db/document/indexes/deb83eae6d5942e5b38f11f4ed13dd2c:
total 8.0K
drwxrwxr-x 2 shawn 4.0K May 11 15:49 ./
drwxrwxr-x 3 shawn 4.0K May 11 15:49 ../

test-blitz-db/document/objects:
total 8.0K
drwxrwxr-x 2 shawn 4.0K May 11 15:49 ./
drwxrwxr-x 4 shawn 4.0K May 11 15:49 ../

Add BTree Indexing to File Backend

Currently the file-based backend uses a simple hash map to store indexes on disk. This has the drawback that loading and storing the index becomes very expensive when many documents are in a given collection.

Help us to improve the file backend by adding a BTree-based index:

  • Create a subclass derived from TransactionalIndex
  • Redefine the add_key and remove_key so that they use a BTree to instantly store keys in the index

This is a complex task, please feel free to ask questions and post comments and ideas in this thread!

Add better tests for special ($) query operators

Currently there are only two tests in tests/test_querying.py that cover various special ($) query operators.

Improve Blitz by adding more specialized tests for the various operators:

$in
$exists
$all
$regex
$ne
$eq
$not
$lt
$gt
$lte
$gte
$and
$or
#...
  1. Create a new file tests/test_query_operators.py
  2. Add appropriate tests for all operators listed above, possibly using existing test data or generating new test data, where appropriate.

OrderedDict turns to normal dict when document is loaded from database

Looks like Document instance attributes that hold ordered dicts turn to normal dicts when the document is reatrieved from the database, short reproducer:

#!/usr/bin/python3

from collections import OrderedDict
import tempfile
import blitzdb

with tempfile.TemporaryDirectory() as temp_dir_name:
    db = blitzdb.FileBackend(temp_dir_name)
    original_document = blitzdb.Document()
    original_document.od = OrderedDict()
    original_document.od["foo"] = 1
    original_document.od["bar"] = 2
    original_document.od["baz"] = 3

    print("original document")
    print(original_document.od)
    print(isinstance(original_document.od, OrderedDict))

    original_document.save(db)
    db.commit()

    loaded_document = db.get(blitzdb.Document, {})

    print("loaded document")
    print(loaded_document.od)
    print(isinstance(loaded_document.od, OrderedDict))

Issue with config file being overwritten

Hi, first off, thanks for this project, it's very cool. I have noticed an issue with reads/writes to the config file in multi-threaded situations though.

If you take a look at line 299 here https://github.com/adewes/blitzdb/blob/master/blitzdb/backends/file/backend.py#L299

save_config appears to always get called regardless of the overwrite_config value, as far as I can tell it gets called everytime the config is loaded. Because the issue is sporadic it's difficult to reproduce with an example but the symptoms are an error message like this:

  File "/home/vagrant/.pyvenv/rcvenv/src/blitzdb-master/blitzdb/backends/file/backend.py", line 111, in __init__
    self.load_config(config, overwrite_config)
  File "/home/vagrant/.pyvenv/rcvenv/src/blitzdb-master/blitzdb/backends/file/backend.py", line 285, in load_config
    self._config = JsonSerializer.deserialize(config_file.read())
  File "/home/vagrant/.pyvenv/rcvenv/src/blitzdb-master/blitzdb/backends/file/serializers.py", line 33, in deserialize
    return json.loads(data.decode('utf-8'))
  File "/usr/lib/python3.4/json/__init__.py", line 318, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.4/json/decoder.py", line 343, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.4/json/decoder.py", line 361, in raw_decode
    raise ValueError(errmsg("Expecting value", s, err.value)) from None
ValueError: Expecting value: line 1 column 1 (char 0)

I'm pretty certain that it is due to the file being written and read at the same time. It should be a simple fix, so if I can, I'll issue a PR in the next few days if I can find the time.

Not python3 friendly

It would be really handy if this was python2 and python3 compatible out of the box.

Set Up Travis Integration

Travis integration is currently non-existent, we should set it up properly so that we can automatically test new builds.

Install on your machine and report issues

Help us to make sure that Blitz installs smoothly on all platforms!

  1. Install blitz using pip: pip install blitzdb
  2. Go through the tutorial in the README or the documentation
  3. Report any issues and bugs you encounter

Merge two databases?

Suppose I've created two different filesystem-backed databases. Is there a preferred, sanctioned way of merging them into one?

Thanks for this, it fills a useful niche!

Python 3.4.1 and blitzdb issue

Testing code:

CODE ---------------------

!/usr/bin/env

import requests
from blitzdb import Document, FileBackend

API_URL = 'http://api.themoviedb.org/3'
API_KEY = 'ddf3xxxxxxxx0289'

class Actor(Document):
pass

def get_actor(_id):
r = requests.get('{}/person/{}?api_key={}'.format(API_URL, str(_id), API_KEY))
return r.json()

actor_1 = Actor(get_actor(1))
actor_2 = Actor(get_actor(2))

backend = FileBackend("db.blitz")
actor_1.save(backend)
actor_2.save(backend)

print(backend.get(Actor,{'imdb_id' : 'nm0000184'}))
print('\n')
print(backend.get(Actor,{'imdb_id' : 'nm0000434'}))

OUTPUT ---------------------

Warning: cjson could not be imported, CJsonSerializer will not be available.
Traceback (most recent call last):
File ".\uff.py", line 27, in
print(backend.get(Actor,{'imdb_id' : 'nm0000184'}))
File "C:\Python34\lib\site-packages\blitzdb\backends\file\backend.py", line 456, in get
raise cls.DoesNotExist
blitzdb.document.DoesNotExist: DoesNotExist(Actor)

QUESTION ---------------------

Why the output says that Actor doesn't exists when I already added it here 'actor_1.save(backend)' and 'actor_2.save(backend)'

Oh yes, and here is what the call to the API returns:

{"adult":false,"also_known_as":["George Walton Lucas Jr. "],"biography":"Arguably the most important film innovator in the history of the medium, George Lucas continually "pushed the envelope" of filmmaking technology since his early days as a student at U.S.C. Considered a wunderkind by his contemporaries, he had a much harder time communicating his vision to studio executives, whose meddling managed to compromise each of his first three feature directing efforts in some way. The monumental success of "Star Wars" (1977) ushered in the era of the "summer blockbuster," which, despite the later popularity of low budget independent films, was still the prevailing mentality powering the Hollywood engine.\n\nThough he set the tone and established the expectations which influenced studios to devote the bulk of their resources to films designed to blast off into hyperspace for spectacular profits, it was doubtful that a film as revolutionary as "Star Wars" was in its day could get made in the later blockbuster assembly line climate of the new millennium.","birthday":"1944-05-14","deathday":"","homepage":"","id":1,"imdb_id":"nm0000184","name":"George Lucas","place_of_birth":"Modesto - California - USA","popularity":2.185575,"profile_path":"/rJ1zvSeZfge0mHtLnzJn4Mkw18S.jpg"}

$exists still works wrong

Here is a program I used to check $exists query operator behavior:

from blitzdb import FileBackend as Backend
from blitzdb import Document


class Actor(Document):
    pass


if __name__ == '__main__':
    backend = Backend('testdb')
    backend.filter(Actor, {}).delete()
    backend.commit()

    marlon_brando = Actor({'name': 'Marlon Brando'})
    leonardo_di_caprio = Actor({'name': 'Leonardo di Caprio', 'gross_income_m': 12.453})
    david_hasselhoff = Actor({'name': 'David Hasselhoff', 'gross_income_m': 12.453})
    charlie_chaplin = Actor({'name': 'Charlie Chaplin', 'gross_income_m': 0.371})

    backend.save(marlon_brando)
    backend.save(leonardo_di_caprio)
    backend.save(david_hasselhoff)
    backend.save(charlie_chaplin)
    backend.commit()
    print([a.name for a in backend.filter(Actor, {'gross_income_m': {'$exists': True}})])
    print([a.name for a in backend.filter(Actor, {'gross_income_m': {'$exists': False}})])

This must print something like this:

['Charlie Chaplin', 'Leonardo di Caprio', 'David Hasselhoff']
['Marlon Brando']

Actually tis script prints this:

['Charlie Chaplin', 'Leonardo di Caprio', 'David Hasselhoff']
['Charlie Chaplin', 'Leonardo di Caprio', 'David Hasselhoff']

Please stop catching errors like this, this, this, this, this and many others. Tests must fail when something works wrong. That is the way tests notify developers that something bad happened.

Change test suite so that it works when MongoDB is not installed

Currently the test suite implicitly assumes that MongoDB is installed. If it's not, some of the tests will fail. We should change the behavior such that it's possible to run the test suite without MongoDB, either automatically (if pymongo is not detected) or by choice (using a flag).

Add SQL Backend

Currently, Blitz supports a file-based backend and a MongoDB backend. We want to add a SQL backend using SQLAlchemy so that people can use Blitz in conjunction with Postgres, MySQL, SQLite and many other SQL databases!

  • Create a new backend folder: sql
  • Write a SQL backend that implements all necessary functionality (filter, get, delete, ...)
  • Add the backend to the test suite and, if necessary, add additional tests

hidden attribute

An attribute and a method of a class share the same name. Give unique
names to the two objects or else Python will raise a
TypeError: object is not callable error at runtime. Python raises the
error because the name is associated to the attribute, so when you
insert parentheses after the name, Python actually tries to call the
attribute (although you were trying to call the function).

Affected files

Solutions

Modify the names

Per Python style conventions, the author modified his Rectangle class
by deleting the area method. Whenever a module needs to access the
area attribute of a Rectangle instance, the module can just access
the area attribute directly. This also suppresses the
Attribute hides this method error.

class Rectangle:
    def __init__(self, width, height):
        self.width = width
        self.height = height
        self.area = width * height
    # deleted area method from here

r = Rectangle(3, 4)
print r.area  # access the attribute directly now

References

SyntaxError when importing Document and/or FileBackend

Is blitzdb compatible with older versions of python? RHEL6 still uses version 2.6.6.

I tried to do the imports shown in https://github.com/adewes/blitzdb#user-content-examples

$ pip show blitzdb

---
Metadata-Version: 1.0
Name: blitzdb
Version: 0.2.12
Summary: A document-oriented database written purely in Python.
Home-page: https://github.com/adewes/blitzdb
Author: Andreas Dewes - 7scientists
Author-email: [email protected]
License: MIT
Location: /usr/lib/python2.6/site-packages
Requires:
$ python -V
Python 2.6.6
$ head -2 test_blitzdb.py
from blitzdb import Document
from blitzdb import FileBackend
$ python test_blitzdb.py
Traceback (most recent call last):
  File "test_blitzdb.py", line 2, in <module>
    from blitzdb import Document
  File "/usr/lib/python2.6/site-packages/blitzdb/__init__.py", line 2, in <module>
    from .backends.file import Backend as FileBackend
  File "/usr/lib/python2.6/site-packages/blitzdb/backends/file/__init__.py", line 1, in <module>
    from blitzdb.backends.file.backend import Backend
  File "/usr/lib/python2.6/site-packages/blitzdb/backends/file/backend.py", line 13, in <module>
    from blitzdb.backends.file.index import (
  File "/usr/lib/python2.6/site-packages/blitzdb/backends/file/index.py", line 105
    self._undefined_keys = {key : True for key in undefined_values}
                                         ^
SyntaxError: invalid syntax
$

Attributes being deleted on .save()

I have a piece of code that does the following:

        lsheet, rsheet = db.get(Sheet, {'_id': left['sheetId']}), \
                         db.get(Sheet, {'_id': right['sheetId']})
        print('saving %s with primary key %s.' % (lsheet.title, lsheet._id))
        db.save(lsheet)
        db.save(rsheet)
        print('saved.')

At the moment the first line is printed, the lsheet object has an _id (the pk is called _id in this class, that's working correctly in other parts of the script). But then the call fails with

Traceback (most recent call last):
  File "test.py", line 3, in <module>
    db = make_db('data/20160422.json')
  File "/home/fiatjaf/comp/hack-json/db.py", line 105, in make_db
    db.save(lsheet)
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/backends/file/backend.py", line 459, in save
    serialized_attributes = self.serialize(obj.attributes)
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/backends/base.py", line 159, in serialize
    output_obj[str(key) if convert_keys_to_str else key] = serialize_with_opts(value, embed_level=embed_level)
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/backends/base.py", line 150, in <lambda>
    serialize_with_opts = lambda value,*args,**kwargs : self.serialize(value,*args,convert_keys_to_str = convert_keys_to_str,autosave = autosave,for_query = for_query, **kwargs)
  File "/home/fiatjaf/compk/hack-json/venv/lib/python3.4/site-packages/blitzdb/backends/base.py", line 161, in serialize
    output_obj = list(map(lambda x: serialize_with_opts(x, embed_level=embed_level), obj))
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/backends/base.py", line 161, in <lambda>
    output_obj = list(map(lambda x: serialize_with_opts(x, embed_level=embed_level), obj))
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/backends/base.py", line 150, in <lambda>
    serialize_with_opts = lambda value,*args,**kwargs : self.serialize(value,*args,convert_keys_to_str = convert_keys_to_str,autosave = autosave,for_query = for_query, **kwargs)
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/backends/base.py", line 175, in serialize
    obj.save(self)
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/document.py", line 390, in save
    return backend.save(self)
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/backends/file/backend.py", line 456, in save
    if hasattr(obj, 'pre_save') and callable(obj.pre_save):
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/document.py", line 175, in __getattribute__
    self.revert()
  File "/home/fiatjaf/comp/hack-json/venv/lib/python3.4/site-packages/blitzdb/document.py", line 427, in revert
    raise self.DoesNotExist("No primary key given!")
blitzdb.document.DoesNotExist: DoesNotExist(Record)

The error happens at https://github.com/adewes/blitzdb/blob/master/blitzdb/document.py#L490. I put some print() calls there and inside the def pk(self) method and discovered that the object was empty. A call to self.keys() returns an empty list, that's why self.pk returns None. I don't know at which time the object becomes empty and couldn't find it myself.

I must say that basic saving, fetching and filtering is working well in many other parts of the script.

Useless code in tests

I've found a lot of code in tests which should be reviewed.
There is a lot of places like this:

    # Test with null elements
    try:
        query = {'appearances': {'$lt': jackie_chan.appearances}}
        assert len(backend.filter(Actor, query)) == len([])
    except NameError:
        pass
    # Test with null elements

That assertion will never be executed, because jackie_chan is not defined. The except statement will catch that exception. This particular piece of code does nothing.
I think we should define jackie_chan somewhere (but not insert it into database) and remove try/except statement.

There is also a lot of places like this:

    # Test with illegal values
    try:
        query = {'gross_income_m': {'$lt': math.sqrt(-1)}}
        assert len(backend.filter(Actor, query)) == len([])
    except ValueError:
        pass
    # Test with illegal values

math.sqrt(-1) raises ValueError, so assertion will never be executed like one in previous example.

I'm going to review the tests in order to find useless places like ones I described above and fix them.

Support for ujson

Hi,
ujson is recognized as a really fast json encoder/decoder, would it be possible to choose the json encoder/decoder library as part of arguments

Syntax for Keyword Searching (LIKE in SQL)

What is the syntax for filtering by wildcard using a FileBackend?

Ex.
I have a Dog document class that has one object persisted with the properties of
{ 'name':'fido', 'color':'brown', 'owner':'Joe Smith' }
I want to be able to filter through all of the persisted dog objects with a wildcard like so,
dogs = backend.filter(Dog, { 'owner': { '$LIKE': 'Joe%' } })
and have the object specified above to be returned in an array.

Is this already supported in any fashion? If not, then is there any movement to support this?

As promised, here is the failed test

Hi Andreas,

When I execute the tests with py.test, here is the output.

platform darwin -- Python 2.7.8 -- py-1.4.25 -- pytest-2.6.3
collected 165 items

blitzdb/tests/test_autoload.py ..
blitzdb/tests/test_dbref_includes.py ....
blitzdb/tests/test_documents.py .....
blitzdb/tests/test_exceptions.py ..
blitzdb/tests/test_file_stores.py .
blitzdb/tests/test_hooks.py ........
blitzdb/tests/test_query_operators.py ...................................F............
blitzdb/tests/test_querying.py ....................................................................
blitzdb/tests/test_sorting.py ....
blitzdb/tests/test_transactions.py ................
blitzdb/tests/test_update.py .......

=============================================================================== FAILURES ===============================================================================
____________________________________________________________________________ test_ne[mongo] ____________________________________________________________________________

backend = <blitzdb.backends.mongo.backend.Backend object at 0x10276cc90>

    def test_ne(backend):
        # DB setup
        backend.filter(Actor, {}).delete()

        marlon_brando = Actor({'name': 'Marlon Brando', 'gross_income_m': 1.453, 'appearances': 78, 'is_funny': False, 'birth_year': 1924})
        leonardo_di_caprio = Actor({'name': 'Leonardo di Caprio', 'gross_income_m': 12.453, 'appearances': 34, 'is_funny': 'it depends', 'birth_year': 1974})
        david_hasselhoff = Actor({'name': 'David Hasselhoff', 'gross_income_m': 12.453, 'appearances': 173, 'is_funny': True, 'birth_year': 1952})
        charlie_chaplin = Actor({'name': 'Charlie Chaplin', 'gross_income_m': 0.371, 'appearances': 473, 'is_funny': True, 'birth_year': 1889})

        backend.save(marlon_brando)
        backend.save(leonardo_di_caprio)
        backend.save(david_hasselhoff)
        backend.save(charlie_chaplin)

        backend.commit()
        assert len(backend.filter(Actor, {})) == 4
        # DB setup

        # Test with normal conditions
        query = {'name': {'$ne': charlie_chaplin.name}}
        assert len(backend.filter(Actor, query)) == len([marlon_brando, leonardo_di_caprio, david_hasselhoff])
        # Test with normal conditions

        # Test with empty list
        query = {'name': {'$ne': []}}
        assert len(backend.filter(Actor, query)) == len([marlon_brando, charlie_chaplin, leonardo_di_caprio, david_hasselhoff])
        # Test with empty list

        # Test with list
        query = {'name': {'$ne': [marlon_brando.name, charlie_chaplin.name]}}
>       assert len(backend.filter(Actor, query)) == len([leonardo_di_caprio, david_hasselhoff, charlie_chaplin, marlon_brando])
E       assert 3 == 4
E        +  where 3 = len(<blitzdb.backends.mongo.queryset.QuerySet object at 0x102767210>)
E        +    where <blitzdb.backends.mongo.queryset.QuerySet object at 0x102767210> = <bound method Backend.filter of <blitzdb.backends.mongo.backend.Backend object at 0x10276cc90>>(Actor, {'name': {'$ne': ['Marlon Brando', 'Charlie Chaplin']}})
E        +      where <bound method Backend.filter of <blitzdb.backends.mongo.backend.Backend object at 0x10276cc90>> = <blitzdb.backends.mongo.backend.Backend object at 0x10276cc90>.filter
E        +  and   4 = len([Actor({'is_funny': 'it depends', 'gross_income_m': 12.453, 'name': 'Leonardo d...ppearances': 34, 'pk': '1f686fd660f4..._m': 1.453, 'name': 'Marlon Brando', 'appearances': 78, 'pk': '2b17c74f124a457dbb737919e1cff55e', 'birth_year': 1924})])

blitzdb/tests/test_query_operators.py:513: AssertionError
================================================================ 1 failed, 164 passed in 21.30 seconds =================================================================

getting error when using MongoBackend

I was glad to find out that blitzdb offers a MongoDB wrapper so I tried it out. Unfortunately I couldn't get it to work

>>> backend = blitzdb.MongoBackend('mongo://127.0.0.1:27017/', True)
>>> doc = blitzdb.Document({'name': 'Shawn'})
>>> doc.save(backend)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Users\shawn\AppData\Local\Programs\Python\Python35-32\lib\site-packages\blitzdb-0.2.12-py3.5.egg\blitzdb\document.py", line 449, in save
    return backend.save(self)
  File "c:\Users\shawn\AppData\Local\Programs\Python\Python35-32\lib\site-packages\blitzdb-0.2.12-py3.5.egg\blitzdb\backends\mongo\backend.py", line 151, in save
    return self.save_multiple([obj])
  File "c:\Users\shawn\AppData\Local\Programs\Python\Python35-32\lib\site-packages\blitzdb-0.2.12-py3.5.egg\blitzdb\backends\mongo\backend.py", line 144, in save_multiple
    self.db[collection].save(attributes)
TypeError: string indices must be integers

I'm sure my mongodb server is running fine. I suppose the usage against the MongoDB backend should be the same. There isn't much documentation on read docs. Am I doing something wrong?

Update the documentation website with the latest changes

I spent ages trying to fix the meta class primary key variable (pk instead of primary_key) because the documentation on the website was incorrect. I began preparing a pull request only to find that the documentation in the git repo itself was already fixed on the 15th of March!

Please update the readthedocs.org website for Blitzdb for future users.

Add Support for Raw Documents & Collections

Currently Blitz uses an object-oriented document schema by default: In order to store and retrieve documents, you need to define a class derived from the Document class.

Make it possible to use Blitz in a non-object-oriented way:

  • Write a polymorphous version of the save function that accepts a collection name and a Python dictionary instead of a Document instance
  • Make the get and filter functions work with collection names instead of Document classes
  • Possibly add a Collection class that represents a given collection in a database (like MongoDB, useful for inserting, searching etc.)
  • Change the delete function so that it works with Python dictionaries

Please feel free to discuss, provide feedback and ask questions in this thread!

Update: I started working on this in the raw_interface branch. Feel free to check out the code there and contribute to it.

Some of the tests fail sometimes due to "bad" random data

Some of the tests in the test suite generate random data using the fakefactory library. Sometimes, the generated data is such that some of the tests fail, e.g. test_list_query in test_querying.py.

We should prepare the test data such that no test will randomly fail due to bad data.

Improve Tests

Blitz has an extensive test suite that provides unit tests for most of the functionality. Help us to improve the test suite by:

  • Documenting and reviewing currently existing tests
  • Improving the organization of the test suite
  • Filling in the holes by adding tests for functionality that is not covered
  • Improving the speed of the test suite

Missing dependency info on PyPI

The six module is missing when installing in a clean virtualenv (py34). I think the dependency info seem to be missing from the PyPI package.

(tmp) % pip install blitzdb
Collecting blitzdb
Installing collected packages: blitzdb
Successfully installed blitzdb-0.2.12
(tmp) % python
Python 3.4.3 (default, Jan  2 2016, 11:36:17) 
[GCC 5.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import blitzdb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jgn/.virtualenvs/tmp/lib/python3.4/site-packages/blitzdb/__init__.py", line 1, in <module>
    from .document import Document
  File "/home/jgn/.virtualenvs/tmp/lib/python3.4/site-packages/blitzdb/document.py", line 8, in <module>
    import six
ImportError: No module named 'six'
>>> 
(tmp) % pip install six
Collecting six
  Using cached six-1.10.0-py2.py3-none-any.whl
Installing collected packages: six
Successfully installed six-1.10.0
(tmp) % python         
Python 3.4.3 (default, Jan  2 2016, 11:36:17) 
[GCC 5.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import blitzdb
>>> 

FileBackend.filter() does not support the sort_by arguement

I have been using this feature for a project and have noticed that when I build the query

db.filter(data, {}, sort_by='seq')

It would return a QuerySet object in the same order as if i used the query

db.filter(data, {})

This is true even when i intentionally saved the records to the Document class in a non sequential order.

I tested using this code:

from blitzdb import FileBackend, Document

db = FileBackend('this.db')

class data(Document):
  pass


db.begin()
a = data({'seq': 3})
b = data({'seq': 5})
c = data({'seq': 1})
d = data({'seq': 4})
e = data({'seq': 2})

db.save(a)
db.save(b)
db.save(c)
db.save(d)
db.save(e)

db.commit()

# Filter 1
db.filter(data, {})[0]['seq']
# v0.2.4 return: 3
# fix return: 3

# Filter 2
db.filter(data, {}).sort('seq')[0]['seq']
# v0.2.4 return: 1
# fix return: 1

# Filter 3
from blitzdb.queryset import QuerySet as BaseQS
db.filter(data, {}).sort('seq', BaseQS.DESCENDING)[0]['seq']
# v0.2.4 return: 5
# fix return: 5

# Filter 4
db.filter(data, {}, sort_by='seq')[0]['seq']
# v0.2.4 return: 3 
# fix return: 1

I have built a fix for this and will add the pull request. I have also added tests to test_sorting.py

SqlBackend doesn't support querying without indexes, but it also can't create indexes

I'm using the SqlBackend with sqlite. I was able to populate the database, but when I tried to run a query, I got the following error:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "...\lib\site-packages\blitzdb\backends\sql\backend.py", line 1408, in filter
    compiled_query = compile_query(collection,query)
  File "...\lib\site-packages\blitzdb\backends\sql\backend.py", line 1405, in compile_query
    raise AttributeError("Query over non-indexed field %s in collection %s!" % (key,collection))
AttributeError: Query over non-indexed field analyses in collection word!

I then tried to create an index but go the following error:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "...\lib\site-packages\blitzdb\backends\sql\backend.py", line 1071, in create_index
    self.db[collection].ensure_index(*args, **kwargs)
AttributeError: 'Backend' object has no attribute 'db'

Add SQL Backend support

Add support to store and retrieve documents from relational databases such as Postgres or MySQL.

How to do this:

  • The table layout will be such that the serialized JSON document gets stored in a BLOB field, and all indexes on values in that document will get stored in additional, indexable columns that have to be defined beforehand.
  • Querying is only possible using indexed fields.
  • When updating documents, indexed values will get updated automatically.
  • When specifying indexes, a type must be given (e.g. text, string, double, int, ...). When trying to store a document with an invalid index value, an exception will get raised.

This way of indexing and storing documents makes it possible, to perform advanced queries on them and even make use of SQL JOIN operators to enrich documents of a given collection with data from other collections, based on some index value in the document.

QuerySet.next() raises AttributeError

I get the following AttributeError when calling the next method on a QuerySet:

docs = db.filter(Document, {})
d = docs.next()
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/env/py2/lib/python2.7/site-packages/blitzdb/backends/file/queryset.py", line 24, in next
 if self._i >= len(self):
 AttributeError: 'QuerySet' object has no attribute '_i'

I am running Python 2.7 from Miniconda by Continuum Analytics.

Improve Documentation

Help us to improve the documentation of BlitzDB!

  • Read the documentation and give us feedback on how understandable and helpful it is, and how we can improve it
  • If you're an native English speaker, correct typos and "broken English"
  • Write a tutorial yourself! If you use Blitz for a particular use case and think it could be relevant for other users as well, summarize what you've done in a small tutorial.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.