GithubHelp home page GithubHelp logo

schematics / schematics Goto Github PK

View Code? Open in Web Editor NEW
2.6K 62.0 286.0 2.52 MB

Python Data Structures for Humans™.

Home Page: http://schematics.readthedocs.org/

License: Other

Python 100.00%
python validation datastructures types schema serialization deserialization

schematics's Introduction

Schematics

Python Data Structures for Humans™.

Build Status

Coverage

About

Project documentation: https://schematics.readthedocs.io/en/latest/

Schematics is a Python library to combine types into structures, validate them, and transform the shapes of your data based on simple descriptions.

The internals are similar to ORM type systems, but there is no database layer in Schematics. Instead, we believe that building a database layer is easily made when Schematics handles everything except for writing the query.

Schematics can be used for tasks where having a database involved is unusual.

Some common use cases:

Example

This is a simple Model.

>>> from schematics.models import Model
>>> from schematics.types import StringType, URLType
>>> class Person(Model):
...     name = StringType(required=True)
...     website = URLType()
...
>>> person = Person({'name': u'Joe Strummer',
...                  'website': 'http://soundcloud.com/joestrummer'})
>>> person.name
u'Joe Strummer'

Serializing the data to JSON.

>>> import json
>>> json.dumps(person.to_primitive())
{"name": "Joe Strummer", "website": "http://soundcloud.com/joestrummer"}

Let's try validating without a name value, since it's required.

>>> person = Person()
>>> person.website = 'http://www.amontobin.com/'
>>> person.validate()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "schematics/models.py", line 231, in validate
    raise DataError(e.messages)
schematics.exceptions.DataError: {'name': ['This field is required.']}

Add the field and validation passes.

>>> person = Person()
>>> person.name = 'Amon Tobin'
>>> person.website = 'http://www.amontobin.com/'
>>> person.validate()
>>>

Testing & Coverage support

Run coverage and check the missing statements. :

$ coverage run --source schematics -m py.test && coverage report

schematics's People

Contributors

alexanderdean avatar bdickason avatar bintoro avatar chadrik avatar christopheryoung avatar cmonfort avatar gabisurita avatar gennady-andreyev avatar gone avatar hkage avatar jaysonsantos avatar jmsdnns avatar johannth avatar jokull avatar justinabrahms avatar kaiix avatar kracekumar avatar kstrauser avatar lkraider avatar martinhowarth avatar meantheory avatar rooterkyberian avatar ryanolson avatar seanoc avatar st0w avatar st4lk avatar talos avatar titusz avatar tommyzli avatar wraziens avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

schematics's Issues

required=True/False is too binary ;-)

A fairly common feature of RESTful APIs is that there are fields which are expected on Read (GET) which aren't settable on Create (POST or maybe PUT). Examples of these fields are auto-increment IDs and datestamps. To deal with this in Dictshield, I've had to create two documents for each RESTful resource:

class BaseRepresentation(Document):
    id = URLField(required=True)
    createdAt = DateTimeField(required=True)
    updatedAt = DateTimeField(required=True)

class SalesShipment(BaseRepresentation):
    <list of fields>

class NewSalesShipment(Document):
    <identical list of fields>

Note that NewSalesShipment does not inherit from BaseRepresentation - because the id, createdAt and updatedAt fields shouldn't exist in a new sales shipment.

The above works fine but it is obviously less than DRY for long lists of fields. Also having a set of resources called New* looks a bit lame!

Just riffing here but maybe an alternative would be something like:

id = URLField(required=["GET","DELETE"])
...
shipment.validate("POST") # No id is fine

Just a thought! Feel free to close as not a bug, but thought it was worth raising.

EmbeddedDocumentField doesn't allow a set to None

Hi, from the source I think this is a feature and not a bug, but I'd like a workaround. Putting required=false in a EmbeddedDocumentField does allow not to include it when creating the model, but later trying to set it to None fails.

From the setter in source we can see that this is the intended behaviour:

def __set__(self, instance, value):
        if value is None:
            return
        if not isinstance(value, self.document_type):
            value = self.document_type(**value)
        instance._data[self.field_name] = value

So how can we set a EmbeddedDocumentField to None, once it has been assigned a value?

Thanks

ListField(required=True) doesn't invalidate

I have the following Document:

class Article(Document):
    url = URLField(required=True)
    pub_date = DateTimeField(required=True, default=datetime.datetime.utcnow)
    authors = ListField(ObjectIdField(required=True), required=True)

and the test:

article1 = Article(url=ARTICLE_URL, pub_date=datetime.strptime("21/11/12 21:00", "%d/%m/%y %H:%M"))
self.assertRaises(ShieldException, article1.validate)

just don't raise an exception

required=True doesn't allow to set fields to None

Hello, I get "ShieldException: Required field missing - cover:None", while my cover is defined as follows:
cover = StringField(default=None, required=True)

from the following snippet from document.py around line ~310:

# treat empty strings is nonexistent
if value is not None and value != '':
    try:
        field._validate(value)
    except (ValueError, AttributeError, AssertionError):
        raise ShieldException('Invalid value', field.field_name,
                              value)
elif field.required:
    raise ShieldException('Required field missing',
                          field.field_name,
                          value)

I understand that "empty strings as nonexistent", but I just want to make sure I don't miss any particular field required by my scheme. It is ok to have keys set to None or null in my MongoDB. I want some fields to be "required" mainly because I want to avoid triple state in jsons: 1. the value of the key is set and is not null, 2. the value is set and is null, 3. the key is not set. I just want to have more strict scheme and threat 3rd case as an error.

So I suggest to introduce a new key, kind of 'allow_empty' which will make sense only if required=True and if it's set to True will allow None and '' to be valid.

DateTimeField does not serialize properly when used in a ListField EmbeddedDocument

The following code snippet is using the most recent commit of dictshield, 995ba29

I suspect you're already aware of this issue, based on the note in commit 0558c82 but I wanted to enter something for tracking purposes and others' awareness.

import datetime

from dictshield.document import Document, EmbeddedDocument
from dictshield.fields import DateTimeField, EmbeddedDocumentField, ListField

class Item(EmbeddedDocument):
    due = DateTimeField()

class Container(Document):
    items = ListField(EmbeddedDocumentField(Item))

i = Item(due=datetime.datetime.utcnow())
c = Container(items=[i,])

print '------ Item -------'
print i.to_python()
print i.to_json()

print '----- Container -----'
print c.to_python()
print c.to_json()

Output ::
------ Item -------
{'_types': ['Item'], 'due': datetime.datetime(2011, 7, 25, 21, 24, 46, 510683), '_cls': 'Item'}
{"_types": ["Item"], "due": "2011-07-25T21:24:46.510683", "_cls": "Item"}
----- Container -----
{'_types': ['Container'], 'items': [{'_types': ['Item'], 'due': datetime.datetime(2011, 7, 25, 21, 24, 46, 510683), '_cls': 'Item'}], '_cls': 'Container'}
Traceback (most recent call last):
File "ex.py", line 21, in
print c.to_json()
File "/Users/st0w/.virtualenvs/obts-tracking/lib/python2.6/site-packages/dictshield/base.py", line 443, in to_json
return json.dumps(data)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/init.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 367, in encode
chunks = list(self.iterencode(o))
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 306, in _iterencode
for chunk in self._iterencode_list(o, markers):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 204, in _iterencode_list
for chunk in self._iterencode(value, markers):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 317, in _iterencode
for chunk in self._iterencode_default(o, markers):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 323, in _iterencode_default
newobj = self.default(o)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 344, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: datetime.datetime(2011, 7, 25, 21, 24, 46, 510683) is not JSON serializable

id fields don't subclass correctly

The following code does not honor the id field via subclassing. Comment or uncomment the id field in SalesShipment in the code snippet below to demonstrate the bug.

#!/usr/bin/env python

from dictshield.base import ShieldException
from dictshield.document import Document
from dictshield.fields import (URLField,
                               DateTimeField,
                               FloatField,
                               StringField)

class BaseRepresentation(Document):
    id = URLField(required=True)
    createdAt = DateTimeField(required=True)
    updatedAt = DateTimeField(required=True)

class SalesShipment(BaseRepresentation):
    id = URLField(required=True)
    cost = FloatField(required=True, min_value=0)
    currency = StringField(required=True)


data = {
  "id" : "http://localhost:8080/sales-shipments/21e5bde8-f7e9-11e0-be50-0800200c9a66",
  "currency" : "GBP",
  "updatedAt" : "2011-10-28T19:39:22.783271",
  "createdAt" : "2011-10-11T08:39:30",
  "cost" : 0.92
}


shipment = SalesShipment(**data)
try:
    shipment.validate()
except ShieldException, se:
    print 'ShieldException caught: %s' % se

print "This shipment cost %f %s" % (shipment.cost, shipment.currency)

Documentation has to be updated .

Example Uses
There are a few ways to use DictShield. A simple case is to create a class structure that has typed fields. DictShield offers multiple types in fields.py, like an EmailField or DecimalField.

There is no file fields.py and all fields details are located in fields directory. This has to be updated in docs. might be earlier version had fields.py and now details are under fields\__init__.py \base.py \temporal.py \mongo.py contains all field implementation details

Postpone validation until validate() is called

Example:

from dictshield.document import Document
from dictshield.fields import StringField, DateTimeField

class BlogPost(Document):
    title = StringField(max_length=40)
    body = StringField(max_length=4096)
    dt = DateTimeField()

data = {
    'title': 'aaa',
    'body': 'bbb',
    'dt': 'ccc'
}

#bp = BlogPost(**data)
bp = BlogPost()
bp.title = data['title']
bp.body = data['body']
bp.dt = data['dt']
bp.validate()

Gives me:

Traceback (most recent call last):
  File "test1.py", line 19, in <module>
    bp.dt = data['dt']
  File "/Users/up/.virtualenvs/test/lib/python2.7/site-packages/dictshield/fields/base.py", line 561, in __set__
    value = DateTimeField.iso8601_to_date(value)
  File "/Users/up/.virtualenvs/test/lib/python2.7/site-packages/dictshield/fields/base.py", line 586, in iso8601_to_date
    date_info = elements[0]
IndexError: list index out of range

and validate() is never reached.
The problem with this approach is that I have to surround this code with 'try'es of all possible exceptions + ShieldException on validate() or a generic Exception which is a bad idea. I also must catch ValueError because inner implementation of DateTimeField uses datetime.datetime(). The same applies to ObjectId with InvalidId exception and possibly other *Fields.

>>> date_digits = [1000, 1000, 1000, 1000, 1000, 1000, 1000,]
>>> datetime.datetime(*date_digits)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: month must be in 1..12

My guess is that that __set__() must just save the value in it's object and validation must be postponed until actual validate() call is used. Let user call validate() manually which must throw ShieldException() only. If a user want to know how the date is converted from 'string' representation to datetime object, introduce a separate _check() or _convert() and get any specific error (ValueError and IndexError in example case).

In other words:
If user don't want to know anything about what is wrong in data coming from for example web form, he just wraps his model.validate() with try ... except ShieldException, not with all possible exceptions or wildcard Exception. If user want's to validate() each field separately, he calls validate() on each field (which wraps all data conversion related exceptions) and catches the only ShieldException. If user wants to catch any specific low-level data conversion error, he calls _check() or _convert() on every field object (or _check_all() on model?) which is also internally used and wrapped by validate(). Actually I have no idea who needs to get these specific exceptions, it is just enough with ShieldException that says that model/field is wrong, so I propose to have _check() or _convert() private.

Coerce IntField

Hello,

This may not be a bug but a feature but I found IntField will not coerce int while UUIDField will:

class Book(Document):
    uuid = UUIDField() # assign a string, it will create a UUID
    title = StringField(max_length=60)
    year = IntField() # assign a string... won't cast to int

Maybe I miss something... if so, how do I enforce an int ?

EmbeddedDocumentField does not work when not in a ListField

I believe this may be related to issue #5, however I don't have permission to re-open it and I felt it may be different enough to warrant creation of a new issue.

Currently, EmbeddedDocumentFields only work when used in the context of a ListField. When attempting to embed directly in an object, an error is generated upon instantiating an instance of said object. Here's the most simple example for comparison:

from dictshield.document import EmbeddedDocument,Document
from dictshield.fields import StringField, EmbeddedDocumentField, ListField

class Action(EmbeddedDocument):
    name = StringField(required=True, max_length=256)

class TaskL(Document):
    action = ListField(EmbeddedDocumentField(Action))

class Task(Document):
    action = EmbeddedDocumentField(Action)

a = Action(name='Phone call')

tl = TaskL()
t=Task()

Upon running this, an error is generated when python reaches the last line:

Traceback (most recent call last):
  File "test-simple.py", line 16, in <module>
    t=Task()
  File "/Users/st0w/.virtualenvs/obts-tracking/lib/python2.6/site-packages/dictshield/base.py", line 312, in __init__
    setattr(self, attr_name, value)
  File "/Users/st0w/.virtualenvs/obts-tracking/lib/python2.6/site-packages/dictshield/fields.py", line 458, in __set__
    value = self.document_type(**value)
TypeError: DocumentMetaclass object argument after ** must be a mapping, not NoneType

Add support for MongoDB BinaryField

Here's what works for me:

class BinaryField(BaseField):
    def __init__(self, subtype=None, **kwargs):
        self.subtype = subtype
        super(BinaryField, self).__init__(**kwargs)

    def __set__(self, instance, value):
        if isinstance(value, (str, unicode)):
            kwargs = {}
            if not self.subtype is None:
                kwargs = {'subtype': self.subtype}
            value = pymongo.binary.Binary(value, **kwargs)

        instance._data[self.field_name] = value

    def _jsonschema_type(self):
        return 'string'

    def for_python(self, value):
        try:
            return pymongo.binary.Binary(value)
        except Exception, e:
            raise ShieldException('Invalid Binary', self.field_name, value)

    def for_json(self, value):
        return str(value)

    def validate(self, value):
        if not isinstance(value, pymongo.binary.Binary):
            try:
                kwargs = {}
                if not self.subtype is None:
                    kwargs = {'subtype': self.subtype}
                value = pymongo.binary.Binary(value, **kwargs)
            except Exception, e:
                raise ShieldException('Invalid Binary', self.field_name, value)
        return value

Another documentation clarification

The documentation doesn't make clear that make_json_publicsafe and make_json_ownersafe take the document as an argument, e.g:

data = customer.make_json_publicsafe(customer)

(Out of interest, why is it necessary to pass the customer back into the function as an argument?)

'dict' object has no attribute 'to_json' marshalling error for complex Documents

There's an issue with make_json_*safe for complex representations. Here is the error:

Traceback (most recent call last):
  File "./scratchpad.py", line 205, in <module>
    data = rep.make_json_ownersafe(rep)
  File "/dictshield/document.py", line 183, in make_json_ownersafe
    white_list=white_list)
  File "/dictshield/dictshield/document.py", line 151, in make_safe
    doc_dict[k] = doc_converter(v)
  File "/dictshield/dictshield/document.py", line 177, in <lambda>
    doc_converter = lambda d: d.make_json_ownersafe(doc_encoder(d), encode=False)
  File "/dictshield/dictshield/document.py", line 183, in make_json_ownersafe
    white_list=white_list)
  File "/dictshield/dictshield/document.py", line 156, in make_safe
    doc_dict[k] = field_converter(k, v)
  File "/dictshield/dictshield/document.py", line 175, in <lambda>
    field_converter = lambda f, v: cls._fields[f].for_json(v)
  File "/dictshield/dictshield/fields/base.py", line 604, in for_json
    return value.to_json(encode=False)
AttributeError: 'dict' object has no attribute 'to_json'

There's a gist with a complete test script for this problem here: https://gist.github.com/1427099

Design question: why does Document.validate() raise an exception for one particular field?

The Document.validate() method loops through all fields (in no predictable order, as _fields is just a regular dict) and raises a ShieldException for the first error it finds, which means when you validate() a model you have no idea how many of the fields are invalid. An obvious use case where this isn't ideal is form validation - the user filling out the form can only be alerted of one error when there might be many (unless you explicitly validate each field one by one, but that defeats the point of having a Document-level validate() method)

Is there a benefit to the current behavior that I'm missing?

Ability to call make_*safe() routines directly on EmbeddedDocuments

It would seem a good idea to have the ability to call the make_*safe() routines directly on EmbeddedDocument objects. As a use case, consider where the possible options for EmbeddedDocument values are stored in a table in a database, and then used to generate the EmbeddedDocument objects at runtime, which are then used within the embedding objects.

One might desire to present a list of such possible values for an EmbeddedDocument utilizing the make_json_publicsafe() method to sanitize the results. However as it currently stands, this cannot be done because the make_*safe() routines are part of the Document class.

Alternatively, one may wish to present only the values of a particular EmbeddedDocument. However as it currently stands, the only method is to make_publicsafe() on the Python object, and then use json.dumps() since the result of make_publicsafe() is a dict rather than a Document object.

Could they be moved to the BaseDocument class so all derived classes have access, or would that cause problems with how they're currently called on embedded documents?

Feature request: add global for underscored internal fields only

To work with the new permissions system I'm having to override the _internal_private_fields in a few places:

class BaseRepresentation(Representation):
    """The fields shared by all existing representations
    """
    _internal_fields = ['_id', '_cls', '_types'] # Exclude id as this is an actual field

    id = UUIDField(required=True)
    created_at = DateTimeField(required=True)
    updated_at = DateTimeField(required=True)

class RepresentationLink(EmbeddedDocument):
    """A RepresentationLink holds the ID of and HATEOAS
       path to an individual Representation
    """
    _internal_fields = ['_id', '_cls', '_types'] # Exclude id as this is an actual field

    id = UUIDField(required=True)
    link = EmbeddedDocumentField(AtomLink)

    def __repr__(self):
        return "<RepresentationLink(%s, %s)>" % (self.id, self.link)

This works okay but it's a bit fragile - if DictShield introduces a new internal field (e.g. _timestamp), then my code will start exposing this field incorrectly. One solution would be a DictShield global like:

UNDERSCORE_ONLY = ['_id', '_cls', '_types']

And then I could override with:

_internal_fields = UNDERSCORE_ONLY

I imagine that this will be a very common use case - because id is a very common public field in JSONs.

Add an enum field

Is there something similar to an enum field in dictshield ? I understand there is no enum in python but still it would be nice to be able to limit the range of possible values for a field. Something similar to this:

class mydocument(Document):
    title = StringField(max_length=60, required=True)
    collection = EnumField(values=["books", "magazines","papers"])

I'm willing to implement it if I can get some good advice from you.

Rename EmbeddedDocument to EmbeddableDocument to reflect it works as a Document too

The examples on embedded documents imply that a typical representation will inherit either from Document or from EmbeddedDocument, depending on whether the item is a document itself or embedded within a larger document.

In fact in a RESTful API, every Document is typically also an EmbeddedDocument - because GET "/orders/1" yields an individual order Document, but GET "/orders" will return a list of all orders, which I can define like this in DictShield:

class OrderWrapper(Document):
    orders = ListField(EmbeddedDocumentField(Order))

It turns out that it's fine in DictShield to define a representation as both a document and an embedded document, like this:

class Order(Document, EmbeddedDocument):
    <blah>

I wasn't expecting this to be okay, because Document and EmbeddedDocument sound like alternatives (unlike, say, Document and EmbeddableMixin)...

Maybe something should be added to the documentation to make it clear that it's possible to make a Document "embeddable" by mixing in EmbeddedDocument?

Support for naming convention transforms

Naming convention transforms are a common feature in serialisation libs (e.g. Java Jackson) and ORMs (e.g. Python SQLalchemy and Scala Squeryl). The basic idea is to support different naming conventions (e.g. camelCase, snake_case) expressed in the JSON/SQL table/whatever, without having to breach the style guide of the host language.

At the moment with Dictshield, if I have a JSON which contains updatedAt and createdAt, then I need to rename my Python fields to match - I can't use updated_at and created_at.

The Squeryl way of supporting this is really DRY:

// Auto-translate Scala camelCase field names into database lower_underscore field names
override def columnNameFromPropertyName(n:String) =
  SquerylNamingConventionTransforms.camelCase2LowerUnderscore(n)

The SQLalchemy conversion functions are in model_generator.py

Jerkson (Scala Jackson) uses a @JsonSnakeCase annotation per-field, powered by a snakeCase method.

for_json() attempting to apply isoformat() to a string

This is with the latest version in pip. Test script:

#!/usr/bin/env python

from dictshield.document import Document, EmbeddedDocument
from dictshield.base import UUIDField
from dictshield.fields import DateTimeField, EmbeddedDocumentField
from datetime import datetime

class TestRepresentation(EmbeddedDocument):
    id = UUIDField(required=True)
    created_at = DateTimeField(required=True)

    def __init__(self, id, created_at):
        super(EmbeddedDocument, self).__init__() # Need to call Document constructor
        self.id = id
        self.created_at = created_at

class RootRepresentation(Document):
    test = EmbeddedDocumentField(TestRepresentation, required=True)

    def __init__(self, test):
        super(Document, self).__init__() # Need to call Document constructor
        self.test = test

test = TestRepresentation("c3491590-1ce9-11e1-8bc2-0800200c9a66", datetime.now())
print test.make_json_ownersafe(test)

root = RootRepresentation(test)
print root.make_json_ownersafe(root)

Output:

{"created_at": "2011-12-02T14:28:31.239244"}
Traceback (most recent call last):
  <snip>
  File "/usr/local/lib/python2.7/dist-packages/dictshield/document.py", line 176, in <lambda>
    field_converter = lambda f, v: cls._fields[f].for_json(v)
  File "/usr/local/lib/python2.7/dist-packages/dictshield/fields/base.py", line 362, in for_json
    v = DateTimeField.date_to_iso8601(value)
  File "/usr/local/lib/python2.7/dist-packages/dictshield/fields/base.py", line 351, in date_to_iso8601
    iso_dt = dt.isoformat()
AttributeError: 'str' object has no attribute 'isoformat'

make_json_ownersafe doesn't traverse list of EmbeddedDocumentFields

class Test(EmbeddedDocument):
... text = StringField()
...
class Tester(Document):
... items = ListField(EmbeddedDocumentField(Test))
...
t=Tester(items=[Test(text='mytest')])
Tester.make_json_ownersafe(t)
'{"items": [{"_types": ["Test"], "text": "mytest", "_cls": "Test"}]}'

Without the ListField wrapping the embedded documents, it works just fine.

Ability to manually override field names

At the moment there is no way to unmarshal the following JSON using DictShield:

{
  "code" : "GBP",
  "self" : "http://localhost:8080/currencies/GBP" # Replacing with "link": works fine
}

... because the self key can't be defined as a field in a Document subclass. And similarly it's hard to work with the following:

{
    "customer" : "Bob",
    "email" : "[email protected]",
    "link" : {
      "rel" : "self",
      "href" : "http://localhost:8080/customer/2",
      "type" : "text/xml"
    },
    "link" : {
      "rel" : "next",
      "href" : "http://localhost:8080/customer/3",
      "type" : "text/xml"
    },
    "link" : {
      "rel" : "prev",
      "href" : "http://localhost:8080/customer/1",
      "type" : "text/xml"
    },
}

... because there are three elements with the same name.

It would be nice to have some sort of Jackson @JsonProperty-style naming override, like this:

self_link = URLField(property_name="self", required=True)

self_atom = EmbeddedDocumentField(AtomLink, property_name="link")
next_atom = EmbeddedDocumentField(AtomLink, property_name="link")
prev_atom = EmbeddedDocumentField(AtomLink, property_name="link")

Reason why UUIDField is in base.py not fields.py?

It looks like the original commit was into fields.py but somewhere along the line the UUIDField got moved into base.py - is there a reason why? Maybe a comment could be added to the code if it's important it stays there.

Can't pass "id" to a document class

I get this error:

File "/usr/local/lib/python2.6/dist-packages/dictshield/document.py", line 262, in init
setattr(self, attr_name, attr_value)
File "/usr/local/lib/python2.6/dist-packages/dictshield/fields/base.py", line 161, in set
value = uuid.UUID(value)
File "/usr/lib/python2.6/uuid.py", line 134, in init
raise ValueError('badly formed hexadecimal UUID string')

Add method to return just fields and values

When using a model to build a SQL query (and possibly other use cases), it's common to want to iterate over just the fields and their values in the dictionary.

A method that excluded key/value pairs that are not a field in the model (like _cls, _types, etc.) would be helpful.

Can't load UUIDField

For some odd reason I can't seem to import the UUIDField. This is with the latest (master) version of DictShield:

>>> import dictshield
>>> from dictshield.fields import StringField
>>> from dictshield.base import UUIDField
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name UUIDField

Design question: is field deletion avoided on purpose?

I'm considering hooking dictshield in with an ongoing project, and noticed the lack of __del__ methods on dictshield.fields.*. Is this a deliberate design decision? Or is it more "haven't needed it yet, why complexify unnecessarily"?

FWIW, I'd like to log destructive accesses to fields (__set__ and __del__ if it appears) for some crude STM.

Extending Document with __metaclass__

I want to extend my Model with metaclass, is it possible?

    class ModelMetaClass(document.TopLevelDocumentMetaclass):
        def __new__(cls, name, bases, attrs):
            super_new = super(ModelMetaClass, cls).__new__
            klass = super_new(cls, name, bases, attrs)

            klass._collection_name = cls.__name__.lower()

            return klass

    class Model(Document, TimeStamped):
        __metaclass__ = ModelMetaClas

I get this error:

Traceback (most recent call last):
File "/Users/up/t/tests/test_models.py", line 25, in test_timestamped
class Model(Document, TimeStamped):
File "/Users/up/t/models/tests/test_models.py", line 21, in new
klass = super_new(cls, name, bases, attrs)
File "/Users/up/.virtualenvs/dictshield/lib/python2.7/site-packages/dictshield/document.py", line 195, in new
for field_name, field in klass._fields.items():
AttributeError: type object 'Model' has no attribute '_fields'

make_json_*safe() functions do not call for_json() to clean field values

Problem in the master branch - consider e.g:

@classmethod
def make_json_publicsafe(cls, doc_dict_or_dicts):
    """Trims the object using make_publicsafe and dumps to JSON
    """
    trimmed = cls.make_publicsafe(doc_dict_or_dicts)
    return json.dumps(trimmed)

There is no calling of for_json() for each field prior to dumping, unlike with e.g. base.py's to_json():

def to_json(self, encode=True):
    """Return data prepared for JSON. By default, it returns a JSON encoded
    string, but disabling the encoding to prevent double encoding with
    embedded documents.
    """
    fun = lambda f, v: f.for_json(v)
    data = self._to_fields(fun)
    if encode:
        return json.dumps(data)
    else:
        return data

Example of the problem:

#!/usr/bin/env python

from dictshield.document import Document
from dictshield.fields import DateTimeField
from datetime import datetime

class TestRepresentation(Document):
    created_at = DateTimeField(required=True)

test = TestRepresentation()
test.created_at = datetime.now()

try:
    test.validate()
except ShieldException, se:
    print "Narcolepsy validation error: %s" % se

data = test.make_json_publicsafe(test)

Throws error:

  File "/usr/lib/python2.7/dist-packages/simplejson/encoder.py", line 192, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: datetime.datetime(2011, 11, 30, 18, 31, 45, 842086) is not JSON serializable

add validation to minimized_field_name

so that 2 fields can not have the same minimized name and will thrown an initialize time exception

currently, the exception is cryptic and not obvious to the root cause

Error when using make_json_publicsafe() on Document with EmbeddedDocumentField

Sample code:

from dictshield.document import Document, EmbeddedDocument
from dictshield.fields import EmbeddedDocumentField, IntField, StringField                                                                                    

class Status(EmbeddedDocument):
    _public_fields = ('status_id',
                      'name',)

    status_id = IntField(required=True, min_value=1)
    name = StringField(required=True, max_length=64)


class StudySubject(Document):
    _public_fields = ('subj_id',
                      'status',)

    subj_id = IntField(required=True, min_value=1)
    status = EmbeddedDocumentField(Status, required=True)

stat = Status(name='ON STUDY', status_id=2)

subj = StudySubject(
    status=stat,
    subj_id=123,
)

print subj.to_python()
print subj.make_publicsafe(subj)
print subj.make_json_publicsafe(subj)

Resulting output:

{'status': <Status: Status object>, '_types': ['StudySubject'], '_cls': 'StudySubject', 'subj_id': 123}
{'status': <Status: Status object>, 'subj_id': 123}
Traceback (most recent call last):
  File "./test-status.py", line 30, in <module>
    print subj.make_json_publicsafe(subj)
  File "/Users/st0w/.virtualenvs/obts-tracking/lib/python2.6/site-packages/dictshield/document.py", line 147, in make_json_publicsafe
    return json.dumps(trimmed)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/__init__.py", line 230, in dumps
    return _default_encoder.encode(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 367, in encode
    chunks = list(self.iterencode(o))
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 309, in _iterencode
    for chunk in self._iterencode_dict(o, markers):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 275, in _iterencode_dict
    for chunk in self._iterencode(value, markers):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 317, in _iterencode
    for chunk in self._iterencode_default(o, markers):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 323, in _iterencode_default
    newobj = self.default(o)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/json/encoder.py", line 344, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <Status: Status object> is not JSON serializable

JSON field called "self" throws DictShield error

Test code:

#!/usr/bin/env python

from dictshield.document import Document
from dictshield.fields import StringField

data = {
  "code" : "GBP",
  "self" : "http://localhost:8080/currencies/GBP" # Replacing with "link": works fine
}

class Currency(Document):
    code = StringField(required=True)

curr = Currency(**data)
try:
    curr.validate()
except ShieldException, se:
    print 'ShieldException caught: %s' % se

print "This currency's code is %s" % curr.code

Error:

Traceback (most recent call last):
  File "./scratchpad.py", line 14, in <module>
    curr = Currency(**data)
TypeError: __init__() got multiple values for keyword argument 'self'

demos/diff_obj_id.py fails

The demos/diff_obj_id.py demo throws the following error:

SimpleDoc:
Traceback (most recent call last):
  File "demos/diff_obj_id.py", line 29, in <module>
    print 'SimpleDoc:', sd.to_python()
  File "/Users/talos/.virtualenvs/caustic/lib/python2.7/site-packages/dictshield/document.py", line 424, in to_python
    data = self._to_fields(fun)
  File "/Users/talos/.virtualenvs/caustic/lib/python2.7/site-packages/dictshield/document.py", line 406, in _to_fields
    data[field.uniq_field] = field_converter(field, value)
  File "/Users/talos/.virtualenvs/caustic/lib/python2.7/site-packages/dictshield/document.py", line 423, in <lambda>
    fun = lambda f, v: f.for_python(v)
  File "/Users/talos/.virtualenvs/caustic/lib/python2.7/site-packages/dictshield/fields/mongo.py", line 42, in for_python
    raise ShieldException('Invalid ObjectId', self.field_name, value)
dictshield.base.ShieldException: Invalid ObjectId - id:4edef13acc50ff07e4000000

ObjectIdField appears to never validate properly once it is assigned.

Unexpected behavior with field named `id`

It seems that DictShield's internal use of the field named id for tracking ObjectIds can cause some confusion if someone attempts to create their own field named id.

from dictshield.document import Document
from dictshield.fields import IntField

class StudySubject(Document):
   id = IntField(required=True, min_value=1)

s = StudySubject()
s.id = -1
print 'ID: %d' % s.id
s.validate()

Output shows ID: -1, but no exception is thrown for the negative value. This appears to be due to a conflict on the field name id in the following block of code in dictshield.base:

           if field.id_field:
                current_id = new_class._meta['id_field']
                if current_id and current_id != field_name:
                    raise ValueError('Cannot override id_field')

                new_class._meta['id_field'] = field_name
                # Make 'Document.id' an alias to the real primary key field
                new_class.id = field

        if not new_class._meta['id_field']:
            new_class._meta['id_field'] = 'id' # <-- Here be dragons
            new_class._fields['id'] = ObjectIdField(uniq_field='_id') # <-- Here too
            new_class.id = new_class._fields['id']

In this case, the resulting id field that is created is of type <dictshield.base.ObjectIdField>. So when the user tries to validate() the document, the validation code for ObjectIdField is called - and not, as might be expected in this example case, validation for IntField.

It would seem to me that DictShield should protect its own internal id tracking variable, or thrown an exception if the user tries to create a field named id - or possibly at least display some kind of warning. This could lead to significant user confusion, as id is a very commonly used field name.

Understandably, the ability to control an object's ID is very powerful and beneficial for the end-developer and as such it should remain in place. Perhaps a developer wishing to override OID handling should have to do so via a specific parameter passed to BaseDocument.__init__(), say for example id_field as it currently stands? If this were done and DictShield tracked internal OIDs via a protected parameter, then users could still use id as a field name without the conflict.

Ability to set/validate a root key in the JSON

At the moment, DictShield produces JSONs without a root key, as in the following example:

class Media(Document):
    """Simple document that has one StringField member
    """
    title = StringField(max_length=40)

producing:

{
    '_types': ['Media'],
    '_cls': 'Media',
    'title': u'Misc Media'
}

However, many RESTful APIs (see e.g. the Shopify API) wrap the JSON in a root key, like so:

{
  "article": {
    "created_at": "2008-07-31T20:00:00-04:00",
    "body_html": "<p>Do <em>you</em> have an <strong>IPod</strong> yet?</p>",
    <yada>
  }
}

This behaviour isn't something JSON really needs, but it appears in a lot of APIs to achieve document-equivalence with XML (which always has a root key).

You can model a root key in DictShield by making your Document an EmbeddedDocument and wrapping it in a Document which just holds the root key. But it would be nice to have that functionality pre-rolled somehow, either declaratively on each Document class definition or imperatively on the make_json_*safe etc.

A few notes:

  • This would nicely complement #22 to customise the root key name (e.g. "MovieMedia" -> "movie-media")
  • Here's how the equivalent functionality works (badly!) in Java Jackson: Use class name as root key for JSON Jackson serialization
  • Any root key setting should be ignored when a class is being used as an EmbeddedDocument

Documentation typos/clarifications

print 'ShieldException caught: %s' % (se))

^ Additional bracket on end

json_string = request.get_arg('data')
user_input = json.loads(json_string)
u.validate(**user_input)

^ u not defined above. (Also: "This method builds a User instance out of the input, which also throws away keys that aren't in the User definition." - how does the above code know to build a User instance rather than say a Media document? I can't see any reference to User.)

Add FileField

Add support for storing information and validating file uploads using DictShield models.

when i have an objectIdField in Document to_json can't serialize

I'm using dictshield with mongodb backend.
I'm trying things out with a basic user:
User(document):
username = StringField(max_length=50)

UserProfile(document)
owner = ObjectIdField()
city = stringField(max_length=50)

I create the user and save it to mongo
which returns the ObjectId of the user (ie ObjectId('4e1b2d46d1f5ce5af4000000'))
then I create the User profile with the owner = ObjectId of new user
If I try user_profile.to_json() I get error "can't serialize objectId" (line 407 base.py)

I solved the issue by returning unicode(value) for ObjectIdField.for_python (line 141 in base.py).

I'm not sure this is correct solution...

if you could confirm or let me know what i'm doing wrong.

thx

PS: this happens using both objectId from base.py and bson.py

i18n support

Any i18n support planned for DictShield? I'd be glad to help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.