bogdanp / anom-py Goto Github PK
View Code? Open in Web Editor NEWAn ndb-like object mapper for Google Cloud Datastore.
Home Page: https://anom.defn.io
License: BSD 3-Clause "New" or "Revised" License
An ndb-like object mapper for Google Cloud Datastore.
Home Page: https://anom.defn.io
License: BSD 3-Clause "New" or "Revised" License
When using get_multi()
with a lot of keys (probably around a 100), get_multi()
will fail with this error:
Traceback (most recent call last):
File "/env/lib/python3.7/site-packages/webapp2.py", line 604, in dispatch
return method(*args, **kwargs)
...
File "/srv/my_handler.py", line 231, in handle
entities = anom.get_multi(keys)
File "/env/lib/python3.7/site-packages/anom/model.py", line 716, in get_multi
entities_data, entities = adapter.get_multi(keys), []
File "/env/lib/python3.7/site-packages/anom/adapters/datastore_adapter.py", line 139, in get_multi
request_keys.remove(key)
TypeError: unhashable type: 'Entity'
It seems an Entity
somehow got in the list of missing
.
I have verified it works fine if i just loop through the keys and .get()
them individually (though it is damn slow). I have also verified that all the keys passed are of type anom.Key
.
I no longer use Datastore and I'm in the process of de-Google-ifying my work/life so this project is now in life support mode until a new maintainer steps up. If you think you can take over the project, either post here or email me and we can figure something out.
Add an adapter that transparently adds strongly-consistent caching via Memcache on top of any other adapter.
When I have a model like:
class Entry(Model):
published = props.DateTime(indexed=True)
My query (following the example @ https://anom.defn.io/quickstart.html?highlight=order_by)
return Entry.query().with_ancestor(Key(Feed, str(feed_id))) \
.order_by(Entry.published).run(limit=toreturn_count)
Will crash with:
...
File "/env/lib/python3.7/site-packages/anom/query.py", line 110, in __next__
return next(self._entities)
File "/env/lib/python3.7/site-packages/anom/query.py", line 149, in _get_entities
for batch in self._get_batches():
File "/env/lib/python3.7/site-packages/anom/query.py", line 118, in _get_batches
entities, self._options.cursor = adapter.query(self._query, self._options)
File "/env/lib/python3.7/site-packages/anom/adapters/datastore_adapter.py", line 179, in query
for entity in result_iterator:
File "/env/lib/python3.7/site-packages/google/api_core/page_iterator.py", line 212, in _items_iter
for page in self._page_iter(increment=False):
File "/env/lib/python3.7/site-packages/google/api_core/page_iterator.py", line 243, in _page_iter
page = self._next_page()
File "/env/lib/python3.7/site-packages/google/cloud/datastore/query.py", line 525, in _next_page
query_pb = self._build_protobuf()
File "/env/lib/python3.7/site-packages/google/cloud/datastore/query.py", line 464, in _build_protobuf
pb = _pb_from_query(self._query)
File "/env/lib/python3.7/site-packages/google/cloud/datastore/query.py", line 595, in _pb_from_query
if prop.startswith("-"):
AttributeError: 'DateTime' object has no attribute 'startswith'"
timestamp: "2020-02-13T03:54:57.553739Z"
}
I need to change my query to:
return Entry.query().with_ancestor(Key(Feed, str(feed_id))) \
.order_by('published').run(limit=toreturn_count)
or
return Entry.query().with_ancestor(Key(Feed, str(feed_id))) \
.order_by('-published').run(limit=toreturn_count)
So as the title says, currently String properties get stored as encoded Blobs in the Datastore, which makes them unreadable in the cloud console Datastore viewer. This also makes doing anything with your data outside of anom more difficult. Storing Strings as Strings is probably a more sane default.
Now i know this is because there is compression support for Text properties, but having Strings as default, with encoding being optional, makes more sense in my opinion.
I made a quick refactor on my fork (which also just removes compression from Text). As that is breaking you probably aren't interested in merging that (if you are, i can make a PR :) ).
I know the limitation of datastore. But it is handy to have a way to enforce uniqueness on the properties. Here djangae has implemented a custom way to apply this. Is it possible to do the same here?
When I try:
list(Subscriber.query().select().where(Subscriber.account == account).run())
It cause an error:
Traceback (most recent call last):
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 135, in handle
self.handle_request(listener, req, client, addr)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 176, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/falcon/api.py", line 242, in __call__
responder(req, resp, **params)
File "/home/bogdan/Projects/inkit-public-api/controllers/subscribers.py", line 20, in on_get
subscribers = self.get_subscribers(account)
File "/home/bogdan/Projects/inkit-public-api/controllers/subscribers.py", line 64, in get_subscribers
subscribers = list(Subscriber.query().select().where(Subscriber.account == account).run())
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/anom/query.py", line 105, in __next__
return next(self._entities)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/anom/query.py", line 139, in _get_entities
for batch in self._get_batches():
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/anom/query.py", line 113, in _get_batches
entities, self._options.cursor = adapter.query(self._query, self._options)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/anom/adapters/datastore_adapter.py", line 189, in query
for entity in result_iterator:
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/google/cloud/iterator.py", line 218, in _items_iter
for page in self._page_iter(increment=False):
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/google/cloud/iterator.py", line 247, in _page_iter
page = self._next_page()
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/google/cloud/datastore/query.py", line 482, in _next_page
query_pb = self._build_protobuf()
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/google/cloud/datastore/query.py", line 424, in _build_protobuf
pb = _pb_from_query(self._query)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/google/cloud/datastore/query.py", line 547, in _pb_from_query
helpers._set_protobuf_value(property_filter.value, value)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/google/cloud/datastore/helpers.py", line 408, in _set_protobuf_value
attr, val = _pb_attr_value(val)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/google/cloud/datastore/helpers.py", line 325, in _pb_attr_value
raise ValueError("Unknown protobuf attr type %s" % type(val))
ValueError: Unknown protobuf attr type <class 'anom.model.Key'>
Account model:
class Account(Model):
name = props.String()
api_token = props.String(indexed=True)
Subscriber model:
class Subscriber(Model):
source = props.String(indexed=True)
account = props.Key(kind="Account", indexed=True)
tags = props.Key(repeated=True, kind="Tag")
first_name = props.String(indexed=True)
last_name = props.String(indexed=True)
company = props.String(optional=True)
gcloud_requests
has a hard dependency on an old version of google-cloud-core>=0.24,<0.25
, which is annoying if you want to use google-cloud-*
libraries released after ~June 2017. For example when someone wants to use google-cloud-logging==1.8.0
this will lead to a pipenv version resolution failure. It could also lead to other errors when people don't use pipenv, because of libraries expecting different versions of google-cloud-core
.
As google-cloud-datastore
uses requests under the hood now, i don't see the use for gcloud_requests
anymore (I don't think the maintainers do either, it hasn't been updated in over a year). So it would be nice if anom-py could use the latest google-cloud dependency.
I attempted to remove the dependency and use the latest google-cloud-datastore
on my fork. Even though the tests pass, when running on google app engine it breaks, and i have no clue why (the worker just crashes / times out, so i don't have a trace or anything). So something probably is still wrong there, and that's why i just decided to make an issue instead of an PR because you might be able to figure it out.
Got an error:
File "<input>", line 1, in <module>
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/anom/model.py", line 432, in __init__
setattr(self, name, value)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/anom/model.py", line 300, in __set__
ob._data[self.name_on_entity] = self.validate(value)
File "/home/bogdan/Projects/inkit-public-api/.env/lib/python3.6/site-packages/anom/properties.py", line 498, in validate
raise ValueError(f"Property {self.name_on_model} cannot be assigned keys of kind {value.kind}.")
ValueError: Property tags cannot be assigned keys of kind Key('Tag', 5073368378245120, parent=None, namespace=None).
When try:
s = Subscriber(account=a, tags=(t.key, t1.key), first_name="Bob", last_name="Bobbson", address=addr, source="manual")
Where t and t1 - Tag objects.
Model:
class Subscriber(Model):
# immutable
source = props.String(indexed=True)
account = props.Key(kind="Account", indexed=True)
tags = props.Key(repeated=True, kind="Tag")
# mutable
first_name = props.String(indexed=True)
last_name = props.String(indexed=True)
company = props.String(optional=True)
What am I doing wrong?
This happens only on 0.9.0. When quering an entity on a production/remote datastore and not a local datastore emulator, anom gets stuck in an infinite loop. This example in a python interactive shell is enough to get stuck (provided you have authenticated with the gcloud tool):
import models
[x for x in models.ExampleModel.query().run()]
ExampleModel in this case is defined like any other anom model and there are some entities in the datastore to query.
meanwhile, using gcloud_requests it works fine:
from google.cloud import datastore
from gcloud_requests import DatastoreRequestsProxy
client = datastore.Client(_http=DatastoreRequestsProxy(), _use_grpc=False)
[x for x in client.query(kind='ExampleModel').fetch()]
I have figured out it is stuck in the while True
loop in query.py
. With some print()
debugging i found that remaining is None
and self._complete
was set, therefore never going in the if not entities: break
statement. I added a if self._complete: break
for a quick fix, but i think there might be a larger problem with cursors/offset/limit.
First of all: thanks for creating and maintaining this package! ๐
Sadly I haven't found the time to debug it properly due to crunch time.
key_from_protobuf
in google.cloud.datastore.helpers creates a Key with a None-value namespace, but anom replaces a None namespace with the global default namespace (which is an empty string), resulting in a ValueError:
for entity in found:
> index = datastore_keys.index(entity.key)
E ValueError: <Key('SocialMediaAuth', 'test1'), project='xxx'> is not in list
../../../../library/anom/adapter.py:147: ValueError
My current workaround is setting the default namespace to None, which requires two line changes in anom.namespaces. I haven't fully validated that this doesn't cause other issues, I'll update this issue if we run into other problems, maybe you have some feedback about this workaround :)
Packages:
anom==0.7
google-cloud==0.32.0
google-cloud-datastore==1.0.0 (also an issue with 1.6.0)
This is because keys are converted eagerly on put, but batches aren't stored until commit.
We have an entity with a property that is things = props.Key(kind=Thing, repeated=True)
, and when iterating over the things
property, you get datastore.Key
s instead of anom.Key
s as you would expect.
I think the problem is in _prepare_to_load
in datastore_adapter.py
. There is a check for values that are isinstance(value, datastore.Key)
, but not for values that are a list of keys (which a repeated key prop would return). Therefore the datastore.Key
s get returned directly to the end user of the library.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.