kgaughan / dbkit Goto Github PK
View Code? Open in Web Editor NEWTaking some of the pain out of Python's DB-API
License: MIT License
Taking some of the pain out of Python's DB-API
License: MIT License
From my interaction with others, I'm not sure the documentation around default_factory is anywhere near as clear as it needs to be. It might be worthwhile making it more obvious and allowing it to be configured as an argument when creating the context/pool.
The mediator should, eh, mediate the creation of cursors, recycling discarding connections until an open one is obtained.
It would be nice to have something like this:
dbkit.insert('table', {'id': '5', 'thing': 'bar'})
This is a silly omission.
Right now, pools don't have a way to assign either of these to their contexts. This makes dealing with the likes of Jinja a touch awkward if you're using a connection pool.
What's needed is something akin to the default_factory and logger attributes on Context to be added to PoolBase. These would be assigned to any newly created contexts within the pool.
The pool doesn't currently have a method for recording the time a connection was last used. Without this, its method for dealing with dead and zombie connections is somewhat heavy-handed. It would be better to attach a timestamp to the connections managed by it, along with a timeout to allow them to be collected in an appropriate manner.
It would be nice to be able to take advantage of driver-specific features, and provide fallbacks where they're not supported. These include things like:
The first step would be to create some class that would provide a basic implementation of these things, with some way for other abstractions to be registered on a driver-by-driver basis, possibly using a setuptools-based hook.
The reason for this is SQLite support. I'm not sure how exactly I'll end up implementing this as the thread affinity involved is a touch more awkward to deal with than with type 2 and type 3 drivers, which the existing connection pool handles just fine.
The former depends on only on being able to tell if the row is a tuple, with the assumption otherwise being that it's a dict-like object. The may break, however, and will likely be slower than I'd like.
Alternatively, some kind of row factory can be implemented. This would be akin to the way the sqlite3 driver for Python works. This has the advantage of keeping the likes to query_value and query_column fast, but it may slow down query() and query_row().
I'm strongly leading towards the latter.
Working on this. Wrote the fake DB-API driver (which itself will need tests) for testing these. No clear idea of how I can simulate all the things that can possibly go wrong with the connection pool yet though.
dbkit should work with Python 3. The pypi version will not even install in a python 3 enviromnent, the current github HEAD will.
Some quick and dirty patching of dbkit.py and the test cases got them all passing under Python3.4. The main changes being how exceptions are caught, using io
module instead of StringIO
, replacing StandardError
with Exception
and changing the print
statements.
tests.test_dbkit.test_good_connect ... ok
tests.test_dbkit.test_bad_connect ... ok
tests.test_dbkit.test_context ... ok
tests.test_dbkit.test_create_table ... ok
tests.test_dbkit.test_bad_drop_table ... ok
tests.test_dbkit.test_transaction ... ok
tests.test_dbkit.test_transaction_decorator ... ok
tests.test_dbkit.test_factory ... ok
tests.test_dbkit.test_unpooled_disconnect ... ok
tests.test_dbkit.test_make_file_object_logger ... ok
tests.test_dbkit.test_logging ... ok
tests.test_dbkit.test_procs ... ok
tests.test_dbkit.test_to_dict_nothing ... ok
tests.test_dbkit.test_to_dict_bad_key ... ok
tests.test_dbkit.test_to_dict_happy_path ... ok
tests.test_dbkit.test_to_dict_sequence ... ok
tests.test_dbkit.test_make_placeholders ... ok
tests.test_pool.test_check_pool ... ok
tests.test_pool.test_lazy_connect ... ok
tests.test_pool.test_real_connect ... ok
tests.test_pool.test_bad_query ... ok
tests.test_pool.test_pool_contention ... ok
tests.test_pool.test_setting_propagation ... ok
----------------------------------------------------------------------
Ran 23 tests in 0.041s
Currently 308 of the 360 most popular libraries support Python 3 py3readiness there is nor reason why dbkit shouldn't.
If Python3 support is desired I can clean up my patch so it works with both 2 and 3 and submit a PR.
Using those two functions requires an active context. It ought to be possible to set them on a context without that context being currently active.
Ideally, each AttrDict generated by a DictFactory would do key sharing. Currently, they don't owing to me wanting to keep the code simple. I'd need to run some tests to see if the benefits of doing so make sense.
One thing I haven't thought about properly yet are cases where, and this is particularly pertinent when it comes to connection pools, is what happens when a connection times out. Having it deal with that properly is critical to getting it properly battle-hardened.
The reason I haven't dealt with this is that DB-API's OperationalError
exception is a touch vague. I've a nagging worry that discarding a connection based on catching that alone might not be an altogether excellent idea. It may be the safest one though.
All that aside, dealing this this particular issue has two aspects: the single-connection aspect and the connection pool aspect. For single connections, the connection mediator will need to be able to reconnect at will to the DB. For connection pools, the mediator will need to have some way to signal to the pool that the connection it currently holds is somehow bad. In both cases, the context will have to signal to the mediator that the current connection is bad.
The best place to handle the error is in the __exit__
method of the mediator. That way the context doesn't have to explicitly signal to the mediator that the connection's bad. SingleConnectionMediator
will need to be tweaked so that it takes a function that returns a connection rather than be supplied with an actual connection. This would make its behaviour a touch more consistent with PooledConnectionMediator
. It'll also have to start counting depth so its behaviour is consistent regardless of whether it's used within or without a transaction.
It would be terribly useful to have a decorator to complement the transaction() function. Maybe call it 'transactional()'?
Is it fair to expect people to use the unit tests and the module code to figure out how this yoke works? Not at all. Documentation and examples need to be written. The doc comments are fine as they stand,ย but are hardly enough.
Implementing this isn't down to a lack of will, but a lack of opportunity. I don't work with any databases that support it to the best of my knowledge, so I can't implement it. If somebody out there knows better than I do how to implement it, I'll be glad to review (and likely accept) a patch.
Nothing complex here: it'd just be nice if the factories had a mechanism to either be converted to some format, or had methods for exporting themselves as JSON, &c.
It would be nice to expose a mogrify function from dbkit if the database supports it. If there is support for this idea I can open a PR.
While dbkit covers a reasonable amount of this stuff, it could do better. For instance, some kind of table abstraction to allow for database introspection and some sophisticated functionality based off of that, ร la dataset, would be nice.
This would be useful to allow something like this:
from django import db
# ...
with dbkit.context(db.connection):
# ...
It would be nice to have a method to safely abort transactions.
I've written code to do connection management so connection pooling is possible. I've also written a base class for a connection pool, but haven't written a proper implementation and it's missing a number of methods for things like creating contexts based off of the pool.
It's incredibly inconvenient to have to explicitly tweak queries for every driver's preference when it comes to prepared statement placeholders.
Instead, it ought to be possible for dbkit to mask that insanity completely.
Expanding 'IN (?)' and its ilk out nicely would be good too.
It's more convenient and shorter to write this:
with pool:
pass
Than it is to write this:
with pool.connect():
pass
The former form ought to be possible.
I've never had need to use callproc before to call stored procedures, so it remains unimplemented. It ought to be implemented though. Unfortunately, sqlite3 doesn't support stored procedures, which isn't exactly going to make testing them easy.
I want to bundle up the stack handling in Context as it's messy and I figure I'd might as well do the same with ThreadAffinePool.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.