pygresql / pygresql Goto Github PK

View Code? Open in Web Editor NEW

37.0 3.0 17.0 2.6 MB

The official PyGreSQL repository

Home Page: http://pygresql.org/

License: Other

Python 76.11% C 23.53% Shell 0.30% Dockerfile 0.06%

database database-adapter postgresql python c-extension libpq

pygresql's Introduction

PyGreSQL - Python interface for PostgreSQL

PyGreSQL is a Python module that interfaces to a PostgreSQL database. It wraps the lower level C API library libpq to allow easy use of the powerful PostgreSQL features from Python.

PyGreSQL should run on most platforms where PostgreSQL and Python is running. It is based on the PyGres95 code written by Pascal Andre. D'Arcy J. M. Cain renamed it to PyGreSQL starting with version 2.0 and serves as the "BDFL" of PyGreSQL. Christoph Zwerschke volunteered as another maintainer and has been the main contributor since version 3.7 of PyGreSQL.

The following Python versions are supported:

PyGreSQL 4.x and earlier: Python 2 only
PyGreSQL 5.x: Python 2 and Python 3
PyGreSQL 6.x and newer: Python 3 only

The current version of PyGreSQL supports Python versions 3.7 to 3.12 and PostgreSQL versions 10 to 16 on the server.

Installation

The simplest way to install PyGreSQL is to type:

$ pip install PyGreSQL

For other ways of installing PyGreSQL and requirements, see the documentation.

Note that PyGreSQL also requires the libpq shared library to be installed and accessible on the client machine.

Documentation

The documentation is available at pygresql.github.io/ and at pygresql.readthedocs.io, where you can also find the documentation for older versions.

pygresql's People

Contributors

Stargazers

Watchers

Forkers

abobov swarzech d nucleoosystem mrbez isgasho erupare rajeev02101987 ra2003 nishanthbejgam dinghuaming justinpryzby mirumirumi smartkeyerror uscxiexing moslem-tg seneca-c

pygresql's Issues

Desupport Python 2.6 and 3.3

It gets harder to test the whole bandwidth of Python versions. We should desupport the older Pyhton versions 2.6 and 3.3 (maybe 3.4 as well) which always make trouble.

Base classic API on DB-API

The idea here is to divorce the classic API from the C module and make it a wrapper around the DB-API module. If done carefully then the classic interface can work with any DB-API compliant module.

(copied from Trac ticket 36, created 2010-09-12)

Consider adding type hints

Python supports type hints since Python 3.5.

To support tool developers with tools such as PyCharm, we might consider adding type hints for PyGreSQL 5.x, e.g. in the form of additional *.pyi stub files.

(copied from Trac ticket 61, created 2016-01-02)

Test asynchronous command processing

In version 5.2 we integrated support for asynchronous command processing which has been contributed on the mailing list by Patrick TJ McPhee a while ago (see #19).

However, this feature is still experimental and only rudimentary tested. We need much more testing.

nowait parameter
method poll(), set_non_blocking(), is_non_blocking().
method send_query() and all ways of getting results afterwards
(iterators, scalars, one, single, direct access etc.)

Consider avoiding encoding/decoding bytea in queries

Currently, when executing queries that contain bytea values on input or output, these are encoded from bytes and decoded to bytes on the level of the pgdb module and the pg.DB wrapper class.

This is needed because on the lowest level we use the PQexec() method or the PQexecParams() method without setting paramLengths and paramFormats. In both cases, Postgres only uses text format for input and output, so we need to encode and decode.

If we would always use PQexecParams() and set paramLengths and paramFormats, we could avoid the encoding and decoding between bytes and bytea text format, by passing these values in binary format.

(Using binary could also speed up passing other parameters with types that have the same binary representation in Python and Postgres. But that could be brittle because it might depend on the Python and Postgres versions. However, bytes are always bytes, so it would be useful in the case of bytea. Also bytea values are usually big, so time and memory demand for encoding/decoding are more relevant for these.)

(copied from Trac ticket 53, created 2015-12-10)

Licence Type

Hi, I am planning to use this library for development purpose but not sure about is Licence type. Can someone please confirm its licence type or provide a link to something official documentation explaining it?

Support optional cursor and connection attribute "messages"

The DBAPI2 PEP 249 proposes an optional optional cursor and connection attribute "messages". We should look into that.

(copied from Trac ticket 19, created 2010-09-12)

Desupport Python 2 (legacy Python)

Remove support for Python 2 completely, probably in the next major version.

Particularly, get rid of py3c.h and use the standard definitions instead.

This will simplify the code and reduce maintenance cost considerably.

Unable to install PyGreSql on CentOS

It has been reported on StackOverflow that there is an issue installing the latest PyGreSQL on CentOS.

We need to investigate this.

Update tests/dbapi20.py

We currently use the Python DB API 2.0 driver compliance unit test suite in version 1.5 which is very old and should be updated to the latest available version (see for instance here).

Option for statically linking libpq

We should have an option to build PyGreSQL with a statically linked libpq.

This is particularly interesting for shipping Windows binaries, because there you often get problems when libpq is not in the PATH or it is a version that does not support all features activated in the DLL of the PyGreSQL C module.

Allow inserttable() to take columns as parameter

Suggested on the mailing list 2017-01:

The method Connection.inserttable(table, values) in the classic module might take more parameters, e.g. columns, similar to the Cursor.copy_from(...) method in the DB API 2 module.

(copied from Trac ticket 73, created 2018-04-23)

Fatal Python Error: deallocating None. (Version 5.2.1 and earlier)

There seems to be this reference counting bug with some specific queries. The query also has to query a lot of data or be repeated multiple times without restarting script or server. The simplest case I thought off to reproduce seems to be querying for columns of type citext array. For instance can first make this small postgres table

CREATE EXTENSION citext;
CREATE TABLE teams (id INT, players CITEXT[]);
INSERT INTO teams VALUES (1, '{"Alice", "Bob"}');

and then run the following script

import pgdb
import sys


def execute_query(query):
    cursor = CONNECTION.cursor()
    cursor.execute(query)
    rows = cursor.fetchall()
    cursor.close()
    return rows


def reproduce_bug():
    print(sys.getrefcount(None))
    for i in range(10000):
        execute_query('SELECT players FROM teams')
        print(sys.getrefcount(None))


if __name__ == '__main__':
    info = {'user': 'paluchasz', 'host': 'localhost', 'database': 'paluchasz', 'port': 5432, 'password': 'abcd'}
    CONNECTION = pgdb.connect(user=info['user'], host=info['host'], database=info['database'], port=info['port'],
                              password=info['password'])
    reproduce_bug()

setup.py: wrong version check

https://github.com/Cito/pygresql/blob/master/setup.py#L83
It doesnt work for versions like 9.5beta1.

TypeError: "Connection is not valid" on exit from Python

This was reported 2019-09-13 on the mailing list:

With the following reproduction script, which uses the classic API and DB-API in tandem,

import pgdb
import pg

conn = pgdb.connect(dbname='postgres')
db = pg.DB(conn)
print(db.query('SELECT * FROM pg_class'))
db.close()
conn.close()

we get the following exception when the Python interpreter exits:

Exception ignored in: <bound method DB.__del__ of <pg.DB object at ...>>
Traceback (most recent call last):
  File "/private/tmp/venv/lib/python3.6/site-packages/pg.py", line 1575, in __del__
TypeError: Connection is not valid

The line in question is https://github.com/Cito/PyGreSQL/blob/8ca38358/pg.py#L1575.

It looks like the DB object is still holding onto the (now closed) pgmodule connection object, and is trying to call set_cast_hook(None) during its __del__ method, which fails. Other places in the DB code only
perform the set_cast_hook for ._closeable connections; should __del__ do the same?

Add notes about Django in the documentation

Add notes about how PyGreSQL can be used in Django in the documentation.

If still needed, mention workaround for Python 2 (see discussion in mailing liste 2018-04-27).

DLL load failed: The operating system cannot run %1.

I just created a new environment using virtualenv and installed using pip install PyGreSQL

Support networking data types

See https://www.postgresql.org/docs/devel/datatype-net-types.html and https://www.psycopg.org/docs/extras.html#networking-data-types

Make PyGreSQL thread-safe on the connection level

Check whether the treadsafety of PyGreSQL can/should be improved.

(copied from Trac ticket 24, created 2010-09-12)

Use PQconnectdbParams instead of PQsetdbLogin

Currently, connections in the classic and DB API 2 modules are made with PQsetdbLogin which only allows a limited number of parameters. Additional parameters must be sent as connection string or connection URI, but they cannot be passed as parameters (the DB API 2 module fakes that by creating a connection string from keyword arguments).

We may want to switch to the newer PQconnectdbParams to properly send any number of parameters.

When doing so, we should take care that everything is properly documented, backward compatible, and particularly the default values are set properly and the same as before.

(copied from Trac ticket 79, created 2019-04-24)

Work through driver development tipps from PostgreSQL

The PostgreSQL Wiki has a page "Driver Development" that might contain information interesting for us that we have not yet considered.

We should go through the page and check if PyGreSQL considers all the things mentioned there.

encounter a problem when using pygresql

hi, i encountered a problem when i was using pygresql to insert data(read from json file) into postgresql.
from pg import DB
DB.insert()

the problem is that memory cannot release and it raised very fast.

looking for your reply, thanks a lot

DLL Load Failed and possible solution?

Hey guys, I'm posting this here in hopes that this gets fixed as soon as possible because this is a glaring problem.

I followed the tutorial here:
https://pygresql.org/contents/tutorial.html

from pg import DB

import os
from dotenv import load_dotenv

# Load the ENV
load_dotenv()

dbname = os.getenv('DB_NAME')
host = os.getenv('DB_HOST')
port = os.getenv('DB_PORT')
user = os.getenv('DB_USERNAME')
password = os.getenv('DB_PASSWORD')

# Initialize the database
db = DB(dbname, host, port, user, password)

And when I tried to run it, I ran into the dreaded: DLL load failed: The operating system cannot run %1

After troubleshooting it myself, I've noticed that the code only grabs the file paths in your PATHs in your system settings, and if the folder has "libpq.dll". It has no regard whatsoever if it's in the right folder to begin with--if it's inside a PostgresSQL folder.

Unfortunately, in my PATHs setting, I had C:\xampp\php BEFORE _C:\Program Files\PostgreSQL\12\bin_ which caused quite a problem, as PHP has its own version of "libpq.dll".

Long story short, the problem was easily fixed by moving the Postgres path before the xampp path and the application threw no error.

Is it possible to have an extra "if" in the source code to address this matter?

Add property similar to "description" to the DB wrapper

In #39 we added a method fieldinfo() to the query in the classic module, which takes an optional field name or number. If nothing is specified, the field info for all fields of the query is returned.

This method is a bit low-level though, the provided information is pretty internal and difficult to interpret. On the DB wrapper class, we may want to add another higher-level method or property similar to the description attribute in the DB-API 2 module. This could be a little bit more informative and PostgreSQL-specific than the DB-API 2 description, but the idea would be the same. We could even reuse much of the code of the description property in the DB-API 2 since it is also based on a lower-level method similar to fieldinfo().

The pgquery type should have a method listtypes()

As suggested on the mailing list 2019-06-07 by Justin Pryzby:

The query type of the pg module should have a method for exposing col_types (the column types) of the query to Python. Currently you can only get the number of results with ntuples() and the list of column names with listfields(). Note that col_types only has the result of PQftype(), it might be also interesing to get the type modifiers with PQfmod().

pgdb already provides this information in the unofficial coltypes property and in combined form in the official description property. This is fetched via the listinfo() method of the underlying source object which uses the _source_buildinfo function.

Instead or in addition to adding a new method, we should probably also add a property that combines all of the information, like description in pgdb. Not sure if an ordered dict (simple dict in Py >= 3.6) with names as keys or a list of namedtuples like in pgdb would be better.

(copied from Trac ticket 82, created 2019-06-07)

json/hstore and types=list()

Am 06.05.2020 um 06:01 schrieb Justin Pryzby:

I'm using jsonb column for the 2nd time ever, and a variation of my issue from
August came up again.

I thougt it was weird to need to pass types=['something', 'else', 'int',
'float', 'json']. I suggest to also do:
if isinstance(types, str):
types=types.split()

I suggested to allow passing not types=[str('json')] but also types=[pg.Json].

I suggested guess_simple_types() should handle Json() and Hstore()

I think we need an example of this, probably for types=list() and for
types=dict().
On Fri, Aug 23, 2019 at 11:01:11AM -0500, Justin Pryzby wrote:
[original message elided]

Support asynchronous command processing

The attached patch provides the asynchronous operations described in section 31.4 of the PostgreSQL manual. I believe everything described in that section is available with these exceptions:

there's no prepared statement support
I didn't implement PQsetSingleRowMode(). This would require a possibly small change to the way that query results are retrieved that I thought would go better as a separate change set
I didn't implement PQconsumeInput() or PQisBusy(). I don't really understand the point of these functions, they seem to have marginal utility outside notification reception, and I wasn't sure exactly how to document them. It might make sense to have an isbusy() call which calls both, but I don't really know if that fits anybody's use case
I seem to have left out PQflush(). This is an oversight. In general, the non-blocking operations are not well tested.

In general, the changes allow the database to be used in an event-driven application, and for other applications, there are some parallelism benefits:

Connections can be completed in the background, which can speed up use cases where for instance the application needs to connect to several databases at once
when multiple semi-colon-delimited queries are run in a single call, the results to all the queries are returned
the application can do other work while waiting for queries to complete
copy and large object operations can use non-blocking IO

Query operations work essentially the same way as they do now, except all the result codes are no returned by getresult(), dictresult() or namedresult(), cases where query() returns None, getresult() et al return '', and you have to call getresult() et al until they return None. Also, exceptions raised by bad queries are raised by getresult() et al, not by the query function.

The result member of pgqueryobject is changed by each call to getresult() et al, so you can't get the same query result twice when using asynchronous calls, and functions which depend on the result member don't work until after a call to getresult() et al.

Because of this last point, I had to reorganize _namedresult(). That's the only Python change other than the unit test.

C code changes are:

some new functions
change to connect() to take a new argument and call PQconnectStartParams() when appropriate
added code to getresult() and dictresult() to call PQgetResult() when appropriate. Looking at it now, this block of code has got to be quite big and maybe should move to its own function
renamed pg_query to _pg_query and added a new argument. This is called from wrapper functions pg_query() and pg_sendquery()
moved scalar result processing from pg_query() to a new function, _check_result_status()
changed pg_query() to call PQsendQuery() or PQsendQueryParams() when appropriate

Attachments:

pygresql-async-1.txt (Docs and Python)
pygresql-async-2.txt (C Module)
asynctests.patch.txt (Newer Patch)
asynctests.patch.txt (Tests)

(contributed by Patrick TJ !McPhee via mailing list, 2015-08-03, copied from Trac ticket 49)

Make the C files compile separately

The pgmodule.c file has been split into 7 parts in version 5.1, but is still a single compilation unit.

We may want to add proper header files for each part and make them compile separately.

(copied from Trac ticket 78, created 2019-04-24)

Importing ABC directly from collections module was removed in Python 3.9

https://bugzilla.redhat.com/show_bug.cgi?id=1791750

https://github.com/Cito/PyGreSQL/blob/5d51d1d6dcb01d48f4260266330cb22c7a447bf5/tests/test_classic_connection.py#L21

largeobject out of scope deallocated outside of txn

See bug report and doc improvements posted by Justin on the mailing list 2019-07-24.

(created from Trac ticket 83, created 2019-07-25)

Improve compatibility with other drivers

Check whether we can adopt some things implemented in other Postgres drivers, and make Pygres a bit more compatible with the [http://wiki.postgresql.org/wiki/Python](other drivers). Some noticeable issues can also be found here.

(copied from Trac ticket 35, created 2010-09-12)

Support arrays with start index != 1

The current implementation of the cast_array function in the C module ignores the start index of PostgreSQL arrays. They are assumed to always be 1. As discussed on the mailing list in September 2016, this should be improved:

The idea is that the cast_array function should take an additonal, optional cls parameter which will then be used as the base class for the array. If that parameter is None or list, then the method works as before, returning a list or a list of lists for multidimensional arrays. If it is any other class, then this will be considered as a subclass of list that will be used instead of the builtin list for building the return values. In addition, the instances of this list subclass will have a lower attribute set to the start index of the corresponding PostgreSQL (sub)arrays by the cast_array function.

It should then be possible to change the default value for cls passed into the cast_array function to a custom list subclass. That subclass could consider the start index when getting items, and return None it the index points to outside the array. This would emulate the behavior of PostgreSQL arrays more closely.

Note that when we support start indices when converting Pg to Py, we also need to support them when converting from Py to Pg. Currently this is done with the ARRAY constructor, which doesn't allow for start indices as far as I know, so this needs to be changed.

(copied from Trac ticket 72, created 2016-09-21)

Error when installing Pygresql 5.0.3

I am encountering an error while trying to install Pygresql 5.0.3 from my requirements.txt file.

I am on macOS 10.12.4 and using Python 2.7.

The error I'm getting seems more like a warning then an error but it is preventing me from installing the package.

My terminal output after running pip install -r requirements.txt is as follows:

Installing collected packages: PyGreSQL
  Running setup.py install for PyGreSQL ... error
    Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/private/tmp/pip-build-sPPGpL/PyGreSQL/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-l9qEfz-record/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build/lib.macosx-10.12-intel-2.7
    copying pg.py -> build/lib.macosx-10.12-intel-2.7
    copying pgdb.py -> build/lib.macosx-10.12-intel-2.7
    running build_ext
    building '_pg' extension
    creating build/temp.macosx-10.12-intel-2.7
    cc -fno-strict-aliasing -fno-common -dynamic -arch i386 -arch x86_64 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe -DPYGRESQL_VERSION=5.0.3 -DDIRECT_ACCESS -DLARGE_OBJECTS -DDEFAULT_VARS -DESCAPING_FUNCS -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -I/usr/local/Cellar/postgresql/9.4.5_2/include -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c pgmodule.c -o build/temp.macosx-10.12-intel-2.7/pgmodule.o -O2 -funsigned-char -Wall -Werror
    pgmodule.c:4143:9: error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]
                    num = PyInt_AsLong(param);
                        ~ ^~~~~~~~~~~~~~~~~~~
    pgmodule.c:4448:12: error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]
                    pgport = PyInt_AsLong(pg_default_port);
                           ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    2 errors generated.
    error: command 'cc' failed with exit status 1
    
    ----------------------------------------
Command "/usr/bin/python -u -c "import setuptools, tokenize;__file__='/private/tmp/pip-build-sPPGpL/PyGreSQL/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-l9qEfz-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/tmp/pip-build-sPPGpL/PyGreSQL/

Bug relating to arrays with nondefault bounds

See mailing list 2018-12-21:

A corrupted record at a customer exposed this bug.

Arrays with nondefault bounds fail if either bound begins with 0 or 9.

diff --git a/pgmodule.c b/pgmodule.c
index 08ed188..6fd4635 100644
--- a/pgmodule.c
+++ b/pgmodule.c
@@ -664,11 +664,11 @@ cast_array(char *s, Py_ssize_t size, int encoding,
                        if (s == end || *s++ != '[') break;
                        while (s != end && *s == ' ') ++s;
                        if (s != end && (*s == '+' || *s == '-')) ++s;
-                       if (s == end || *s <= '0' || *s >= '9') break;
+                       if (s == end || *s < '0' || *s > '9') break;
                        while (s != end && *s >= '0' && *s <= '9') ++s;
                        if (s == end || *s++ != ':') break;
                        if (s != end && (*s == '+' || *s == '-')) ++s;
-                       if (s == end || *s <= '0' || *s >= '9') break;
+                       if (s == end || *s < '0' || *s > '9') break;
                        while (s != end && *s >= '0' && *s <= '9') ++s;
                        if (s == end || *s++ != ']') break;
                        while (s != end && *s == ' ') ++s;

Consistent handling of disabled features

PyGreSQL has some preprocessor macros such as SSL_INFO that control which features are enabled. Some features may be disabled when they are not supported by the installed PostgreSQL version.

The behavior for features that are not enabled should be consistent: Either these methods/attributes should simply not be provided (using them will raise AttributeErrors) or they should be always available, but using them should raise NotSupportedError.

Currently the behavior is mostly the former, but not very consistent. The latter has the advantage that the behavior will be the same when an old version of libpq is installed and used at runtime, and applications need only one type of check to safeguard against old libpq versions.

Support for sslmode and sslrootcert

I've been looking around and can't seem to find how to pass these opts: ?sslmode=verify-full&sslrootcert=rds-ca-2019-root.pem.

Is SSL options supported?

Command "python setup.py egg_info" failed with error code 1

I'm not able to install pygresql. When I run pip install, I get this:

(postgresql) Matts-MacBook-Pro:~ mattspeck$ pip install pygresql
Collecting pygresql
  Using cached PyGreSQL-5.0.4.tar.gz
    Complete output from command python setup.py egg_info:
    sh: pg_config: command not found
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/0q/fzk2q0_94tn_z1vlf446_jdr0000gn/T/pip-build-qOeMG3/pygresql/setup.py", line 88, in <module>
        pg_version = pg_version()
      File "/private/var/folders/0q/fzk2q0_94tn_z1vlf446_jdr0000gn/T/pip-build-qOeMG3/pygresql/setup.py", line 82, in pg_version
        match = re.search(r'(\d+)\.(\d+)', pg_config('version'))
      File "/private/var/folders/0q/fzk2q0_94tn_z1vlf446_jdr0000gn/T/pip-build-qOeMG3/pygresql/setup.py", line 74, in pg_config
        raise Exception("pg_config tool is not available.")
    Exception: pg_config tool is not available.
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/0q/fzk2q0_94tn_z1vlf446_jdr0000gn/T/pip-build-qOeMG3/pygresql/

I'm running Anaconda Python 2.7.13.
I've tried installing both within and outside of the virtualenv. Anyone know why this is happening / if there is a fix?

I'm also running on MacOS.

PostgreSQL 13 support

Hi,

PostgreSQL 13.0 is due tomorrow. When will you release v13 compatible version?

Regards, Devrim

cursor.copy_from() method doesn't work with quoted named tables

The error below is thrown when trying to copy a CSV file into a given Postgres table that have quotes in the name in order to use CamelCase style.

The Error message is:

pg.ProgrammingError: ERROR:  syntax error at or near "User"
LINE 1: copy "public."User"" from stdin (format csv,delimiter ','...

The code that generates the error:

cursor.copy_from(file, "public.\"User\"", null="")

where file is the csv file object and the second argument is the schema.table name

Trying others names to the schema.table such as 'public."User"' or any variations doesn't work.

I assume the problem is not a Postgres internal problem because the code above works with psycopg2

Missing adaptation of parameters in query() and query_prepared()

In the pgdb module and the query_formatted() and higher level methods of the pg module such as update(), input parameters are adapted automatically, e.g. converting Python lists to PostgreSQL arrays.

However, the lower level query() and query_prepared() methods of the pg module pass parameters in raw form. This is not well documented, and there is no help on what to do in this case. We should explain how and when to manually adapt the parameters. Maybe also provide alternative low level methods with the Postgres parameter syntax or improve the low level methods so that they automatically adapt parameters as well if that's possible without loss of performance or backward compatibility.

(copied from Trac ticket 81, created 2019-06-07)

The pgdb cursor object should optionally be a real database cursor

It should be possible to create "real" (server side) cursors usind the pgdb.Connection.cursor() method.

It would make sense to add an optional name parameter that would be used as the name of the server side cursor (see here).

(copied from Trac ticket 68, created 2016-01-15)

Consistent handling of optional functionality

PyGreSQL can currently be compiled with various options that would enable or disable parts of the functionality like large object support or default parameters. These options can be added as user options during setup and are passed to the compiler as preprocessor definitions.

This makes the documentation and code overly complicated. Instead, the complete functionality should be enabled by default. There should only be one "compatibility version" setting that would make it possible to not include functionality that is only supported in newer libpq versions, in order to be able to provide binaries compatible with older libpq versions.

Also, the behavior for options that are not enabled should be consistent: Either these methods/attributes should simply not be provided (using them will raise AttributeErrors) or the should be always available but using them should raise NotSupportedError when they require a newer libpq version. Currently the behavior is mostly the former, but not very consistent. The latter has the advantage that the behavior will be the same when an old version of libpq is installed and used at runtime, and applications need only one type of check to safeguard against old libpq versions.

Create separate repository for SQLAlchemy dialect

PyGreSQL existed as a dialect in SQLAlchemy, but should now go into a separate repository.

See sqlalchemy/sqlalchemy#5189

【Performance】fetchall performance

Hi，recently we encounter an performance problem when we upgrade the PyGreSQL from 4.1 to 5.1 with Python2.7 runing environment.

The test case is just consider about the fetchall interface. It's just query 200M rows. The test case is

import datetime
import time
import pgdb
try:
    conn=pgdb.connect(host='127.0.0.1:5433',user='postgres',database='postgres')
    c=conn.cursor()
    c.execute("select * from XXX_int")
    row = c.fetchall()
    c.close()
    conn.close()
except (pgdb.InternalError,Exception) as e:
    print(str(e) + "\nFAILED")

The server version is PostgreSQL12.4, it's information as follows

postgres=# select version();
                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 12.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (EulerOS 4.8.5-28), 32-bit
(1 row)

postgres=# select count(1) from XXX_int;
  count
---------
 2000000
(1 row)

postgres=# \d xxx_int
              Table "public.xxx_int"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 a      | integer |           |          |
 b      | integer |           |          |

For private reason, we need to compile 32-bit version in X86_64 platform with the -m32 flag.

We compare the execution time with two version.

py4.1

[/PyGreSQL-4.1/module/build/lib.linux-x86_64-2.7]$ time python test.py

real    0m7.868s
user    0m7.139s
sys     0m0.325s
[/PyGreSQL-4.1/module/build/lib.linux-x86_64-2.7]$ time python test.py

real    0m8.237s
user    0m7.473s
sys     0m0.333s
[/PyGreSQL-4.1/module/build/lib.linux-x86_64-2.7]$ time python test.py

real    0m7.856s
user    0m7.151s
sys     0m0.298s

py5.1

[/usr/local/lib/python2.7/site-packages/PyGreSQL-5.1-py2.7-linux-x86_64.egg]$ time python test.py

real    0m13.153s
user    0m12.408s
sys     0m0.318s
[/usr/local/lib/python2.7/site-packages/PyGreSQL-5.1-py2.7-linux-x86_64.egg]$ time python test.py

real    0m13.223s
user    0m12.505s
sys     0m0.291s
[/usr/local/lib/python2.7/site-packages/PyGreSQL-5.1-py2.7-linux-x86_64.egg]$ time python test.py

real    0m13.261s
user    0m12.535s
sys     0m0.313s

You can see the avg time is about 8s in py4.1 while 13.2s in py5.1.

【Analyze】
Trough the analyze, we find the most cost time is in the following at the fetchmany function end, which is used for constructing the returning result.

return [row_factory([typecast(value, typ)
            for typ, value in zip(coltypes, row)]) for row in result]

Comparing the fetchmany function, we find a little differences and rewrite it.

typecast = self.type_cache.typecast
row_factory = self.row_factory
coltypes = self.coltypes
return [row_factory([typecast(value, typ)
            for typ, value in zip(coltypes, row)]) for row in result]

The executing time is as follows.

[ /usr/local/lib/python2.7/site-packages/PyGreSQL-5.1-py2.7-linux-x86_64.egg]$ time python test.py

real    0m11.158s
user    0m10.451s
sys     0m0.288s
[ /usr/local/lib/python2.7/site-packages/PyGreSQL-5.1-py2.7-linux-x86_64.egg]$ time python test.py

real    0m11.226s
user    0m10.530s
sys     0m0.300s

Through the testing result, execution time is down from 13s to 11s.

And another difference is funcion typecast， but I donnot know how to optimiation it.

Integer Precision error when installing PyGreSQL 5.1.1

Getting the following error when attempting to install PyGreSQL 5.1 and 5.1.1 on Mac OS 10.15.3 using pip.

Seems similar to #5, as I attempted to pin 5.1.1 but noticed the fix commit targeted a separate file.

Building wheels for collected packages: pygresql
  Building wheel for pygresql (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /Users/paged/.venvs/pg-devops/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/3b/yymg6h1546l32bjb9m6pv2nh0000gp/T/pip-install-sTZ5be/pygresql/setup.py'"'"'; __file__='"'"'/private/var/folders/3b/yymg6h1546l32bjb9m6pv2nh0000gp/T/pip-install-sTZ5be/pygresql/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /private/var/folders/3b/yymg6h1546l32bjb9m6pv2nh0000gp/T/pip-wheel-kjBSwN
       cwd: /private/var/folders/3b/yymg6h1546l32bjb9m6pv2nh0000gp/T/pip-install-sTZ5be/pygresql/
  Complete output (17 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.macosx-10.15-x86_64-2.7
  copying pg.py -> build/lib.macosx-10.15-x86_64-2.7
  copying pgdb.py -> build/lib.macosx-10.15-x86_64-2.7
  running build_ext
  building '_pg' extension
  creating build/temp.macosx-10.15-x86_64-2.7
  cc -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -iwithsysroot /usr/local/libressl/include -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -pipe -DPYGRESQL_VERSION=5.1.1 -DDIRECT_ACCESS -DLARGE_OBJECTS -DDEFAULT_VARS -DESCAPING_FUNCS -DSSL_INFO -I/Users/paged/.venvs/pg-devops/include/python2.7 -I/usr/local/include -I/Users/paged/.venvs/pg-devops/include/python2.7 -c pgmodule.c -o build/temp.macosx-10.15-x86_64-2.7/pgmodule.o -O2 -funsigned-char -Wall -Werror
  In file included from pgmodule.c:183:
  ./pgquery.c:122:22: error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]
      q->current_row = row;
                     ~ ^~~
  1 error generated.
  error: command 'cc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for pygresql
  Running setup.py clean for pygresql
Failed to build pygresql

Add automatic linting

Add automatic linting of Python files (e.g. using flake8) and C code to make the code more consistent and catch potential problems early.

Add autocommit option to pgdb

CREATE DATABASE command is failing with transactional block error. And there is no option to set auto-commit. Is there no way we can create database with Pygresql DB API2 ?

Support PQresultMemorySize() in PostgreSQL 12

Suggested on the mailing list 2019-05-15:

PostgreSQL 12 has a new libpq function PQresultMemorySize() to report the memory size of the query result (Lars Kanis, Tom Lane)

We should support that in PyGreSQL when PostgreSQL 12 is released.

(copied from Track ticket 80, created 2019-05-15)

Request to support csv with header when using copy_from

For PyGreSQL is supported in AWS Glue Python Shell Job, I use PyGreSQL to import the CSV result of Athena Query into RDS Postgresql.
The CSV result of Athena query has a first line as header, I found the parameter to deal with this issue in Postgres copy command, but it seems no such parameters in copy_from method.

Could you consider to add this improvement feature?
Thank you.

Error installing on OS X under python 3

I am having the same issue this person had
https://stackoverflow.com/questions/37627609/clang-error-when-installing-pygresql-under-mac-os

pip3 install PyGreSQL
Collecting PyGreSQL
  Using cached PyGreSQL-5.0.3.tar.gz
Building wheels for collected packages: PyGreSQL
  Running setup.py bdist_wheel for PyGreSQL ... error
  Complete output from command /usr/local/opt/python3/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/private/var/folders/sx/vktstng55zj19cp2b0fs82r80000gn/T/pip-build-8je2_lb6/PyGreSQL/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /var/folders/sx/vktstng55zj19cp2b0fs82r80000gn/T/tmp5p2s33jrpip-wheel- --python-tag cp36:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.macosx-10.12-x86_64-3.6
  copying pg.py -> build/lib.macosx-10.12-x86_64-3.6
  copying pgdb.py -> build/lib.macosx-10.12-x86_64-3.6
  running build_ext
  building '_pg' extension
  creating build/temp.macosx-10.12-x86_64-3.6
  clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -DPYGRESQL_VERSION=5.0.3 -DDIRECT_ACCESS -DLARGE_OBJECTS -DDEFAULT_VARS -DESCAPING_FUNCS -I/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/include/python3.6m -I/usr/local/Cellar/postgresql/9.6.2/include -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/include/python3.6m -c pgmodule.c -o build/temp.macosx-10.12-x86_64-3.6/pgmodule.o -O2 -funsigned-char -Wall -Werror
  pgmodule.c:3694:3: error: code will never be executed [-Werror,-Wunreachable-code]
                  long    num_rows;
                  ^~~~~~~~~~~~~~~~~
  1 error generated.
  error: command 'clang' failed with exit status 1

I don't see the same code in pgmodule.c:3694:3 any more so is this fixed and just not pushed to the cheese shop?