GithubHelp home page GithubHelp logo

franzinc / agraph-python Goto Github PK

View Code? Open in Web Editor NEW
37.0 14.0 12.0 8.17 MB

AllegroGraph Python client

Home Page: https://franz.com/agraph

License: MIT License

Python 67.54% Shell 0.19% Makefile 0.65% Java 31.59% HTML 0.02%

agraph-python's Introduction

AllegroGraph Python API

PyPI - Python Version PyPI package Anaconda package

The AllegroGraph Python API offers convenient and efficient access to an AllegroGraph server from a Python-based application. This API provides methods for creating, querying and maintaining RDF data, and for managing the stored triples. The AllegroGraph Python API deliberately emulates the Eclipse RDF4J (formerly Aduna Sesame) API to make it easier to migrate from RDF4J to AllegroGraph. The AllegroGraph Python API has also been extended in ways that make it easier and more intuitive than the RDF4J API.

Requirements

Python versions >=3.8,<=3.12 are supported. The installation method described here uses the pip package manager. On some systems this might require installing an additional package (e.g. python-pip on RHEL/CentOS systems). All third-party libraries used by the Python client will be downloaded automatically during installation.

Installation

Important

It is highly recommended to perform the install in a virtualenv environment.

The client can be installed from PyPI using the pip package manager:

pip install agraph-python

Alternatively, a distribution archive can be obtained from ftp://ftp.franz.com/pub/agraph/python-client/ and installed using pip:

pip install agraph-python-<VERSION>.tar.gz

Offline installation

If it is not possible to access PyPI from the target machine, the following steps should be taken:

  • In a compatible environment with unrestricted network access run:

    pip wheel agraph-python
    
  • This will generate a number of .whl files in the current directory. These files must be transferred to the target machine.

  • On the target machine use this command to install:

    pip install --no-index --find-links=<DIR> agraph-python
    

    where <DIR> is the directory containing the .whl files generated in the previous step.

Install by conda

Using conda to install agraph-python is also supported:

conda create -n myenv python=3.10
conda activate myenv
conda install -y -c conda-forge -c franzinc agraph-python

Testing

To validate the installation make sure that you have access to an AllegroGraph server and run the following Python script:

from franz.openrdf.connect import ag_connect
with ag_connect('repo', host='HOST', port='PORT',
                user='USER', password='PASS') as conn:
    print (conn.size())

Substitute appropriate values for the HOST/PORT/USER/PASS placeholders. If the script runs successfully a new repository named repo will be created.

Proxy setup

It is possible to configure the AllegroGraph Python client to use a proxy for all its connection to the server. This can be achieved by setting the AGRAPH_PROXY environment variable, as in the following example:

# Create a SOCKS proxy for tunneling to an internal network
ssh -fN -D 1080 [email protected]
# Configure agraph-python to use this proxy
export AGRAPH_PROXY=socks://localhost:1080

The format of the AGRAPH_PROXY value is TYPE://HOST:PORT, where TYPE can be either http, socks4, socks5 or socks (a synonym for socks5). Note that if a SOCKS proxy is used, DNS lookups will be performed by the proxy server.

Unit tests

The Python client includes a suite of unit tests that can be run after installation. The tests are executed using the pytest framework and also use a few utilities from nose, so these two packages have to be installed. We also need the pytest-mock plugin:

pip install -e ".[test]"

The tests require a running AllegroGraph server instance. The configuration of this server is passed to the tests through environment variables:

# Host and port where the server can be reached. These values are the
# default, it is only necessary to define the variables below if your
# setup is different
export AGRAPH_HOST=localhost
export AGRAPH_PORT=10035

# Tests will create repositories in this catalog.
# It must exist on the server. Use "/" for the root catalog.
export AGRAPH_CATALOG=tests

# Login credentials for the AG server.
# The user must have superuser privileges.
export AGRAPH_USER=test

# Use a prompt to read the password
read -s -r -p "Password for user ${AGRAPH_USER}: " AGRAPH_PASSWORD
export AGRAPH_PASSWORD

To run the tests, type:

pytest --pyargs franz.openrdf.tests.tests --pyargs franz.openrdf.tests.newtests

agraph-python's People

Contributors

brucedclayton avatar dancyatfranz avatar dancysoft avatar divaricatum avatar dklayer avatar hudsonatfranz avatar macdavid313 avatar marijnh avatar maxdebayser avatar theihor avatar tsznuk avatar vseloved avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

agraph-python's Issues

Circular import

There is a circular import in query.py because it transitively imports franz.openrdf.repository.repository which in turn imports query

To test (with python2) just run this single line:

#> from franz.openrdf.query.query import QueryLanguage

and you will get:

Traceback (most recent call last):
File "test.py", line 1, in
from franz.openrdf.query.query import QueryLanguage
File "/home/mbayser/.pyenv/versions/agraph_test/lib/python2.7/site-packages/agraph_python-100.0.0.dev0-py2.7.egg/franz/openrdf/query/query.py", line 21, in
from .queryresult import GraphQueryResult, TupleQueryResult
File "/home/mbayser/.pyenv/versions/agraph_test/lib/python2.7/site-packages/agraph_python-100.0.0.dev0-py2.7.egg/franz/openrdf/query/queryresult.py", line 15, in
from ..repository.repositoryresult import RepositoryResult
File "/home/mbayser/.pyenv/versions/agraph_test/lib/python2.7/site-packages/agraph_python-100.0.0.dev0-py2.7.egg/franz/openrdf/repository/init.py", line 10, in
from .repository import Repository
File "/home/mbayser/.pyenv/versions/agraph_test/lib/python2.7/site-packages/agraph_python-100.0.0.dev0-py2.7.egg/franz/openrdf/repository/repository.py", line 22, in
from .repositoryconnection import RepositoryConnection
File "/home/mbayser/.pyenv/versions/agraph_test/lib/python2.7/site-packages/agraph_python-100.0.0.dev0-py2.7.egg/franz/openrdf/repository/repositoryconnection.py", line 24, in
from ..query.query import Query, TupleQuery, UpdateQuery, GraphQuery, BooleanQuery, QueryLanguage

To workaround, one only needs to import repository first:

import franz.openrdf.repository.repository
from franz.openrdf.query.query import QueryLanguage

off-by-1 error in method 'franz.openrdf.model.value.URI.split()'

This line in the method 'franz.openrdf.model.value.URI.split()' adds 1 to the position of the split resulting in incorrect results:

return self.uri[:pos + 1], self.uri[pos + 1:]

The correct line should be:
return self.uri[:pos], self.uri[pos:]

Also I see in several places in the file those methods use sometimes 'self.getURI()' and other times 'self.uri' - it would be nice if they were consistent, one way or the other.

Regards,
Franco Venturi

urllib3 Retry API change

As of urllib 2.0.0 (released 2023-04-26), the method_whitelist argument for Retry (which has been deprecated for a while) was removed in favor of allowed_methods.

However, the removed argument is still referenced here:

retries = Retry(backoff_factor=0.1,
connect=10, # 10 retries for connection-level errors
status_forcelist=(), # Retry only on connection errors
method_whitelist=False) # Retry on all methods, even POST and PUT

Therefore, pinning the version of urllib3 is required. It would be great if this long-deprecated, now removed argument was replaced with the proper one.

Hostname not resolved when using docker service alias in spec string with server.openSession

While working with agraph-python, I experienced an issue that might be a bug in agraph-python itself.

I am running the v7.0.4 version of the franzinc/agraph Docker image in a CI machine through GitLab-runner. The relevant part of the file GitLab-ci configuration file looks like this.

variables:
  AGRAPH_SUPER_USER: "XXX"
  AGRAPH_SUPER_PASSWORD: "YYY"
  AGRAPH_USER: "XXX" (same as super-user)
  AGRAPH_PASSWORD: "YYY" (same as super-password)
  AGRAPH_PORT: 10035
  AGRAPH_HOST: "db"

services:
  - name: franzinc/agraph:v7.0.4
    alias: db

The important thing to note is that the agraph image was given the alias db.

At a certain point, I am trying to reuse the spec string of two existing franz.openrdf.repository.repositoryconnection.RepositoryConnection objects, to perform a federated SPARQL query. self._server is an AllegroGraphServer object, created with the same username, password and host arguments as self._ontology_engine and self._engine, which are RepositoryConnection objects, created with ag_connect, which are working.

[...]
            context = self._ontology_engine.createURI(
                f'http://www.osp-core.com/agraph_session_id#'
                f'{self._session_id}')
            spec_ontology = spec.graphFilter(self._ontology_engine.getSpec(),
                                             [context])
            spec_engine = self._engine.getSpec()
            spec_federated = spec.federate(spec_ontology, spec_engine)
            query_engine = self._server.openSession(spec_federated)

        tuple_query = query_engine.prepareTupleQuery(
            QueryLanguage.SPARQL, query_string)
[...]

After that, I call tuple_query.evaluate() and get the following error:

[...]
[...] line 270, in _sparql (the code pasted above)
    query_engine = self._server.openSession(spec_federated)
  File "/usr/local/lib/python3.7/site-packages/agraph_python-101.0.7-py3.7.egg/franz/openrdf/sail/allegrographserver.py", line 243, in openSession
    minirep = self._client.openSession(spec, autocommit=autocommit, lifetime=lifetime, loadinitfile=loadinitfile)
  File "/usr/local/lib/python3.7/site-packages/agraph_python-101.0.7-py3.7.egg/franz/miniclient/repository.py", line 192, in openSession
    loadInitFile=loadinitfile, store=spec))
  File "/usr/local/lib/python3.7/site-packages/agraph_python-101.0.7-py3.7.egg/franz/miniclient/request.py", line 99, in jsonRequest
    else: raise RequestError(status, body)
franz.miniclient.request.RequestError: Server returned 400: Failed to start a session with specification <db:simphony-ontologies>{<http://www.osp-core.com/agraph_session_id#4a797b0c-35b0-4781-bb82-638ad417e297>}+<db:test_osp_core_transfer>:
Unknown hostname: "db"

Actually, the spec_federated string is not exactly as the RequestError reports, the value is: <http://XXX:YYY@db:10035/repositories/simphony-ontologies>{<http://www.osp-core.com/agraph_session_id#1e444591-b5d9-4e9a-80c3-b0566ee85074>} + <http://XXX:YYY@db:10035/repositories/test_osp_core_transfer>.

I thought this did not make much sense, so I just tried resolving the host name with the sockets library and replacing the IP in the spec string.

[...]
            context = self._ontology_engine.createURI(
                f'http://www.osp-core.com/agraph_session_id#'
                f'{self._session_id}')
            spec_ontology = spec.graphFilter(self._ontology_engine.getSpec(),
                                             [context])
            spec_engine = self._engine.getSpec()
            spec_federated = spec.federate(spec_ontology, spec_engine)
            ip = socket.gethostbyname("db")
            spec_federated = spec_federated.replace('db', ip)
            query_engine = self._server.openSession(spec_federated)

        tuple_query = query_engine.prepareTupleQuery(
            QueryLanguage.SPARQL, query_string)
[...]

Surprisingly, after this change tuple_query.evaluate() works as expected, and the results of the query are correct. In the end I just worked around the issue by writing a function that resolves all host names in a spec string and replaces them.

Let me know if you need more information to locate the source of the problem.

ag_connect does not correctly handle the protocol parameter

The protocol parameter is not correctly handled; this is due to a typo in the following line of code:

server = AllegroGraphServer(host=host, port=port, protcol=protocol,

which says:

 server = AllegroGraphServer(host=host, port=port, protcol=protocol,

Note the misspelling "protcol=protocol".

The work-around for this is to use the full URI as the host, rather than trying to pass in host, port and protocol separately.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.