GithubHelp home page GithubHelp logo

dracor-org / dracor-api Goto Github PK

View Code? Open in Web Editor NEW
10.0 5.0 2.0 692 KB

eXistdb application for dracor.org

License: MIT License

XQuery 93.83% HTML 0.28% XSLT 2.76% Shell 1.31% JavaScript 0.13% Dockerfile 1.68%

dracor-api's Introduction

DraCor API

This is the eXistdb application providing the API for https://dracor.org.

The API Documentation is available at https://dracor.org/doc/api/.

Getting Started

git clone https://github.com/dracor-org/dracor-api.git
cd dracor-api
docker compose up
# load data, see below

We provide a compose.yml that allows to run an eXist database with dracor-api locally, together with the supporting dracor-metrics service and a triple store. With Docker installed simply run:

docker compose up

This pulls the necessary images from Docker Hub and starts the respective containers. The eXist database will become available under http://localhost:8080/. To check that the DraCor API is up run

curl http://localhost:8088/api/v1/info

By default, when you run docker compose up for the first time, a password for the admin user of the eXist database is generated and printed to the console. If you instead want to use a specific password use the EXIST_PASSWORD environment variable like this:

EXIST_PASSWORD=mysecret docker compose up

To use the database with an empty password, e.g. on a local machine, run:

EXIST_PASSWORD= docker compose up

The docker-compose setup also includes a DraCor frontend connected to the local eXist instance. It can be accessed by opening http://localhost:8088/ in a browser.

Load Data

To load corpus data into the database use the DraCor API calls. First add a corpus:

curl https://raw.githubusercontent.com/dracor-org/testdracor/main/corpus.xml | \
curl -X POST \
  -u admin: \
  -d@- \
  -H 'Content-type: text/xml' \
  http://localhost:8088/api/v1/corpora

Then load the TEI files for the newly added corpus (in this case test):

curl -X POST \
  -u admin: \
  -H 'Content-type: application/json' \
  -d '{"load":true}' \
  http://localhost:8088/api/v1/corpora/test

This may take a while. Eventually the added plays can be listed with

curl http://localhost:8088/api/v1/corpora/test

With jq installed you can pretty print the JSON output like this:

curl http://localhost:8088/api/v1/corpora/test | jq

VS Code Integration

For the Visual Studio Code editor an eXist-db extension is available that allows syncing a local working directory with an eXist database thus enabling comfortable development of XQuery code.

We provide a configuration template to connect your dracor-api working copy to the dracor-v1 workspace in a local eXist database (e.g. the one started with docker compose up).

After installing the VS Code extension copy the template to create an .existdb.json configuration file:

cp .existdb.json.tmpl .existdb.json

Adjust the settings if necessary and restart VS Code. You should now be able to start the synchronization from a button in the status bar at the bottom of the editor window.

XAR Package

To build a dracor-api XAR EXPath package that can be installed via the dashboard of any eXist DB instance you can just run ant.

Webhook

The DraCor API provides a webhook (/webhook/github) that can trigger an update of the corpus data when the configured GitHub repository for the corpus changes.

Note: For the webhook to work, the shared secret between DraCor and GitHub needs to be configured at /db/data/dracor/secrets.xml in the database.

License

dracor-api is MIT licensed.

dracor-api's People

Contributors

afuetterer avatar cmil avatar dependabot[bot] avatar ingoboerner avatar mathias-goebel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

cmil ingoboerner

dracor-api's Issues

Add number of words per character to node data in GEXF

#24 (comment):

One other obvious thing to provide on a per-node basis would be the number of words per character (i.e., everything within <sp>/<p> and <sp>/<l> (including <emph>, but excluding <note> and <stage> within <sp>…).

It would add to our visualisations to align the node sizes with the number of words per character, an aspect we can't visualise at the moment…

Add CORS headers

To allow the API to be used by other web apps we need to add appropriate CORS headers.

Calculate and store corpus metrics when database is updated

To be able to show more detailed corpus metrics (e.g. token counts) on the dracor.org home page, those metrics should be calculated after an update of the corpus files in any sub collection of /db/data/dracor. The numbers should be stored in the database, possibly at /db/data/dracor/metrics.xml.

The re-calculations could be triggered after updates from both load.xq and github-webhook.xq.

Add table with character data to API

Proposed name:

/corpora/{corpusname}/play/{playname}/characters/csv

Proposed values:

ID and label:

  • Character ID
  • Character Label

Three quantitative measures:

  • Scene Appearances
    • = number of scenes a character appears in
  • Speech Acts
    • = number of <sp> per character
  • Number of Words

Five network-based measures (per character)

  • Degree
  • Weighted Degree
  • Betweenness Centrality
  • Closeness Centrality
  • Eigenvector Centrality

As far as I can see, we do not calculate network values per character for API purposes yet. Our Shiny app has implemented this already in the Vertices tab and may serve as point of references for these values.

Load corpora asynchronously

To avoid long running HTTP requests and possible timeouts the loading of an entire corpus from its repository should be done in an asynchronous manner. Similarly to how we now deal with webhook deliveries, a POST /corpora/{corpusname} should just schedule a job, which then runs independently
and updates the database.

collection structure

@cmil you came up with notable thoughts on the current structure of the collections in the database. as far as i remember this should lead to something like the following:

db
└── data
    └── dracor
        ├── metrics
        │   ├── ger
        │   ├── rus
        │   ├── shake
        │   └── swe
        ├── rdf
        │   ├── ger
        │   ├── rus
        │   ├── shake
        │   └── swe
        └── tei
            ├── ger
            ├── rus
            ├── shake
            └── swe

is this correct and if not, may be you can provide a better sample?

Add segmentation data in CSV and/or JSON

For example, for .csv:

member, scene
Podkolesin, Действие первое | Явление I
Podkolesin, Действие первое | Явление II
Stepan, Действие первое | Явление II

Implement PUT /corpora

In order to decouple data and application code and to be more flexible in how we populate a dracor-api instance I suggest to follow a RESTful approach and implement the appropriate methods to add and update data via the API itself.

A first step would be to add corpora by PUTing its meta data to /corpora. The payload would be a simple JSON structure like this:

{
    "name": "ger",
    "title": "German Drama Corpus",
    "repo": "https://github.com/dracor-org/gerdracor"
}

This creates an index.xml file in the corpus' TEI collection storing the meta data that is currently held in corpora.xml.

Optionally the TEI files can be loaded from the GitHub archive.

This gives us more flexibility in setting up a test environment without having to build different xar packages and allows us to add corpora to an installation without having to modify the software.

Improve precision of word counts

The dracor stats module currently uses a simple \W+ regular expression for tokenising texts to count words. This regex includes characters like dashes (-) or apostrophes () which results in an imprecise word count. We should find a better regular expression and use the count of tei:w elements in shakedracor as a comparison for testing.

Add playId to metadata table

Since now we have stable IDs for each play throughout all corpora, we could add playId to the metadata table, next to playName.

Web hook caching problem

Frequent corpus updates of the same TEI document pushed separately to Github fail to properly update the DraCor database via the webhook. This is because we currently use raw.githubusercontent.com to obtain the data for individual documents, which apparently caches requests for about 5min.

To alleviate this problem we could obtain the documents as blobs from the GitHub Data API (see https://developer.github.com/v3/git/blobs/) which seems to cache for 1min only.

This should probably done before #49.

Add possibility to download graphs in GEXF format

The CSV format we offer for download is limited in what it can comprise. We should start slowly to build a GEXF export. The first version could just comprise what the CSV comprises, but on top of the IDs also feature the labels, i.e., character names from <persName> (or, <name>, for person groups). – Here is an easy example how to build the GEXF format.

Add normalisedYear to /corpora/{corpusname} output

The title of the ticket should be self-explanatory. ;)

Also, while doing this, we could change the name of the column "year" in the metadata table (/corpora/{corpusname}/metadata.csv) to "yearNormalised".

Offer various text slices as txt downloads via API (for quantitative research)

It should be possible to obtain various text slices as needed:

  • all spoken text within <sp> (excluding <stage>, including <emph>) per play as simple txt stream
  • all stage texts per play (everything within <stage>) as simple txt stream

Additional option:

  • offer spoken text (as detailed above) separated by {male | female | unknown} property if available

Github webhook causes timeout with when updating too many files

The github webhook currently processes the modified files right away. This takes too long when there are too many of them and causes github to abort the request resulting in some files not being updated properly.

The webhook should be changed to just record the modified files so that a separately scheduled process could do the actual processing and database update.

Provide global ID resolver

dracor.org/id/rus000001 (etc.) should, depending on the request header:

  • provide an RDF triple (if RDF is requested), or
  • forward to respective play (if called via browser)

Refine RDF generation

With #9 and #41 we now have RDF documents for each play. See, for instance:

There are still a few issues:

  • dc:title is missing in the RDFs for ger and shake corpora
  • dc:creator has firstname lastname in shake, but lastname, firstname in ger and rus
  • network measures are not yet included
  • owl:sameAs seems to appear twice
  • Plays in Graph <https://dracor.org/ger> lack rdfs:label

To quote @ingoboerner in #9 (comment):

RDF should look as follows:

<rdf:RDF xml:lang="en" 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:dracon="http://dracor.org/ontology#">
    <rdf:Description rdf:about="https://dracor.org/rus/andreyev-mysl">
        
        <rdfs:label xml:lang="en">Andreyev, Leonid: A Thought</rdfs:label>
        
        <rdfs:label xml:lang="ru">Андреев, Леонид Николаевич: Мысль</rdfs:label>
        
        <dc:creator xml:lang="en">Leonid Andreev</dc:creator>
        <dc:creator xml:lang="ru">Леонид Николаевич Андреев</dc:creator>
        
        <dc:title xml:lang="en">A Thought</dc:title>
        <dc:title xml:lang="ru">Мысль</dc:title>
        
        <!-- http://dracor.org/ontology#normalisedYear -->
        <!-- http://dracor.org/ontology#premiereYear -->
        <!-- http://dracor.org/ontology#printYear -->
        <!-- http://dracor.org/ontology#writtenYear -->
        
        
        <!-- Author as blank node -->
        <dracon:has_author>
            <rdf:Description>
                <rdfs:label xml:lang="en">Leonid Andreev</rdfs:label>
                <rdfs:label xml:lang="ru">Леонид Николаевич Андреев</rdfs:label>
                <owl:sameAs rdf:resource="http://www.wikidata.org/entity/Q310866"/>
            </rdf:Description>
        </dracon:has_author>
        
       <!-- network-measures -->
        
        <!-- http://dracor.org/ontology#averageClustering -->
        <!-- http://dracor.org/ontology#averageDegree -->
        <!-- http://dracor.org/ontology#averagePathLength -->
        <!-- http://dracor.org/ontology#density -->
        <!-- http://dracor.org/ontology#diameter -->
        <!-- http://dracor.org/ontology#maxDegree -->
        <!-- http://dracor.org/ontology#maxDegreeIds -->
        <!-- http://dracor.org/ontology#numOfActs -->
        <!-- http://dracor.org/ontology#numOfSegments -->
        <!-- http://dracor.org/ontology#numOfSpeakers -->
        
        
        <dracon:in_corpus rdf:resource="https://dracor.org/rus"/>
        
        <owl:sameAs rdf:resource="http://www.wikidata.org/entity/Q59355429"/>
        
    </rdf:Description>
</rdf:RDF>

I generate a blank node for the author because within dracor it does not have its own dracor-id; I tested example-rdf in my local installation of Jena and it worked.

The drdf:play-to-rdf() function in rdf.xqm should be adjusted to get there.

Consolidate use of 'id' and 'name' properties

The id properties of some resources (/corpora/{corpusName}, /corpora/{corpusName}/play/{playName}) still provide the play name (e.g. "gogol-revizor"), while /corpora/{corpusName}/metadata already shows the now available DraCor IDs of the plays. This should be changed so that id properties referring to a play always give the DraCor ID whereas the play name is provided by a name property.

Add function stage-directions-incl-speakers

As proposed by @nilsreiter, let us add another function called stage-directions-incl-speakers. This would be the same as the stage-directions function, but it would add the speaker strings if they are directly preceding a stage direction. So, for example, this one …

<sp who="#vroni">
  <speaker>VRONI</speaker>
  <stage>hat den Eimer umgestülpt und sich auf denselben gesetzt.</stage>
  <p>Erzähl weiter von meiner Mutter!</p>
</sp>

… would transform into:

VRONI hat den Eimer umgestülpt und sich auf denselben gesetzt.

The expected advantage of this added function is that part-of-speech tagging might work better with the subject of the sentence present.

Handle payload encoding in api:sparql

In api:sparql the Content-Encoding of the POST data needs to be taken into account when processing the query. Depending on the request the encoding can differ.

While the following httpie request seems to get properly decoded (although resulting in a 400 response because of the nonsensical query)

echo "QUERY" | http -v https://dracor.org/api/sparql

the same query with curl results in an internal server error (500):

curl -v -X POST "https://dracor.org/api/sparql" -H "accept: application/sparql-results+xml" -H "Content-Type: application/sparql-query" -d "QUERY"

Include attribute definition into GEXF format

In order to recognise attributes, Gephi needs them defined upfront, so in our case (also proposing to lowercase "gender" in the id attribute):

<graph defaultedgetype="undirected" mode="static">
  <attributes class="node" mode="static">
    <attribute id="gender" title="Gender" type="string"></attribute>
  </attributes>
  <nodes>
  […]

While we are at it, maybe we could add a header:

<?xml version="1.0" encoding="UTF-8"?>
<gexf xmlns="http://www.gexf.net/1.3" version="1.3">

Thus, also jumping from GEXF 1.2draft to 1.3?
Also, "MALE" and "FEMALE" are correctly inserted, but "UNKNOWN" is not (no <attvalues> given in these cases).

Implement PUT on /corpora/{corpusname}/play/{playname}

As a second step after #44 I suggest to implement PUT and POST on a single play. This would accept a TEI document and would create a new or update an existing resource in the database. This would allow us to add or update a single play without depending on a Github push or reloading the entire corpus.

Retire github-webhook.xq

In #48 we added a new webhook endpoint to the API. Once this proves to work well enough we should remove the old webhook implementation in github-webhook.xq.

Eigenvector Centrality in cast list differs from values calculated by Gephi and igraph

We're not the first to notice this mismatch (cf. "Eigenvector Centrality Oddity with iGraph, Gephi, and NetworkX"). While that article finds diverging values for all three, igraph, Gephi and NetworkX, we find that igraph and Gephi throw the same results, while NetworkX begs to differ.

To add another example, here's what our R script throws (using igraph) for "Emilia Galotti":

r_screenshot

The documentation for igraph and NetworkX both insinuate that they're relying on the same algorithm. Could you maybe check if you throw the 'edge weights' into the formula (which we don't do)? This could explain the different values…

Originally posted by @lehkost in #31 (comment)

Offer metadata table via API

We would need something like this table for research purposes:
https://github.com/lehkost/RusDraCor/blob/master/Ira_Scripts/NEW_calculations.csv

Our Python scripts would welcome such up-to-date metadata table, and so would our Shinyapp. Would be nice to know how to calculate the more complicated network metrics in XQuery and whether there are any libraries out there that might help in doing so. In any case, we can certainly start with a subset that is easy to calculate.

I propose the following columns:

  • title: filename of the play (incl. author)
  • genre: value of <term type="genreTitle" subtype="{tragedy|comedy|etc.}"/> (can be empty)
  • year (normalised): one year according to our algorithm specifying the normalised year out of written/print/premiere data
  • number of segments
  • number of acts: just counting occurrences of <div type="act">
  • network size
  • density
  • diameter
  • average path length
  • average clustering coefficient
  • average degree
  • maximum degree
  • characters with maximum degree: up to three, possibly divided by "|", if more than three, then "several characters"

(See #18)
In addition (no hurry) we should introduce some quantitative values, like:

  • number of words uttered by female|male characters
  • number of speech acts by female|male characters

For the quantitative values it is maybe a good idea to have a cron job calculate the table every so often instead of live-generating this with every request?

Cast function of API throws servor error for some plays

The play where it first turned up was Lermontov's "Maskarad":

Looking at the error message, it seems to have to do with the way numbers are stored, probably coming from the metrics service?

Most plays seem to work fine, but there are a few others that are affected, like:

Build issues

This ticket collects a few build issues that are still extant after the merge of PR #36. The below output of ant devel demonstrates some of them.

  • the exist package is extracted twice, in dependencies and prepare-exist, which is unnecessarily time consuming
  • there is an ERROR: Could not bind to port because Address already in use at the end of the init target. While this does not seem to be of any consequence for running the development database, it looks fishy and should be solved one way or the other.
  • currently, when building a XAR package you cannot see from the outside whether it has been built with the -Dtestdracor=true option or not, which may lead to confusion
$ ant devel
Buildfile: /Users/cmil/Projects/dracor/dracor-api/build.xml

check-devel:

test-corpora:

corpora:
     [copy] Copying 1 file to /Users/cmil/Projects/dracor/dracor-api

create-dirs:

xar:
     [copy] Copying 1 file to /Users/cmil/Projects/dracor/dracor-api
      [zip] Building zip: /Users/cmil/Projects/dracor/dracor-api/build/dracor-0.33.0.xar

dependencies:
      [get] Destination already exists (skipping): /Users/cmil/Projects/dracor/dracor-api/build/dependencies/eXist-db-4.5.0.tar.bz2
    [untar] Expanding: /Users/cmil/Projects/dracor/dracor-api/build/dependencies/eXist-db-4.5.0.tar.bz2 into /Users/cmil/Projects/dracor/dracor-api/devel
      [get] Destination already exists (skipping): /Users/cmil/Projects/dracor/dracor-api/build/dependencies/crypto-0.3.5.xar
      [get] Destination already exists (skipping): /Users/cmil/Projects/dracor/dracor-api/build/dependencies/sparql-latest.xar

prepare-exist:
     [echo] install eXist to devel/eXist-db-4.5.0
      [get] Destination already exists (skipping): /Users/cmil/Projects/dracor/dracor-api/build/dependencies/eXist-db-4.5.0.tar.bz2
    [untar] Expanding: /Users/cmil/Projects/dracor/dracor-api/build/dependencies/eXist-db-4.5.0.tar.bz2 into /Users/cmil/Projects/dracor/dracor-api/devel
     [copy] Copying 2 files to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/autodeploy

set-ports:
     [xslt] Processing /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-http.xml to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-http-tmp.xml
     [xslt] Loading stylesheet /Users/cmil/Projects/dracor/dracor-api/resources/ant/jetty-port-update.xslt
     [move] Moving 1 file to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc
     [xslt] Processing /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-ssl.xml to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-ssl-tmp.xml
     [xslt] Loading stylesheet /Users/cmil/Projects/dracor/dracor-api/resources/ant/jetty-port-update.xslt
     [move] Moving 1 file to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc
     [xslt] Processing /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty.xml to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-tmp.xml
     [xslt] Loading stylesheet /Users/cmil/Projects/dracor/dracor-api/resources/ant/jetty-port-update.xslt
     [move] Moving 1 file to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc

init:
     [exec] 
     [exec] 20 Dec 2018 18:25:40,219 [main] INFO  (JettyStart.java [run]:149) - Running with Java 1.8.0_191 [Oracle Corporation (Java HotSpot(TM) 64-Bit Server VM) in /Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home/jre] 
     [exec] 20 Dec 2018 18:25:40,219 [main] INFO  (JettyStart.java [run]:156) - Running as user 'cmil' 
     [exec] 20 Dec 2018 18:25:40,220 [main] INFO  (JettyStart.java [run]:157) - [eXist Home : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0] 
     [exec] 20 Dec 2018 18:25:40,238 [main] INFO  (JettyStart.java [run]:158) - [eXist Version : 4.5.0] 
     [exec] 20 Dec 2018 18:25:40,238 [main] INFO  (JettyStart.java [run]:159) - [eXist Build : 201811211903] 
     [exec] 20 Dec 2018 18:25:40,238 [main] INFO  (JettyStart.java [run]:160) - [Git commit : e29b4099c] 
     [exec] 20 Dec 2018 18:25:40,238 [main] INFO  (JettyStart.java [run]:162) - [Operating System : Mac OS X 10.14.1 x86_64] 
     [exec] 20 Dec 2018 18:25:40,239 [main] INFO  (JettyStart.java [run]:163) - [log4j.configurationFile : file:///Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/log4j2.xml] 
     [exec] 20 Dec 2018 18:25:40,245 [main] INFO  (JettyStart.java [run]:164) - [jetty Version: 9.4.10.v20180503] 
     [exec] 20 Dec 2018 18:25:40,246 [main] INFO  (JettyStart.java [run]:165) - [jetty.home : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty] 
     [exec] 20 Dec 2018 18:25:40,246 [main] INFO  (JettyStart.java [run]:166) - [jetty.base : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty] 
     [exec] 20 Dec 2018 18:25:40,246 [main] INFO  (JettyStart.java [run]:167) - [jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/standard.enabled-jetty-configs] 
     [exec] 20 Dec 2018 18:25:40,747 [main] INFO  (JettyStart.java [run]:176) - Configuring eXist from /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/conf.xml 
     [exec] 20 Dec 2018 18:25:47,656 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty.xml] 
     [exec] 20 Dec 2018 18:25:47,773 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-gzip.xml] 
     [exec] 20 Dec 2018 18:25:47,800 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-http.xml] 
     [exec] 20 Dec 2018 18:25:47,831 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-jaas.xml] 
     [exec] 20 Dec 2018 18:25:47,836 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-jmx.xml] 
     [exec] 20 Dec 2018 18:25:47,869 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-requestlog.xml] 
     [exec] 20 Dec 2018 18:25:47,878 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-ssl.xml] 
     [exec] 20 Dec 2018 18:25:47,887 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-ssl-context.xml] 
     [exec] 20 Dec 2018 18:25:47,913 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-https.xml] 
     [exec] 20 Dec 2018 18:25:48,013 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-deploy.xml] 
     [exec] 20 Dec 2018 18:25:48,034 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-plus.xml] 
     [exec] 20 Dec 2018 18:25:48,054 [main] INFO  (JettyStart.java [run]:200) - [loading jetty configuration : /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/tools/jetty/etc/jetty-annotations.xml] 
     [exec] 20 Dec 2018 18:25:48,058 [main] INFO  (JettyStart.java [startJetty]:516) - [Starting jetty component : org.eclipse.jetty.server.Server] 
     [exec] 20 Dec 2018 18:25:48,059 [main] INFO  (JettyStart.java [lifeCycleStarting]:662) - Jetty server starting... 
     [exec] 20 Dec 2018 18:25:49,877 [main] ERROR (JettyStart.java [run]:368) - ---------------------------------------------------------- 
     [exec] 20 Dec 2018 18:25:49,878 [main] ERROR (JettyStart.java [run]:369) - ERROR: Could not bind to port because Address already in use 
     [exec] 20 Dec 2018 18:25:49,878 [main] ERROR (JettyStart.java [run]:370) - java.net.BindException: Address already in use 
     [exec] 20 Dec 2018 18:25:49,878 [main] ERROR (JettyStart.java [run]:371) - ---------------------------------------------------------- 
     [exec] /Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home/bin/java -Xms128m -Xmx2048m -Dfile.encoding=UTF-8 -Dexist.home=/Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0 -jar /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/start.jar shutdown -u admin -p 
     [exec] Shutting down database instance at 
     [exec] 	xmldb:exist://localhost:8080/exist/xmlrpc/db
     [exec] 20 Dec 2018 18:25:52,695 [qtp416878771-23] INFO  (JettyStart.java [shutdown]:602) - Database shutdown: stopping server in 1sec ... 
     [xslt] Processing /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/conf.xml to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/conf.xml.tmp
     [xslt] Loading stylesheet /Users/cmil/Projects/dracor/dracor-api/resources/ant/exist-conf.xslt
     [move] Moving 1 file to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0
     [copy] Copying 1 file to /Users/cmil/Projects/dracor/dracor-api/devel/eXist-db-4.5.0/autodeploy

devel:
   [delete] Deleting: /Users/cmil/Projects/dracor/dracor-api/anttmp-062451

check-metrics:
     [echo] start the import process with `bash devel/eXist-db-4.5.0/bin/startup.sh`

BUILD SUCCESSFUL
Total time: 1 minute 1 second

/sparql does not accept `application/x-www-form-urlencoded; charset=UTF-8`

While POST requests to /sparql work fine when the Content-type header is exactly application/x-www-form-urlencoded, they won't be accepted when a charset is added to the MIME type, e.g. application/x-www-form-urlencoded; charset=UTF-8. In this case a 400 HTTP method POST is not supported by this URL response is sent.

This appears to be a limitation in eXist's RESTXQ implementation. Since content types with charset are sent by default by clients like YASGUI we should find a was to accept those.

Replace /corpora/{corpusname}/load with POST /corpora

The current way of loading the data of a corpus into the database by GET /corpora/{corpusname}/load should be replaced by a POST request on /corpora with a JSON payload like this:

{
  "load": true,
  "corpora": ["ger", "rus"]
}

Not only would this allow us to update multiple corpora at once. It would also be more RESTful since load is not really a resource but rather an action upon a resource.

This is related to #44 and #45.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.