GithubHelp home page GithubHelp logo

korapsru's Introduction

KorapSRU

KorapSRU is the CLARIN Federated Content Search (FCS) endpoint for KorAP. It implements FCS specifications and connects the CLARIN FCS client Aggregator and KorAP. Thus, public resources in KorAP are accessible from Aggregator through KorapSRU.

Supported FCS Specifications

CLARIN defines FCS specifications to allow distributed search across multiple heterogenous search engines in a uniform way. FCS specifications are built on the SRU/CQL protocol for communications between its client and endpoint. FCS 1.0 specification supports SRU (Search Retrieve via URL) 1.2 and FCS 2.0 specification supports SRU 2.0.

KorapSRU 1.0.1 release implements FCS 1.0 specification and supports basic search using simple CQL (Contextual Query Language) for term query, phrase query and boolean query. FCS 2.0 specification is implemented in the newest version of KorapSRU, but it has not been released yet. It supports extended search (e.g. annotation search) that can be formulated using FCS Query Language (FCSQL) developed based on Corpus Query Processor (CQP). FCSQL is only available with SRU version 2.0, whilst CQL is available with SRU version 1.1, 1.2 and 2.0.

Usually CQL and FCSQL queries are translated into the native language of a search engine in an FCS endpoint. Since KorAP supports multiple query languages and has its own query translator Koral, the translation is implemented in Koral, not in KorapSRU. Therefore, KorAP users will also be able to use CQL and FCSQL.

Supported SRU requests

SRU explain request

gives general information about KorapSRU and some default search settings, for instance the number of records it retrieves per page. See:

https://clarin.ids-mannheim.de/korapsru?operation=explain

To obtain more information such as supported annotation layers needed for requesting an extended search,

x-fcs-endpoint-description=true 

must be added as an extra request parameter. See:

https://clarin.ids-mannheim.de/korapsru?operation=explain&x-fcs-endpoint-description=true

SRU search retrieve request

contains a CQL or FCSQL query. KorapSRU forwards the CQL or FCSQL query in an SRU search retrieve request URL to Kustvakt, the API provider of KorAP managing the communications among all KorAP components. Moreover, KorapSRU transforms the query results from Kustvakt into an SRU response.

Examples:

  • Basic search using CQL

Searching for all occurrences of term Buch (means book in German)

https://clarin.ids-mannheim.de/korapsru?operation=searchRetrieve&query=Buch&version=1.2

  • Annotation search using FCSQL

Searching for all lemmas from Tree tagger annotations containing heit, in FCS query: [tt:lemma=".*heit"]

https://clarin.ids-mannheim.de/korapsru?operation=startRetrieve&query=%5Btt%3Alemma%3D%22.*heit%22%5D&queryType=fcs

Software Requirements

  • Java 7 (JDK 1.7 with JCE or OpenJDK 7)

  • Tomcat 7

  • Kustvakt

Installation

Configure the service URI in /src/main/webapp/WEB-INF/web.xml to a Kustvakt server URI, for example:

<context-param>
  <param-name>korap.service.uri</param-name>
  <param-value>http://localhost:8089/api/</param-value>
</context-param>

KorapSRU is built based on the FCSSimpleEndpoint library provided by CLARIN. KorapSRU 1.0.2-SNAPSHOT uses FCSSimpleEndpoint version 1.3.0 available from CLARIN Nexus repository. To allow Maven to download the library using JDK 1.7, an additional Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files 7 is needed.

To install a war file of KorapSRU, go to the root directory of the project and run

$ mvn install -Dhttps.protocols=TLS1.2

in a terminal.

korapsru's People

Contributors

akron avatar dependabot[bot] avatar idsgerrit avatar kupietz avatar margaretha avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

korapsru's Issues

Add license

Currently this repository has no license information.

Handling unsupported foundries

Supported foundries in KorapSRU are limited to those of free resources. In Kalamar, FCS-QL may be used for other resources and foundries. Thus, supported foundry check and unsupported foundry exceptions should be handled in KorapSRU instead of KorAP (Koral).

Build fails with maven >= 3.81

[ERROR] Failed to execute goal on project KorapSRU: Could not resolve dependencies for project de.mannheim.ids:KorapSRU:war:1.0.4-SNAPSHOT: Failed to collect dependencies at eu.clarin.sru.fcs:fcs-simple-endpoint:jar:1.4.0 -> eu.clarin.sru:sru-server:jar:1.8.0 -> org.z3950.zing:cql-java:jar:1.12: Failed to read artifact descriptor for org.z3950.zing:cql-java:jar:1.12: Could not transfer artifact org.z3950.zing:cql-java:pom:1.12 from/to maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for repositories: [indexdata (http://maven.indexdata.com/, default, releases+snapshots)] -> [Help 1]

That's also why the CI builds fail.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.