GithubHelp home page GithubHelp logo

mtub / xmlps Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pkp/ots

0.0 2.0 0.0 24.79 MB

PKP XML Parsing Service

PHP 65.34% Ruby 0.06% JavaScript 10.14% CSS 17.91% Shell 0.57% XSLT 5.15% TeX 0.83%

xmlps's Introduction

PKP XML Parsing Service

Module Description

  • User
  • Authentication
  • Registration
  • New password
  • Admin
  • Confirm registrations
  • Set a user's document conversion rate
  • Delete user
  • Edit User
  • System log viewer
  • Manager
  • Receives conversion jobs
  • Job list
  • Handles job distribution to queues
  • DocxConversion
  • Converts documents to DocX format
  • NlmxmlConversion
  • Converts documents to NLMXML format
  • ReferenceConversion
  • Parses references from DocX document into a seperate XML file
  • BibtexConversion
  • Converts references from the previous step into Bibtex
  • BibtexreferenceConversion
  • Converts Bibtex references into NLMXML and merges the converted references into the NLMXML document
  • HtmlConversion
  • Converts the NLMXML document into HTML
  • CitationStyleConversion
  • Formats the citations in the HTML document according to the citationstyle requested by the user
  • PdfConversion
  • Converts the HTML document into PDF
  • XmpConversion
  • Adds an XMP sidecar with metadata from the NLMXML to the PDF document
  • ZipConversion
  • Zips all documents
  • API
  • Simple REST API to submit and retrieve jobs and to provide functionality for the frontends AJAX callbacks.

Requirements

  • Apache mod_headers needs to be installed and enabled
  • Java VM needs to be installed
  • Citation parsing has a variety of requirements please refer to the ParsCit documentation
  • xml2bib needs to be installed
  • Pandoc & libghc-citeproc-hs-data needs to be installed
  • The XMP conversion needs Exiftool to be installed
  • The docX conversion needs LibreOffice with unoconv installed. The server is tested to work with LibreOffice 4.2.4.
wget http://download.documentfoundation.org/libreoffice/stable/4.2.4/deb/x86_64/LibreOffice_4.2.4_Linux_x86-64_deb.tar.gz
tar -xzf LibreOffice_4.2.4_Linux_x86-64_deb.tar.gz
rm -f LibreOffice_4.2.4_Linux_x86-64_deb.tar.gz
sudo dpkg -i LibreOffice_4.2.4.2_Linux_x86-64_deb/DEBS/*.deb
rm -rf LibreOffice_4.2.4.2_Linux_x86-64_deb

Installation

  • Copy the source
# git clone https://github.com/MichaelThessel/xmlps.git
# cd xmlps
  • Install the dependencies
# php composer.phar self-update
# php composer.phar install
  • Configure the environment
cp config/autoload/local.php.dist config/autoload/local.php
  • Change local.php according to your environment

  • Initialize the database

# vendor/doctrine/doctrine-module/bin/doctrine-module  orm:schema-tool:update --force
  • Run the shell script that starts the conversion queues
./start_queues.sh

Unit tests

  • After a successful installation the unit tests should complete without errors
# ./unittest.sh

Developer notes

  • SASS compilation, CSS and Javascript compression & unification is done using Guard (http://guardgem.org)
  • After making changes to Javascript (javascript/) or style files (style/scss/) recompile/recompress the style and Javascript files by running
# guard

API

There is a simple REST API available to submit, view and retrieve jobs from/to the server.

Submit

Submit a job to the server. The citationStyleHash is an internal identifier for the requested citaton style. A list of hashes can be retrieved through the citationStyleList API. The API will return the job id which can be used to retrieve the completed job later or to query the server for the job status.

URL: api/job/submit Request type: POST Parameters:

  • email
  • password
  • fileName
  • fileContent
  • citationStyleHash

I.e.

http://example.com/api/job/submit
POST parameters:
    'email' => '[email protected]'
    'password' => 'passowrd'
    'fileName' => 'document.docx'
    'citationStyleHash' => 'c6de5efe3294b26391ea343053c19a84',
    'fileContent' => '...'

Example response:

{"status":"success","id":123}

Status

Returns the current status for a job. Only completed jobs can be retrieved from the server. A full list of statuses can be found here.

URL: api/job/status Request type: GET Parameters:

  • email
  • password
  • id

I.e.

http://example.com/api/job/[email protected]&password=password&id=123

Example response:

{"status":"success","jobStatus":0,"jobStatusDescription":"Pending"}

Citation Style List

Returns a list of available citation styles and their internal ids. We support all citation styles from citationstyles.org

URL: api/job/citationStyleList Request type: GET

I.e.

http://example.com/api/job/citationStyleList

Example response:

{"status":"success","citationStyles":{"c6de5efe3294b26391ea343053c19a84":"ACM SIG Proceedings (\u0022et al.\u0022 for 15+ authors)"...

Retrieve

Retrieve a converted document. The jobConversionStage parameter specifys which type of conversion you want to get retrned. A full list of conversion stages can be found here.

URL: api/job/retrieve Request type: GET Parameters:

  • email
  • password
  • id
  • conversionStage

I.e.

http://example.com/api/job/[email protected]&password=password&id=123&conversionStage=10

Example response:

The requested document or a JSON string with an error message.

xmlps's People

Contributors

axfelix avatar michaelthessel avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.