GithubHelp home page GithubHelp logo

vitaliy-1 / jatsparser Goto Github PK

View Code? Open in Web Editor NEW
11.0 7.0 19.0 1.6 MB

JATSParser is aimed to be integrated with Open Journal Systems 3.0+ for transforming JATS XML to various formats

License: GNU General Public License v3.0

PHP 100.00%
jats-xml jats

jatsparser's Introduction

JATSParser

JATSParser is aimed to be integrated with Open Journal Systems 3.0+ for transforming JATS XML to various formats

Usage

  • Install composer dependencies
  • See example.php
  • Doesn't deal with JATS XML metadata as it by design it should be transfered from OJS
  • Transforms JATS to HTML and PDF, uses TCPDF for the latter conversion
  • Has dependency from citeproc-php for support for different citation style formats

jatsparser's People

Contributors

marcellagreca avatar vitaliy-1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

jatsparser's Issues

CiteProc throws a fatal error if year isn't valid

If an article isn't assigned to a journal issue, the year tag inside may be used to indicate that the article is in press, which causes a fatal error from the CiteProc:

PHP Fatal error:  Uncaught Seboettg\CiteProc\Exception\InvalidDateTimeException: Could not create valid date with year=in press, month=1, day=1. in /home/doc/OJS/development/e-med/ojs/plugins/generic/jatsParser/JATSParser/vendor/seboettg/citeproc-php/src/Rendering/Date/DateTime.php:45

Text nodes shouldn't be trimmed when converted to HTML

Sometimes writers/editors will accidentally make the space after an italicized word also italic.
When it's displayed on the page, those words will run together.

For example, with a JATS XML file like this:

<italic id="italic-9">Diplomacy </italic>reimagined ...

The space between the two words is included in the <italic>, but the html generated looks like this:

<i>Diplomacy</i>
reimagined

Resulting in:
Screen Shot 2019-06-21 at 6 48 56 PM

I think it's because of the trim on this line https://github.com/Vitaliy-1/JATSParser/blob/master/src/JATSParser/HTML/Text.php#L71

Figures inside a <p> aren't recognized correctly.

It seems that <fig> tags are not correctly recognized if present inside a <p>.

Actually, according to the documentation this is possible (https://jats.nlm.nih.gov/publishing/tag-library/1.1/element/fig.html).

I've been trying to replace ./fig with .//fig here, but it doesn't seem to be going right...

foreach (self::$xpath->evaluate(".//sec|./p|./list|./table-wrap|./fig|./media|./disp-quote|./verse-group", $body) as $content) {

Anchor tags

Hi Vitaly,
I think you forgot the anchor tags to jump to the figures.
In "./JATSParser/src/JATSParser/HTML/Figure.php", this works for me:

...        
	public function setContent(JATSFigure $jatsFigure) {

+               // anchor tags
+               $aNode = $this->ownerDocument->createElement("a");
+               $aNode->setAttribute("name", $jatsFigure->getId());
+               $this->appendChild($aNode);

...

Regards
Olaf

Line breaks

Line breaks such as <br> or (in XML) <break> in the XML body text are stripped off.
Can you please add such tags?

Best regards

Edvin

Add support for disp-quote

Hi, we're testing the plugin for one of our journals. We have cites with 'disp-quote' tag in our Jats XML that is not working.

Example:

...la pobreza afecta a adolescentes y jóvenes de una manera desproporcionada: en 2008, se estimaba que 35 millones de adolescentes de la región, de edades comprendidas entre los 13 y los 19 años, vivían por debajo del umbral de la pobreza. Casi 15 millones de adolescentes, de entre 10 y 18 años, vivían con menos de 1 dólar al día.

Thank you in advance
Diego

Appendix not extracted from JATS XML

The parser does not extract appendix from the XML.
Appendix are tagged with tag withing the
Here is an example of XML for appendix

.......... <title>APPENDIX</title>

With respect to our sample, the time spent to clear the 3 consistencies tested ranges from 4 up to 44 seconds. Stratifying linearly this time on 9 levels of 5 seconds, a factor of correction (FOC) is obtained to adjust the p-score in the following way: 0-5 secs =+ 1, 6-10 secs = +2, 11-15 secs = +3, 16-20 secs = +4, 21-25 secs = +5, 26-30 secs = +6, 31-35 secs = +7, 36-40 secs = +8, > 40 secs = +9 (Table V). The sum of the p-score total + FOC represents the timed p-score (tp-score). In this new role, the tp-score ranges from 5 up to 20, expressing itself as a continuum of severity 26. The possibility of a clinical subdivision of the tp-score in further levels is under consideration.

Issue with inline-graphic, display-formula and fig tags

Hi,

We have the following tags in our jats xmls:
<inline-graphic xlink:href="ES12366-34-1-001-00" id="ID_8b8883f5-3d03-4d61-adf9-bbae1bb3d9ef"></inline-graphic> <disp-formula><alternatives> <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="bpasch-38c-2-001-ueqn001"/> </alternatives></disp-formula>
And also following tags is not working inside the table or p tag
<fig><label>Figure 1:</label><caption><p>Plots of <italic>&#x03C1;</italic> v/s c of copper surfactants (derived from fried and unfried oils) solution in benzene.</p></caption> <graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="bpasch-38c-2-001-f001.jpg"/> </fig>

Please help me out from this issue.

After parsing through Vitaliy-1/JATSParser, All data is not coming in html and pdf.

Add class to p-element in poetry

Is it possible to add a class to the content in the poetry (i.e. the p-element in HTML), when it has been converted from XML into HTML. In this way the visualization of it can be changed with CSS (for example with an indent and/or italics).
You have classes like this already for e.g. headings and figures.

Display inline-formula

Hi @Vitaliy-1 , do you think it is possible that JATSParser would render math formulas as this?
We are using OldGregg Theme.

<td id="tc-314904a80e79" align="left">
			<inline-formula id="if-42f432a5f18c"> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>C</mml:mi><mml:mi>O</mml:mi><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:munderover><mml:mrow><mml:mo>∑</mml:mo><mml:msubsup><mml:mi>n</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo>-</mml:mo><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:mrow><mml:mi>N</mml:mi><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:math></inline-formula>
</td>

Best regards!

Hyperlinks to external uri inside JATS XML <p> tag

Hi @Vitaliy-1 , I need to reference a supplementary file inside a JATS XML formatted article, which has also been published as a pdf galley. The only way I thought about doing so is by adding a hyperlink inside the corresponding

tag using an <ext-link> tag as shown below…

<p id="p-8fe1d0c07051">La estructura factorial de tres factores se mantuvo en la versión final del instrumento, con las siguientes características (ver <xref id="x-98099e9a5fef" rid="tw-6e408fb0f00d" ref-type="table">Table 2</xref> y <ext-link ext-link-type="uri" xlink:href="http://www.evidencia.org.ar/index.php/Evidencia/article/view/4253/1789">material complementario</ext-link>.</p>

…but this is not actually working, actually showing nothing (I mean the “material complementario” uri) when the XML renders.

If could add this feature, it would be great! Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.