GithubHelp home page GithubHelp logo

ceurws / ceur-make Goto Github PK

View Code? Open in Web Editor NEW
14.0 6.0 4.0 251 KB

A set of scripts to semi-automatically generate workshop proceedings for CEUR-WS.org

License: GNU General Public License v3.0

TeX 0.66% Makefile 12.79% Shell 8.70% XSLT 74.71% Perl 2.12% Dockerfile 1.03%

ceur-make's Introduction

ceur-make

A free set of scripts to semi-automatically generate open access workshop proceedings for [CEUR-WS.org] 1, with special support for proceedings exported from [EasyChair] 2.

Key Advantages

  • facilitates the generation of CEUR-WS.org workshop proceedings volumes
    • … particularly so for workshops using the EasyChair submission system
  • enriches your CEUR-WS.org volume, as an extra feature over the standard template, with RDFa annotations to improve the visibility of your workshop to semantic search engines

Features

  • From a special table of contents file, generate
    • [a CEUR-WS.org compliant index.html file] 3 with additional RDFa annotations
    • [a CEUR-WS.org compliant copyright form] 4
    • a LaTeX table of contents that helps you generate an all-in-one PDF version of the proceedings (which you can, e.g., print, but which should not be uploaded to CEUR-WS.org)
    • a BibTeX database to make your proceedings citable
  • Optionally generate this table of contents from [EasyChair] 2 proceedings

Disclaimer

CEUR-WS.org may revise their [submission rules] 6 and their requirements for [index.html] 3 files, [copyright forms] 4, etc., at any time. You are responsible for checking whether CEUR-WS.org have released new versions of these (concretely: if the CEURVERSION in the source of the [index.html] 3 template is the same as the one in the source of toc2ceurindex.xsl). The ceur-make developers welcome any [bug reports] 5 related to this. If you manually edit the output generated by ceur-make, you will be doing so entirely at your own risk. In particular, take care not to break any CEUR... annotations.

Use ceur-make at your own risk. At the time of this writing, the documentation (both in this README file and in the sources) is not yet complete, but we will be working on this.

Prerequisites

  • GNU make (any recent version should be sufficient)
  • GNU bash (any recent version should be sufficient)
  • Saxon-HE 9 (other XSLT 2 processors might work as well, but the Makefile currently assumes Saxon, and ceur-make has been tested with Saxon-HE 9.5)
    • Java Runtime Environment (JRE, for Saxon)
  • Optional:
    • Perl 5 (for processing EasyChair proceedings; any recent version should be sufficient)
    • TeX (for generating an all-in-one proceedings file; any recent version should be sufficient; tested with TeX Live 2012)

On a recent Linux all of these components (except maybe Saxon) should be installable via the central package manager, if they are not yet installed anyway. On Mac OS, most components should be installable via MacPorts or Fink, on Windows via Cygwin. For Saxon, it suffices to download its distribution for Java, and to unzip the saxon9he.jar file from the ZIP file, and to put it into some user directory.

ceur-make has so far been tested on Linux and Windows (using Cygwin); [reports] 5 from users of other systems are welcome.

How to use

Getting started

To get started, you need to copy the ceur-make scripts into the directory in which you would like to keep your proceedings. You can do this by calling ./ceur-make-init path/to/your/directory from the directory where you installed ceur-make. Copy Makefile.vars.template to Makefile.vars and adapt the paths to point to the path of where you put Saxon, etc. (ceur-make-init doesn't do this automatically to prevent problems.)

Export from EasyChair (optional)

When you use [EasyChair] 2 and instruct it to create an LNCS proceedings volume, ceur-make can automatically generate the XML table of contents (toc.xml) from the EasyChair volume information. Note that, for the purpose of ceur-make, “LNCS” just means that EasyChair will provide the proceedings for download in a ZIP file with a certain structure. It doesn't mean that your proceedings will be published with Springer, nor that the papers have to be in the LNCS layout.

  1. When creating a proceedings menu in EasyChair, use “9999” for the volume number (as this is currently hard-coded in ceur-make).
  2. Download the final proceedings as a ZIP file and unzip it into a directory.
  3. Copy the ceur-make scripts into that directory, so that they become siblings of the 9999PPPP per-paper directories, the README file, etc.
  4. Generate toc.xml by make toc.xml and adapt it manually if needed; see below. (If make toc.xml doesn't do its job, try to enforce it with make -B toc.xml.)
    • related issues: #1

A note on LaTeX or Word sources vs. PDF versions of papers

Note that [publishing with CEUR-WS.org] 6 only requires PDF versions of papers but no sources. EasyChair, however, will force authors who contribute to an “LNCS” proceedings volume to upload the ZIPped LaTeX sources or the Word source of their papers. If the actual sources of the papers are not relevant to your workflow (ceur-make does not require them in any case!), you can ask your authors to upload a dummy ZIP or a dummy Word file.

Manually writing or adapting toc.xml

When you do not generate toc.xml from [EasyChair] 2, as explained above, you will have to write it manually, following the example.

If it has been generated from EasyChair, the following special situtations may require manual adaptation of toc.xml.

Sessions

EasyChair does not support workshops having multiple sessions. If your workshop has multiple sessions, you need to insert them manually into toc.xml; see the example file.

Page numbers

Page numbers are an optional feature, but make toc.xml generates page number information from the EasyChair metadata.

If you do not wish to use page numbers, please remove all pages entries from toc.xml.

If you wish to use page numbers and would also like to generate an all-in-one PDF, e.g., for distribution during your workshop, please make sure you adapt the pages entries to match the all-in-one PDF. They will have to be shifted as soon as the all-in-one PDF includes material before the papers, such as a preface, and if this material uses the same counter as the papers' pages, e.g. if the preface doesn't use Roman numerals.

Generating CEUR-WS.org proceedings

To get started with this, you need a toc.xml file (see this example), which you can either write manually, or have generated from an EasyChair archive (see above). Additionally, you need to write workshop.xml (see this example) manually.

From these files, you can generate the following building blocks of a CEUR-WS.org proceedings volume.

  • [the index.html file] 3 (via make, or specifically make ceur-ws/index.html)
  • [the copyright form] 4 (via make, or specifically make copyright-form.txt)
  • a LaTeX table of contents to help with generating an all-in-one PDF version of the proceedings (via make, or specifically make toc.tex). The all-in-one PDF is assumed to have the name proc.pdf.
  • a BibTeX database (via make, or specifically make ceur-ws/temp.bib). This file will need manual post-processing; please read on below.
  • a ZIP archive for upload to CEUR-WS.org (via make zip)

Manually adapting index.html

Some features of the [a CEUR-WS.org index.html template] 3 are not yet supported by ceur-make and will require manual adaptation. This includes the distinction between regular papers and “AUX” papers, which are not to be indexed in publication databases.

“Joint volume” proceedings comprising papers from multiple workshops also require manual work. See Vol-1010 as an example for such a volume; however be aware that the [index.html template] 3 has changed meanwhile, i.e. you should not literally copy source code from old tables of content.

If your editors have FOAF profiles, please consider manually adding resource="foaf-profile" in addition to href="homepage" for each editor.

If your authors have FOAF profiles, manually add resource="foaf-profile" to each outer <span rel="dcterms:creator">; otherwise, if they have homepages and you want to link to them, manually add rel="foaf:homepage" resource="homepage" to each inner <span property="foaf:name">.

Manually adapting the BibTeX database

The BibTeX database, generated as ceur-ws/temp.bib may work out of the box with BibLaTeX and Biber but usually requires manual revision, as ceur-make does not handle persons' names as intelligently as BibTeX, and as bibtex does not support Unicode names and identifiers.

For these reasons we force you to manually inspect and revise ceur-ws/temp.bib and copy it to ceur-ws/yourworkshopYYYY.bib (according to the settings in workshop.xml); ceur-ws/temp.bib is not included in the ZIP archive for upload to CEUR-WS.org.

While you are working on ceur-ws/temp.bib, you can test it with make bibtest.pdf (which currently assumes plain old BibTeX, not Biber).

Contributors

License

This code is licensed under GPL version 3 or any later version.

ceur-make's People

Contributors

arademaker avatar clange avatar csarven avatar wolfgangfahl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ceur-make's Issues

Custom page number offset in toc.xml

Depending on how one generates the proceedings volume from toc.xml (via toc.tex), the first paper doesn't start on page 1, as a title page, table of contents, etc. might occur before. As the page numbering in toc.xml (which are taken from EasyChair) propagates to ceur-ws/temp.bib as well as ceur-ws/index.html, it would make sense to specify an offset (e.g. “5 pages”), which is respected when generating toc.xml.

Standardising editor's affiliation country format

Editor's affiliation country value is a string which makes it possible to have any value e.g., fullname of the country, ISO 3166-1-alpha-2 etc. It may be preferable to standardise on the format. It would mean that the values should be entered as ISO 3166-1-alpha-2 e.g., CA, or a wikipedia URL e.g., http://en.wikipedia.org/wiki/Canada (which we can map to dbpedia during transformation, like we do for the location of the event). Both can be mapped in any case.

Parameterize make target names

Instead of ceur-ws/temp.bib the Makefile should create a BibTeX file named by the workshop ID. Find out how the name of a make target can be parameterized; maybe using a second pass.

Makefile to validate RDFa against "unit tests"

ceur-make intends to output sane RDFa, i.e. RDFa that uses reasonable URIs for things and that is valid w.r.t. the vocabularies used.

However,

  • ceur-make might be wrong,
  • Some features (e.g. AUX papers) are not yet supported by ceur-make, and some other features (e.g. linking to machine-comprehensible FOAF user profiles if editors/authors have them) do require manual copy-editing of the RDFa if editors want to have them (and, don't worry, those who want to have this, are usually technically experienced). Any such manual copy-editing might go wrong.
  • We won't be able to stop the bad practice of bypassing ceur-make and copy-editing existing volumes' index.html files.

An easy way of implementing this would be a combination of

  • some new rules in the Makefile
  • a shell script
  • pyRDFa (for obtaining RDF/XML)
  • SPARQL queries executed by the ARQ command-line tool

Consider having a copy of ceur-ws.css in the ceur-make directory

Jyrki Nummenmaa:

When I view the generated index.html file, it does not find the ceur-ws css file since the reference is relative. I do not think it would hurt to make the reference absolute in which case the file would automatically look ok with the right style.

Two possible solutions to make the relative link work:

  1. maintain a physical copy (or even the actual master version?) of ceur-ws.css in the ceur-make repository
  2. let Makefile download the online ceur-ws.css.

reconsider use of bibo:presentedAt

We currently say that a proceedings volume as "presented at" a workshop event. However weren't rather the individual papers presented a workshop?

Adapt LaTeX sources to a uniform style

A request from user Pascal Fontaine:

I think the process can be further automated, by automatically compiling
(with the right page numbers, uniform title emphasis style,
letter format, CEUR footnotes, etc...), creating the zip, maybe checking
correspondence of titles and authors between TeX and Easychair (I know
and fix various little things). You could restrict the input style to a
few of them.

$pdf left in index.html file

After running make, for some reason the link for each paper in index.html was "$pdf" instead of "paper-01.pdf" &c.

Display name vs givenName and familyName

Currently we use display name for the author. Should we move or also incorporate givenName and familyName specifically? It would mean that the toc needs to have a field for it. Another thing to investigate: is easychair metadata making that distinction or only giving display name?

ceur-ws/paper-01.pdf should depend on creation of ceur-ws directory

The Makefile rule

ceur-ws/paper-01.pdf: ceur-ws ID

is re-executed whenever ceur-ws is newer than ceur-ws/paper-01.pdf. The timestamp of the directory ceur-ws gets updated whenever a directory entry is added/deleted/renamed.

But we mean that before creating ceur-ws/paper-01.pdf, the directory ceur-ws should be created.

This could be controlled by creating, in the same rule that creates the directory, a hidden file ceur-ws/.directory, and depending on that file. However this file would have to be excluded from the ZIP.

Support multi-session workshops

See @csarven's draft implementation in https://github.com/ceurws/ceur-make/blob/linked-research/toc2ceurindex.xsl#L188. This will require the toc.xml ad hoc schema to be extended (so maybe a task for @csarven and @clange to work on together).

From EasyChair we probably won't get session information. But we could extend the documentation of ceur-make as follows:

  1. use make toc.xml to generate toc.xml.
  2. manually add your session structure to toc.xml
  3. then use make to generate your index.html.

Script for validating submissions

Basic functionality to implement (sync with latest version of [http://ceur-ws.org/HOWTOSUBMIT.html#TOPERRORS](top mistakes)):

  • index.html structure:
    • using outdated Vol-XXX template (check date stamp)
    • invalid HTML (feed through W3C service, or tidy)
    • encoding should be US-ASCII, with HTML entities for non-ASCII characters
    • authors not separated with comma (but, e.g., with “and”)
    • incomplete author names (comparing author/title with PDF would be too hard to automate)
    • inconsistent title capitalisation (e.g. compute ratio of capitalised letters for each title, warn about outliers)
    • only one submitting editor at the bottom
    • link to workshop must work
    • title must not be the Vol-XXX sample title
  • paper full-text:
    • copyright clauses (scan full-text PDF for the usual suspects: Springer, ACM, …)
  • file/directory structure of ZIP:
    • not in ZIP format
    • no metadata (.DS_Store, __MACOSX, .svn, .git)
    • PDF papers not in subdirectories but on top level

Should output a command line for error-report. Initially just with --error parameters for each error encountered, later with arguments (e.g. name of erroneous paper file).

Document how to use EasyChair's frontmatter in the further workflow

Jyrki Nummenmaa:

The instructions related to toc.xml generated from EasyChair project do not particularly mention what to do with frontmatter. Maybe it is self-evident to the editors?

We could consider auto-generating a LaTeX preface.pdf and adding it to the table of contents.

index2main failure on Vol-2849

Michael Cochez reported:

I am publishing a CEUR volume, and wanted to use index2main. Now. It
appears the volume has been created using ceur-make and the script is
not able to extract the information from the index.html file. Is that
a known issue?

ceurws@mars:~/www/Vol-2849$ index2main index.html
line 12 column 7 - Error:

is not recognized!
line 12 column 7 - Warning: discarding unexpected
line 31 column 7 - Warning: discarding unexpected
line 32 column 7 - Error: is not recognized!
line 32 column 7 - Warning: discarding unexpected
line 33 column 10 - Error: is not recognized!
line 33 column 10 - Warning: discarding unexpected
line 43 column 37 - Error: is not recognized!
line 43 column 37 - Warning: discarding unexpected
line 43 column 130 - Warning: discarding unexpected
line 43 column 141 - Error: is not recognized!
line 43 column 141 - Warning: discarding unexpected
line 43 column 229 - Warning: discarding unexpected
line 57 column 16 - Error:
is not recognized!
-:1: parser error : Document is empty

I did now manually create the block for the homepage.

Let easychair2xml.pl write “command to create document” into toc.xml

make retex currently runs Perl on its own to find out the “command to create document” (actually just the main LaTeX source, not the command, but that's a separate issue) from each paper's README_EASYCHAIR file. This is something that easychair2xml.pl could easily do.

TODO fix this ticket to link to the relevant sources
TODO create ticket for the separate issue

Remove RDFa instruction comments from index.html; put such documentation elsewhere

Requirements:

  • generated index.html should be free from XML comments
  • but there should be a clear documentation of how to write correct RDFa annotations.

Better alternatives than mere documentation:

  1. provide a script to strip such comments (see #12)
  2. add support for all RDFa annotations to toc.xml and workshop.xml.

Upon releasing this change, manually strip existing volumes from such comments.

Update XSL to current index file template (2020-07-09)

I believe the toc2ceurindex.xsl is at CEURVERSION=2015-12-02. The current version of Vol-XXX/index.html file is at CEURVERSION=2020-07-09.

Would it be possible please to update the XSL file? I've seen most new additions to CEUR-WS follow the 2020 template, as requested by CEUR-WS ("Always use the latest template"), probably based on manual edition of the HTML template, but obviously don't benefit from your RDFa annotations, which is quite a pity.

Editors affiliation URL

Editors can have their own homepage URL, should it be possible to also add the affiliations urls (sort of like workplaceHomepage)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.