acl-org / aclpub Goto Github PK
View Code? Open in Web Editor NEWThe official tool for creating proceedings for conferences of the Association for Computational Linguistics (ACL).
Home Page: https://acl-org.github.io/ACLPUB
The official tool for creating proceedings for conferences of the Association for Computational Linguistics (ACL).
Home Page: https://acl-org.github.io/ACLPUB
Hello, I'm submitting a paper to ACL2024 Findings. I'm requested to sign the acl-copyright-transfer-2021.pdf
located at templates/copyright/acl-copyright-transfer-2021.pdf
. I have two following questions:
Which language should be used for signatures? English or the language the author speaks?
Among the three cases of signing (All authors signing, Work for company, One author signing) if I'm an MS. student as well as an intern from a company (part of my authors are from the company), while my affiliation is my college in the paper is my school, which one should I tick on? Work for company or One author signing?
SIG information is available (I think?) in the meta
file. This should be added as a tag in the XML so that it is available to automatically be added the Anthology at ingestion time (saving the manual effort and error associated with it).
Hello again!
The START guide doesn't mention the files just-program.tex
and just-toc.tex
, however, without them compiling proceedings on START is not possible. Also, START interface (Templates tab) has no fields for those two files. So in my understanding, the Step 3 of the guide becomes an obligatory one contrary to what is stated there.
I'd issue a PR to fix that, but I'm not sure if my understanding is correct, and if there wouldn't be any changes planned to templates and/or START interface.
Thank you for all your work!
This could be enabled for ACL only events, since we only pay for DOIs for them.
In three places, the scripts check if the author has only a last name but no first name, and handles this case correctly, but if the author has only a first name but no last name, it outputs unknown
instead.
I think first we have to decide how these should be written in the XML (acl-org/acl-anthology#237).
We have two top-level documentation directories. One should be removed (I believe doc/
). Filing this to attend to later.
ACLPUB uses the order file to generate the program order. It checks that all papers are in the file, but if a paper is listed twice, it will generate two copies of it in the proceedings. This should be fixed.
It would be helpful to merge in the easy2acl repo and documentation (really just one script), so that we have a single access point for people wishing to contribute to the Anthology.
Ideas for the recently-proposed sanity check script:
Many of the templates, e.g., titlepage.tex have the following in their preamble:
\setlength\topmargin{0.2cm} \setlength\oddsidemargin{-0cm}
\setlength\textheight{24.7cm} \setlength\textwidth{16cm}
\setlength\columnsep{0.6cm} \newlength\titlebox \setlength\titlebox{2.00in}
\setlength\headheight{5pt} \setlength\headsep{0pt}
\setlength\footskip{1.0cm}
\setlength\leftmargin{0.0in}
Could we replace this with the geometry package? I think it would be simpler to understand that way:
\usepackage[a4paper,margin=2.5cm,columnsep=0.6cm,headheight=5pt,headsep=0pt,footskip=1cm]{geometry}
\newlength\titlebox \setlength\titlebox{5cm}
The Anthology uses a new nested format. ACLPUB should generate that instead of the old flat format.
We often see ñ
converted to something like \\textasciitilde {n}
, e.g., the following from EAMT 2020:
author = "Gema Ram\'{\i}rez-S\'{a}nchez and Jaume Zaragoza-Bernabeu and Marta Ba\\textasciitilde {n}\'{o}n and Sergio Ortiz Rojas",
It's possible this is caused by ACLPUB.
This repository is now the canonical place to download the latex and word templates for ARR, if one doesn't want to use Overleaf. And then Github makes it just awkward enough to download the multiple files in templates/latex that it's easier to clone the full repository. However, this repo isn't made for author end-users either:
Maybe ARR could offer tar bundles for the templates again? Or this repo could also contain tar bundles of the files in templates/latex, so it would be possible for authors to just download the single file?
Thank you for considering this.
U+200E is a left-to-right direction marker and was found in an author's last name, which caused ACLPUB to crash when building the EMNLP 2019 proceedings.
i’m not sure if we are still using etc tags, need to check when at terminal and ensure we only use attachment.
My paper draft is done over the ACL 2020 format (identical to the ACL 2022 format apparently) and I want to submit this paper to EMNLP 2022. Will this be an issue? Both templates look the same but I am afraid it will get a desk rejection because of wrong format. Please let me know if this would most likely be the case. Thanks!
The documentation here could be improved. Currently it is split between the top-level README and the (outwardly more important) anthologize README.
One thought here is to turn this into a Github pages site, update the top-level index.html to contain consolidated, clearly-delineated instructions, and then point people to acl-org.github.io/ACLPUB when they need to follow instructions.
I will give this some more attention in mid December.
Is there a reason acl.sty doesn't do \bibliographystyle{acl_natbib}?
With the shift to stamping the paper number in START, the top line of the template has been removed. This used to read something like "ACL Submission XYZ. DO NOT DISTRIBUTE."
Now that that's been removed, local copies don't have this warning. I think it's useful to include this, though. If you share your own paper with others, e.g. requesting feedback, you don't have to 'opt-in' to telling them not to distribute further. It's automatically done.
The right place to put it, I think, is in the title block, just below "Anonymous ACL Submission".
All the relevant changes I see in acl-pub are recent changes from @danielgildea and me. I can put them into a PR here.
A lot of the changes involve anthologize.pl
, which is run on the pub chair's local machine to convert the final.tgz
generated by START into what the Anthology wants. I think it was never really decided whether that should have its home here or acl-pub.
It would be nice if we got pubchairs to decide on a short name for each volume, which could then be used in papers etc. If there is agreement on this, we could ask Softconf to add that field to the START interface, and we could update the code to process it. See also acl-org/acl-anthology#567.
I am confused about ACLPUB is run and wonder if anyone can answer questions here.
The proceedings.tgz
files I've received from pubchairs (say for ACL 2019) have a layout like this:
papers/
proceedings/
cdrom/
pdf/
P19-1001.pdf
P19-1002.pdf
...
i.e., the actual paper IDs. However, I can't see where this is produced. It should be in bin/bib.pl, but that code does not produce the full Anthology ID, but rather something like
papers/
proceedings/
cdrom/
pdf/
naacl00.pdf
naacl01.pdf
naacl02.pdf
...
See for example lines 150–152 of bib.pl, where the bib file name is created:
my $fn_base = sprintf "%0${digits}d", $pn;
my $fn = "cdrom/bib/$abbrev$fn_base.bib";
open(FILE,"> $fn") || printf(STDERR "Can't open $fn: $!\n");
Indeed, this is the format I've received recently when people have built this manually. So it seems to be an issue just with START.
Can anyone clarify what START is doing here? How is cdrom/pdf
getting populated with the actual Anthology files?
I am hoping we can get START to make some changes to the meta
file. These apply to all current conferences (e.g., ACL 2020):
Remove:
type
): no longer usedbib_url
): no longer usedAdd:
shortbooktitle
), with the example "Proceedings of WMT"volume
), with the default example "TOBEFILLED: volume number or name within collection"CC: @rrgerber
Dear @mjpost and all,
ACL Anthology has recently updated the editor information for proceedings/papers. But, this update should be reflected on the the ACL style sheet as well as compilation of the proceedings code/template. For further information, please see this issue. Thank you,
ACLPUB seems sometimes to create bad bib keys when there are special characters in the lead author's name:
@InProceedings{b\"{u}ler-etal-2005-using,
author = "B\"{u}ler, Dirk and Minker, Wolfgang and Elciyanti, Artha",
Currently, START suggests the name format
Matt Post (Johns Hopkins University)
in the chair
lines for ACLPUB. This is not parseable. We should ask them to change the format to
Post, Matt
which we can parse quite nicely.
make-anthology.sh
currently creates this structure:
anthology/
P/
P19/
P19.xml
P19-1001.pdf
P19-1001.bib
P19-1001.Supplementary.tgz
The *.bib
files are no longer needed since we only use the XML at ingestion time.
Quite a few pieces of ACLPUB are dedicated to generation of HTML, but as far as I know, this code is not actually used any more. Is it?
For example, the proceedings.tgz generated for EMNLP 2018 contains only one HTML file, advertisement.html, which is a list of accepted papers. I believe that @desilinguist did not rely on this HTML file when making the EMNLP 2018 web page (because I think I gave him a Markdown version) and wonder whether anyone else uses it.
Assuming that the HTML generated here is not used anymore, I suggest retiring all relevant code (advertisement.pl authors.pl db-to-html.pl index.pl program-html.pl unified-authors.pl) so that all HTML generation is done from the Anthology (which @mbollmann is working on a modern version of).
Hello,
Thank you for fixing broken links in docs/start.md
!
There are some more of them in the same document but less critical:
https://github.com/acl-org/ACLPUB/blob/master/docs/files/sample-order.txt
https://github.com/acl-org/ACLPUB/blob/master/docs/files/verify_order.py
This comment suggested we should load the hyperref
package last.
One of the build scripts in Step 3 here creates symlinks instead of copying PDFs. This results in a common error where a submitter forgets to add the -h
flag to tar, and then when unpacking it, I am left with a bunch of unresolved links. Disk space is cheap these days, and PDFs are small; we should just have it copy the files and avoid this situation.
this has been added to the meta file, should be added to the xml so that the anthology can ingest it automatically
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.