torik42 / yalafi Goto Github PK
View Code? Open in Web Editor NEWYet another LaTeX filter
License: GNU General Public License v3.0
Yet another LaTeX filter
License: GNU General Public License v3.0
We should add a submodule yalafi.documentclasses, together with options --dcls and --documentclass for yalafi and yalafi.shell, respectively.
There seems to be a Vim issue with handling of multi-byte characters, if Vim's errorformat
is used to parse mult-line messages. See here for the bug description.
This causes problems with Vim plugins, e.g. the compiler vlty for vimtex, if the LaTeX text contains many multi-byte characters, as in Russian texts with cyrillic letters. See here for an example.
Even if the Vim issue is fixed, it will take some time for the fix to be incorporated into Linux distributions. We therefore should provide a single-line format for option --output, for instance
--output sl-1
: include file name, line, column, LT'serror message text--output sl-2
: as sl-1, but append LT's replacement suggestionsThis option should be used in editors/vlty.py and editors/ltyc.py.
A way with -- hopefully -- only moderate intrusion could be as follows.
YaLafi core
Yalafi.shell
We should implement some heuristic that decides whether a language change by \foreignlanguage really breaks the text flow of the surrounding language, or whether it is rather short. In the latter case, it should be substituted by a placeholder in the text flow of the surrounding language (as is done with inline formulas), continuing its sentence / paragraph.
There will remain at least one bug. The scanner currently is initialised with a language code that, for instance on 'de', detects "'
as a special token in German texts. Since the scanner first scans the whole LaTeX text, this won't be changed inside of \foreignlanguage{english}{"'}
.
In the following snippet, the closing ] is missing.
\section[1}{Title}
This is a a text.
Currently, this correctly produces an error indication from the LaTeX filter. However, the text behind [ is skipped, as it is consumed when trying to find the closing ].
This unnecessarily hides mistakes, as here 'a a'.
In a one-file document, the LaTeX filter sees the preamble that often contains macros not properly handled "out-of-the-box". Currently, the LaTeX text can only be modified with special macros \LTadd, \LTskip, \LTalter.
Placing the critical parts of the preamble in an \LTskip{...} may fail, as this changes the environment seen by the TeX system.
These are commonly used and currently trigger unnecessary warnings.
For babel: macros \selectlanguage, \foreignlanguage
The current parsing of displayed equations assumes a certain style of writing formulas that may be inappropriate. On the other hand, at least inclusion of a place holder like 'V-V-V' together with trailing punctuation extracted from the formula does help the proofreading program. (Even if this omits text parts included, e.g., with \text.)
We could provide options for yalafi and yalafi.shell that switch to this simpler mode for all equation environments, which have been registered with EquEnv()
in the fully detailed mode. The replacement like 'V-V-V' should be taken from a rotating collection.
See the discussion in Issue #83.
If one uses Emacs via the call of script yalafi-emacs
, then a newly started LT server is stopped immediately at the end of a language check. Therefore, usage of an LT server rather decreases speed, as starting the server is more expensive than starting the command-line LT tool. (This does not happen with Vim and script yalafi-grammarous
: subsequent checks are really faster with a server.)
Probably, the new LT server process has to be "detached" somehow from the sub-process that is started by Emacs to perform a language check.
The two-line interface Bash script in README does not work for all options. For instance, enabled rules or enabled/disabled categories are not handled correctly.
Method Parser.expand_arguments()
has to be revised. In case a macro has set parameter extract
, the behaviour may be incorrect, if repl
is a function or if opts
is not empty.
For instance, in an English document (--language en-GB), the input
\foreignlanguage{german}{"'}
produces
"'
Instead, the German right double quotes “
should result.
It is unlikely that we can fix that in the near future. For a real solution, scanner and parser would have to be reorganised to work in a pipelined way. Currently, the scanner first reads the whole input text and produces a token list, on which the parser then operates.
EDIT. A probably simpler solution:
"'
etc. in the scanner."
as 'active character' in German text parts, as in TeX.Add macros from biblatex package, especially \cite*.
On some systems, LanguageTool can be installed via a packet manager. This places an executable Bash script languagetool
in the standard search path that dispatches to the different software components. Compare, for instance, Issue #19.
At least under Arch Linux, the script also can start an LT server (option --http
).
It seems reasonable to somewhat redefine option --lt-command ...
for this case. We simply would ignore an option value in --lt-directory ...
(just set it to current directory), and call languagetool --http ...
in case a local server has to be used (--server my
).
This then also could be integrated as option into the Vim interface scripts.
Hi,
let me thank you at first for this great project and especially the good vim and languagetool integration.
I've noticed that yalafi leaves the positional argument of the figure environment in.
Input:
\begin{figure}[h]
\centering
\includegraphics[width=0.7\linewidth]{./image.png}
\end{figure}
Output: [h]
I'm using Yalafi version 1.1.1 (from pip) with the Languagetool Plugin in vim.
Best Regards
Max
Assume that yalafi.shell is used with '--server my', and an unknown language like '--language en-XX' is specified. Then after some trials, the tool concludes that it cannot contact or start the LT server, since the request with the given language is not successful.
Furthermore, on a 32-bit system with unsupported German for higher LT versions, yalafi.shell terminates in a way it shouldn't.
When sending the HTML request, we have to closer examine the reason of a failure.
In Issue #73, mapping of character positions between LaTeX and plain text fails (function tex2txt.translate_numbers()
).
We should catch that possibility and print an error message.
This may cause a problem with punctuation checking, e.g., in
\begin{align}
a &= b \\
& \quad \times c
\end{align}
It might be practical to have a small server on top of yalafi.shell that pretends to be LT's server, but additionally performs LaTeX filtering and position mapping.
We use simple rotating replacements for maths material, as 'C-C-C'. Similarly, short foreign-language inclusions will be substituted this way in the surrounding text flow. This may cause false positives in English documents, if the replaced text part starts with a vowel, and is preceded by the article 'an'.
When the substituted text starts with a vowel, we should use a replacement starting with a vowel.
Vim-grammarous and ALE internally use Vim function matchaddpos()
for error highlighting. This function expects byte, rather than character offsets for column numbers.
If character offsets are given, then highlighting may be shifted, see Issue #89@vim-grammarous.
xml-b
to yalafi.shell (vim-LanguageTool expects character offsets)EDIT: in ALE, one can simply set 'vcol': 1
in the linter component.
Input
A
\newcommand{\x}{}
\newcommand{\y}{}
B
leaves a blank line between A and B in filter output.
This seems to depend on the test function name. For instance, we had to change test_macros_latex()
to test_macros_latex_builtins()
in file tests/test_packages/test_latex_builtins.py
.
Need to check that.
This is now dev.languagetool.org.
As pointed out here by @petRUShka, this is missing.
It would be practical to have a macro like
\LTmacros{defs.tex}
It should expand to nothing, but read the LaTeX text in the given file and append its macro definitions to the current list.
At first sight, this might be possible with a "handler function" in yalafi/handlers.py.
A problem could be, however, character position tracking.
I run vlty via VIM plus vimtex on tex-file with quite complex template. As a result I get error:
AttributeError: 'NoneType' object has no attribute 'lin'
Run command
python3 -m yalafi.shell --lt-command languagetool --language ru \
--disable "WHITESPACE_RULE" --enable "" \
--disablecategories "" --enablecategories "" \
--documentclass "" \
--packages "amsbsy,keyval,fontenc,enumerate,eufrak,calc,inputenc,url,amsopn,babel,bm,amssymb,epstopdf-base,trig,amsmath,revsymb,amsthm,array,longtab le,caption2,natbib,amstext,mathtext,graphics,amsgen,graphicx,ifthen,caption3,extsizes,amsfonts" \
--encoding cp866 \
my_article.tex
Output:
=== my_article.tex
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.amsbsy'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.keyval'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.fontenc'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.enumerate'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.eufrak'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.calc'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.inputenc'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.url'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.amsopn'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.babel'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.bm'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.amssymb'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.epstopdf_base'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.trig'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.revsymb'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.array'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.longtab_le'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.caption2'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.natbib'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.amstext'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.mathtext'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.graphics'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.amsgen'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.ifthen'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.caption3'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.extsizes'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.amsfonts'
*** yalafi.shell: warning:
*** could not load module 'yalafi.documentclasses.revtex4'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.enumerate'
Expected text language: Russian
Working on STDIN...
=== my_article.tex ===
1.) Line 18, column 20, Rule ID: UPPERCASE_SENTENCE_START
Message: Это предложение не начинается с заглавной буквы
Suggestion: Maik
maik Гипотеза Лемма russian О конечности чи...
^^^^
=== my_article.tex ===
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.8/site-packages/yalafi/shell/__main__.py", line 3, in <module>
from . import shell
File "/usr/lib/python3.8/site-packages/yalafi/shell/shell.py", line 351, in <module>
gentext.generate_text_report(proofreader.run_proofreader, sys.stdout)
File "/usr/lib/python3.8/site-packages/yalafi/shell/gentext.py", line 93, in generate_text_report
output_text_report(tex, plain, charmap, matches, file, out)
File "/usr/lib/python3.8/site-packages/yalafi/shell/gentext.py", line 48, in output_text_report
s = (str(nr) + '.) Line ' + str(lc.lin) + ', column ' + str(lc.col)
AttributeError: 'NoneType' object has no attribute 'lin'
In yalafi/sehll/gentext.py, we still use on older code for mapping of line and column numbers from yalafi/tex2txt.py.
Independently of the result / fix for Issue #73, this should be updated to the method from yalafi/shell/utils.py.
On --as-server, all entries in --lt-options are ignored.
For example, this is wrong, if languagetool-commandline is internally used (no --server given), and something like --languagemodel should be passed to LT.
This problem may occur with LanguageTool 5.0 and 5.1, it has been fixed with the daily snapshot from 2020/10/26 (see here). Versions 4.9.1 and below work well.
Apparently, only German texts are affected.
The problem is provoked with an input like
auf$\Omega$
The plain text sent to LanguageTool is aufC-C-C
, and LanguageTool replies with an invalid JSON message.
This is due to the issue here. It has been fixed with the daily snapshot 2020-10-26.
setup.py
should be in the repository, so that one can install yalafi using pip directly from github with pip install --user git+https://github.com/matze-dd/YaLafi.git@master
.
When using yalafi-grammarous
I had to change python
to python3
as python
is Python 2 on my system. Leading to the error message:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/huecking/.config/nvim/plugged/YaLafi/yalafi/shell/__main__.py", line 3, in <module>
from . import shell
File "yalafi/shell/shell.py", line 124, in <module>
from yalafi import tex2txt
File "yalafi/tex2txt.py", line 29, in <module>
from . import parameters, parser, utils
File "yalafi/parameters.py", line 23, in <module>
from .defs import Environ, EquEnv, Macro
File "yalafi/defs.py", line 19, in <module>
from . import utils
File "yalafi/utils.py", line 119
exec('import ' + mod)
SyntaxError: unqualified exec is not allowed in function 'get_module_handler' because it contains a nested function with free variables
Maybe this could be change in the code? Otherwise this issue may just help people to know what they have to change. Maybe it could be mentioned in the docs that it works only with python3
Script shell.py --output html ...
fails, if the last input line is for instance
L\L
Probably, the missing final line break (consumed by \L) causes the problem.
For simpler configuration of editor interfaces, we should add to yalafi.shell options --enable, --disablecategories, --enablecategories.
For an input like
\newcommand{\xxx}{XXERR}
\xxx
the spelling error is related to '\xxx' in an HTML report, due to a small hack already implemented in Tex2txt/shell.py.
For the other formats, only the single leading backslash of '\xxx' is given as error location.
It would be nice to activate the HTML-report hack in other cases, except for the plain-text report.
In yalafi/shell/shell.py
, the language tool command is hardcoded, which works well if you directly
download languagetool. I installed it via my package manager, which gives me a binary instead of
jarfile in /usr/bin
, and other files are located in /usr/share/languagetool/
. In order to get it working I had to do the following change in yalafi/shell/shell.py
# ltcommand = 'java -jar languagetool-commandline.jar --json --encoding utf-8'
ltcommand = 'languagetool --json --encoding utf-8'
I was then able to generate a report by using the following command
python -m yalafi.shell --lt-directory /usr/share/languagetool/ --output html draft.tex > draft.html
It would be nice to provide a command line option to change ltcommand as well.
When using yalafi.shell for an editor interface, then a local LT server is used on option --server my
. In this case, for each invocation of yalafi.shell by the editor, we first send a small test request in order to check for a running server.
Instead, we should send the real request immediately, and try to start the local server only afterwards, if nobody is responding.
For Vim and Emacs usage, putting all the components together might be rather complex.
We should check, whether integrating things with Docker makes life easier.
Add at least tikzpicture environment.
Do that.
Currently, only a subset of commonly used LaTeX macros and environments is recognised "out-of-the-box". In order to ease application , we could add a submodule, say yalafi.packages, that contains further Python files with definitions like in example [definitions.py]. For a LaTeX package, the corresponding file would provide initial versions for important macros and environments.
These "subsubmodules" could be activated via command-line option for yalafi and yalafi.shell, for instance ... --packages amsmath,hyperref
.
EDIT
Additionally, one could provide an option --root-document file
for yalafi.shell. The script would read that file and extract package information from \documentclass
and \usepackage
.
In both cases, it is important to ensure a proper evaluation order. First, macros and environments from yalafi/parameters.py
, then from package extensions, finally from user-declarations given by --define
and --python-defs
. This has to be independent of declaration method, LaTeX or Python code.
Hi, thanks for this great tool!
I noticed the recent PR #62 and issue #60, allowing users to specify alternative languagetool
command. I replaced the original vlty.vim
in vimtex with the one that comes with YaLafi to try it. Based on README, I commented out g:vimtex_grammar_vlty.lt_directory
and only specified g:vimtex_grammar_vlty.lt_command
. However, vimtex then complained lt_directory path not valid
.
I'm not familiar with vim script, but I think it has something to do with the checking code here in vlty.vim
?
Suppose we have bibliography in our tex-file:
\begin{thebibliography}{99}
\bibitem{paper_name}
\end{thebibliography}
yalafi will process paper_name
as sentence but not the technical label. So one sees an error on that line.
Could the glossaries / glossaries-extra packages commands be supported?
e.g. \acr, \gls, \glspl
etc commands
I know you've written some documentation on how to extend and support different packages, but I haven't got the time currently to do it myself (though hopefully, I'll revisit this issue at some time). Thanks for YaLafi, it's really great!
LanguageTool won't detect missing space as in
\[
f(x) = 0 \text{ для} x \ge 0
\]
if the replacement for x \ge 0
is something like W-W-W
. Detection works, if we use Cyrillic letters for replacements, too.
The hard linebreak \\[length]
with vertical skip is not removed. Tex2txt did that correctly :-(.
EDIT: The first fix version removes a linebreak after \\
, if no [...]
is following. We try to avoid that.
I have a latex-file with math formulas
For example I have equation:
\begin{equation}
\begin{aligned}
f_{13}(x, z) = & x^{3}+ \frac{1}{12} \left(-24 z^{4} + 72 z^{3} - 70 z^{2} + 112 z - 76\right) x^{2} + \\
& +\frac{1}{11} \left(2877 z^{4} - 9184 z^{3} + 13080 z^{2} - 23436 z + 24318\right) x + \\
& +\frac{1}{4} \left( 10224 z^{4} - 31451 z^{3} + 46509 z^{2} - 83811 z + 80129 \right).
\end{aligned}
\end{equation}
and yalafi yields WORD_REPEAT_RULE on this line:
=== my_file.tex ===
28.) Line 547, column 5, Rule ID: WORD_REPEAT_RULE
Message: Возможная опечатка: повтор слова
Suggestion: U-U-U
...ветствует B-B-B, и этим значениям отвечает U-U-U U-U-U plus V-V-V plus W-W-W. Разложение э...
^^^^^^^^^^^
For highlighting of errors in the source text buffer, vim-LanguageTool does not directly use the error location returned by the proofreader. Instead, it tries to identify the reported problematic text part at the line given by the proofreader (application of Vim's matchadd()
). This fails for
\newcommand\books{books}
A \books{} are interesting.
The mistake 'A books' is correctly reported for line 2, but it cannot be highlighted in the source text buffer.
With a minor reorganisation of method Parser.init_package()
, we could provide an iterface for handler functions of LaTeX macros. For instance, it would allow them to add an environment type (possibly with its own handler function).
This way, we could easily implement something like \newtheorem.
\quad
are declared twice in yalafi/parameters.py
, both for text and math mode. As the maths parser expands "normal" macros before simplification of maths material, a single definition in text mode should suffice, e.g., \newcommand{\quad}{\;}
.\medspace
from package amsmath should only be loaded by a package extension, compare Issue #28.\text
in Parameters.math_text_macros
as well as for equation and theorem environments.When using the script for an editor plug-in, we should not harshly terminate if a LaTeX error occurs (like missing end of an equation).
Instead, we could try to recover somehow and include a descriptive message (containing a bold spelling error) in the text sent to the proofreader. This would generate a "normal" proofreader message that is hopefully placed at the right position in the text.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.