bdtc / lwarp Goto Github PK
View Code? Open in Web Editor NEWThe LaTeX lwarp package — Convert LaTeX to HTML.
Home Page: https://ctan.org/pkg/lwarp
The LaTeX lwarp package — Convert LaTeX to HTML.
Home Page: https://ctan.org/pkg/lwarp
If generating named HTML files, where the names come from the section names, currently these names must be unique to avoid file name clashes.
Add a number to the file name to generate unique HTML files to allow section names to be reused several times in the document. This can be the same sequential number which would be used for numbered files. This would result in files such as Name-35.html, Name-67.html, Name-134.html, etc.
If this sequential number is used, it will shift as new sections are inserted or deleted from the document, leaving a cluttered file space and orphaned files to be cleaned up. Incoming references from outside may break.
The manual method mentioned in the following link has the advantage that broken incoming links may be avoided if care is used not to change the names.
Generate a single side-TOC file, and refer to it from each HTML page.
Currently the side-TOC is generated and compiled from scratch for each HTML page. Generate it once, store it in a file, and refer to it from each page. Possible significant improvement in compile times and page sizes, but reduces flexibility.
Generate a TOC at the start of the page instead of to the side.
Provide a way to add additional meta tags.
Since lwarp converts LaTeX to HTML, I expected to see an HTML version of the lwarp manual.
[Apologies if I just missed it.]
I suggest putting an HTML version of the manual on
https://lwarp.github.io/
See more here:
https://pages.github.com/
Note: Rendering new versions and putting them on lwarp.github.io can be automated with Github Actions.
Consider this example, which uses the comparatively obscure Unicode character ⚠ (U+26A0 : WARNING SIGN)
\documentclass{article}
\usepackage{fontspec}
\setmainfont{Deja Vu Sans}
\usepackage[mathjax]{lwarp}
\begin{document}
$a ⚠ b$
\end{document}
The character is contained in Deja Vu Sans, and when compiled by lualatex, the math expression renders correctly. (I don't think it is contained in the default math font, so some kind of fallback magic must be happening here, but I suppose that's not the point.)
When compiled by lwarpmk, it says
Missing character: There is no ⚠ (U+26A0) in font [lmmono8-regular]:!
and the character is replaced in the HTML output by � (U+FFFD : REPLACEMENT CHARACTER).
Since (as I understand it) the fallback mechanism in lualatex is not very advanced, it makes sense that when lwarp is rendering math and producing images, this might fail, but when the mathjax
option is used, it would make more sense to pass this through since mathjax is perfectly capable of rendering it.
Incidentally, the problem seems not to occur outside math mode, at least not with this character.
A workaround on the user side is to write, e.g., \unicode{x26A0} instead of the literal character.
Here is a minimal tex file:
% main.tex
\documentclass[doc2]{ltxdoc}
\usepackage[mathjax]{lwarp}
\setcounter{FileDepth}{1}
\setcounter{SideTOCDepth}{1}
\begin{document}
\section{Introduction}
some text and some formula $a = b + c$
\end{document}
After pdflatex main
and lwarpmk html
, the resulting file Introduction.html
has 4 <p>
tags but 3 </p>
tags. The unpaired <p>
tag appears on line 372 after the class="hidden"
div.
Consider the following test case:
\documentclass{article}
\usepackage{lwarp}
\begin{document}
$a \
b$
\end{document}
(Note that after the "$a
" there is a "\
".)
When compiled with lualatex, this correctly produces a document with a single line containing the math expression "a b", with space between them.
When compiled with lwarpmk, this produces the following error:
Missing character: There is no ^M (U+000D) in font [lmmono8-regular]:!
(Incidentally, the ^M is a literal CR and mangles the console output.) An HTML file is output which contains ^M (again literal CR) in a math expression.
Under normal circumstances the document continues to compile and HTML output is produced, so this is easily spotted, but in a beamer slide it breaks the document entirely.
lwarp adds many mathjax command definitions in a hidden div at the start of the document:
<!--MathJax customizations:-->
<div class="hidden">
\(\newcommand{\footnotename}{footnote}\)
\(\def \LWRfootnote {1}\)
\(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\)
\(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\)
\(\let \LWRorighspace \hspace \)
\(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\)
\(\newcommand {\mathnormal }[1]{{#1}}\)
\(\newcommand \ensuremath [1]{#1}\)
\(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \)
...
I've had search engines (I believe bing) get confused by this and use it as text snippets in search results. Based on the MathJax documentation, I believe that most of these definitions could instead be put in the "Lwarp MathJax emulation code" inside the <script>
tag that specifies the mathjax options, along the following lines (the following written with the help of ChatGPT-4):
MathJax = {
tex: {
macros: {
footnotename: "footnote",
LWRfootnote: "1",
footnote: ["{}^{\\mathrm{#1}}", 1], // Adjusted for the optional argument and default value handling
footnotemark: ["{}^{\\mathrm{#1}}", 1], // Similar to footnote but without the second argument
LWRorighspace: "\\hspace",
hspace: "\\ifstar \\LWRorighspace \\LWRorighspace", // This might not work as expected because \ifstar isn't directly supported in MathJax
mathnormal: ["{#1}", 1],
ensuremath: ["#1", 1],
LWRframebox: ["\\fbox{#2}", 2, ""], // Default value for the first optional argument is handled as empty
framebox: "\\LWRframebox", // This assumes LWRframebox is properly defined to handle its arguments
setlength: ["{}", 2],
addtolength: ["{}", 2],
setcounter: ["{}", 2],
addtocounter: ["{}", 2],
arabic: ["{}", 1],
number: ["{}", 1],
noalign: ["\\text{#1}\\notag\\", 1],
cline: ["{}", 1],
directlua: "\\text{(directlua)}",
luatexdirectlua: "\\text{(directlua)}",
protect: "{}",
// The commands related to \mathchar, \mathcode, \delcode, \delimiter might not be directly convertible
oe: "\\unicode{x0153}",
OE: "\\unicode{x0152}",
ae: "\\unicode{x00E6}",
AE: "\\unicode{x00C6}",
aa: "\\unicode{x00E5}",
AA: "\\unicode{x00C5}",
o: "\\unicode{x00F8}",
O: "\\unicode{x00D8}",
l: "\\unicode{x0142}",
L: "\\unicode{x0141}",
ss: "\\unicode{x00DF}",
SS: "\\unicode{x1E9E}",
dag: "\\unicode{x2020}",
ddag: "\\unicode{x2021}",
P: "\\unicode{x00B6}",
copyright: "\\unicode{x00A9}",
pounds: "\\unicode{x00A3}",
LWRref: "\\ref",
ref: "\\ifstar \\LWRref\\LWRref", // Again, \ifstar might not be supported as expected
multicolumn: ["#3", 3], // Assuming only the third argument is of interest
// \require{textcomp} doesn't have a direct equivalent, but MathJax might already support the required symbols
meta: ["\\langle \\textit{#1}\\rangle", 1],
intertext: ["\\text{#1}\\notag\\", 1],
Hat: "\\hat",
Check: "\\check",
Tilde: "\\tilde",
Acute: "\\acute",
Grave: "\\grave",
Dot: "\\dot",
Ddot: "\\ddot",
Breve: "\\breve",
Bar: "\\bar",
Vec: "\\vec"
}
}
});
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.