davidmerfield / typeset Goto Github PK
View Code? Open in Web Editor NEWAn HTML pre-processor for web typography
Home Page: https://typeset.lllllllllllllllll.com/
License: Creative Commons Zero v1.0 Universal
An HTML pre-processor for web typography
Home Page: https://typeset.lllllllllllllllll.com/
License: Creative Commons Zero v1.0 Universal
This lib looks great ! But I don't believe in neither fairy tales nor magic.
I won't pass my content in a "black box" that magically promises enhancements.
Could you explain (blog post ?) exactly what problem does your library solve and how ?
Thanks
Would be nice to avoid messing with Mathjax or LaTeX by default.
When 'on github' is mentioned it is preceded with a redundant on. In other words 'on on github' is shown.
Void elements, or singletons, like img
, hr
, br
and others contain a closing forward slash in (X)HTML validation, e.g. <img src="foo.jpg" />
.
When Typeset processes content with HTML, it is removing those closing slashes, e.g. rendering the above as <img src="foo.jpg">
.
Closing slashes are of course optional, but:
a) I don't think Typeset should be messing with tag syntax in the first place.
b) In HTML emails, using closing slashes is recommended for cross-email compatibility in all their crappy rendering engines.
c) In my case, I'm using MJML, which uses void/singleton elements for things like mj-image
-- and in that case a tag without a closing slash isn't valid.
My specific use might be an edge case, but I'm sure I'm not the only one formatting HTML emails.
Can Typeset avoid changing HTML tags?
Cheerio should be a dependency, not a dev dependency:
> Cannot find module 'cheerio'
Require stack:
- /Users/anandchowdhary/Projects/open-source/anandchowdhary.com/node_modules/typeset/src/eachTextNode.js
- /Users/anandchowdhary/Projects/open-source/anandchowdhary.com/node_modules/typeset/src/index.js
As part of my research on #52 I also studied whether the push/pull technique could be applied to blocks with text-align:right
(that is, the punctuation would hang over the right margin instead of the left). I haven’t gotten it to work yet. It may be too complicated to be worthwhile. What I tried:
In the processor script, closing punctuation would need to be wrapped similar to opening punctuation, though in opposite order: the pull
would precede the push
. Also, the wrapping tag for the closing punctuation would want to be distinct from the opening (e.g., push-open
and push-closed
rather than just push
). This part seemed tractable.
In the CSS, however, I couldn’t come up with a way of styling the push-closed
and pull-closed
to get analogous behavior with the usual push
/pull
pairs. The usual idea is that in the middle of a line, the two appear together, but at a line break, the push
remains at the end of one line, the line break happens, and the pull
appears at the beginning of the next. On the right edge, the pull
would happen first, at the end of the line, and then assumedly you’d want the push
to wrap to the next.
More troublesome still, I couldn’t come up with a way to toggle this behavior purely with CSS. That is, in any given text block, the text is either aligned left or right (or neither) so only the opening push/pull pairs or closing push/pull pairs should work (or neither). But there isn’t any way to write a CSS selector conditioned on the presence of another CSS property. If every right-aligned block was guaranteed to have, say, class="right"
, then you can have CSS selectors like .right push-closed
and so on. But that requires the right-alignment to be encoded at “compile time”, rather than strictly in the CSS (where it should be).
I executed this:
$ npm install -g typeset
And I got
[email protected] /usr/local/lib/node_modules/typeset
├── [email protected]
├── [email protected]
└── [email protected] ([email protected], [email protected], [email protected], [email protected], [email protected])
But I don't have typeset-js
when I tried to run it:
zsh: command not found: typeset-js
OS X 10.10.5, iTerm 2, with Oh-My-Zsh.
Input:
<small><em>"A"</em></small>
Output:
<small><em><span class="pull-double">“</span>A”</em></small>
Expected:
<span class="push-double"></span><small><em><span class="pull-double">“</span>A”</em></small>
I would like to avoid the last 2 words in a paragraph from appearing on the last line by themselves.
For now I've been hacking around this by manually wrapping the last few words in a with a class that typeset is configured to ignore (to avoid soft-hyphens) and manually replacing spaces between those words with .
Was this functionality intentionally omitted? If not, I'm happy to take a shot at implementing it myself.
When double quotes are followed by a punctuation mark (e.g. comma or period), they are rendered as opening, instead of closing, double quotes.
I have HTML with entities like <
and >
in it and when I pass this HTML through typeset, they are replaced with the actual <
and >
characters and so my HTML comes up incorrect. Check out the following example:
console.log(typeset(`
<!doctype html>
<html lang="en">
<p>Hello <there> you!</p>
`.trim()));
This produces the following output:
<!DOCTYPE html><html lang="en"><head></head><body><p>Hello <there> you!</there></p></body></html>
The <
and >
around the word there
have now turned it into a <threre>
tag, with a closing tag as well just before </p>
in the final output.
Here's a REPL demonstrating this test case: https://repl.it/@sharat87/BigheartedDefiantCones
Thank you very much for your work.
<p>Hello, <em>"Mr"</em> Fox.</p>
I need to insert a spacer when the adjacent sibling is a text node.
Add option to change the class name for the elements that this library creates.
Trying to run Typeset from the browser, with jQuery, I got this string:
<p>Yjarni Sigurðardóttir spoke to NATO from Iceland yesterday: "Light of my life, fire of my florins -- my sin, my soul. The tip of the tongue taking a trip to 118° 19' 43.5"."</p>
<p>"She's faster than a 120' 4" whale." <em>Piña co­ladas</em> were widely consumed in Götterdämmerung from 1880–1912. For the low price of $20 / year from Ex­hi­bits A–E... Then the <em>duplex</em> came forward. "Thrice the tower, he mounted the round gunrest, 'awaking' HTML. He can print a fixed num­ber of dots in a square inch (for in­stance, 600 × 600)."
</p>
turned into this:
<p><span class="pull-Y">Y</span>jarni Sigurðardót­tir spoke to <span<span class="push-c"></span> <span class="pull-c">c</span>lass="small-caps">NATO</span> from Ice­land yes­ter­day:<span class="push-double"></span> <span class="pull-double">“</span>Light<span class="push-o"></span> <span class="pull-o">o</span>f my life, fire<span class="push-o"></span> <span class="pull-o">o</span>f my florins&thinsp;&mdash;&thinsp;my sin, my soul.<span class="push-T"></span> <span class="pull-T">T</span>he tip<span class="push-o"></span> <span class="pull-o">o</span>f the tongue tak­ing a trip to 118° 19′ 43.5″.”</p>
<p><span class="pull-double">“</span>She’s faster than a 120′ 4″<span class="push-w"></span> <span class="pull-w">w</span>hale.” <em>Piña<span class="push-c"></span> <span class="pull-c">c</span>o­ladas</em> <span class="pull-w">w</span>ere<span class="push-w"></span> <span class="pull-w">w</span>idely<span class="push-c"></span> <span class="pull-c">c</span>on­sumed in Göt­ter­däm­merung from 1880–1912. For the low price<span class="push-o"></span> <span class="pull-o">o</span>f $20 / year from Ex­hi­bits<span class="push-A"></span> <span class="pull-A">A</span>–E…<span class="push-T"></span> <span class="pull-T">T</span>hen the <em>du­plex</em> <span class="pull-c">c</span>ame for­ward.<span class="push-double"></span> <span class="pull-double">“</span>Thrice the tower, he mounted the round gun­rest,<span class="push-single"></span> <span class="pull-single">‘</span>awak­ing’ <span<span class="push-c"></span> <span class="pull-c">c</span>lass="small-caps">HTML</span>. He<span class="push-c"></span> <span class="pull-c">c</span>an print a fixed num­ber<span class="push-o"></span> <span class="pull-o">o</span>f dots in a square inch (for in­stance, 600 × 600).”
</p>
I investigated the sequential calling of modules that continuously transform the initial text, and notice that the HTML tags were not escaped by "quotes", "hyphenate" or "ligatures", but only after "smallCaps".
I couldn't understand why.
Lovely tool. Great work.
It might be nice to be able to enable or disable certain features such as hanging punctuation and hyphenation.
In my first attempted use of the tool, I found the hyphenation to be distracting for the particular type size and style I was using. There are probably cases where hanging punctuation is unnecessary.
I'm happy to work on a quick implementation of this to get the conversation rolling.
This is more of a pet peeve than anything but in my opinion the ignore
option should be represented by an array of selectors instead of one huge selector string, reason being that it is more "intuitive" - one would usually expect to ignore multiple selector-strings instead of just one.
So I think the better approach would be to use an Array and join the array when doing the selection/ignorance(?). But then again if you're going into the direction of cutting down the code size I'm totally fine with that.
Seems to fail for this
'AAA...
Currently each typographic feature in the library runs src/eachTextNode.js
. This repetition is inefficient.
We should instead compose a function of all the modules that need to modify text nodes, then pass that function to eachTextNode
once.
Screen readers stumble on something like <span>a</span>t
, reading it as 'A T' instead of 'at'.
This makes the optical margin alignment and hanging punctuation output unusable for people using screen readers.
We should be able to solve this using:
<p aria-label="at"><span>a</span>t</p>
For the life of me, I can't figure out how to import this as a dependency using webpack. What am I missing?
I'm testing the client version from Chrome, using jQuery:
Sometimes this line doesn't work because .data
is undefined.
childNode.data = doThis(childNode.data, childNode);
When it happens everything fails.
I was planning to add a line in punctuation.js to replace a hyphen between two numbers without a space with an en-dash (14-31 days
→ "14–31 days"
).
It looks like for ranges with a space, we're expecting it to be replaced with an em-dash. I'm unsure if this is by design but the test HTML (and the Chicago style) calls for an en-dash. Should this be corrected or is there a different rule for ranges with spaces that I'm missing?
Thanks!
Hey! Thanks for the shout-out in the README. I maintain Normalize-OpenType.css.
If you are thinking of incorperating any features, let me know. I’d be happy to help. I have also been using and contributing very small pieces to Typogr.js for a while which is in a very similar space. Perhaps having a consensus on a class name or tag between all three for small caps would be a start.
I have been partial to using abbr
tags for small caps, but I can understand the appeal of just using .caps
or .small-caps
.
I would also recommend dropping the ligature support, these can be handled entirely with CSS now. My understanding is that fi and fl as a single glyph are kind of a deprecated or frowned-upon part of Unicode, since OpenType and font-feature-settings
takes care of that now while preserving the f, l, and i as separate characters.
Let me know what you think! Nice job on the library so far, always nice to find other people that care this much about this stuff.
An example to illustrate the issue (note the quotation marks):
Input:
<p>How about "<a href="/foo">that</a>" said the old man.</p>
Output:
How about ”that“ said the old man.
Expected:
How about “that” said the old man.
Why does this happen?
This is because each text node is handled individually. In the example, the three text nodes are How about "
, that
and " said the old man.
.
Solutions
Compute the text content of block elements (p tags, blockquotes) and run the substitution on that, instead of on the nodes individually?
Howdy, love the library.
Running into some issues regarding your use of the for...in
statement for array iteration. for...in
is for enumerable properties of collections, not iteration through iterables (e.g. Arrays). I'd recommend using array.forEach()
, for...of
(broadly supported by Node >= 0.12, dunno about browser compat.) or a standard for
loop, which is obviously more verbose but more appropriate than for...in
.
for...in
will throw when some dingus (not me, but there are lots of dinguses out there) modifies the Array prototype, because for...in
will kick back any additional enumerable properties, not just the elements of the array.
Copy-pasta'd some relevant code from the MDN demonstrating the issue.
Object.prototype.objCustom = function () {};
Array.prototype.arrCustom = function () {};
let iterable = [3, 5, 7];
iterable.foo = "hello";
for (let i in iterable) {
console.log(i); // logs 0, 1, 2, "foo", "arrCustom", "objCustom"
}
for (let i of iterable) {
console.log(i); // logs 3, 5, 7
}
Add Array.prototype.foo = () => 'bar'
to index.js and run the test suite. It'll run through all the places where that needs to be changed.
Cheers!
Hi, can you add a small section in readme about how to use it in browser. So far I found that I can dotypeset(jq("p"));
can I do this?
typeset(jq("p"), {
disable: ['hyphenate'], // array of features to disable
});
How integral is cheerio to this problem you're solving? What is it's biggest benefit?
This is a great library! Thanks so much for maintaining it.
I'd love to see an option for controlling widows. For instance typogr.js uses the "widont" pattern of replacing the space between the last two words in a block with
.
(Typographically a widow is actually a single line on a wrapped paragraph rather than a single word on a wrapped line, but on the web the usage seems to be mostly about not leaving a single word, since multicolumn layouts are less common.)
The build.js
script does not see to produce any .js
file and hangs after outputting "done". It also depends on numerous non-listed Node modules.
js build.js
BUILDING!
DONE!
[Have to type Ctrl-C to exit]
I imagine that Typeset processing would work well with raw markdown. It seems so, but maybe I'm wrong.
Is there something I should be worried about if preprocessing markdown with Typeset? (Maybe something that would break markdown -- the \n\n---\n\n
that represents an <hr>
, for example.)
It would be great to have a command line tool to use typeset
as a post-processor (bearing in mind that typeset
is sometimes a reserved shell command).
À
, Á
,Ä
, and Â
should be treated just the same as A
Found three small issues, all related to different behaviour when processing greek characters, instead of latin.
Problem 1: single quote before space
correct | wrong | |
input | english' english | ελληνικά' ελληνικά |
output | english’ english | ελληνικά′ ελληνικά |
Problem 2: double quotes before full stop
correct | wrong | |
input | "english". | "ελληνικά". |
output | “english”. | “ελληνικά“. |
Problem 3: double quotes before comma
correct | wrong | |
input | "english", | "ελληνικά", |
output | “english”, | “ελληνικά“, |
A cool feature would be to automatically encapsulate instances of manually configured—and perhaps predefined—abbreviations with <abbr title="…"></abbr>
.
Shouldn't feet and inches remain as straight quotes rather than curly?
http://practicaltypography.com/foot-and-inch-marks.html
Is there a way to specify the language in the case of processing HTML fragments?
Add option to wrap a sequence of characters in a node which could be targeted? This would not be default behaviour and is naturally dependent on the typeface e.g.
typeset("Ave, Imperator, morituri te salutant", {kern: ["av", "at"]})
.kern-av {letter-spacing: 0.96em}
.kern-at {letter-spacing: 0.98em}
A lot of the metrics in typeset.css can be generated automatically based on the user's typeface. I should build a tool to do this, with an interface for tweaking the optical margin alignment of certain characters.
I don't know the cause just yet...
At the end of
http://chrbutler.com/back-to-the-real-future
"They lie.."
http://chrbutler.com/the-gift-in-our-work
Leading hanging punctuation failed
http://chrbutler.com/the-way-to-make-good-things-is-to-make-many-things
http://chrbutler.com/everyone-is-someone-elses-marketer
"productivity future"
http://chrbutler.com/neighborhoods-the-anti-algorithm
issue with escaped
http://chrbutler.com/well-get-there-fast-and-then-well-take-it-slow
http://chrbutler.com/the-information-is-not-entirely-online
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.