jorendorff / es-spec-html Goto Github PK
View Code? Open in Web Editor NEWAn HTML version of the ECMAScript draft specification autogenerated from the source
Home Page: http://ecma-international.org/ecma-262/5.1/
An HTML version of the ECMAScript draft specification autogenerated from the source
Home Page: http://ecma-international.org/ecma-262/5.1/
Ugh, Word macro turds in the HTML.
The Intl spec needs to examine the macros, and that's fine, but we should still strip them out at some point.
https://people.mozilla.com/~jorendorff/es6-draft.html#sec-4.2.1
There are a few places where the spec should use <h2>...</h2>
but instead has <p><b>...</b><p>
:
18.2.6.1 / Runtime Semantics
B.1.1 / Static Semantics
B.1.4 / Syntax
B.1.4 / Pattern Semantics
D / In Edition ... ?
E.g. see Figure 1 — Object/Prototype Relationships in section 4.2.1: http://ecma-international.org/ecma-262/5.1/#sec-4.2.1
It should look something like this: http://es5.github.com/x4.html#x4.2.1
In 15.1.3 URI Handling Function Properties, Decode Abstract Operation,
step 4.d.vi.2.a and 4.d.vi.3.a, the script prints warnings because it is
misinterpreting the Word document. transform.py seems to be producing
empty markers.
I think the output is fine, just bogus warnings.
I think this is a bug in the Word document. /cc: @allenwb.
http://ecma-international.org/ecma-262/5.1/ has the following in its source:
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://ecma-international.org/ecma-262/5.1/");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-6146537-1");
pageTracker._trackPageview();
</script>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-6146537-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
As you can see, there are two different Google Analytics snippets — the first of which is being used incorrectly. (The erroneously edited gaJsHost
line makes it try to load ga.js from ecma-international.org rather than from Google.)
To prevent mistakes like this in the future, and to keep the manual post-processing to a minimum, it would probably be a good idea to have the HTML generator script automatically insert the correct snippet.
Here’s an optimized version of that script which could be used in this case:
<script>
window._gaq = [['_setAccount', 'UA-6146537-1'], ['_trackPageview']];
(function(d, t) {
var g = d.createElement(t),
s = d.getElementsByTagName(t)[0];
g.src = '//www.google-analytics.com/ga.js';
s.parentNode.insertBefore(g, s);
}(document, 'script'));
</script>
Update: The online version has now been fixed to include this correct, optimized snippet.
It's unsightly.
sans-serif text in the middle of a serif run should get a smaller font-size
to compensate for differences in x-height among the fonts we will end up using; of course without either font-size-adjust
or switching to web fonts or fonts everyone has, it's impossible to get this perfect across platforms.
StringValue is defined in 7.6; the others are more obvious
Currently [[Call]] is not a link, because there is not a single obvious place for that to link to.
But [[Call]] should be clickable. On consideration I think clicking it should bring up a menu of links:
[[Call]] method
…described (Table 9)
…of an ordinary Function object (8.3.16.1)
…of an exotic bound Function object (8.4.1.1)
…of an exotic Proxy object (8.5.14)
Like an index.
Making a real index is a lot of work; faking it is what fixup_links does.
The paragraph "Property keys are used to access properties and their values." has a bullet. This isn't in the Word doc.
In 15.10.2.6, last algorithm, the table should be inside the list and the last step should be numbered 4.
We erroneously make a new list and number it "1." instead.
In B.2.3.2 some lists are messed up. The indentation and numbering are both wrong. There should be one list nested inside another, and instead we have three separate lists.
We can improve by recovering list structure (partly or entirely) from the appearance of the text (particularly indentation and numbering/bullets), not just from the Word list structures, which can be a bit bizarre.
I just realized some of them are described adequately somewhere; for example, see Table 37 in es6 draft rev 18.
[[Map]], [[MapNextKind]], and [[MapIterationKind]] should link to rows in that table.
NewFunctionEnvironment
Function environment record
Global environment record (note this will fight with "the global environment")
Function Declaration Instantiation
On https://github.com/jorendorff/es-spec-html, consider hitting the edit button on the left of “An HTML version of the ECMAScript draft specification autogenerated from the source”, and setting the URL field (currently empty) to http://ecma-international.org/ecma-262/5.1/
.
Meaning it looks like a child of 26.2.3.2 and isn't linkable
At least some subheadings in the document are not marked with any paragraph style at all, it's just a Normal paragraph with bold text in it. These should be converted to h2 the same as any other paragraph with identical appearance.
Commit dba375f removed changes that made the Word→HTML converter use UTF-8 in the Python source code and in the generated HTML. I think that commit was a change in the wrong direction:
Another bogus warning because we incorrectly interpret the Word document has having wrong numbering.
I think the bug is that numbering state is per-num
, but the script is storing it per-abstractNum
.
/Users/jorendorff/dev/es-spec-html/fixups.py:507: UserWarning: Word marker is '5.\t', HTML will show '1.\t'
warn("Word marker is {!r}, HTML will show {!r}".format(marker_str, html_marker_str))
<li style="-ooxml-indentation: 0.0pt">If Type(<span style="font-style: italic">number</span>) is not Number, return <span
style="font-weight: bold">false</span>.</li>
/Users/jorendorff/dev/es-spec-html/fixups.py:507: UserWarning: Word marker is '6.\t', HTML will show '2.\t'
warn("Word marker is {!r}, HTML will show {!r}".format(marker_str, html_marker_str))
<li style="-ooxml-indentation: 0.0pt">If <span style="font-style: italic">number</span> is <span style="font-weight:
bold">NaN</span>, <span style="font-weight: bold">+∞</span>, or <span style="font-weight: bold">−∞</span>,
return <span style="font-weight: bold">false</span>.</li>
/Users/jorendorff/dev/es-spec-html/fixups.py:507: UserWarning: Word marker is '7.\t', HTML will show '3.\t'
warn("Word marker is {!r}, HTML will show {!r}".format(marker_str, html_marker_str))
<li style="-ooxml-indentation: 0.0pt">Otherwise, return <span style="font-weight: bold">true</span>.</li>
In 8.2.4.2, “Let succeeded be the result of calling the [[Set]] internal method of base” — the document has all this in a single paragraph, but the OOXML markup clearly has two separate paragraphs. Not sure what's going on.
The table in section 7.2 lists \uFEFF
and Other category “Zs”
in the same table row. They should probably be separate rows.
In 7.6.1.1 and 7.6.1.2, for example, the keywords are shown in the default font, but the document shows them in a monospaced font.
In Annexes D and E, many paragraphs are missing a section number at the start. They're easy to spot because they start with a colon instead.
Table captions are supposed to have the Tabletitle paragraph style. This one doesn't.
I'll email @allenwb about it -- this is perhaps easier to fix in Word than in python.
The resulting HTML file is ~1.6 MB, which is enough to freeze some modern browsers (especially if you already have some tabs open).
Would it be possible to generate both a single-page and a multi-page version, i.e. one page per chapter?
P.S. es5.github.io (repo) uses a script that takes the resulting single-page HTML file and splits it up.
The first paragraph of 15.13.6 is, bizarrely, a numbered heading. This throws off the numbering for all other subsections.
Axel Rauschmayer points out:
“KeyedDestructuringAssignmentEvaluation” is mentioned once and could be a link to #sec-runtime-semantics-keyeddestructuringassignmentevaluation
Opening this because there are two options: 1) just add it to specific_link_source_data_lang
; or 2) do something more general (like what the script does for algorithm names; see title_as_algorithm_name
).
Bullet 5 on the word doc is:
5. If F’s has a [[HomeObjectNeedsSuper]] internal slot is true, then
a. Let home be the value of F’s [[HomeObject]] internal slot.
b. If home is undefined, then throw a ReferenceError exception.
a.c. Set envRec’s HomeObject to homethe value of F’s [[HomeObject]] internal slot.
b.d. Set envRec’s MethodName to the value of F’s [[MethodName]] internal slot.
The web version still uses the strikedthrough version.
The Scrap Heap is 25 screensful of text at the end of the document, containing assorted oddments that have been deleted from the main body. I've been confused by it more than once. ("Oh, wait, none of this is part of the proposed spec, I'm in the Scrap Heap.")
It probably should be a separate Word document. Failing that, how about we just strip it from the HTML version.
The document shows only 1 paragraph beginning
NOTE ECMAScript differs from the Java programming language...
The OOXML markup in document.xml clearly shows two paragraphs though (with a <del>
element at the end of the first one -- maybe that is what causes them to be joined?) and this is currently being rendered as two HTML paragraphs.
In 15.7.3.6, there is a NOTE containing a list with just one item, requiring totally special list markup.
The table should be inside step 3. Since we don't figure that out, step 4 is treated as a new list, and gets the number 1.
In 8.3.16.7:
<li><span style="font-family: Times New Roman">Let <i>functionPrototype</i> be the intrinsic object</span>
%FunctionPrototype%.</li>
In 8.5.1:
<li><span style="font-family: Times New Roman">Let <i>trap</i> be the result of <a
href="#sec-9.3.7">GetMethod</a>(<i>handler</i>,</span> "<code>getPrototypeOf</code>").</li>
and so on.
This is not all that mysterious; we just need to do a better job figuring out what the default font for a paragraph is supposed to be. Right now we check the first and last span and see if they use the same font. These paragraphs have a few characters at the end that are the wrong font, but it's not noticeable because it's just punctuation.
When the grammar fixups were originally written, we never had lines like
ClassTail: ClassHeritageopt { ClassBody }
in the document. Grammar productions were always on two lines. Now that we have one-line productions, the grammar fixups need to be fixed up.
The second group of ObjectLiteral productions is indented; it shouldn't be.
Really there needs to be a way to establish an id= attribute on the particular paragraph that defines something, and link to that.
The links for "Assert:" all point to "5.2 Algorithm conventions" which is a massive wall of text.
All the section numbers changed in Revision 18, and it will break all incoming links to sections. Need more resilient section ids.
@@ToPrimitive has its name linked.
In 15.10.2.5 there are a few NOTEs where div.note does not extend to cover the whole note.
Have to fix it heuristically.
See sections 7.6.1.2, 7.7, 7.8.3, etc.
Sometimes the document uses tables for layout; we should notice that there aren't any headings or borders in such a table and give it a special HTML class.
http://people.mozilla.org/~jorendorff/es6-draft.html#sec-grammar-notation
The example that goes VariableDeclaration : BindingIdentifier Initialiser _[In] _opt
is rendered without subscripts.
Properties are accessed by name, using either the dot notation:
MemberExpression . IdentifierName
CallExpression . IdentifierNameor the bracket notation:
MemberExpression [ Expression ]
CallExpression [ Expression ]
Some of this is supposed to be indented.
allenwb writes:
BTW, Just in case you didn't notice. The numbering differences cause some of the
cross references to be broken. for example see:http://people.mozilla.org/~jorendorff/es6-draft.html#sec-13.1.1
Yup, they're busted all right. It's due to the numbering snafu, where the doc now contains section numbers like "13.0.1", on purpose, and my numbering code is generating "13.1.1" for some reason. That's the thing to fix.
<p><span class="prod"><span class="nt">AssignmentElementList</span> <span class="geq">:</span> <span
class="nt">Elision</span></span> <span style="font-family: sans-serif"><sub>opt</sub></span> <i>AssignmentElement</i></p>
Hard to imagine where this is coming from.
In 5.1.6, “If the phrase “[empty]” appears”, [empty] should be shown in the same font as in actual grammar productions. Same for “[lookahead ∉ set]” in the next paragraph. Also “[no LineTerminator here]”.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.