GithubHelp home page GithubHelp logo

fay59 / x86doc Goto Github PK

View Code? Open in Web Editor NEW
439.0 439.0 76.0 2.54 MB

HTML representation of the Intel x86 instructions documentation.

Home Page: http://www.felixcloutier.com/x86

License: The Unlicense

Python 99.49% CSS 0.51%

x86doc's People

Contributors

fay59 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

x86doc's Issues

Footnotes not separated out to the end of the Description section

For example, http://felixcloutier.com/x86/MOVDQU.html

This instruction can be used to load a YMM register from a 256-bit memory

If alignment checking is enabled (CR0.AM = 1, RFLAGS.AC = 1, and CPL = 3), an alignment-check exception (#AC) may or may not be generated (depending on processor implementation) when the operand is not aligned on an 8-byte boundary.

location, to store the contents of a YMM register into a 256-bit memory location, or to move data between two YMM registers.

Should be "This instruction can be used to load a YMM register from a 256-bit memory location, to store ...", and then the 1. If alignment checking... footnote at the end of the whole section.

Intel's PDF puts footnotes at the end of a page, with a horizontal line separating them from the rest of the text.

This issue is quite confusing for the MOV to/from segment registers entry (which isn't currently on the site at all) because the footnote text almost looks like it could be continuing on from the break in the main text.


I searched for footnotes in the PDF by searching for 1. (with a trailing space).

Print PDF issue

Hi,
In the "How To Run" step3 : you said that we need to print pdf before runing extract.py script.
I used CutePDF to print (vol A & B) to PDF, but i had got error when runing extract.py.
So could you please well explain the step3 ?

Regards.

Another extraction fails (and an offer)

Offer

If you fix my error, I will convert the output to a set of Python dictionaries.

  • A dictionary using opcode hex values as keys to access opcode name and info
  • A dictionary using opcode name to access opcode hex values and info.

Hopefully, you will want to pull my dictionaries into your offering.

Question

Is it unreasonable to use extract.py on the full converted PDFs as opposed to clipped at AAA?

Error seen during extraction

Writing to html/FCLEX:FNCLEX.html
[<OpenTag p >, <OpenTag em >, <OpenTag sup >]
[<OpenTag p >, u'W', u'h', u'e', u'n', u' ', u'o', u'p', u'e', u'r', u'a', u't', u'i', u'n', u'g', u' ', u'a', u' ', u'P', u'e', u'n', u't', u'i', u'u', u'm', u' ', u'o', u'r', u' ', u'I', u'n', u't', u'e', u'l', u'4', u'8', u'6', u' ', u'p', u'r', u'o', u'c', u'e', u's', u's', u'o', u'r', u' ', u'i', u'n', u' ', u'M', u'S', u'-', u'D', u'O', u'S', u'*', u' ', u'c', u'o', u'm', u'p', u'a', u't', u'i', u'b', u'i', u'l', u'i', u't', u'y', u' ', u'm', u'o', u'd', u'e', u',', u' ', u'i', u't', u' ', u'i', u's', u' ', u'p', u'o', u's', u's', u'i', u'b', u'l', u'e', u' ', u'(', u'u', u'n', u'd', u'e', u'r', u' ', u'u', u'n', u'u', u's', u'u', u'a', u'l', ' ', u'c', u'i', u'r', u'c', u'u', u'm', u's', u't', u'a', u'n', u'c', u'e', u's', u')', u' ', u'f', u'o', u'r', u' ', u'a', u'n', u' ', u'F', u'N', u'C', u'L', u'E', u'X', u' ', u'i', u'n', u's', u't', u'r', u'u', u'c', u't', u'i', u'o', u'n', u' ', u't', u'o', u' ', u'b', u'e', u' ', u'i', u'n', u't', u'e', u'r', u'r', u'u', u'p', u't', u'e', u'd', u' ', u'p', u'r', u'i', u'o', u'r', u' ', u't', u'o', u' ', u'b', u'e', u'i', u'n', u'g', u' ', u'e', u'x', u'e', u'c', u'u', u't', u'e', u'd', u' ', u't', u'o', u' ', u'h', u'a', u'n', u'd', u'l', u'e', u' ', u'a', u' ', u'p', u'e', u'n', u'd', u'i', u'n', u'g', u' ', u'F', u'P', u'U', u' ', u'e', u'x', u'c', u'e', u'p', u'-', u't', u'i', u'o', u'n', u'.', u' ', u'S', u'e', u'e', u' ', u't', u'h', u'e', u' ', u's', u'e', u'c', u't', u'i', u'o', u'n', u' ', u't', u'i', u't', u'l', u'e', u'd', u' ', u'\u201c', u'N', u'o', u'-', u'W', u'a', u'i', u't', u' ', u'F', u'P', u'U', u' ', u'I', u'n', u's', u't', u'r', u'u', u'c', u't', u'i', u'o', u'n', u's', u' ', u'C', u'a', u'n', u' ', u'G', u'e', u't', u' ', u'F', u'P', u'U', u' ', u'I', u'n', u't', u'e', u'r', u'r', u'u', u'p', u't', u' ', u'i', u'n', u' ', u'W', u'i', u'n', u'd', u'o', u'w', u'\u201d', u' ', u'i', u'n', u' ', u'A', u'p', u'p', u'e', u'n', u'd', u'i', u'x', u' ', u'D', u' ', u'o', u'f', u' ', u't', u'h', u'e', u' ', <OpenTag em >, u'I', u'n', u't', u'e', u'l', <OpenTag sup >, u'\xae', ' ', u'6', u'4', u' ', u'a', u'n', u'd', u' ', u'I', u'A', u'-', u'3', u'2', u' ', u'A', u'r', u'c', u'h', u'i', u't', u'e', u'c', u't', u'u', u'r', u'e', u's', u' ', u'S', u'o', u'f', u't', u'w', u'a', u'r', u'e', u' ', u'D', u'e', u'v', u'e', u'l', u'o', u'p', u'e', u'r', u'\u2019', u's', u' ', u'M', u'a', u'n', u'u', u'a', u'l', u',', u' ', u'V', u'o', u'l', u'u', u'm', u'e', u' ', u'1', <CloseTag em>, u',', u' ', u'f', u'o', u'r', u' ', u'a', u' ', u'd', u'e', u's', u'c', u'r', u'i', u'p', u't', u'i', u'o', u'n', u' ', u'o', u'f', u' ', u't', u'h', u'e', u's', u'e', u' ', u'c', u'i', u'r', u'c', u'u', u'm', u's', u't', u'a', u'n', u'c', u'e', u's', u'.', u' ', u'A', u'n', ' ', u'F', u'N', u'C', u'L', u'E', u'X', u' ', u'i', u'n', u's', u't', u'r', u'u', u'c', u't', u'i', u'o', u'n', u' ', u'c', u'a', u'n', u'n', u'o', u't', u' ', u'b', u'e', u' ', u'i', u'n', u't', u'e', u'r', u'r', u'u', u'p', u't', u'e', u'd', u' ', u'i', u'n', u' ', u't', u'h', u'i', u's', u' ', u'w', u'a', u'y', u' ', u'o', u'n', u' ', u'a', u' ', u'P', u'e', u'n', u't', u'i', u'u', u'm', u' ', u'4', u',', u' ', u'I', u'n', u't', u'e', u'l', u' ', u'X', u'e', u'o', u'n', u',', u' ', u'o', u'r', u' ', u'P', u'6', u' ', u'f', u'a', u'm', u'i', u'l', u'y', u' ', u'p', u'r', u'o', u'c', u'e', u's', u's', u'o', u'r', u'.']
Traceback (most recent call last):
  File "extract.py", line 41, in <module>
    result = main(sys.argv)
  File "extract.py", line 33, in main
    parser.process_page(page)
  File "/Users/jlettvin/Desktop/github/x86doc/x86manual.py", line 303, in process_page
    self.end_page(page)
  File "/Users/jlettvin/Desktop/github/x86doc/x86manual.py", line 255, in end_page
    self.flush()
  File "/Users/jlettvin/Desktop/github/x86doc/x86manual.py", line 239, in flush
    self.__output_file(displayable)
  File "/Users/jlettvin/Desktop/github/x86doc/x86manual.py", line 354, in __output_file
    file_data = self.__output_page(displayable).encode("UTF-8")
  File "/Users/jlettvin/Desktop/github/x86doc/x86manual.py", line 373, in __output_page
    text.append(self.__output_html(element))
  File "/Users/jlettvin/Desktop/github/x86doc/x86manual.py", line 385, in __output_html
    result = self.__output_text(element)
  File "/Users/jlettvin/Desktop/github/x86doc/x86manual.py", line 574, in __output_text
    text.autoclose()
  File "/Users/jlettvin/Desktop/github/x86doc/htmltext.py", line 56, in autoclose
    raise Exception("autoclose mismatch")
Exception: autoclose mismatch

Extraction fails

$ python2 extract.py vol2a.pdf vol2b.pdf
Processing page 1
[...]
Processing page 670
Processing page 671
Processing page 672
Writing to html/Intel® 64 and IA.html
Traceback (most recent call last):
  File "extract.py", line 40, in <module>
    result = main(sys.argv)
  File "extract.py", line 34, in main
    parser.flush()
  File "x86manual.py", line 239, in flush
    self.__output_file(displayable)
  File "x86manual.py", line 354, in __output_file
    file_data = self.__output_page(displayable).encode("UTF-8")
  File "x86manual.py", line 373, in __output_page
    text.append(self.__output_html(element))
  File "x86manual.py", line 385, in __output_html
    result = self.__output_text(element)
  File "x86manual.py", line 539, in __output_text
    elif element.font_name() == "NeoSansIntel" and self.__title_stack[-1] == "operation":
IndexError: list index out of range

Tables with sub-column headers (like in CMPPD) get messed up

http://felixcloutier.com/x86/CMPPD.html is pretty messed up; the column headers seem to repeat everything. The main table body seem to be ok.

It's a tricky table because on "main" column has 4 sub-columns each with their own header.
https://github.com/HJLebbink/asm-dude/wiki/CMPPD formats the headers correctly, but the table is so wide that it needs a scroll bar. It's usable if you click in the table so you can left/right arrow to scroll sideways without having to leave your place to click on the scroll bar itself.

Only 1 of the 3 MOV entries is present (and it's the debug-register one, not regular integer)

http://felixcloutier.com/x86/MOV.html is the entry for MOV r32, DR0–DR7.

In a fork of this project, https://github.com/HJLebbink/asm-dude/wiki/MOV is regular GP-register mov, like MOV r/m32,r32.

But HJLebbink's fork seems to have lost the debug-register and control-register forms. IIRC, http://felixcloutier.com/x86 used to have all 3 separate entries from the Intel PDF:

  • MOV—Move
  • MOV—Move to/from Control Registers
  • MOV—Move to/from Debug Registers

(which appear in that order in the PDF).

So HJLebbink kept the first entry, this revision kept the last entry?

Note that the problem isn't present for MOVQ: the index has 2 entries for MOVQ. But one of them is actually MOVD/MOVQ, so the HTML pages have different URLs.

Much post-processing needed

I've ported the repo to python3 and pdfminer.six.
Apart from some minor issues it seems to parse OK, but there is still a lot of manual/scripted post-processing needed. Could you please show what post-processing steps you have taken to create the website from the files produced by the parser?

Regards

RG

Missing instruction "setne"

First, this website is really very good !!!!!! Easy to read and find instruction compared to intel's PDF. But I can not find "setne".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.