GithubHelp home page GithubHelp logo

pymathml's Introduction

PyMathML

PyMathML is a Python package to create MathML expressions programatically.

The present version of PyMathML is restricted to Presentation MathML (see MathML specifications, chapter 3).

MathML is extremely verbose; with PyMathML, concise, pythonic expressions are converted to valid, well-formed MathML code. For example, the following snippet:

from pymathml import *
from pymathml.utils import *
a, b = identifiers('a', 'b')
expr = a**2+2*a*b+b**2

defines the mathematical expression a²+2ab+b². PyMathML then produces the following MathML code

print(expr)
<mrow><mrow><msup><mi>a</mi><mn>2</mn></msup><mo>+</mo><mrow><mrow><mn>2</mn><mo>⁢</mo><mi>a</mi></mrow><mo>⁢</mo><mi>b</mi></mrow></mrow><mo>+</mo><msup><mi>b</mi><mn>2</mn></msup></mrow>

This code is released under a BSD 3-clause "New" or "Revised" License. It is open to contributions.

To install PyMathML, clone this repository and issue the following command:

python setyp.py install

The remainder of this page is a tutorial. It is organized as follows:

This README.md is the Markdown export of a Jupyter Notebook which can be found in the docs/ directory of this repository.

import inspect # This will be used below

Converting PyMathML expressions to MathML

Let us define a basic PyMathML expression:

a, b, c, x = identifiers('a', 'b', 'c', 'x')
expr = a*x**2+b*x+c # ax²+bx+c

Then str(expr) returns its MathML representation:

str(expr)
'<mrow><mrow><mrow><mi>a</mi><mo>\u2062</mo><msup><mi>x</mi><mn>2</mn></msup></mrow><mo>+</mo><mrow><mi>b</mi><mo>\u2062</mo><mi>x</mi></mrow></mrow><mo>+</mo><mi>c</mi></mrow>'

PyMathML expressions can be displayed (using MathJax) in Jupyter notebooks (note: owing to limitations of Github Flavored Markdown, you really need to execute the Jupyter Notebook in the docs/ directory to see the output of this cell properly).

expr

a⁢x2+b⁢x+c

expr.tomathml() returns the MathML representation of expr as an Element from the xml.etree.ElementTree module in the standard library:

mml = expr.tomathml()
type(mml)
xml.etree.ElementTree.Element

which can then be converted to XML as follows:

import xml.etree.ElementTree as ET

print(ET.tostring(mml, encoding='unicode'))
<mrow><mrow><mrow><mi>a</mi><mo>⁢</mo><msup><mi>x</mi><mn>2</mn></msup></mrow><mo>+</mo><mrow><mi>b</mi><mo>⁢</mo><mi>x</mi></mrow></mrow><mo>+</mo><mi>c</mi></mrow>

The function tomathml promotes its argument to a PyMathML expression, and calls the tomathml() method:

mml = tomathml('a')
print(ET.tostring(mml, encoding=('unicode')))
<mi>a</mi>

If the optional argument display is specified, the expression is enclosed in a math element, with the specified display attribute:

mml = tomathml('a', display='block')
print(ET.tostring(mml, encoding=('unicode')))
<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mi>a</mi></math>
mml = tomathml('a', display='inline')
print(ET.tostring(mml, encoding=('unicode')))
<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mi>a</mi></math>

The function tostring promotes its argument to a PyMathML expression, and returns its MathML representation as a string. It takes the same optional argument display as the tomathml function:

tostring('a')
'<mi>a</mi>'
tostring('a', display='block')
'<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mi>a</mi></math>'
tostring('a', display='inline')
'<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mi>a</mi></math>'

The functions inline and block are alternatives to the function tostring with a display argument.

block('a')
'<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><mi>a</mi></math>'
inline('a')
'<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mi>a</mi></math>'

Basic MathML elements

All MathML elements are defined as PyMathML objects.

Token elements

Token elements are MathML elements that have text, but no children (see MathML specifications, section 3.1.9.1). Token elements all derive from the Token class (see conversion table below).

MathML PyMathML
mi Identifier
mn Number
mo Operator
mtext Text
mspace not implemented
ms not implemented

Token elements are instantiated by passing to the initializer the text as a non keyword argument and the attributes as keyword arguments:

x = Identifier('x', mathvariant='bold')
print(x)
<mi mathvariant="bold">x</mi>

Note that any object can be passed as the "text" of the token element, provided that it can be converted to a string.

Non-token elements

Non-token elements are

They all derive from the Expression class, and are instantiated by passing to the initializer the children expressions as non-keyword arguments, and the attributes as keyword arguments:

expr = Sup('a', 2, superscriptshift='0.5em')
print(expr)
<msup superscriptshift="0.5em"><mi>a</mi><mn>2</mn></msup>

(note that strings and numbers are automatically converted to mi and mn children elements, respectively). When relevant, the docstring of the derived Element lists non-keyword arguments:

print(inspect.getdoc(SubSup))
PyMathML representation of the msubsup element.

See MathML specifications, section 3.4.3.

Usage: SubSup(base, subscript, superscript, **attributes)

which produces the following MathML code:

    <msubsup>
        tomathml(base)
        tomathml(subscript)
        tomathml(superscript)
    </msubsup>

Conversion table for general layout schemata

MathML PyMathML
mrow Row
mfrac Frac
msqrt Sqrt
mroot Root
mstyle Style
merror not implemented
mpadded not implemented
mphantom not implemented
mfenced Fenced
menclose not implemented

Conversion table for script and limit schemata

MathML PyMathML
msub Sub
msup Sup
msubsup SubSup
munder Under
mover Over
munderover UnderOver
mmultiscripts not implemented

Conversion table for tables and matrices

MathML PyMathML
mtable Table
mlabeledtr not implemented
mtr TableRow
mtd TableEntry
maligngroup not implemented
malignmark not implemented

Building complex expressions with special methods

All special functions of PyMathML objects have been implemented; therefore, complex expressions can be built very naturally:

f, x = identifiers('f', 'x')

expr = f(x[1], x[2], x[3])**2

results in the following MathML code:

print(expr)
<msup><mrow><mi>f</mi><mo>⁡</mo><mfenced><msub><mi>x</mi><mn>1</mn></msub><msub><mi>x</mi><mn>2</mn></msub><msub><mi>x</mi><mn>3</mn></msub></mfenced></mrow><mn>2</mn></msup>

which renders as f(x₁, x₂, x₃)² (you need to run the Jupyter notebook to see the output of the following cell correctly):

expr

f⁡x1x2x32

Conversion table for magic methods

In the table below, e, e1 and e2 are PyMathML expressions, me, me1 and me2 are their translation to MathML.

PyMathML MathML
+e <mrow><mo>+</mo>me</mrow>
-e <mrow><mo>-</mo>me</mrow>
e1+e2 <mrow>me1<mo>+</mo>me2</mrow>
e1-e2 <mrow>me1<mo>-</mo>me2</mrow>
e1*e2 <mrow>me1<mo>&it;</mo>me2</mrow>
e1@e2 <mrow>me1<mo>⋅</mo>me2</mrow>
e1/e2 <mrow>me1<mo>/</mo>me2</mrow>
e1//e2 <mfrac>me1 me2</mfrac>
e1**e2 <msup>me1 me2</msup>
e1[e2] <msub>me1 me2</msub>
e1(e2) <mrow>me1<mo>&af;</mo><mfenced>e2</mfenced></mrow>

Caveat

Expressions are not automatically parenthetized. For example, the following snippet

a, b = identifiers('a', 'b')
expr = (a+b)**2

results in the following MathML code

print(expr)
<msup><mrow><mi>a</mi><mo>+</mo><mi>b</mi></mrow><mn>2</mn></msup>

which renders as a+b², not (a+b)². This is a limitation of this package, not a bug. Indeed, close inspection of the above MathML code reveals that the expression a+b is embedded in a mrow element, so that a+b is indeed squared.

Until automatic fencing is implemented (see issue 1), this is how the above expression should be defined

expr = Fenced(a+b)**2

which renders as (a+b)² as expected:

print(expr)
<msup><mfenced><mrow><mi>a</mi><mo>+</mo><mi>b</mi></mrow></mfenced><mn>2</mn></msup>

Mathematical operations

PyMathML defines classes that implement unary, binary and n-ary operations as compound MathML elements.

Unary operations

Unary operations are derived from the UnaryOperation class. Examples are the Pos and Neg classes. The single operand is embedded in a mrow element:

a = Identifier('a')
print('{}\n{}'.format(Pos(a), Neg(a)))
<mrow><mo>+</mo><mi>a</mi></mrow>
<mrow><mo>-</mo><mi>a</mi></mrow>

The above constructs are equivalent to the +a and -a, respectively:

print('{}\n{}'.format(+a, -a))
<mrow><mo>+</mo><mi>a</mi></mrow>
<mrow><mo>-</mo><mi>a</mi></mrow>

Note that the initializer also accepts attributes, which are passed to the mrow element:

print(Neg(a, dir='rtl'))
<mrow dir="rtl"><mo>-</mo><mi>a</mi></mrow>

New unary operations can be created with the unary_operation_type function, like so:

Not = unary_operation_type('Not', '\N{NOT SIGN}')
print(Not(a))
<mrow><mo>¬</mo><mi>a</mi></mrow>
print(inspect.getdoc(Not))
PyMathML representation of the ¬ unary operation.

Usage: Not(operand, **attributes)

which produces the following MathML code:

    <mrow>
        <mo>¬</mo>
        operand
    </mrow>

Binary operations

Binary operations are derived from the BinaryOperation class. Currently implemented binary operations are listed below.

PyMathML Operator
CircledTimes
Div /
Dot
Equals =
InvisibleTimes
Minus -
Plus +
Times ×

The operands are passed to the initializer of the binary operation to be constructed. They are embedded in a mrow element.

a, b = identifiers('a', 'b')
print(Minus(a, b))
<mrow><mi>a</mi><mo>-</mo><mi>b</mi></mrow>

which is equivalent to a-b:

print(a-b)
<mrow><mi>a</mi><mo>-</mo><mi>b</mi></mrow>

Note that the initializer also accepts attributes, which are passed to the mrow element:

print(Minus(a, b, dir='rtl'))
<mrow dir="rtl"><mi>a</mi><mo>-</mo><mi>b</mi></mrow>

Also, assuming associativity, more than two operands can be passed to the initializer:

c = Identifier('c')
print(Minus(a, b, c))
<mrow><mi>a</mi><mo>-</mo><mi>b</mi><mo>-</mo><mi>c</mi></mrow>

which is not strictly equivalent to a-b-c (the former being embedded in one single mrow element):

print(a-b-c)
<mrow><mrow><mi>a</mi><mo>-</mo><mi>b</mi></mrow><mo>-</mo><mi>c</mi></mrow>

New binary operations can be created with the binary_operation_type function, like so:

CircledTimes = binary_operation_type('CircledTimes', '\N{CIRCLED TIMES}')
print(CircledTimes(a, b, c))
<mrow><mi>a</mi><mo>⊗</mo><mi>b</mi><mo>⊗</mo><mi>c</mi></mrow>
print(inspect.getdoc(CircledTimes))
PyMathML representation of the ⊗ binary operation.

Usage: CircledTimes(*operands, **attributes)

which produces the following MathML code (associativity is assumed):

    <mrow>
        operands[0]
        <mo>⊗</mo>
        operands[1]
        <mo>⊗</mo>
        operands[2]
        ...
    </mrow>

Note that the CircledTimes binary operator is actually defined in the library.

N-ary operations

N-ary operations are derived from the NaryOperation class. Currently implemented N-ary operations are listed below.

PyMathML Operator
Product
Sum

Three expressions are passed to the initializer of the n-ary operation: the operand, the start expression and the end expression:

a, i, n = identifiers('a', 'i', 'n')
operand = a[i]
start = Equals(i, 0)
end = n
expr = Sum(operand, start, end)
print(expr)
<mrow><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>0</mn></mrow><mi>n</mi></munderover><msub><mi>a</mi><mi>i</mi></msub></mrow>

which renders as (only works in the Jupyter notebook)

expr

∑i=0nai

Note that, if empty, start and end must explicitly be set to None

print(Sum(operand, start, None))
<mrow><munder><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>0</mn></mrow></munder><msub><mi>a</mi><mi>i</mi></msub></mrow>
print(Sum(operand, None, end))
<mrow><mover><mo>∑</mo><mi>n</mi></mover><msub><mi>a</mi><mi>i</mi></msub></mrow>

Also note that the initializer accepts attributes, which are passed to the mrow element:

print(Sum(operand, None, None, dir='rtl'))
<mrow dir="rtl"><mo>∑</mo><msub><mi>a</mi><mi>i</mi></msub></mrow>

New n-ary operations can be created with the nary_operation_type function, like so:

Union = nary_operation_type('Union', '\N{N-ARY UNION}')
print(Union(operand, None, None))
<mrow><mo>⋃</mo><msub><mi>a</mi><mi>i</mi></msub></mrow>
print(inspect.getdoc(Union))
PyMathML representation of the ⋃ n-ary operation.

Usage: Union(operand, start, end, **attributes)

If both start and end are not None, the following MathML code is
produced:

    <mrow>
        <munderover>
            <mo>⋃</mo>
            start
            end
        </munderover>
        operand
    </mrow>

If only start is not None, the following MathML code is produced:

    <mrow>
        <munder>
            <mo>⋃</mo>
            start
        </munder>
        operand
    </mrow>

Finally, if only end is not None, the following MathML code is
produced:

    <mrow>
        <mover>
            <mo>⋃</mo>
            end
        </mover>
        operand
    </mrow>

Convenience functions

Convenience functions can be found in the pymathml.utils. See docstrings for more details.

import pymathml.utils
help(pymathml.utils)
Help on module pymathml.utils in pymathml:

NAME
    pymathml.utils - A collection of functions to facilitate creation of expressions.

FUNCTIONS
    identifiers(*names, **attributes)
        Return instances of Identifier with specified names.
        
        The **attributes are passed to the initializer of all returned
        instances of Identifier.
    
    table(cells, **attributes)
        Create a Table.
        
        The cells of the returned table are specified as an iterable of
        iterables. The **attributes are passed to the initializer of the
        Table object (attributes cannot be set for the nested TableRow and
        TableEntry objects)
    
    underbrace(expr, underscript)
        Create an underbraced expression.
        
        The LaTeX equivalent is:
        
            \underbrace{expr}_{underscript}

FILE
    /home/sbrisard/Documents/programmes/pymathml/pymathml/utils.py

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.