GithubHelp home page GithubHelp logo

peterbe / premailer Goto Github PK

View Code? Open in Web Editor NEW
1.1K 27.0 193.0 636 KB

Turns CSS blocks into style attributes

Home Page: https://premailer.io

License: BSD 3-Clause "New" or "Revised" License

Python 91.09% HTML 8.64% CSS 0.17% Shell 0.10%

premailer's Introduction

premailer

https://travis-ci.org/peterbe/premailer.svg?branch=master

Looking for sponsors

This project is actively looking for corporate sponsorship. If you want to help making this an active project consider pinging Peter and we can talk about putting up logos and links to your company.

Python versions

Our tox.ini makes sure premailer works in:

  • Python 3.4
  • Python 3.5
  • Python 3.6
  • Python 3.7
  • Python 3.8
  • PyPy

Turns CSS blocks into style attributes

When you send HTML emails you can't use style tags but instead you have to put inline style attributes on every element. So from this:

<html>
<style type="text/css">
h1 { border:1px solid black }
p { color:red;}
</style>
<h1 style="font-weight:bolder">Peter</h1>
<p>Hej</p>
</html>

You want this:

<html>
<h1 style="font-weight:bolder; border:1px solid black">Peter</h1>
<p style="color:red">Hej</p>
</html>

premailer does this. It parses an HTML page, looks up style blocks and parses the CSS. It then uses the lxml.html parser to modify the DOM tree of the page accordingly.

Warning! By default, premailer will attempt to download any external stylesheets by URL over the Internet. If you want to prevent this you can use the allow_network=False option.

Getting started

If you haven't already done so, install premailer first:

$ pip install premailer

Next, the most basic use is to use the shortcut function, like this:

>>> from premailer import transform
>>> print(transform("""
...         <html>
...         <style type="text/css">
...         h1 { border:1px solid black }
...         p { color:red;}
...         p::first-letter { float:left; }
...         </style>
...         <style type="text/css" data-premailer="ignore">
...         h1 { color:blue; }
...         </style>
...         <h1 style="font-weight:bolder">Peter</h1>
...         <p>Hej</p>
...         </html>
... """))
<html>
<head>
    <style type="text/css">p::first-letter {float:left}</style>
    <style type="text/css">
    h1 { color:blue; }
    </style>
</head>
<body>
    <h1 style="border:1px solid black; font-weight:bolder">Peter</h1>
    <p style="color:red">Hej</p>
</body>
</html>

The transform shortcut function transforms the given HTML using the defaults for all options:

base_url=None, # Optional URL prepended to all relative links (both stylesheets and internal)
disable_link_rewrites=False, # Allow link rewrites (e.g. using base_url)
preserve_internal_links=False, # Do not preserve links to named anchors when using base_url
preserve_inline_attachments=True, # Preserve links with cid: scheme when base_url is specified
preserve_handlebar_syntax=False # Preserve handlebar syntax from being encoded
exclude_pseudoclasses=True, # Ignore pseudoclasses when processing styles
keep_style_tags=False, # Discard original style tag
include_star_selectors=False, # Ignore star selectors when processing styles
remove_classes=False, # Leave class attributes on HTML elements
capitalize_float_margin=False, # Do not capitalize float and margin properties
strip_important=True, # Remove !important from property values
external_styles=None, # Optional list of URLs to load and parse
css_text=None, # Optional CSS text to parse
method="html", # Parse input as HTML (as opposed to "xml")
base_path=None, # Optional base path to stylesheet in your file system
disable_basic_attributes=None, # Optional list of attribute names to preserve on HTML elements
disable_validation=False, # Validate CSS when parsing it with cssutils
cache_css_parsing=True, # Do cache parsed output for CSS
cssutils_logging_handler=None, # See "Capturing logging from cssutils" below
cssutils_logging_level=None,
disable_leftover_css=False, # Output CSS that was not inlined into the HEAD
align_floating_images=True, # Add align attribute for floated images
remove_unset_properties=True # Remove CSS properties if their value is unset when merged
allow_network=True # allow network access to fetch linked css files
allow_insecure_ssl=False # Don't allow unverified SSL certificates for external links
allow_loading_external_files=False # Allow loading any non-HTTP external file URL
session=None # Session used for http requests - supply your own for caching or to provide authentication

For more advanced options, check out the code of the Premailer class and all its options in its constructor.

You can also use premailer from the command line by using its main module.

$ python -m premailer -h
usage: python -m premailer [options]

optional arguments:
-h, --help            show this help message and exit
-f [INFILE], --file [INFILE]
                      Specifies the input file. The default is stdin.
-o [OUTFILE], --output [OUTFILE]
                      Specifies the output file. The default is stdout.
--base-url BASE_URL
--remove-internal-links PRESERVE_INTERNAL_LINKS
                      Remove links that start with a '#' like anchors.
--exclude-pseudoclasses
                      Pseudo classes like p:last-child', p:first-child, etc
--preserve-style-tags
                      Do not delete <style></style> tags from the html
                      document.
--remove-star-selectors
                      All wildcard selectors like '* {color: black}' will be
                      removed.
--remove-classes      Remove all class attributes from all elements
--strip-important     Remove '!important' for all css declarations.
--method METHOD       The type of html to output. 'html' for HTML, 'xml' for
                      XHTML.
--base-path BASE_PATH
                      The base path for all external stylsheets.
--external-style EXTERNAL_STYLES
                      The path to an external stylesheet to be loaded.
--disable-basic-attributes DISABLE_BASIC_ATTRIBUTES
                      Disable provided basic attributes (comma separated)
--disable-validation  Disable CSSParser validation of attributes and values
--pretty              Pretty-print the outputted HTML.
--allow-insecure-ssl  Skip SSL certificate verification for external URLs.
--allow-loading-external-files Allow opening any non-HTTP external file URL.

A basic example:

$ python -m premailer --base-url=http://google.com/ -f newsletter.html
<html>
<head><style>.heading { color:red; }</style></head>
<body><h1 class="heading" style="color:red"><a href="http://google.com/">Title</a></h1></body>
</html>

The command line interface supports standard input.

$ echo '<style>.heading { color:red; }</style><h1 class="heading"><a href="/">Title</a></h1>' | python -m premailer --base-url=http://google.com/
<html>
<head><style>.heading { color:red; }</style></head>
<body><h1 class="heading" style="color:red"><a href="http://google.com/">Title</a></h1></body>
</html>

Turning relative URLs into absolute URLs

Another thing premailer can do for you is to turn relative URLs (e.g. "/some/page.html" into "http://www.peterbe.com/some/page.html"). It does this to all href and src attributes that don't have a :// part in it. For example, turning this:

<html>
<body>
<a href="/">Home</a>
<a href="page.html">Page</a>
<a href="http://crosstips.org">External</a>
<img src="/folder/">Folder</a>
</body>
</html>

Into this:

<html>
<body>
<a href="http://www.peterbe.com/">Home</a>
<a href="http://www.peterbe.com/page.html">Page</a>
<a href="http://crosstips.org">External</a>
<img src="http://www.peterbe.com/folder/">Folder</a>
</body>
</html>

by using transform('...', base_url='http://www.peterbe.com/').

Ignore certain <style> or <link> tags

Suppose you have a style tag that you don't want to have processed and transformed you can simply set a data attribute on the tag like:

<head>
<style>/* this gets processed */</style>
<style data-premailer="ignore">/* this gets ignored */</style>
</head>

That tag gets completely ignored except when the HTML is processed, the attribute data-premailer is removed.

It works equally for a <link> tag like:

<head>
<link rel="stylesheet" href="foo.css" data-premailer="ignore">
</head>

HTML attributes created additionally

Certain HTML attributes are also created on the HTML if the CSS contains any ones that are easily translated into HTML attributes. For example, if you have this CSS: td { background-color:#eee; } then this is transformed into style="background-color:#eee" and as an HTML attribute bgcolor="#eee".

Having these extra attributes basically as a "back up" for really shit email clients that can't even take the style attributes. A lot of professional HTML newsletters such as Amazon's use this. You can disable some attributes in disable_basic_attributes.

Capturing logging from cssutils

cssutils is the library that premailer uses to parse CSS. It will use the python logging module to mention all issues it has with parsing your CSS. If you want to capture this, you have to pass in cssutils_logging_handler and cssutils_logging_level (optional). For example like this:

>>> import logging
>>> import premailer
>>> from io import StringIO
>>> mylog = StringIO()
>>> myhandler = logging.StreamHandler(mylog)
>>> p = premailer.Premailer(
...     cssutils_logging_handler=myhandler,
...     cssutils_logging_level=logging.INFO
... )
>>> result = p.transform("""
...         <html>
...         <style type="text/css">
...         @keyframes foo { from { opacity: 0; } to { opacity: 1; } }
...         </style>
...         <p>Hej</p>
...         </html>
... """)
>>> mylog.getvalue()
'CSSStylesheet: Unknown @rule found. [2:1: @keyframes]\n'

If execution speed is on your mind

If execution speed is important, it's very plausible that you're not just converting 1 HTML document but a lot of HTML documents. Then, the first thing you should do is avoid using the premailer.transform function because it creates a Premailer class instance every time.

# WRONG WAY!
from premailer import transform

for html_string in get_html_documents():
    transformed = transform(html_string, base_url=MY_BASE_URL)
    # do something with 'transformed'

Instead...

# RIGHT WAY
from premailer import Premailer

instance = Premailer(base_url=MY_BASE_URL)
for html_string in get_html_documents():
    transformed = instance.transform(html_string)
    # do something with 'transformed'

Another thing to watch out for when you're reusing the same imported Python code and reusing it is that internal memoize function caches might build up. The environment variable to control is PREMAILER_CACHE_MAXSIZE. This parameter requires a little bit of fine-tuning and calibration if your workload is really big and memory even becomes an issue.

Advanced options

Below are some advanced configuration options that probably doesn't matter for most people with regular load.

Choosing the cache implementation

By default, premailer uses LFUCache to cache selectors, styles and parsed CSS strings. If LFU doesn't serve your purpose, it is possible to switch to an alternate implementation using below environment variables.

  • PREMAILER_CACHE: Can be LRU, LFU or TTL. Default is LFU.
  • PREMAILER_CACHE_MAXSIZE: Maximum no. of items to be stored in cache. Defaults to 128.
  • PREMAILER_CACHE_TTL: Time to live for cache entries. Only applicable for TTL cache. Defaults to 1 hour.

Getting coding

First clone the code and create whatever virtualenv you need, then run:

pip install -e ".[dev]"

Then to run the tests, run:

tox

This will run the whole test suite for every possible version of Python it can find on your system. To run the tests more incrementally, open up the tox.ini and see how it works.

Code style is all black

All code has to be formatted with Black and the best tool for checking this is therapist since it can help you run all, help you fix things, and help you make sure linting is passing before you git commit. This project also uses flake8 to check other things Black can't check.

To check linting with tox use:

tox -e lint

To install the therapist pre-commit hook simply run:

therapist install

When you run therapist run it will only check the files you've touched. To run it for all files use:

therapist run --use-tracked-files

And to fix all/any issues run:

therapist run --use-tracked-files --fix

premailer's People

Contributors

asandeep avatar beniwohli avatar bogdal avatar fangpenlin avatar graingert avatar hoserdude avatar lavr avatar litchfield avatar luhn avatar majortom731 avatar mariocesar avatar michi88 avatar moggers87 avatar mpj17 avatar onecrayon avatar p12tic avatar peterbe avatar pmosetc avatar pzrq avatar redtoad avatar remik avatar revolunet avatar rtpg avatar russelldavis avatar s16h avatar svisser avatar techniq avatar theospears avatar tzanke avatar vanng822 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

premailer's Issues

Too many values to unpack

Here is a little fix.
When the premailer face something like "background: url(http://example.com/thing.png);"
It shouldn't split every ":" but only the first one.

--------------------------- premailer/premailer.py ----------------------------
index 58ed3f5..a9ea4bd 100644
@@ -25,11 +25,11 @@ def _merge_styles(old, new):
In other words, the new style bits replace the old ones
"""
news = {}

  • for k, v in [x.strip().split(':') for x in new.split(';') if x.strip()]:
  • for k, v in [x.strip().split(':', 1) for x in new.split(';') if x.strip()]:
    news[k.strip()] = v.strip()

olds = {}

  • for k, v in [x.strip().split(':') for x in old.split(';') if x.strip()]:
  • for k, v in [x.strip().split(':', 1) for x in old.split(';') if x.strip()]:
    olds[k.strip()] = v.strip()

merged = news

Feature Request: Emit Warnings For Using Bad Styles

It would be fantastic if premailer could warn you about using styles that are not supported by a majority of email clients.

Basically, if we could code in the data posted by campaignmonitor and emit a warning or raise an exception whenever premailer detects these unsupported styles are being used. This feature will make it much easier to make compliant email html templates and for debugging these email clients generally.

Use the webkit rendering engine to turn CSS blocks into style attributes

Like wkhtmltopdf does for pdf files, use the webkit rendering engine to turn convert css blocks into inline style attributes.

If you've ever used a PDF to HTML conversion software you'll notice that it really struggles to create great looking pdfs from html. wkhtmltopdf however, takes uses an entirely different approach and creates near-perfect pdfs. This is because wkhtmltopdf uses the rendering engine straight from webkit.

We should do the same with premailer. Use the webkit rendering engine and create near-perfect inline styles.

Thanks

Transforming documents containing unicode

I get the following issue when parsing a file containing, for example, Japanese characters:

albin@dev:~/premailer-tests$ python -m premailer --file t.html
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/premailer-2.1.1-py2.7.egg/premailer/__main__.py", line 119, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/usr/local/lib/python2.7/dist-packages/premailer-2.1.1-py2.7.egg/premailer/__main__.py", line 114, in main
    options.outfile.write(p.transform())
  File "/usr/local/lib/python2.7/dist-packages/premailer-2.1.1-py2.7.egg/premailer/premailer.py", line 213, in transform
    root = tree if stripped.startswith(tree.docinfo.doctype) else page
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 216: ordinal not in range(128)

test.html has the following content:

<!doctype html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Unicode Test</title>
    <style>
        h1 {
            font-family: 'Gulim', sans-serif;
        }
    </style>
</head>
<body>
    <h1>問題</h1>
</body>
</html>

I am running the latest files and installed using the python setup.py install.

mailto links are broken with link rewriting

If I have a a tag with a mailto: href, the link gets broken with your URL rewriting.

You should probably use the standard urlparse.urljoin to rewrite your links with a base URL to be safe.

Styles with not matching element should stay in <style>

In my email I have styles that fixes behaviour of some webclients. They don't have elements that match to selector. For now such rules are deleted forever. Excepted behaviour is to save them in <style> tag or give possibility to mark one <style> element as not to change.

Margin & padding properties are special cases of the merging process

I might be wrong, but I think margin and padding properties should be treated separately from others during the merge process. Let's say, we use the following rule to reset spacing:

* {
  margin: 0;
  padding: 0;
}

and then a dedicated tag rule like:

p { margin-top:10px }

processed HTML will result in something like:

<p style="... margin:0 ; margin-top:10px">

and the margin-top property will be ignored.

My idea would be to systematically split the margin:x x x x; property into margin-top:x; margin-right:x; ... before the merging process. Hence, margin & padding properties would be properly merged.

Do you think it is valuable?

Failed to handle CSS rules with no selectorText

I hit the issue below when the css contains @fontface rules.

Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Library/Python/2.7/site-packages/premailer/main.py", line 119, in
sys.exit(main(sys.argv[1:]))
File "/Library/Python/2.7/site-packages/premailer/main.py", line 114, in main
options.outfile.write(p.transform())
File "/Library/Python/2.7/site-packages/premailer/premailer.py", line 243, in transform
these_rules, these_leftover = self._parse_style_rules(css_body, index)
File "/Library/Python/2.7/site-packages/premailer/premailer.py", line 172, in _parse_style_rules
for x in rule.selectorText.split(',')
AttributeError: 'CSSFontFaceRule' object has no attribute 'selectorText'

There should be other css rules with no selector text as well.
https://pythonhosted.org/cssutils/docs/css.html

CSS Comments in @media

When I run the premailer on the following document it returns an error.

Document:

<!doctype html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Document</title>
    <style>
    @media screen {
        /* comment */
    }
    </style>
</head>
<body></body>
</html>

Error:

albin@dev:~/premailer-tests$ python -m premailer -f temp.html
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/premailer/__main__.py", line 119, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/usr/local/lib/python2.7/dist-packages/premailer/__main__.py", line 114, in main
    options.outfile.write(p.transform())
  File "/usr/local/lib/python2.7/dist-packages/premailer/premailer.py", line 263, in transform
    for key in rule.style.keys():
AttributeError: 'CSSComment' object has no attribute 'style'

@font-face removed

I have an email template that uses several @font-face rules to include Google web fonts. It appears that when I pass the template through premailer, all of the @font-face rules get removed. I've tried using keep_style_tags=True but that doesn't affect the behavior. I've also looked at the other initialization options, but nothing seems to address this issue.

I know @font-face is only supported by iOS/Apple Mail, but that's a large percentage of our users, so we'd like to give them a better experience. Any ideas?

Preserve 6 character color codes for Lotus Notes

As you may be aware, the IBM/Lotus Notes email client has poor HTML support. One of its quirks is that it sometimes (only sometimes!) gets confused by "short" color codes like #FFF instead of #FFFFFF. Which is unfortunate because Premailer is automatically converting to the short versions when possible. Any idea how to get it to stop? Or even where that's coming from? I'm guessing somewhere deep in cssselect...

This HTML:

<table style="background-color:#F00; border-collapse:collapse; position:relative;" bgcolor="#F00" valign="top">
    <tr>
        <td>
            <p style="color:#FFFFFF">This is in a table with background of #F00 (red) </p>
        </td>
    </tr>
</table>


<table style="background-color:#FF0000; border-collapse:collapse; position:relative;" bgcolor="#FF0000" valign="top">
    <tr>
        <td>
            <p style="color:#FFFFFF">This is in a table with background of #FF0000 (red) </p>
        </td>
    </tr>
</table>

Looks like this in Notes 8.5:
screenshot 2015-03-06 16 37 43

Too many values to unpack

Here is a little fix.
When the premailer face something like "background: url(http://example.com/thing.png);"
It shouldn't split every ":" but only the first one.

--------------------------- premailer/premailer.py ----------------------------
index 58ed3f5..a9ea4bd 100644
@@ -25,11 +25,11 @@ def _merge_styles(old, new):
In other words, the new style bits replace the old ones
"""
news = {}

  • for k, v in [x.strip().split(':') for x in new.split(';') if x.strip()]:
  • for k, v in [x.strip().split(':', 1) for x in new.split(';') if x.strip()]:
    news[k.strip()] = v.strip()

olds = {}

  • for k, v in [x.strip().split(':') for x in old.split(';') if x.strip()]:
  • for k, v in [x.strip().split(':', 1) for x in old.split(';') if x.strip()]:
    olds[k.strip()] = v.strip()

merged = news

Performance issues

Hi,

I really love Premailer, thank you for writing it.

I am using it on some rather large emails with tabled reports in them and it is taking quite a long time. Multiple minutes. I am curious to see if there are any performance gains to be had in the code.

Before I looked closely I wanted to check with you. Do you consider the code to be heavily optimised? Do you feel there could be room for improvement?

Cheers,
Michael

Premailer no longer transforms partial documents

I'm not sure whether this was previously explicitly supported, but we've seen significant performance boosts by transforming only a partial document and its limited stylesheet. The template in which this is placed is transformed once during build, and the partial document is inserted after it itself is transformed.

This used to work with premailer's XML mode (which does not forcibly add html/body tags around content), but with the introduction of PR #87 this no longer works. The document I'm transforming has a root element that is a wrapping div around a series of paragraphs and other markup.

The introduction and use of the get_or_create_head function means that these snippet documents fail because there is no body tag:

  File "/devel/mail.py", line 103, in _render_html
    snippet = premail.transform()
  File "/devel/env/local/lib/python2.7/site-packages/premailer/premailer.py", line 267, in transform
    head = get_or_create_head(tree)
  File "/devel/env/local/lib/python2.7/site-packages/premailer/premailer.py", line 62, in get_or_create_head
    body = CSSSelector('body')(root)[0]
IndexError: list index out of range

Is it possible (and sensible) to make the get_or_create_head return a dummy element in cases like this, with xml documents that do not contain a body tag?

[bug]: "url(data:..." breaks merge_styles

ValueError need more than 1 value to unpack
/usr/lib/python2.7/dist-packages/premailer/premailer.py in merge_styles
for k, v in [x.strip().split(':', 1) for x in new.split(';') if x.strip()]: 

local vars

new_keys        set([u'background-image'])
old          'display:inline-block; width:14px; height:14px; margin-top:1px; line-height:14px; vertical-align:text-top; background-repeat:no-repeat'
k           u'background-image'
v           u'url(".....
news        [(u'background-image', u'url("data:image/png')]
x           u'base64,iVBORw0KGgoAAAANSUhEUgAAABgAAAAQCAYAA...

keep class attributes

the transform() function replaces the "class" attribute with the appropriate styling, but it also removes the class from the html elements. I need the styling added inline, but the classes to be left within their elements so that my @media (!important) styling can be inserted in after and still have classes to point to within the dom.

Im having a hard time changing your code to meet this needs, do you have any thoughts on how I could keep the classes within the html elements but still inject the inline styling?

XSLT attributes mangled, namespaces dropped, any chance of support?

I know this is outside what premailer offers but here goes. We are using XSLT templates for our email source files and would like to use premailer to inline styles in the HTML content while leaving the XSLT untouched. For example:

<p class="foobar">
  <xsl:value-of select="SomethingOrOther"/>
</p>

With the --method xml switch, syntax si good but the XML namespace of tags is being stripped, so we get:

<p style="color: yellow">
  <value-of select="SomethingOrOther"/>
</p>

Similar things happen with <xsl:element> tags:

<xsl:element name="a">
  <xsl:attribute name="href">
    <xsl:value-of select="LoginURL"/>foo/bar/<xsl:value-of select="SomeID"/>
  </xsl:attribute>
  <xsl:value-of select="LoginURL"/>foo/bar/<xsl:value-of select="SomeID"/>
</xsl:element>

Comes out as:

<element name="a">
  <attribute name="href">
    <value-of select="LoginURL"/>foo/bar/<value-of select="SomeID"/>
  </attribute>
  <value-of select="LoginURL"/>foo/bar/<value-of select="SomeID"/>
</element>

I can run a post-premailer replacement over the output but would rather not. I'm not familiar with Python or the HTML parser you are using.

Is there some simple switch we can use that would make it respect and retain the xsl: namespace portion of tag names?

While on the subject, I notice that test comparisons are being escaped as if they were normal HTML markup:

<xsl:if test="ShipmentValues>0">

becomes

<if test="ShipmentValues&gt;0">

FYI our input files to premailer are wrapped in this:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
            xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
            xmlns:msxsl="urn:schemas-microsoft-com:xslt"
            exclude-result-prefixes="msxsl">
<xsl:output method="html" indent="yes" encoding="utf-8"/>

<xsl:template match="RFQsMessagesInfo">
<!-- HTML content goes here -->
</xsl:template>
</xsl:stylesheet>

Is support for this scenario likely to be forthcoming?

Usage example in README

Hi @peterbe ! The README is egregiously lacking even a simple usage example. Seems like that would be nice, so as not to have to go digging into the code in order to even begin using premailer. Maybe I'll submit a pull request once I've done the digging in the code and figured out how to use it :-)

Replacement of urls beginning with //

When using a link whose href starts with "//", Premailer should just add http or https depending on the base url it uses. Currently, it replaces it with the full host, so //another.host/ becomes http://my.base.host/another.host/

UnicodeDecodeError

This thing is great but it is not working for like 98% of my html pages (html emails).
I ma getting lot of errors and all of them are like:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 15052: ordinal not in range(128)

Any idea how to solve this?

Can't handle &nbsp;

If source html contain &nbsp; exception will thrown:

Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/premailer/__main__.py", line 133, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/usr/local/lib/python2.7/dist-packages/premailer/__main__.py", line 128, in main
    options.outfile.write(p.transform())
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 45: ordinal not in range(128)

Test html:

<html>
  <body>&nbsp;</body>
</html>

Add specific rules for unmapped css rules to attributes

Exists some specific atticutes that can't be mapped cleary mapped from css to html attibutes. For example cellpadding, cellspasing and etc. It will be greate have specific rules for this attributes which will applied as html attribues and removed form styles (because it unsupported for any browser or email clien). For example for add -premailer- prefix for this rule: -premailer-cellspasing will be convert to cellspasing attribute and etc.

thread concurrency bug causes exceptions when using premailer in multiple threads

Exception hits in multithreading code, which is quite common when sending e-mail notifications in background threads.
This bug is very important for me - is there any chance to fix it soon?
How could I help?

Steps to reproduce:
Run the following code e.g. in python shell, assuming you have all the libs and deps in pythonpath::

import threading
from premailer import premailer

old = u'-ms-text-size-adjust:100%; -webkit-text-size-adjust:100%; background-color:#ddd; border-collapse:collapse; mso-table-lspace:0; mso-table-rspace:0'
new = 'background-color:#dddddd;'

def repeat_merge_styles(old, new, class_): 
    for i in range(0,20):
        print premailer.merge_styles(old, new, class_)

threads = [threading.Thread(target=repeat_merge_styles, args=(old, new, '')) for i in range(0,30)]
for t in threads: t.start()

The result should be printing 20x30 merged styles, possibly a bit messed up because of I/O flushing.

Actual result is below - exceptions.

-ms-text-size-adjust:100%; -webkit-text-size-adjust:100%; background-color:#ddd; border-collapse:collapse; height:100%; margin:0; mso-table-lspace:0; mso-table-rspace:0; padding:0; width:100%-ms-text-size-adjust:100%; -webkit-text-size-adjust:100%; background-color:#ddd; border-collapse:collapse; height:100%; margin:0; mso-table-lspace:0; mso-table-rspace:0; padding:0; width:100%

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 761, in run
    self.__target(*self.__args, **self.__kwargs)
  File "<console>", line 3, in merge_styles
  File "/home/alex/Documents/src/offiserv/vendor/premailer/premailer.py", line 63, in merge_styles
    for k, v in csstext_to_pairs(old):
  File "/home/alex/Documents/src/offiserv/vendor/premailer/premailer.py", line 43, in csstext_to_pairs
    parsed = cssutils.css.CSSVariablesDeclaration(csstext)
  File "/home/alex/Documents/src/offiserv/vendor/cssutils/css/cssvariablesdeclaration.py", line 47, in __init__
    self.cssText = cssText
  File "/home/alex/Documents/src/offiserv/vendor/cssutils/css/cssvariablesdeclaration.py", line 154, in _setCssText
    emptyOk=True)
  File "/home/alex/Documents/src/offiserv/vendor/cssutils/prodparser.py", line 562, in parse
    self._log.error(u'%s: %s %s: %r' % (name, [t for t in text], e, token))
  File "/home/alex/Documents/src/offiserv/vendor/cssutils/errorhandler.py", line 98, in __handle
    raise error(msg)
SyntaxErr: CSSVariableDeclaration: ['-', 'm', 's', '-', 't', 'e', 'x', 't', '-', 's', 'i', 'z', 'e', '-', 'a', 'd', 'j', 'u', 's', 't', ':', '1', '0', '0', '%', ';', ' ', '-', 'w', 'e', 'b', 'k', 'i', 't', '-', 't', 'e', 'x', 't', '-', 's', 'i', 'z', 'e', '-', 'a', 'd', 'j', 'u', 's', 't', ':', '1', '0', '0', '%', ';', ' ', 'b', 'a', 'c', 'k', 'g', 'r', 'o', 'u', 'n', 'd', '-', 'c', 'o', 'l', 'o', 'r', ':', '#', 'd', 'd', 'd', ';', ' ', 'b', 'o', 'r', 'd', 'e', 'r', '-', 'c', 'o', 'l', 'l', 'a', 'p', 's', 'e', ':', 'c', 'o', 'l', 'l', 'a', 'p', 's', 'e', ';', ' ', 'h', 'e', 'i', 'g', 'h', 't', ':', '1', '0', '0', '%', ';', ' ', 'm', 'a', 'r', 'g', 'i', 'n', ':', '0', ';', ' ', 'm', 's', 'o', '-', 't', 'a', 'b', 'l', 'e', '-', 'l', 's', 'p', 'a', 'c', 'e', ':', '0', ';', ' ', 'm', 's', 'o', '-', 't', 'a', 'b', 'l', 'e', '-', 'r', 's', 'p', 'a', 'c', 'e', ':', '0', ';', ' ', 'p', 'a', 'd', 'd', 'i', 'n', 'g', ':', '0', ';', ' ', 'w', 'i', 'd', 't', 'h', ':', '1', '0', '0', '%'] Missing token for production Sequence(ident, :, term): ('CHAR', ';', 1, 25)

Exception in thread Thread-15:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 761, in run
    self.__target(*self.__args, **self.__kwargs)
  File "<console>", line 3, in merge_styles
  File "/home/alex/Documents/src/offiserv/vendor/premailer/premailer.py", line 63, in merge_styles
    for k, v in csstext_to_pairs(old):
  File "/home/alex/Documents/src/offiserv/vendor/premailer/premailer.py", line 43, in csstext_to_pairs
    parsed = cssutils.css.CSSVariablesDeclaration(csstext)
  File "/home/alex/Documents/src/offiserv/vendor/cssutils/css/cssvariablesdeclaration.py", line 47, in __init__
    self.cssText = cssText
  File "/home/alex/Documents/src/offiserv/vendor/cssutils/css/cssvariablesdeclaration.py", line 154, in _setCssText
    emptyOk=True)
  File "/home/alex/Documents/src/offiserv/vendor/cssutils/prodparser.py", line 562, in parse
    self._log.error(u'%s: %s %s: %r' % (name, [t for t in text], e, token))
  File "/home/alex/Documents/src/offiserv/vendor/cssutils/errorhandler.py", line 98, in __handle
    raise error(msg)
SyntaxErr: CSSVariableDeclaration: ['-', 'm', 's', '-', 't', 'e', 'x', 't', '-', 's', 'i', 'z', 'e', '-', 'a', 'd', 'j', 'u', 's', 't', ':', '1', '0', '0', '%', ';', ' ', '-', 'w', 'e', 'b', 'k', 'i', 't', '-', 't', 'e', 'x', 't', '-', 's', 'i', 'z', 'e', '-', 'a', 'd', 'j', 'u', 's', 't', ':', '1', '0', '0', '%', ';', ' ', 'b', 'a', 'c', 'k', 'g', 'r', 'o', 'u', 'n', 'd', '-', 'c', 'o', 'l', 'o', 'r', ':', '#', 'd', 'd', 'd', ';', ' ', 'b', 'o', 'r', 'd', 'e', 'r', '-', 'c', 'o', 'l', 'l', 'a', 'p', 's', 'e', ':', 'c', 'o', 'l', 'l', 'a', 'p', 's', 'e', ';', ' ', 'h', 'e', 'i', 'g', 'h', 't', ':', '1', '0', '0', '%', ';', ' ', 'm', 'a', 'r', 'g', 'i', 'n', ':', '0', ';', ' ', 'm', 's', 'o', '-', 't', 'a', 'b', 'l', 'e', '-', 'l', 's', 'p', 'a', 'c', 'e', ':', '0', ';', ' ', 'm', 's', 'o', '-', 't', 'a', 'b', 'l', 'e', '-', 'r', 's', 'p', 'a', 'c', 'e', ':', '0', ';', ' ', 'p', 'a', 'd', 'd', 'i', 'n', 'g', ':', '0', ';', ' ', 'w', 'i', 'd', 't', 'h', ':', '1', '0', '0', '%'] Missing token for production Sequence(ident, :, term): ('CHAR', ';', 1, 119)

Specificity not supported

Given

<style>
  p:last-child {
    color: red;
  }
  p {
    color: blue;
  }
<style>
<p>foo
<p>bar

This should make the first p blue and the other red, because the :last-child selector is more specific. Instead, both ps are blue.

Replace URLs in CSS with base_url

Something useful would be to replace URLs in CSS also, not just HTML.

I'm trying to get something working with the cssutils package.

Unknown property names

Property: Unknown Property name. [31:17: -ms-interpolation-mode]
Property: Unknown Property name. [4:17: mso-line-height-rule]
Property: Unknown Property name. [106:17: -webkit-text-size-adjust]
Property: Unknown Property name. [107:17: -ms-text-size-adjust]
Property: Unknown Property name. [188:21: text-rendering]
Property: Unknown Property name. [613:17: mso-text-raise]
Property: Unknown Property name. [624:17: transition]
Property: Unknown Property name.

I'm not sure what is this and what can I do about that..
Thanks for any help!

pip package appears to be missing __main__.py

Could be a mac thing but after doing the pip install there was no __main__.py in the /site-packages/premailer directory. I copied it manually from the github repo and it seems ok now.

New lxml version doesn't include cssselect

Newly released lxml version (3.0.alpha2) which gets pulled automatically by pip install premailer doesn't come with cssselect bundled in the package but it is now a separate package. This throws import errors at runtime when trying to import anything from cssselect since it's not installed.

>>> from premailer import Premailer, transform, __version__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/foo2/lib/python2.7/site-packages/premailer/__init__.py", line 1, in <module>
    from premailer import Premailer, transform, __version__
  File "/tmp/foo2/lib/python2.7/site-packages/premailer/premailer.py", line 3, in <module>
    from lxml.cssselect import CSSSelector
  File "/tmp/foo2/lib/python2.7/site-packages/lxml/cssselect.py", line 18, in <module>
    raise ImportError('cssselect seems not to be installed. '
ImportError: cssselect seems not to be installed. See http://packages.python.org/cssselect/

After doing a pip install cssselect

>>> from premailer import Premailer, transform, __version__
>>> 

Related lxml thread is here: lxml/lxml#46

Jinja variables in hrefs are HTML encoded. Outside of hrefs they're not encoded.

Premailer version 2.5.0.

I have an HTML template with Jinja variables in it. I'd like for these Jinja variables to remain after using premailer, so I can cache the premailed version and do variable substitution in a loop.

For variables in hrefs, like
<a href="{{ SURVEY_LINK }}" class="btn-primary">Start Survey</a>

Premailer HTML encodes the braces to their HTML-encoded entities:
<a href="%7B%7B%20SURVEY_LINK%20%7D%7D" class="btn-primary"...

Likewise, brackets, [ and ] in href quotes are escaped, which is problematic for my use case as well.

I'd like to use premailer on my Jinja email templates but because of this HTML encoding issue I'm not able to.

--preserve-style-tags not working

Running python -m premailer --preserve-style-tags -f test.html

on test.html which is

<head>
<style type="text/css">
p {font-size:12px;}
</style>
</head>
<body>
<p>html</p>
</body>
</html>

seems not to preserve the style block, which (if I read the README right) it should. I can't spot any obvious issues in the premailer code, but I'm not an expert.

Invalid style attr is generated when using pseudoclasses

When CSS contain pseudoclasses resulted style attribute contain curly braces copied from CSS.

This will lead web browser to unable to parse the style and element will use default style instead of defined in CSS.

Test HTML:

<html>
  <head>
    <style type="text/css">
    a.test {
      background-color: red;
    }
    a.test:hover {
      background-color: blue;
    }
    </style>
  </head>

  <body>
    <a class="test" href="#">TEST</a>
  </body>
</html>

Output:

<html>
  <head>
    </head>

  <body>
    <a class="test" href="#" style="{background-color:red} :hover{background-color:blue}" bgcolor="red">TEST</a>
  </body>
</html>

IMO, expected output should be similar to this:

<html>
  <head>
    <style type="text/css">
    a.test:hover {
      background-color: blue !important;
    }
    </style>
  </head>

  <body>
    <a class="test" href="#" style="background-color:red" bgcolor="red">TEST</a>
  </body>
</html>

pseudo class not supported

I'm not sure if it's a good idea to write this here. I unfortunately have no fix for it.
Here's a link of what i'm talking about.

http://www.w3.org/TR/css-style-attr#cascading

Actually one of the lib used by premail will raise an exception. I'm pretty sure it will make the parser a little bit more complicated. Anyway, you should parse the pseudo class and skip them if it it happen. But, it would be wise to make the parser work in a way it will work once the lib you use can parse pseudo classes.

when you add inline attribute it should be like this
for class, style in pseudoclasses:
inline += "%s {%s} " % (class, style)

the default class would be an empty string so then it should be pretty well supported. So when you don't have pseudo class you put it in { } but let say you have a :hover it would be :hover {...}

I tried this thing with firefox 3.5 and it doesn't seem to work.

In my project I just stripped the pseudo classes as they are not really important but I guess it would be cool to support this feature of css 2 :)

Ignore certain stylesheets or styles?

Is there a way to ignore a specific style sheet? A similar library written in node called juice allows you to put data-ignore="ignore" on your style or link tag and the styles will only be embedded, not inlined. Does premailer have similar functionality?

Output replaces HTML Entities with unicode literals

Running transform seems to translate HTML entities in the source into unicode literals. For example:
<p>&copy; &nbsp;&nbsp; 2014</p>
becomes
<p>©    2014</p>

This is causing issues for me and I'm guessing it's just a side effect of the lxml settings and not intentional. My understanding is that "&copy" has better email client compatibility as "©" (If anything I'd prefer an option to go the other way: escape any unicode literals in the source)

!important overwritten by styles without it

In style I have two rules that are styling the same element in DOM. First one have !import, when second not. Currently if second rule is processed later it will overwrite previous style. Proper behaviour is to save last rule with !important.

Example style:

.element1 {
    background-color: red !important;
}

.element2 {
    background-color: blue;
}

Example HTML:

<div class="element1 element2">Foo</div>

Result:

<div style="background-color: blue;">Foo</div>

Excepted result:

<div style="background-color: red !important;">Foo</div>

inline media queries support

"inline media queries" are parsed as CSS selector and should be skipped IMHO.

eg:

@media print{
     tr.detail{
         display: none;
     }
}

multiple rules are stripped down to one (vendor prefixes)

I use vendor prefixes for background gradients, and the rules are stripped down to one.

input:

background: #ffffff; background: -moz-linear-gradient(top,  #ffffff 0%, #e5e5e5 100%); background: -webkit-gradient(linear, left top, left bottom, color-stop(0%,#ffffff), color-stop(100%,#e5e5e5)); background: -webkit-linear-gradient(top,  #ffffff 0%,#e5e5e5 100%); background: -o-linear-gradient(top,  #ffffff 0%,#e5e5e5 100%); background: -ms-linear-gradient(top,  #ffffff 0%,#e5e5e5 100%); background: linear-gradient(to bottom,  #ffffff 0%,#e5e5e5 100%); filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#ffffff', endColorstr='#e5e5e5',GradientType=0 );

output:

filter:progid:DXImageTransform.Microsoft.gradient( startColorstr='#ffffff', endColorstr='#e5e5e5',GradientType=0 );
background:linear-gradient(to bottom,  #ffffff 0%,#e5e5e5 100%);

The multple rules for vendor prefixes need to be supported.

padding-right not unset when an overruling padding property is present

Here is a minimal document to produce this issue:

<html>
  <head>
    <style type="text/css">
.wrapper.last {
  padding-right: 0;
}

#header .wrapper {
  padding: 15px;
  background-color: red;
}
    </style>
  </head>
  <body>
    <table id="header"><tr><td class="wrapper last"></td></tr></table>
  </body>
</html>

This is what premailer produces (relevant snippet):

<td style="background-color:red; padding:15px; padding-right:0" bgcolor="red">

As you can see, both the padding and the padding-right properties are present. This is happening with version 2.9.0. I'll spend some time debugging it and see how far I can get, but I'm not familiar with this code base, so if you know quickly why this is happening, or where I should look, please share.

@import and @media is passed to cssselect's parse_simple_selector, causing errors

Hey, looks like you aren't parsing out things like @import and @media before passing css to the cssselect library.

The author of cssselect had a few suggestions when I filed the issue there:
scrapy/cssselect#25

Here's an example of a css rule that raises an error:
CSS
@import url('https://fonts.googleapis.com/css?family=Lato:400,700,400italic|Signika:400,700');

Error
Expected selector, got <NUMBER '700' at 0>

While imported css/fonts and media queries probably aren't general fare for email css, it would be good to at least filter them out of the call to cssselect's parse_simple_selector.

Thanks for putting this library together!

Implementing inherit/initial/unset properties

Hello,

I'm using premailer to inline some templates that are getting rather large (I'm using the Zurb Ink framework).

It would be very helpful to be able to "unset" certain properties so that they aren't inlined. I think this should be easy to implement since, as far as I can tell, inherit, initial, and unset don't really mean anything when written to a property in a style= attribute. So I think those could properties could simply be omitted and it would have the effect of reverting them to their inherited or initial value.

Unnecessary defaultdict import breaks Python 2.4 compatibility

In premailer.py, line 3, there's an import of defaultdict, but it isn't used anywhere in that module. Since defaultdict is only part of Python since version 2.5, this import breaks Python 2.4 compatibility. When removing that import, premailer works like a charm on Python 2.4.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.