Comments (15)
@skyler Looks like it is lxml
not premailer
related issue.
See here: http://stackoverflow.com/questions/4684614/is-there-a-way-to-disable-urlencoding-of-anchor-attributes-in-lxml
Until better solution is available, i used simple replace to fix the problem with href="{{ }}"
.
After string has been transformed fix it back:
mapping = (('%7B%7B%20', '{{ '), ('%20%7D%7D', ' }}'))
for k, v in mapping:
s = s.replace(k, v)
from premailer.
+1
from premailer.
I ran into this issue as well with SendGrid marketing campaign emails, specifically for [Weblink]
and [unsubscribe]
.
I ended up chaining two .replace().replace()
to .transform()
to look for the HTML encoded weblink and unsubscribe strings.
from premailer.
+1. Same problem for me too
from premailer.
@skyler looking at http://stackoverflow.com/a/4684874/205832 it makes me think that maybe you can avoid the escaping if you change the parser. I.e.
parser = Premailer("your HTML string", method="xml")
print parser.transform()
from premailer.
I have not had much luck using the xml output method. While it avoids escaping the attributes, I have just get various XML related errors instead, primary related non declared entities. Instead, I decided to write my own version of etree's tostring
which renders the HTML and keeps the values in the attributes unescaped.
The following gist has this code: https://gist.github.com/clj/b8ba315b9a138db73be0
This works well for my testcase, but I have not tested it extensively as I am having some other problems (where the generated output as rendered by browsers is not equivalent to the input, but I don't think this is due to the renderer linked above, but I am investigating). Basically, YMMV.
Getting the output rendered as I want requires overriding and copying some of the functionality of transform
in the premailer
class and it might be nice to refactor the code so that is possible to easily override the rendering of the output without doing that.
from premailer.
Perhaps the best way is to allow you to insert some sort of override for how lxml puts the values into src
and href
attributes. Premailer fiddles with the value every time because it tries to correct URLs based on the base_url
.
See
premailer/premailer/premailer.py
Line 421 in d2a2a4a
In your case, you don't want an actual working browser URL in there, you want some other code.
Perhaps something like this:
def my_url_fixed(old, new):
if '{{' in old and '}}' in old:
# I know what I'm doing :)
return old
return new
p=Premailer(jinja_html, base_url='https://example.com', url_fixer=my_url_fixer)
And inside Premailer we have something like this:
if self.url_fixer is None:
self.url_fixer = lambda _, n: n
parent.attrib[attr] = self.url_fixer(urljoin(self.base_url, url))
from premailer.
But isn't it etree.tostring(..., method="html")
that percent encodes the href and src attributes and not actually premailer that is doing the encoding? The fix you propose above happens before etree.tostring(...)
is called and so I don't think it is able to achieve the desired effect of not precent encoding the characters in href (and src) attributes (which happens in etree.tostring
when the method is html
).
I have not personally had any issues with the urljoin
since I don't specify a base_url
:
from urlparse import urljoin
urljoin(None, '{{ blah }}')
# outputs: u'{{ blah }}'
So, I think that the only way of making premailer work with template engines which expect the hrefs and src attributes to not be mangled is to render the output using a custom function. As an example:
html = """
<html>
<body>
<p class="{{ something }}">
<a href="{{ something else }}">test</a>
<a href="http://example.com/%C3%A6%C3%B8%C3%A5">ing</a>
</p>
</body>
</html>
"""
from lxml import etree
tree = etree.fromstring(html)
etree.tostring(tree, method="html")
Produces the following output:
'''
<html>
<body>
<p class="{{ something }}">
<a href="%7B%7B%20something%20else%20%7D%7D">test</a>
<a href="http://example.com/%C3%A6%C3%B8%C3%A5">ing</a>
</p>
</body>
</html>
'''
Which will fail to render through my templating engine.
Trying to percent decode the output would not work, since I would want the href in the first to stay {{ something else }}
(which could be achieved by precent decoding the contents of the attribute) but the second href I would like to stay as http://example.com/%C3%A6%C3%B8%C3%A5
so I can't just naively percent decode everything in an href or src attributes.
If on the other hand I run this through the MyPremailer
from the gist in my previous comment (MyPremailer(html, remove_classes=False).transform()
), I get the following output:
'''
<html>
<head></head><body>
<p class="{{ something }}">
<a href="{{ something else }}">test</a>
<a href="http://example.com/%C3%A6%C3%B8%C3%A5">ing</a>
</p>
</body>
</html>
'''
Which is what my templating engine needs to work correctly.
from premailer.
@peterbe: Switching to method='xml'
seems to be a no-go.
Here's the first error I get, even though my input validates as XHTML 1.0 Transitional.
File "C:\utils\Python27\lib\site-packages\premailer\premailer.py", line 308, in transform
head = get_or_create_head(tree)
File "C:\utils\Python27\lib\site-packages\premailer\premailer.py", line 58, in get_or_create_head
body = CSSSelector('body')(root)[0]
IndexError: list index out of range
In fact, it looks like the chosen method
can get ignored anyway: if hasattr(self.html, "getroottree")
.
I also tried setting method='text'
, but that appears to have overridden my setting of encoding='ascii'
, as you get the familiar UnicodeEncodeError.
from premailer.
@clj Annoyingly I didn't get a notification about your reply.
I fear that the idea of trying to overwrite how tostring
works feels dangerous and fragile. It might just happen to work today in this use-case. But taking it on and trying to solve for something against its will usually leads to maintenance problems in the future.
My bet would be to instead rely on a string replace solution as show by @darklow.
from premailer.
This can be resolved by either using method='xml'
, if possible, or if you can't have correctly-formed xml, then in the following way:
from premailer import Premailer
from premailer.premailer import _importants
from lxml import etree
with open('path/to/template.jinja2') as f:
template_str = f.read()
parser = etree.HTMLParser()
html = etree.fromstring(template_str, parser)
# i used these options. feel free to change them.
styled = Premailer(html,
external_styles=['path/to/css/file.css'],
disable_leftover_css=True,
strip_important=True,
disable_validation=True,
keep_style_tags=False,
).transform()
# the html parser forcibly added <html><body> to my template when it wasn't before
# the getroot()[0][0] skips to the element contained in body
# just use etree.tostring(styled) if you want the full document
out = etree.tostring(styled.getroot()[0][0]).decode('utf8')
out = _importants.sub('', out) # if you used the strip_important flag
with open('path/to/output/template.jinja2', 'w') as f:
f.write(out)
A refactor of the Premailer.transform
method would make "jinja-var preservation" an easy option to add to the class.
from premailer.
This also bit me. Thanks for the solution, @jdq22
from premailer.
I faced same issue and used urllib.unquote
to convert HTML encoded strings back to their decoded form.
Working just fine for me. HTH.
from premailer.
Just offering another solution. It's certainly not efficient, but it does the job and I was already using bs4 for some other aspects:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, features="html.parser")
# Preserve links (template tags get URL encoded)
orig_links = []
for a in soup.find_all('a', href=True):
orig_links.append(a['href'])
a['href'] = '#link-%d' % (len(orig_links) - 1)
# Inline CSS
html = transform(
html,
premailer_html = transform(
str(soup),
strip_important=False
)
)
# Restore links
soup = BeautifulSoup(premailer_html, features="html.parser")
for a in soup.find_all('a', href=True):
lidx = int(a['href'].rsplit('-', 1)[1])
a['href'] = orig_links[lidx]
# Output: str(soup) or soup.prettify() ... I then minify it as well with htmlmin
from premailer.
What works for me currently is to set preserve_handlebar_syntax
to True
.
from premailer.
Related Issues (20)
- Premailer strips important tags from stylesheets meant to be ignored HOT 2
- Does premailer support css variables? HOT 4
- Mustache {{ }} syntax in href is broken on transform HOT 2
- premailer doesn't appear to work on m1 macs HOT 18
- premailer issue
- Prefill advanced options of hosted Premailer? HOT 3
- Add cache support for _load_external_url HOT 2
- To support css var HOT 1
- :not is not supported
- Style sorting breaking proper precedence HOT 3
- Question about self-closing tags HOT 1
- allow_network=False prevents loading of local stylesheets
- preserve_handlebar_syntax regex should be improved
- Direct child selector (">") is ignored
- Bug: Global style with !important do not take precedence over the respective inline style
- Deleted HOT 1
- load_external_url should have a proper timeout set
- AttributeError: 'CSSMediaRule' object has no attribute 'style'
- 1 HOT 2
- lxml 5.0 Seems To Break preserve_handlebar_syntax
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from premailer.