holtwick / xhtml2pdf Goto Github PK

This project forked from xhtml2pdf/xhtml2pdf

HTML/CSS to PDF converter based on Python. Most current fork/version by Chris Glass: https://github.com/chrisglass/xhtml2pdf

License: Apache License 2.0

Python 100.00%

xhtml2pdf's Introduction

HELP
====

> xhtml2pdf -h

REQUIREMENTS
============

- Reportlab Toolkit 2.2+
  <http://www.reportlab.org/>

- html5lib 0.11.1+
  <http://code.google.com/p/html5lib/>

- pyPdf 1.11+ (optional)
  <http://pybrary.net/pyPdf/>

EXAMPLES
========

> xhtml2pdf -s test\test-loremipsum.html
> xhtml2pdf -s http://www.python.org
> xhtml2pdf test\test-*.html

PYTHON INTEGRATION
==================

Some simple demos of how to integrate PISA into
a Python program may be found here: test\simple.py

CONTACT
=======

[email protected]

LICENSE
=======

Copyright 2010 Dirk Holtwick, holtwick.it

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

xhtml2pdf's People

Contributors

Stargazers

Watchers

Forkers

srusskih lakesh jonknee 42ventures ixion dsqmoore trsqxyz jmerdich fredmusoro

xhtml2pdf's Issues

Table sometimes expands out of page

Sometimes, when output has multiple pages and a page starts with a Table, this table expands to out of page. It does not occur if widths of all cols has defined.

I'll post a link to a Gist with some sample of nonworking HTML.

cssParser fails with escaped quotes within string

cssParser does not recognize escaped characters within string because the backslash character is consumed by the first group [\t !#$%&(-~]. For example, the following string in a css file makes xhtml2pdf fail with traceback: "\"}\"".

It helps if I put i_escape and i_escape_nl before the plain characters group in i_string_content rule:

--- cssParser.py.orig   2009-03-18 14:02:36.000000000 +0200
+++ cssParser.py    2011-02-18 11:19:35.386109400 +0200
@@ -281,7 +281,7 @@
     re_rgbcolor = re.compile(i_rgbcolor, _reflags)
     i_nl = u'\n|\r\n|\r|\f'
     i_escape_nl = u'\\\\(?:%s)' % i_nl
-        i_string_content = _orRule(u'[\t !#$%&(-~]', i_escape_nl, i_nonascii, i_escape)
+        i_string_content = _orRule(i_escape, i_escape_nl, u'[\t !#$%&(-~]', i_nonascii)
     i_string1 = u'\"((?:%s|\')*)\"' % i_string_content
     i_string2 = u'\'((?:%s|\")*)\'' % i_string_content
     i_string = _orRule(i_string1, i_string2)

Segmentation fault

I'm getting segmentation fault in pisa either by using the xhtml2pdf cli tool or when trying to import pisa:

$ xhtml2pdf -h
Segmentation fault

$ python
Python 2.6.4 (r264:75706, Jan 26 2010, 14:52:19)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import ho.pisa as pisa
Segmentation fault

I believe there is a missing dependency which is not checked, because I have the same version installed on another machine with similar configuration and it's working perfectly.

$ uname -a
Linux cyrus-dev 2.6.18-238.12.1.el5 #1 SMP Tue May 31 13:23:01 EDT 2011 i686 i686 i386 GNU/Linux

$ cat /etc/issue
CentOS release 5.6 (Final)
Kernel \r on an \m

Thanks!

Total number of pages not available as proprietary tag

I would like to have the total number of pages available as a tag, like pdf:totalpages or smth. Is that not in there for performance reasons (I suppose you'd have to backtrack after rendering the entire document)? I'm a php programmer with no python experience, but I would like to help if needed.

While we're at it: a tag for the current date (with a format parameter) wouldn't be such a bad idea either (I think I can even do that myself :-))

startViewer function does not work on gnu/linux

I suggest making the following changes on pisa.py line 470
Change this:
os.system('open "%s"' % filename)

For this:
os.system('xdg-open "%s"' % filename)

form elements not recognized

Although pisa_reportlab is able to process input elements of types "radio" and "select", pisa_parser allows only "text", "hidden", and "checkbox". Perhaps "radio" and "select" should be added to TAGS["input"]. It would be good to add "password" and "submit" types also.

Demo no longer works

http://www.xhtml2pdf.com/demo

AttributeError: Element instance has no attribute 'matchesNode'

I get the following error when xhtml2pdf is run with debug option:

Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\sx\pisa3\pisa_parser.py", line 231, in CSSCollect
cssAttrMap[cssAttrName] = node.getCSSAttr(c.cssCascade, cssAttrName)
  File "C:\Python27\lib\site-packages\sx\pisa3\pisa_parser.py", line 195, in getCSSAttr
result = cssCascade.findStyleFor(self.cssElement, attrName, default)
  File "C:\Python27\lib\site-packages\sx\w3c\css.py", line 133, in findStyleFor
rule = self.findCSSRulesFor(element, attrName)
  File "C:\Python27\lib\site-packages\sx\w3c\css.py", line 153, in findCSSRulesFor
rules += ruleset.findCSSRuleFor(element, attrName)
  File "C:\Python27\lib\site-packages\sx\w3c\css.py", line 504, in findCSSRuleFor
return self.findCSSRulesFor(*args, **kw)[-1:]
  File "C:\Python27\lib\site-packages\sx\w3c\css.py", line 496, in findCSSRulesFor
if (attrName in declarations) and (nodeFilter.matches(element)):
  File "C:\Python27\lib\site-packages\sx\w3c\css.py", line 272, in matches
if not qualifier.matches(element):
  File "C:\Python27\lib\site-packages\sx\w3c\css.py", line 463, in matches
return selector.matches(element.getPreviousSibling())
  File "C:\Python27\lib\site-packages\sx\w3c\css.py", line 268, in matches
if not element.matchesNode(self.fullName):
AttributeError: Element instance has no attribute 'matchesNode'

should CSSDOMElementInterface.getPreviousSibling() return a CSSElementInterfaceAbstract instance, or should CSSSelectorBase.matches() accept unwrapped DOM elements?

XHTMLParser is no longer available in latest html5lib

pisa_parser uses a XHTMLParser when -xhtml switch is set to true. That class is no longer available in the .90 release of html5lib. This version installs by default in the Lucid release of Ubuntu thus breaking the installations of PISA. Requirement for PISA should be changed to html5lib 0.11 only and not html5lib 0.11+.

render SELECT inputs

Hello
Is there a way to render select inputs yet ?
the changelog mentions it was "prepared", perhaps in another branch ?
thank you

sx/w3c/css.py:40: DeprecationWarning: the sets module is deprecated

maybe this simple workaround will be useful?

try:
    set
except NameError:
    from sets import Set as set   # Python 2.3 fallback

then use set instead sets.Set in line 529

Header height = top + height

When I generate a pdf with a fixed header, its actual height = declared height + declared top.

E.g.
With this configuration I get a header starting at 1cm with an height of 2cm:

@page {
@frame header {
-pdf-frame-content: headerContent;
-pdf-frame-border: 1;
top: 1cm;
height: 1cm;
left: 1cm;
right: 1cm;
}
}

pisaDocument does not close or flush file

Using pisa.CreatePDF does not close or flush the pisaContext document. As such, you cannot manipulate or move the PDF until the requesting method has finished because the last chunk is not written to the final PDF file. In my tests, the tmp file is a valid PDF document; however the generated PDF does not have the final chunk of info until the calling method ends.

Adding
c.dest.flush() or c.dest.close() to the pisaDocument method before returning the pisaContext instance resolves this issue.

holtwick / xhtml2pdf Goto Github PK

xhtml2pdf's Introduction

xhtml2pdf's People

Contributors

Stargazers

Watchers

Forkers

xhtml2pdf's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs