davideuler / chm2pdf Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/chm2pdf
License: GNU General Public License v2.0
Automatically exported from code.google.com/p/chm2pdf
License: GNU General Public License v2.0
CHM2PDF (c) 2007 Massimo Sandal (c) 2007-2008 Chris Karakas <http://www.karakas-online.de> A Python script that converts a CHM file into a single PDF file. Usage: chm2pdf [options] input_filename [output_filename] See chm2pdf --help for all options. RECOMMENDED READING: - http://www.karakas-online.de/forum/viewtopic.php?t=10275 - http://www.karakas-online.de/forum/viewtopic.php?t=10969 Installation: - download the .tar.gz - unzip it: "tar -xzvf chm2pdf-a.b.c.tar.gz" - enter the newly created directory - acquire root privileges - type "python setup.py install" Requires: - python - chmlib NOTE: chmlib *must* be configured with ./configure --enable-examples - pychm - htmldoc Optional: - BeautifulSoup All of these should be in your Linux/Unix distribution repository :) To contact Massimo: [email protected] To contact Chris: [email protected]
What steps will reproduce the problem?
1. Get some CHM with images, ie a Book
2. run chm2pdf --book file.chm
Images are not included. The problem is in the images' name. I've created a
patch before, and send it to the group.
What version of the product are you using? On what operating system?
0.9 running on Ubuntu Hardy
Original issue reported on code.google.com by [email protected]
on 19 May 2008 at 3:10
Attachments:
What steps will reproduce the problem?
1. Install the chm2pdf package(v 0.9) and all dependencies from Synaptic,
on Ubuntu
2. chm2pdf --book mybook.chm
The error is caused because the chm file path includes spaces.
Example: /home/myuser/Docs/Some books to read/mychm.chm
Try to gix that
What is the expected output? What do you see instead?
The spected output is a pdf file. I just see a fatal error
What version of the product are you using? On what operating system?
Version 0.9, Ubuntu package.
Please provide any additional information below.
Here is the error:
CHM2PDF_WORK_DIR = /tmp/chm2pdf/work/pbp
CHM2PDF_ORIG_DIR = /tmp/chm2pdf/orig/pbp
Removing any previous temporary files
rm: no se puede borrar «/tmp/chm2pdf/orig/pbp/*»: No existe el fichero ó
directorio
rm: no se puede borrar «/tmp/chm2pdf/work/pbp/*»: No existe el fichero ó
directorio
failed to open /home/hexbase/Escritorio/Almost
sh: cannot create /tmp/chm2pdf/work/pbp/urlslist.txt: Directory nonexistent
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 887, in <module>
main(sys.argv)
File "/usr/bin/chm2pdf", line 883, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 180, in convert_to_pdf
objective_urls=get_objective_urls_list(filename)
File "/usr/bin/chm2pdf", line 98, in get_objective_urls_list
flist=open(CHM2PDF_WORK_DIR+'/urlslist.txt','r')
IOError: [Errno 2] No such file or directory:
'/tmp/chm2pdf/work/pbp/urlslist.txt'
Original issue reported on code.google.com by [email protected]
on 13 Dec 2008 at 2:37
What steps will reproduce the problem?
1. Look at source
2. Find lines like "CHM2PDF_WORK_DIR = CHM2PDF_TEMP_WORK_DIR + os.sep +
basename"
3. Replace with "CHM2PDF_WORK_DIR = os.path.join(CHM2PDF_TEMP_WORK_DIR,
basename)"
Original issue reported on code.google.com by [email protected]
on 19 Jul 2009 at 6:29
What steps will reproduce the problem?
1. Run chm2pdf --verbose --extract-only <somefile.chm>
2. View output directories created
3. Run ls on /tmp directory to see if working directory has images are there
and can be opened in viewer
What is the expected output? tmp directories are there for viewing
What do you see instead? No tmp directories exist with names given in the
output for CHM2PDF_WORK_DIR variable.
What version of the product are you using? 0.9.1 (from .deb file)
On what operating system? Ubuntu Linux (10.04 LTS- the Lucid Lynx)
Please provide any additional information below.
I am having troubles getting images to display for a chm-converted pdf file.
Per the article at <http://www.karakas-online.de/forum/viewtopic.php?t=11078>,
I have tried the --extract-only and --verbose options to get to the html files
to see what the problem is. While running, I can see the directories made.
But once chm2pdf ends, the directories disappear.
The properties for tmp (via ls -l) are drwxrwxrwxt (not sure what 't' flag is).
Output for chm2pdf and directory listings below. 'ls-l|wc -w' Called during
and then after chm2pdf runs. (ls truncated, and filename changed)
steve@steve-laptop:~/reading_material/tmp$ chm2pdf --verbose --extract-only
somefile.chm
CHM2PDF_WORK_DIR = /tmp/tmpJ6BPBT/somefile
CHM2PDF_ORIG_DIR = /tmp/tmpsYO78R/somefile
Correcting links in the HTML files...
steve@steve-laptop:~/reading_material/tmp$
<from another terminal>
steve@steve-laptop:/tmp$ ls (results truncated for better viewing)
tmp4rlrQd tmpmDcthS tmpuyKHhB tmpFczhlo tmpnr1psy tmpZrUI2e
tmp1gXghc tmpFI_6VZ tmpPUEFa tmp41QHFI tmpgnT0IT tmpt3XD9o
steve@steve-laptop:/tmp$
steve@steve-laptop:/tmp$ ls|wc -w (chm2pdf if running)
32
steve@steve-laptop:/tmp$ ls|wc -w (chm2pdf terminated)
30
Original issue reported on code.google.com by [email protected]
on 12 Sep 2010 at 8:20
Command:
chm2pdf --book "Filename with spaces.chm"
Log:
failed to open Filename
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 1098, in <module>
main(sys.argv)
File "/usr/bin/chm2pdf", line 1092, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 318, in convert_to_pdf
objective_urls=get_objective_urls_list(filename)
File "/usr/bin/chm2pdf", line 116, in get_objective_urls_list
flist=open(CHM2PDF_WORK_DIR+'/urlslist.txt','rU')
IOError: [Errno 2] No such file or directory: '/tmp/tmpowiRNl/Filename with
spaces/urlslist.txt'
Using Ubuntu Jaunty 9.04.
/usr/bin/chm2pdf version 0.9.1
Thanks
Original issue reported on code.google.com by [email protected]
on 20 Aug 2009 at 5:59
Let's say i have a file "Cool Scripts.chm". Trying to convert it
in pdf will fail.
>
> $chm2pdf "Cool Scripts.chm"
> failed to open Cool
> failed to open Cool
> Converting individual HTML pages in PDF...
> Traceback (most recent call last):
> File "/usr/bin/chm2pdf", line 176, in <module>
> main(sys.argv)
> File "/usr/bin/chm2pdf", line 172, in main
> convert_to_pdf(cfile, filename, outputfilename)
> File "/usr/bin/chm2pdf", line 106, in convert_to_pdf
> pf=open(page_filename,'r')
> IOError: [Errno 2] No such file or directory:
'../tempout//8015final/toc.html'
>
To overcome this bug do the following (for you file)
$cp "Cool Scripts.chm" CS.chm
$chm2pdf CS.chm
Original issue reported on code.google.com by [email protected]
on 2 Nov 2007 at 5:37
What steps will reproduce the problem?
1. Take a .chm file (don't know if it's reproducible with any .chm)
2. Run the 'chm2pdf' command using '--book' option
What is the expected output? What do you see instead?
I was expecting the PDF book :-). However, I see this error message:
sergio@miki ~/media/livros/understanding_llinux_kernel $ chm2pdf --book
ULK.chm ULK.pdf
ERR002: Error: no pages generated! (did you remember to use webpage mode?
Something wrong happened when launching htmldoc.
exit value: 256
Check if output exists or if it is good.
Done.
What version of the product are you using? On what operating system?
sergio@miki ~/media/livros/understanding_llinux_kernel $ chm2pdf --version
/usr/bin/chm2pdf version 0.9.1
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 22 Nov 2008 at 9:40
Program "chm2pdf" ignores errors produced during the execution of
"htmldoc", thus
returning the message "file.pdf written. Done" even when no pdf is created.
After examining the program with "strace", it seems that "htmldoc" segfaults
at some point, and this error is not captured by "chm2pdf". So they are
really two bugs.
I'll send you the concrete .CHM if it helps.
mremap(0xb76be000, 282624, 286720, MREMAP_MAYMOVE) = 0xb76be000
brk(0xa6a8000) = 0xa6a8000
brk(0xa6c9000) = 0xa6c9000
brk(0xa6ea000) = 0xa6ea000
mremap(0xb76be000, 286720, 290816, MREMAP_MAYMOVE) = 0xb76be000
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
[...]
mmap2(NULL, 120802, PROT_READ, MAP_PRIVATE, 6, 0) = 0xb7c58000
close(6) = 0
write(2, "sh: line 1: 22511 Violaci\363n de segmento htmldoc --duplex
--format \'pdf14\' --jpeg=\'100\' --linkcolor \'blue\' --header \'c C\'
--size \'a4\' --linkstyle \'plain\' --embedfonts --book --footer \'c C\' [...]
-- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.24 (PREEMPT)
Locale: LANG=es_ES.UTF-8, LC_CTYPE=es_ES.UTF-8 (charmap=ISO-8859-1)
(ignored: LC_ALL set to es_ES)
Shell: /bin/sh linked to /bin/bash
Versions of packages chm2pdf depends on:
ii htmldoc 1.8.27-3 HTML processor that generates inde
ii libchm-bin 2:0.39-7 library for dealing with Microsoft
ii python 2.5.2-1 An interactive high-level object-o
ii python-chm 0.8.4-0.1+b1 Python binding for CHMLIB
ii python-support 0.7.7 automated rebuilding support for P
chm2pdf recommends no packages.
-- no debconf information
Original issue reported on code.google.com by [email protected]
on 23 Apr 2008 at 4:51
spaces on .chm filenames are not properly escaped, e.g:
chm2pdf --book "file with spaces.chm"
-- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.25 (PREEMPT)
Locale: LANG=es_ES.UTF-8, LC_CTYPE=es_ES.UTF-8 (charmap=ISO-8859-1)
(ignored: LC_ALL set to es_ES)
Shell: /bin/sh linked to /bin/bash
Versions of packages chm2pdf depends on:
ii htmldoc 1.8.27-3 HTML processor that generates inde
ii libchm-bin 2:0.39-9 library for dealing with Microsoft
ii python 2.5.2-1 An interactive high-level object-o
ii python-chm 0.8.4-0.1+b1 Python binding for CHMLIB
ii python-support 0.8.1 automated rebuilding support for P
chm2pdf recommends no packages.
-- no debconf information
Original issue reported on code.google.com by [email protected]
on 7 Jul 2008 at 2:19
Hi,
the chm2pdf was crashing on some files I had. The problem was with the file I had and chm2pdf
was aborting midway in the generation of urls in urlslist.txt with some error
message being added
to urlslist.txt too.
The appended error message contained in urlslist.txt was causing chm2pdf to
crash later on as
line: 119 (of trunk)
spline[5] wouldnt work.
adding a simple check
if len(spline) == 5: urls_list.append(spline[5])
takes care of it. Now it comes out more gracefully having generated as much of
the pdf it could.
Thanks and regards
-- sreangsu
Original issue reported on code.google.com by [email protected]
on 21 Feb 2010 at 8:12
Please refer to this for details. A patch can be found in the Debian source
package, please merge this into your repository.
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=501959
Original issue reported on code.google.com by [email protected]
on 22 Nov 2008 at 5:26
What steps will reproduce the problem?
1. converting to PDF
2.
3.
What is the expected output? What do you see instead?
PDF version of CHM file
What version of the product are you using? On what operating system?
latest on Ubuntu Linux 7.10
Please provide any additional information below.
upon running > sudo /usr/bin/chm2pdf --book myChm.chm newPdf.pdf. I get
the error > ERR011: Unable to parse HTML element on line 512!
ERR002: Error: no pages generated! (did you remember to use webpage mode?
. This si odd cos the CHM in question is a BOOK, and not a webpage...
Original issue reported on code.google.com by [email protected]
on 13 Feb 2008 at 7:33
What steps will reproduce the problem?
1. Normal --book conversion
2.
3.
What is the expected output? What do you see instead?
Converted PDF
What version of the product are you using? On what operating system?
0.9.1-1.1ubuntu1, ubuntu linux
Please provide any additional information below.
chm2pdf --book haha.chm
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 1098, in <module>
main(sys.argv)
File "/usr/bin/chm2pdf", line 1092, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 386, in convert_to_pdf
correct_file(page_filename, htmlout_filename, html_list,
objective_urls, options)
File "/usr/bin/chm2pdf", line 131, in correct_file
pf=open(input_file,'rU')
IOError: [Errno 21] Is a directory: '/tmp/tmpr7s2RQ/haha/'
Original issue reported on code.google.com by [email protected]
on 21 Oct 2009 at 10:06
I have a CHM file with images, and some are not generated in the PDF. The
reason is (again) that in windows paths and names are not case sensitive, but
in linux they are. So basically the problem is there: a mismatch in upper/lower
case somewhere in the CHM is enough. The CHM will display correctly in windows
but you can't convert completely with chm2pdf.
The curious part is that in my case, the images not displayed where written
correctly but they where in the same subdirectory with other images from other
pages: and on one of the other pages the subdirectory was written lower case.
So the page where images are missing in PDF is not necessarly the page where
the mispelled upper/lowercase is, it can be on any other page. Probably what
counts is how the path is spelled the first time it is encountered generating
the CHM source file....
Anyone has some ideas how this could be solved automagically in chm2pdf?
Original issue reported on code.google.com by [email protected]
on 18 Nov 2011 at 6:05
as per this given link
"http://www.karakas-online.de/forum/viewtopic.php?t=10275" i worked but i
am facing bellow error,
1>chm2pdf --book my-file.chm
root@AmSi:/home/amaresh/Desktop# chm2pdf --book Glass\,Ables\ -\ Linux\
for\ Programmers\ and\ Users\ \(Prentice\,\ 2006\).chm
CHM2PDF_WORK_DIR = /tmp/chm2pdf/work/Glass,Ables - Linux for Programmers
and Users (Prentice, 2006)
CHM2PDF_ORIG_DIR = /tmp/chm2pdf/orig/Glass,Ables - Linux for Programmers
and Users (Prentice, 2006)
Removing any previous temporary files
sh: Syntax error: "(" unexpected
sh: Syntax error: "(" unexpected
sh: Syntax error: "(" unexpected
sh: Syntax error: "(" unexpected
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 887, in <module>
main(sys.argv)
File "/usr/bin/chm2pdf", line 883, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 180, in convert_to_pdf
objective_urls=get_objective_urls_list(filename)
File "/usr/bin/chm2pdf", line 98, in get_objective_urls_list
flist=open(CHM2PDF_WORK_DIR+'/urlslist.txt','r')
IOError: [Errno 2] No such file or directory:
'/tmp/chm2pdf/work/Glass,Ables - Linux for Programmers and Users (Prentice,
2006)/urlslist.txt'
Waiting for solution,
2>chm2pdf --book --title my-file.chm
------------------
root@AmSi:/home/amaresh/Desktop# chm2pdf --book --title Glass\,Ables\ -\
Linux\ for\ Programmers\ and\ Users\ \(Prentice\,\ 2006\).chm
CHM2PDF_WORK_DIR = /tmp/chm2pdf/work/Glass,Ables - Linux for Programmers
and Users (Prentice, 2006)
CHM2PDF_ORIG_DIR = /tmp/chm2pdf/orig/Glass,Ables - Linux for Programmers
and Users (Prentice, 2006)
Removing any previous temporary files
sh: Syntax error: "(" unexpected
sh: Syntax error: "(" unexpected
sh: Syntax error: "(" unexpected
sh: Syntax error: "(" unexpected
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 887, in <module>
main(sys.argv)
File "/usr/bin/chm2pdf", line 883, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 180, in convert_to_pdf
objective_urls=get_objective_urls_list(filename)
File "/usr/bin/chm2pdf", line 98, in get_objective_urls_list
flist=open(CHM2PDF_WORK_DIR+'/urlslist.txt','r')
IOError: [Errno 2] No such file or directory:
'/tmp/chm2pdf/work/Glass,Ables - Linux for Programmers and Users (Prentice,
2006)/urlslist.txt'
--------------------------
Original issue reported on code.google.com by amareshchandradas2005
on 7 May 2009 at 9:24
I was able to eliminate one of my ERR011: Unable to parse HTML element
on line xx! errors.
My CHM file contained some javascript, but no effort is done in
chm2pdf to delete javascript (some other unwanted stuff is deleted
before passing all to the htmldoc part).
I am no expert of regex, so the following may not be a good solution,
but at least in my case one ERR011 is gone!
# Delete javascript (<script type='text/javascript'>...</script>)
page=re.sub('(?i)<script type=("|\')text/javascript("|\')
(.*?)>(.*?)</script>','', page, flags=re.DOTALL|re.MULTILINE)
Original issue reported on code.google.com by [email protected]
on 14 Nov 2011 at 9:49
What steps will reproduce the problem?
1. Get a .chm with a TOC
2. Do a chm2pdf --webpage on it
3. TOC has linksto the right places but all whitespace has been replaced with
newlines.
What is the expected output? What do you see instead?
The TOC should transfer as it is. Instead it transfers with all words in a
heading on a separate
line.
What version of the product are you using? On what operating system?
chm2pdf 0.9, OS is MacOS X 10.5.2 on Intel x86.
Please provide any additional information below.
It is uncertain at this point which .chm files produce this broken TOC output. All the files I have
been able to get my hands on break when converted.
Original issue reported on code.google.com by [email protected]
on 4 Apr 2008 at 8:56
What steps will reproduce the problem?
1. Install on Centos 5.1
2. chm2pdf --book RHCEStudy.chm
3.
What is the expected output? What do you see instead?
rm: cannot remove `/tmp/chm2pdf/orig/RHCEStudy/*': No such file or directory
rm: cannot remove `/tmp/chm2pdf/work/RHCEStudy/*': No such file or directory
sh: /tmp/chm2pdf/work/RHCEStudy/urlslist.txt: No such file or directory
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 1111, in ?
main(sys.argv)
File "/usr/bin/chm2pdf", line 1107, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 326, in convert_to_pdf
objective_urls=get_objective_urls_list(filename)
File "/usr/bin/chm2pdf", line 114, in get_objective_urls_list
flist=open(CHM2PDF_WORK_DIR+'/urlslist.txt','rU')
IOError: [Errno 2] No such file or directory:
'/tmp/chm2pdf/work/RHCEStudy/urlslist.txt'
What version of the product are you using? On what operating system?
0.9.1 Centos 5.1
Please provide any additional information below.
Trying to convert a .chm file to .pdf
Original issue reported on code.google.com by [email protected]
on 9 Aug 2008 at 5:28
What steps will reproduce the problem?
1. run command on chm book
2.
3.
What is the expected output? What do you see instead?
I expect to get a structured pdf file.
Instead I get
ERR002: Error: no pages generated! (did you remember to use webpage mode?
Something wrong happened when launching htmldoc.
exit value: 256
Check if output exists or if it is good.
Done.
What version of the product are you using? On what operating system?
I am running v. 9.1 on Ubuntu Hardy Heron
Please provide any additional information below.
I saw that a similar bug was posted, but tagged invalid after instructions
to read the man page were given. I read the man page, and this seems to be
a bug, or an issue with certain chm format books. The chm files in
question have navigation capabilities. Unless I am mistaken, that means
they are structured.
Original issue reported on code.google.com by [email protected]
on 23 Feb 2009 at 9:18
Hi,
I'm trying to convert a chm book to pdf. But only the first 3 pages get
converted. After some debugging, I think the problem in the PageLister class:
in the start_param() method, change the line 62:
if key=='name' and value=='Local'
to:
if key=='name' and value.lower()=='local'
solved my problem. Apparently some of the fields named 'local' but not 'Local'.
Original issue reported on code.google.com by [email protected]
on 10 Jan 2009 at 10:10
What steps will reproduce the problem?
1. Installed dependences on Debian
2. Run chm2pdf --book MyFile.chm MyFile.pdf
What is the expected output? What do you see instead?
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 887, in ?
main(sys.argv)
File "/usr/bin/chm2pdf", line 883, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 242, in convert_to_pdf
correct_file(page_filename, htmlout_filename, html_list, objective_urls)
File "/usr/bin/chm2pdf", line 118, in correct_file
image_catcher.feed(page)
File "/usr/lib/python2.4/sgmllib.py", line 95, in feed
self.goahead(0)
File "/usr/lib/python2.4/sgmllib.py", line 165, in goahead
k = self.parse_declaration(i)
File "/usr/lib/python2.4/markupbase.py", line 95, in parse_declaration
decltype, j = self._scan_name(j, i)
File "/usr/lib/python2.4/markupbase.py", line 384, in _scan_name
self.error("expected name token at %r"
File "/usr/lib/python2.4/sgmllib.py", line 102, in error
raise SGMLParseError(message)
sgmllib.SGMLParseError: expected name token at
'<!\xaf\xb6\x8f\x83|(F?\xe1\x1c\xd2\xbf\xf0\x15?\xc2\x9a\xde'
What version of the product are you using? On what operating system?
chm2pdf-0.9, GNU/Linux Debian Etch
Please provide any additional information below.
None
Regards
Original issue reported on code.google.com by [email protected]
on 27 Apr 2008 at 8:28
It'd be nice if internal links worked in the PDF. I believe they break
because of the page-by-page approach (using 'pdftk cat') to pdf production.
Perhaps it is better to extract all pages & to use a file list with htmldoc.
See:
http://www.mobileread.com/forums/attachment.php?attachmentid=1794&d=1160611136
discussed at:
http://www.mobileread.com/forums/showthread.php?t=7999
for a system that does preserve links.
Original issue reported on code.google.com by [email protected]
on 7 Sep 2007 at 1:13
I have a CHM with 4 pages of less than 10 lines of text and cross-links (see
example in the attachment generated expressly to reproduce problem). The PDF
generated by chm2pdf is of 15 pages.
CHM is made with the microsoft HMTL Help Workshop 4.74.8702.0 (latest one).
Script CHM2PDF 0.9.1.1ubuntu5 on latest ubuntu 11.10.
see --verbose output:
Example.chm:
--> /#IDXHDR
--> /#ITBITS
--> /#IVB
--> /#STRINGS
--> /#SYSTEM
--> /#TOPICS
--> /#URLSTR
--> /#URLTBL
--> /$FIftiMain
--> /$OBJINST
--> /$WWAssociativeLinks/Property
--> /$WWKeywordLinks/Property
--> /doc/Images/Param_COMP_htm_5b06edf4.bmp
--> /doc/Images/Param_COMP_htm_5b06edf4.GIF
--> /doc/Images/Param_COMP_htm_5b06edf4.PNG
--> /doc/Images/param_MS.png
--> /doc/Index.hhk
--> /doc/P1.htm
--> /doc/P2.htm
--> /doc/P3.htm
--> /doc/P4.htm
--> /toc.hhc
Correcting /tmp/tmpr5QJmW/Example/doc/P1.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P2.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P2.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P2.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P3.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P3.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P3.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P3.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P4.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P4.htm
Correcting /tmp/tmpr5QJmW/Example/doc/P4.htm
############### 1st pass ###############
match P1\.htm and replace it with temp0001_html
match P2\.htm and replace it with temp0002_html
match P2\.htm and replace it with temp0002_html
match P2\.htm and replace it with temp0002_html
match P3\.htm and replace it with temp0005_html
match P3\.htm and replace it with temp0005_html
match P3\.htm and replace it with temp0005_html
match P3\.htm and replace it with temp0005_html
match P4\.htm and replace it with temp0009_html
match P4\.htm and replace it with temp0009_html
match P4\.htm and replace it with temp0009_html
############### 2nd pass ###############
match temp0001_html and replace it with temp0001.html
match temp0002_html and replace it with temp0002.html
match temp0003_html and replace it with temp0003.html
match temp0004_html and replace it with temp0004.html
match temp0005_html and replace it with temp0005.html
match temp0006_html and replace it with temp0006.html
match temp0007_html and replace it with temp0007.html
match temp0008_html and replace it with temp0008.html
match temp0009_html and replace it with temp0009.html
match temp0010_html and replace it with temp0010.html
match temp0011_html and replace it with temp0011.html
htmldoc --webpage --duplex --format 'pdf14' --jpeg='100' --linkcolor 'blue'
--header 'c C' --size 'a4' --no-duplex --linkstyle 'plain' --embedfonts
--bodyfont times --footer 'c C' "/tmp/tmpz5hkxw/Example/temp0001.html"
"/tmp/tmpz5hkxw/Example/temp0002.html" "/tmp/tmpz5hkxw/Example/temp0003.html"
"/tmp/tmpz5hkxw/Example/temp0004.html" "/tmp/tmpz5hkxw/Example/temp0005.html"
"/tmp/tmpz5hkxw/Example/temp0006.html" "/tmp/tmpz5hkxw/Example/temp0007.html"
"/tmp/tmpz5hkxw/Example/temp0008.html" "/tmp/tmpz5hkxw/Example/temp0009.html"
"/tmp/tmpz5hkxw/Example/temp0010.html" "/tmp/tmpz5hkxw/Example/temp0011.html"
-f example.pdf > /dev/null
PAGES: 15
BYTES: 211921
Written file example.pdf
Done.
Original issue reported on code.google.com by [email protected]
on 10 Nov 2011 at 8:21
Attachments:
I have some trouble with the last pages of my documents.
In one case, i get some ERR011: Unable to parse HTML element on line 49! from
htmldoc on the last pages.
The strange thing is, that if I re-run the very same htmldoc command dipalyed
with the --verose --verbosity high level (obviosly not deleting the temporary
files), the pdf will be complete and no error ERR011 is rised.
It's like the last file written by the conversion before invoking HTMLDOC is
still open or not completely written!
In an other case, the last page is simply completly missing, without any error
message.
But the code look just fine to me:
pf=open(filename,'w')
pf.write(page)
pf.close
What could be wrong??
Writing a dummy file afterwards before invoking HTMLDOC resolves the problem,
but this seems quite a ugly hack to me (I am not a programmer).
Then I found the command
pf.flush()
This also solves the problem. But why is this necessary??
Original issue reported on code.google.com by [email protected]
on 26 Nov 2011 at 9:27
Attachments:
What steps will reproduce the problem?
1. run chm2pdf --extract-only <filename.chm>
2.
3.
What is the expected output? What do you see instead?
This should produce a data directory containing html files. Instead, data
directory is deleted.
What version of the product are you using? On what operating system?
chm2pdf v 0.9.1
Please provide any additional information below.
Program source code (with line numbers):
1069 CHM2PDF_WORK_DIR = CHM2PDF_TEMP_WORK_DIR + os.sep + basename
1070 CHM2PDF_ORIG_DIR = CHM2PDF_TEMP_ORIG_DIR + os.sep + basename
...
1102 convert_to_pdf(cfile, filename, outputfilename, options)
1103 shutil.rmtree(CHM2PDF_TEMP_WORK_DIR)
1104 shutil.rmtree(CHM2PDF_TEMP_ORIG_DIR)
This shows that WORK_DIR and ORIG_DIR are *below* TEMP_WORK_DIR and
TEMP_ORIG_DIR, and so are deleted at lines 1103, 1104.
Program needs to test for options['extract-only']=='' before calling
shutil.rmtree(CHM2PDF_TEMP_WORK_DIR).
Original issue reported on code.google.com by [email protected]
on 6 Aug 2011 at 11:26
chm2pdf 0.9
Bug submission following discussion in Google group chm2pdf.
When converting the NSIS User's Manual NSIS.chm, the resulting
PDF file contains only a table of content. The main text is missing.
CHM file is attached to this report and can also be downloaded
from http://nsis.sourceforge.net/
In the Google group discussion, Chris Karakas mentions that he
reproduces the problem with his development 0.9.1 version and asked
for a bug submission. Here it is...
> I tried both 0.9 and my "development" 0.9.1 version. The problem exists
> in both. It comes from the fact that the CHM file "says" that it contains
> files with names like "SectionF.21.html#F.21.1.2", but actually it
> contains files like "SectionF.21.html". That is, we have to take away the
> "anchor information" (the "#F.21.1.2" part), before dealing with the
> files in chm2pdf.
>
> This seems to be a bug, so please be so kind and open one. :-)
>
> Chris
Original issue reported on code.google.com by [email protected]
on 13 Mar 2008 at 4:17
Attachments:
Greetings, thank you for this great program. I've created a python 2.6 portfile
for macports.
I just wanted to note that it is no longer necessary to compile your own chmlib
as the macports
chmlib is configured with --enable-examples.
-james
Original issue reported on code.google.com by [email protected]
on 9 Sep 2009 at 3:54
What steps will reproduce the problem?
1. Running chm2pdf --book <chm file>
2.
3.
What is the expected output? What do you see instead?
Traceback (most recent call last):
File "/usr/local/bin/chm2pdf", line 887, in <module>
main(sys.argv)
File "/usr/local/bin/chm2pdf", line 883, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/local/bin/chm2pdf", line 179, in convert_to_pdf
html_list=get_html_list(cfile)
File "/usr/local/bin/chm2pdf", line 88, in get_html_list
lister.feed(topicstree)
File "/usr/lib/python2.7/sgmllib.py", line 103, in feed
self.rawdata = self.rawdata + data
TypeError: cannot concatenate 'str' and 'NoneType' objects
What version of the product are you using? On what operating system?
0.9.1-1.1ubuntu4
Please provide any additional information below.
I can provide the CHM file by email. Mail me at [email protected]
Original issue reported on code.google.com by [email protected]
on 5 Aug 2011 at 5:24
In my application some links are not working in the PDF as I have some
upper/lower case errors in links. As CHM is "windows stuff" this doesen't
matter there, but "here" it does!
So how about making the 1. pass matching case insensitive adding the (?i)
modifier in the regular expression?
Original issue reported on code.google.com by [email protected]
on 18 Nov 2011 at 6:05
What steps will reproduce the problem?
1. chm2pdf --book filename
Output:
PAGES: 142
BYTES: 824607
Written file tdd.pdf
Done.
What is the expected output? What do you see instead?
The chm involved tables -> but in pdf there is no tables at all, the text
that should be in it disappeared
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 11 Nov 2009 at 1:30
What steps will reproduce the problem?
1. /usr/bin/python /usr/bin/chm2pdf --book book.chm book.pdf
What is the expected output? What do you see instead?
chm is successfully converted to pdf. However this error occurs
sgmllib.py:111:error:SGMLParseError: unexpected '\xbd' char in
declaration and chm2pdf crashes.
What version of the product are you using? On what operating system?
chm2pdf-0.9.1 on Fedora 12.
Please provide any additional information below.
backtrace
-----
sgmllib.py:111:error:SGMLParseError: unexpected '\xbd' char in declaration
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 1111, in <module>
main(sys.argv)
File "/usr/bin/chm2pdf", line 1107, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 394, in convert_to_pdf
correct_file(page_filename, htmlout_filename, html_list, objective_urls,
options)
File "/usr/bin/chm2pdf", line 140, in correct_file
image_catcher.feed(page)
File "/usr/lib64/python2.6/sgmllib.py", line 104, in feed
self.goahead(0)
File "/usr/lib64/python2.6/sgmllib.py", line 174, in goahead
k = self.parse_declaration(i)
File "/usr/lib64/python2.6/markupbase.py", line 136, in parse_declaration
"unexpected %r char in declaration" % rawdata[j])
File "/usr/lib64/python2.6/sgmllib.py", line 111, in error
raise SGMLParseError(message)
SGMLParseError: unexpected '\xbd' char in declaration
Local variables in innermost frame:
message: "unexpected '\\xbd' char in declaration"
self: <__main__.ImageCatcher instance at 0x7fa85d2bdef0>
Bugzilla bug at https://bugzilla.redhat.com/show_bug.cgi?id=629659
Original issue reported on code.google.com by lakshminaras2002
on 14 May 2011 at 11:13
What steps will reproduce the problem?
> chm2pdf
What is the expected output? What do you see instead?
Traceback (most recent call last):
File "/usr/local/bin/chm2pdf", line 24, in <module>
import chm.chm as chm
ImportError: No module named chm.chm
What version of the product are you using? On what operating system?
I use 9.1 version of chm2pdf on linux 2.6.38-11-generic
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 12 Sep 2011 at 12:58
Something wrong happened when launching htmldoc.
exit value: 16640
ERR011: Unable to read image file "/tmp/tmpVEDMGt/UNIX\ System\ Programming/"!
Original issue reported on code.google.com by [email protected]
on 3 Dec 2010 at 3:33
What steps will reproduce the problem?
1. rename a legitimate chm file into abc\'xyz
2. chm2pdf --book abc\'xyz
3.
What is the expected output? What do you see instead?
Got an error saying a tmp folder doesn't exit.
sh: Syntax error: Unterminated quoted string
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 1108, in <module>
main(sys.argv)
File "/usr/bin/chm2pdf", line 1102, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 318, in convert_to_pdf
objective_urls=get_objective_urls_list(filename)
File "/usr/bin/chm2pdf", line 116, in get_objective_urls_list
flist=open(CHM2PDF_WORK_DIR+'/urlslist.txt','rU')
IOError: [Errno 2] No such file or directory:
"/tmp/tmp9iuwl8/abc'xyz/urlslist.txt"
What version of the product are you using? On what operating system?
Version 0.9.1
Ubuntu 10.04
Please provide any additional information below.
The problem can be fixed by adding the following lines.
1069d1068
< basename = '_' + re.sub(r'[^\w]', '', basename)
1091d1089
< filename = filename.replace("\'", "\\\'")
1093d1090
< outputfilename = outputfilename.replace("\'", "\\\'")
Original issue reported on code.google.com by [email protected]
on 21 Sep 2012 at 5:36
What steps will reproduce the problem?
1. executing chm2pdf with a "huge" chm (1054 pages)
2.
3.
What is the expected output? What do you see instead?
it gives error on re.sub line 159
What version of the product are you using? On what operating system?
version 0.9.1
Please provide any additional information below.
Bug solved using: page=re.sub(re.escape(iurl),img_filename,page)
probably the iurl contains some special chars
Original issue reported on code.google.com by [email protected]
on 11 Mar 2009 at 8:33
What steps will reproduce the problem?
1. Execute chm2pdf on a CHM file that contains spaces in its internal file
structure.
What is the expected output?
A shiny, new PDF.
What do you see instead?
user@computer ~ $ chm2pdf --book temp.chm
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 1098, in <module>
main(sys.argv)
File "/usr/bin/chm2pdf", line 1092, in main
convert_to_pdf(cfile, filename, outputfilename, options)
File "/usr/bin/chm2pdf", line 318, in convert_to_pdf
objective_urls=get_objective_urls_list(filename)
File "/usr/bin/chm2pdf", line 121, in get_objective_urls_list
urls_list.append(spline[5])
IndexError: list index out of range
What version of the product are you using? On what operating system?
0.9.1 on Linux Mint Debian Edition.
Please provide any additional information below.
A kind fellow named Reto has posted a solution here:
https://groups.google.com/forum/#!topic/chm2pdf/859fW7pSMWA
In get_objective_urls_list of the main script, change the contents of the for
loop to the following:
for line in flist.readlines()[3:]:
spline= re.sub(r".*?normal file\s*(.*?)\n$", "\\1", line)
if spline[0]=="/":
urls_list.append( spline)
flist.close()
I know little Python and even less CHM, but the fix worked like a charm for me.
Now let's see if I can't do something about removing those annoying footers in
the original CHM...
Thanks for all the time and work that's gone into chm2pdf. It has already
helped me out of a tight spot.
Original issue reported on code.google.com by [email protected]
on 26 May 2013 at 1:29
sander@athlon64:~/chm2pdf-0.0.2$ chm2pdf ~/Azureus\ Downloads/Learning\
Something/Learning\ Something.chm
Traceback (most recent call last):
File "/usr/bin/chm2pdf", line 11, in <module>
import chm.chm as chm
ImportError: No module named chm.chm
sander@athlon64:~/chm2pdf-0.0.2$
Original issue reported on code.google.com by [email protected]
on 19 Aug 2007 at 12:21
On Opensuse 10.3, I installed chm2pdf and chmlib (from source with
--enable-example), pychm (I extract the tarball in /) and htmldoc (from
suse repository), but it appers:
# chm2pdf Traceback (most recent call last):
File "/usr/local/bin/chm2pdf", line 24, in <module>
import chm.chm as chm
ImportError?: No module named chm.chm
Thanks
Original issue reported on code.google.com by [email protected]
on 7 Dec 2007 at 8:36
What steps will reproduce the problem?
1. download source
2. unpack to disk
3. look at source script, line 146
What is the expected output? What do you see instead?
144 f=open(output_file,'w')
145 f.write(page)
146 f.close # BUG! <=========== MISSED "()"
147 #hack to guarantee that the file has been wholly written
148 f=open(output_file,'r')
149 while len(f.read()) < len(page):
150 pass
151 f.close()
INSTEAD:
146 f.close()
there are no method calling without parenthesis :)
What version of the product are you using? On what operating system?
No matter.
Please provide any additional information below.
Use pylint or other tool to verificaton source code.
Dont do such stupid hacks on that clear language as Python :D
Original issue reported on code.google.com by [email protected]
on 4 Feb 2008 at 12:42
What steps will reproduce the problem?
1. chm2pdf somefile.chm
What is the expected output? What do you see instead?
segment error.
What version of the product are you using? On what operating system?
os is debian 6. apt-get install chm2pdf
Please provide any additional information below.
I think this is htmldoc's error, so, why not use wkhtmltopdf
Original issue reported on code.google.com by huangmingyou
on 25 Feb 2011 at 8:21
What steps will reproduce the problem?
1. while installing through apt-get install <package>, i am getting this error
2. root@AmSi:/home/amaresh# apt-get install chmlib
Reading package lists... Done
Building dependency tree
Reading state information... Done
Package chmlib is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
E: Package chmlib has no installation candidate
3. root@AmSi:/home/amaresh# apt-get install pychm
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Couldn't find package pychm
4. apt-get install htmldoc
Reading package lists... Done
Building dependency tree
Reading state information... Done
htmldoc is already the newest version.
The following packages were automatically installed and are no longer required:
rwhod libdb4.5
Use 'apt-get autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 10 not upgraded.
What is the expected output? What do you see instead?
E: Couldn't find package <name>
What version of the product are you using? On what operating system?
Ubuntu 8.10 ,Linux
Please provide any additional information below.
Original issue reported on code.google.com by amareshchandradas2005
on 7 May 2009 at 7:49
The first color information after a link is removed.
E.g this in orig:
<table>
<tr>
<td bgcolor="#00ff00">row 1, col1</td>
<td bgcolor="#00ff00">row 1, col2 <a href="P1.htm"> here a link</a></td>
<td bgcolor="#00ff00">row 1, col3</td>
</tr>
</table>
Becomes this in work:
<table>
<tr>
<td bgcolor="#00ff00">row 1, col1</td>
<td bgcolor="#00ff00">row 1, col2 <a href="temp0001.html"> here a link</a></td>
<td bgcolor="">row 1, col3</td>
</tr>
</table>
Original issue reported on code.google.com by [email protected]
on 13 Nov 2011 at 4:40
The page numbers of the created pdf, start's for every section of chm file
from number 1
Original issue reported on code.google.com by [email protected]
on 2 Nov 2007 at 5:54
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.