wcember / pypub Goto Github PK
View Code? Open in Web Editor NEWPython library to programatically create epub files
License: MIT License
Python library to programatically create epub files
License: MIT License
Hi, i got this message opening the rendered epub file
"This page contains the following errors:
error on line 35 at column 15: Entity 'nbsp' not defined
Below is a rendering of the page up to the first error."
Any suggestions for stripping out the tag?
Thank you, great work with pypub, really.
Is there a way to set the cover of the ebook? I didn't see anything in the documentation
Hello, I like your library very much. I successfully matched it with the latest python and django. But I have a problem with Polish characters ąęśżźćółń. I had to replace the "basestring" on "str".
e_pub_ch = create_chapter_from_string(html_string="ąęśżźćóńł", url=None, title=i.title_chapter)
pypub/epub.py always generates filename.
Related to #29
Greetings
I'm attempting to utilize pypub with my Slixfeed XMPP news bot.
def generate_epub(text, pathname):
pathname = pathname.split("/")
filename = pathname.pop()
directory = "/".join(pathname)
book = xml2epub.Epub(filename)
chapter0 = xml2epub.create_chapter_from_string(text, strict=False)
book.add_chapter(chapter0)
book.create_epub(directory, epub_name=filename)
The result would yield a filename that was not originally intended.
If my intention is to have filename_001.epub
, it would turn into filename001
.
This results in my program failing to locate the generated file.
To solve this, I import module os
and rename the file.
def generate_epub(text, pathname):
pathname_list = pathname.split("/")
filename = pathname_list.pop()
directory = "/".join(pathname_list)
book = xml2epub.Epub(filename)
chapter0 = xml2epub.create_chapter_from_string(text, strict=False)
book.add_chapter(chapter0)
filename_tmp = "slixfeedepub"
book.create_epub(directory, epub_name=filename_tmp)
pathname_tmp = os.path.join(directory, filename_tmp) + ".epub"
os.rename(pathname_tmp, pathname)
Please allow passing a complete path, as it would make workflow more consistent.
Here are other functions I use for other filetypes.
def generate_html(text, filename):
with open(filename, 'w') as file:
file.write(text)
def generate_pdf(text, filename):
pdfkit.from_string(text, filename)
def generate_markdown(text, filename):
h2m = html2text.HTML2Text()
markdown = h2m.handle(text)
with open(filename, 'w') as file:
file.write(markdown)
Notice that all filenames passed are utilized with no change.
Current demo works with some ereaders. Fails with KoReader. Also fails EPUBCheck v5.1.0
ERROR(PKG-006): My First Epub.epub//...../My%20First%20Epub.epub(-1,-1): Mimetype file entry is missing or is not the first file in the archive.
Validating using EPUB version 2.0.1 rules.
ERROR(RSC-005): My First Epub.epub/OEBPS/content.opf(2,61): Error while parsing file: element "package" missing required attribute "unique-identifier"
ERROR(OPF-048): My First Epub.epub/OEBPS/content.opf(2,61): Package tag is missing its required unique-identifier attribute and value.
ERROR(OPF-054): My First Epub.epub/OEBPS/content.opf(10,34): Date value "07-27-2023" is not valid as per http://www.w3.org/TR/NOTE-datetime:07-27-2023 class java.lang.IllegalArgumentException MONTH.
ERROR(RSC-005): My First Epub.epub/OEBPS/content.opf(16,69): Error while parsing file: value of attribute "id" is invalid; must be an XML name without colons
ERROR(RSC-005): My First Epub.epub/OEBPS/content.opf(22,25): Error while parsing file: value of attribute "idref" is invalid; must be an XML name without colons
ERROR(OPF-030): My First Epub.epub/OEBPS/content.opf(-1,-1): The unique-identifier "null" was not found.
ERROR(RSC-005): My First Epub.epub/OEBPS/toc.ncx(15,36): Error while parsing file: value of attribute "id" is invalid; must be an XML name without colons
ERROR(HTM-004): My First Epub.epub/OEBPS/toc.html(-1,-1): Irregular DOCTYPE: found "", expected "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">".
ERROR(RSC-005): My First Epub.epub/OEBPS/toc.html(2,7): Error while parsing file: elements from namespace "" are not allowed
FATAL(RSC-016): My First Epub.epub/OEBPS/toc.html(15,5): Fatal Error while parsing file: The element type "hr" must be terminated by the matching end-tag "</hr>".
ERROR(HTM-004): My First Epub.epub/OEBPS/0.xhtml(-1,-1): Irregular DOCTYPE: found "", expected "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">".
ERROR(RSC-005): My First Epub.epub/OEBPS/0.xhtml(4,9): Error while parsing file: element "head" incomplete; missing required element "title"
ERROR(RSC-005): My First Epub.epub/OEBPS/0.xhtml(6,26): Error while parsing file: element "a" not allowed here; expected element "address", "blockquote", "del", "div", "dl", "h1", "h2", "h3", "h4", "h5", "h6", "hr", "ins", "noscript", "ns:svg", "ol", "p", "pre", "script", "table" or "ul" (with xmlns:ns="http://www.w3.org/2000/svg")
ERROR(RSC-026): My First Epub.epub/OEBPS/0.xhtml(11,31): URL "/wiki/Main_Page" leaks outside the container (it is not a valid-relative-ocf-URL-with-fragment string)
ERROR(RSC-005): My First Epub.epub/OEBPS/0.xhtml(12,90): Error while parsing file: element "img" missing required attribute "alt"
ERROR(RSC-026): My First Epub.epub/OEBPS/0.xhtml(37,39): URL "/wiki/Main_Page" leaks outside the container (it is not a valid-relative-ocf-URL-with-fragment string)
ERROR(RSC-026): My First Epub.epub/OEBPS/0.xhtml(44,48): URL "/wiki/Wikipedia:Contents" leaks outside the container (it is not a valid-relative-ocf-URL-with-fragment string)
ERROR(RSC-026): My First Epub.epub/OEBPS/0.xhtml(51,51): URL "/wiki/Portal:Current_events" leaks outside the container (it is not a valid-relative-ocf-URL-with-fragment string)
ERROR(RSC-026): My First Epub.epub/OEBPS/0.xhtml(58,44): URL "/wiki/Special:Random" leaks outside the container (it is not a valid-relative-ocf-URL-with-fragment string)
ERROR(RSC-026): My First Epub.epub/OEBPS/0.xhtml(65,45): URL "/wiki/Wikipedia:About" leaks outside the container (it is not a valid-relative-ocf-URL-with-fragment string)
ERROR(RSC-005): My First Epub.epub/OEBPS/0.xhtml(155,43): Error while parsing file: value of attribute "id" is invalid; must be an XML name without colons
ERROR(RSC-005): My First Epub.epub/OEBPS/0.xhtml(300,17): Error while parsing file: element "ul" incomplete; missing required element "li"
FATAL(RSC-016): My First Epub.epub/OEBPS/0.xhtml(2486,43): Fatal Error while parsing file: The entity "mdash" was referenced, but not declared.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(11,31): Referenced resource "wiki/Main_Page" could not be found in the EPUB.
ERROR(RSC-008): My First Epub.epub/OEBPS/0.xhtml(12,90): Referenced resource "OEBPS/images/09d8c6ac-4a4a-47ef-ae01-0bdf1ac0acd2..png" is not declared in the OPF manifest.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(37,39): Referenced resource "wiki/Main_Page" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(44,48): Referenced resource "wiki/Wikipedia:Contents" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(51,51): Referenced resource "wiki/Portal:Current_events" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(58,44): Referenced resource "wiki/Special:Random" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(65,45): Referenced resource "wiki/Wikipedia:About" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(95,43): Referenced resource "wiki/Help:Contents" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(102,47): Referenced resource "wiki/Help:Introduction" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(109,56): Referenced resource "wiki/Wikipedia:Community_portal" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(116,51): Referenced resource "wiki/Special:RecentChanges" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(123,58): Referenced resource "wiki/Wikipedia:File_upload_wizard" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(155,43): Referenced resource "wiki/Special:Search" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(177,78): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(184,74): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(204,80): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(213,76): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(227,43): Referenced resource "wiki/Help:Introduction" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(236,51): Referenced resource "wiki/Special:MyContributions" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(243,42): Referenced resource "wiki/Special:MyTalk" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(277,34): Referenced resource "wiki/EPUB" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(284,39): Referenced resource "wiki/Talk:EPUB" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(325,38): Referenced resource "wiki/EPUB" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(332,67): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(339,70): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(355,60): Referenced resource "wiki/Special:WhatLinksHere/EPUB" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(362,66): Referenced resource "wiki/Special:RecentChangesLinked/EPUB" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(369,62): Referenced resource "wiki/Wikipedia:File_Upload_Wizard" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(376,54): Referenced resource "wiki/Special:SpecialPages" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(383,72): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(390,67): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(397,130): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(420,114): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(427,69): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(460,34): Referenced resource "wiki/EPUB" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(467,63): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(474,66): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(509,45): Referenced resource "wiki/Electronic_article" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(519,33): Referenced resource "wiki/E-book" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(522,38): Referenced resource "wiki/File_format" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(526,41): Referenced resource "wiki/File_extension" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(538,35): Referenced resource "wiki/E-reader" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(542,45): Referenced resource "wiki/Technical_standard" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(546,65): Referenced resource "wiki/International_Digital_Publishing_Forum" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(550,37): Referenced resource "wiki/Open_eBook" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(555,47): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(562,52): Referenced resource "wiki/Book_Industry_Study_Group" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(567,34): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(572,32): Referenced resource "wiki/XHTML" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(576,30): Referenced resource "wiki/XML" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(581,34): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(594,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(604,37): Referenced resource "wiki/Open_eBook" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(609,34): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(615,50): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(623,34): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(629,53): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(634,33): Referenced resource "wiki/MathML" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(639,34): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(645,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(650,31): Referenced resource "wiki/WOFF" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(654,31): Referenced resource "wiki/SFNT" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(659,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(664,31): Referenced resource "wiki/HTML" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(668,30): Referenced resource "wiki/CSS" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(673,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(680,65): Referenced resource "wiki/International_Digital_Publishing_Forum" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(684,52): Referenced resource "wiki/World_Wide_Web_Consortium" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(689,35): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(695,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(708,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(719,50): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(732,50): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(744,50): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(754,45): Referenced resource "wiki/Zip_(file_format)" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(759,50): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(767,32): Referenced resource "wiki/XHTML" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(771,33): Referenced resource "wiki/DTBook" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(775,53): Referenced resource "wiki/DAISY_Digital_Talking_Book" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(779,30): Referenced resource "wiki/CSS" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(783,30): Referenced resource "wiki/XML" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(787,44): Referenced resource "wiki/Table_of_contents" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(791,35): Referenced resource "wiki/Metadata" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(795,44): Referenced resource "wiki/Zip_(file_format)" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(808,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(818,46): Referenced resource "wiki/Internet_media_type" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(824,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(829,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(843,46): Referenced resource "wiki/Internet_media_type" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(849,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(854,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(861,52): Referenced resource "wiki/Portable_Network_Graphics" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(865,31): Referenced resource "wiki/JPEG" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(869,30): Referenced resource "wiki/GIF" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(873,51): Referenced resource "wiki/Scalable_Vector_Graphics" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(877,46): Referenced resource "wiki/Internet_media_type" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(882,49): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(893,34): Referenced resource "wiki/Unicode" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(897,32): Referenced resource "wiki/UTF-8" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(901,33): Referenced resource "wiki/UTF-16" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(906,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(912,49): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1091,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1102,49): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1113,35): Referenced resource "wiki/Metadata" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1126,49): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1139,44): Referenced resource "wiki/IETF_language_tag" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1147,44): Referenced resource "wiki/IETF_language_tag" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1152,31): Referenced resource "wiki/ISBN" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1156,30): Referenced resource "wiki/URL" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1165,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1170,35): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1186,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1201,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1216,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1221,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1692,44): Referenced resource "wiki/Table_of_contents" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1696,53): Referenced resource "wiki/DAISY_Digital_Talking_Book" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(1700,86): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1719,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1724,44): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(1744,44): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2053,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2064,44): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2076,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2087,49): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2178,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2189,35): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2197,50): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2235,54): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2242,34): Referenced resource "wiki/MathML" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2246,34): Referenced resource "wiki/Bitmap" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2250,52): Referenced resource "wiki/Scalable_Vector_Graphics" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2258,49): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2264,61): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2273,42): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2280,63): Referenced resource "wiki/International_Standards_Organization" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2284,68): Referenced resource "wiki/International_Electrotechnical_Commission" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2289,48): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2296,63): Referenced resource "wiki/International_Standards_Organization" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2300,68): Referenced resource "wiki/International_Electrotechnical_Commission" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2305,53): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2318,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2329,35): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2335,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2340,30): Referenced resource "wiki/CSS" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2345,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2358,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2369,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2374,31): Referenced resource "wiki/WebP" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2378,46): Referenced resource "wiki/Opus_(audio_format)" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2383,35): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2396,75): Referenced resource "w/index.php" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2409,47): Referenced resource "wiki/Reflowable_document" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2417,36): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2423,56): Fragment identifier is not defined.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2430,32): Referenced resource "wiki/HTML" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2434,43): Referenced resource "wiki/Raster_graphics" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2438,43): Referenced resource "wiki/Vector_graphics" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2442,36): Referenced resource "wiki/Metadata" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2446,31): Referenced resource "wiki/CSS" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2461,37): Referenced resource "wiki/Font_size" could not be found in the EPUB.
ERROR(RSC-007): My First Epub.epub/OEBPS/0.xhtml(2468,34): Referenced resource "wiki/MathML" could not be found in the EPUB.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2472,36): Fragment identifier is not defined.
ERROR(RSC-012): My First Epub.epub/OEBPS/0.xhtml(2480,36): Fragment identifier is not defined.
Check finished with errors
Messages: 2 fatals / 187 errors / 0 warnings / 0 infos
EPUBCheck completed
NOTE this is with fix applied from #16
During chapter loads, xmlprettify
is called to format the output nicely, and in doing so it strips the text
and tail
attributes from elements. Unfortunately this can have the unintended consequence of producing mangled epubs from reasonable HTML.
For example, this HTML:
<div>1234 <i>5</i> 6789</div>
should produce output exactly like the input,
<div>1234 <i>5</i> 6789</div>
but actually looks like:
<div>1234
<i>5</i>6789
</div>
Which will be rendered differently since there's no space after the 5.
Removing the xmlprettify call from Chapter._render
makes the output correct again.
chapter.py
:
def get_image_type(url):
for ending in ['jpg', 'jpeg', '.gif' '.png']:
if url.endswith(ending):
return ending
else:
try:
f, temp_file_name = tempfile.mkstemp()
urllib.urlretrieve(url, temp_file_name)
image_type = imghdr.what(temp_file_name)
return image_type
except IOError:
return None
This single method has 3 bugs:
url = url.lower()
since sometime extension can be uppercaser, it causes redundant http request to detect the image type.'.gif' '.png']
missing a comma, so ".gif .png" causes .png and .gif never met. Also missing '.bmp' which imghdr
will not recognize.if url.endswith(ending) or ((ending + '?') in url):
, or else it missing images with ?parameters
which itself is a html contains inner img src, and the imghdr
will not recognize it but ePUB editor and web browser able to render it.Second place is constants.py
, seems like both 'code' and pre
tags not included. It causes sample code in https://security.googleblog.com/2009/03/reducing-xss-by-way-of-automatic.html
get drop, but sample code is important. Also <style> need to support or else the caller can't control the padding between images, e.g. 'style': ['display', 'padding', 'max-height', 'max-width'],
Third place is chapter.py
should support set timeout or else it wait forever but it should give a chance skip to next chapter:
$ grep -n requests\.g pypub/chapter.py
70: requests_object = requests.get(image_url, headers=request_headers)
241: request_object = requests.get(url, headers=self.request_headers, allow_redirects=False)
Any support for python3 on the plans? =]
Looks like there are a number of forks (or extremely similar code base) based on pypub:
<pre>
tags?)The regular linked forks:
Interesting forks in most recent order:
The current setup.py / requirements.txt fail to install dependencies.
As of 2023-07-27 this project is Python 2.x only. Current dependencies will attempt to install latest versions; all of which will fail to install with Python 2.
(py27venv) C:\code\py\pypub\pypub>python pypub/unit_tests_image.py
..E.
======================================================================
ERROR: test_save_image (__main__.ChapterTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pypub/unit_tests_image.py", line 24, in test_save_image
'test image ' + str(index)),
File "C:\code\py\pypub\pypub\pypub\chapter.py", line 82, in save_image
raise ImageErrorException(image_url)
ImageErrorException: Error downloading image from http://bothsides.wpengine.netdna-cdn.com/wp-content/uploads/2015/11/bothsides1.jpg
----------------------------------------------------------------------
Ran 4 tests in 1.347s
FAILED (errors=1)
http://bothsides.wpengine.netdna-cdn.com/wp-content/uploads/2015/11/bothsides1.jpg is (as of 2023-07-29) no longer a working URL.
Archive.org appear to have a copy at https://web.archive.org/web/20190110010653/http://bothsides.wpengine.netdna-cdn.com/wp-content/uploads/2015/11/bothsides1.jpg (but this URL can not be used, looks like https://web.archive.org/web/20190110010653if_/http://bothsides.wpengine.netdna-cdn.com/wp-content/uploads/2015/11/bothsides1.jpg could be used?)
某些xml中的title是封面,如果能自动添加到epub的cover中就好了。
例如这种xml文件:
<?xml version="1.0" encoding="utf-8" standalone="no"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>封面</title><link href="http://xxx.css" rel="stylesheet" type="text/css" /></head><body><div class="center"><img alt="" class="fullscreen" src="https://xxx/xxx.jpg" href="./image/Images/cover.jpg" /></div></body></html>
python 3.9
html2epub 1.2
Get fatal errors. Should map:
&mdash
to —
- u'\u2014'
Found in issue #24
This is enough to cause truncation and unexpected behaviors in Adobe Digital Editions Version 4.5.11.187303
Hi, i upgraded to the new version via pip, but just like in the previous one you get the same error (a similar one) in the rendered epub.
This page contains the following errors:
error on line 277 at column 15: Entity 'mdash' not defined
Below is a rendering of the page up to the first error.
Could it be a problem with Beautiful Soup, unicode and entities?
thank you
Hello
Thanks for this nice project. I wonder if we could make the package python3 compatible? I know that the fork at grandemk/pypub has a working python3 version. Since this repo is the source for the pip packages, I think it would be appreciated if we can update this repo or merge it with grandemk/pypub.
The link to the documentation in the readme is broken.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.