jesselau76 / ebook-gpt-translator Goto Github PK

View Code? Open in Web Editor NEW

1.6K 12.0 205.0 653 KB

Enjoy reading with your favorite style.

Home Page: https://jesselau.com

License: MIT License

Python 100.00%

epub pdf python translation translator docx mobi

ebook-gpt-translator's People

Contributors

Stargazers

Watchers

Forkers

absoluteray ilovecode93 lavineleo gilbert0571 otoitsuki sportfy davyvision hehe2046 maeganyork 0x1984 justonehe jinjiefan2020 lovegotobe dafang anjing137 baifengbai zyshjklm yorzi besuccesswag tsuyosaz bluce-clj linuer jnightlee yudern xrz369 knightcn1983 xflykkk rslu2000 zark7777 coopergu lejoys apspecial deepbreath canzj lw9726 jevoncode asaltfishx joohnnie nickcen maxjn8985 kurotanshi knightnic tian64873493 gmh5225 anylee2021 s2077 nicocanada bitefoo diouf glaceage yuanxin0923 tanquan tournet cameronpriest panshiforks ai-awe phoenixyj jackcashman wowmarcomei wuzhi7 tractortoby yankunsong darkhucx jinzaizhichi wonglynn somexin lorsso 0xviviyorg kitsch777 munntein mls2009 liuguangyong93 giancarlo-ma codefker flowerwithoutbee ac1982 aceluodan lukalake zinojeng owcako xjtuyanshi taitouyy ccto2 c00renut yelban gridechelon october-wind sherifneamatalla michaelwanggithub2 fushun1990 zjm1060 ai-ld xinyedai artchess lizhenzhublog omygpt roclv kagangtuya-star qby123456 fastisslowben

ebook-gpt-translator's Issues

ImportError: cannot import name 'HOCRConverter' from 'pdfminer.converter' (D:\Python310\lib\site-packages\pdfminer\converter.py)

pdfminer.converter.HOCRConverter在python3.10中已被删除

openai.error.APIError

当出现openai.error.APIError: HTTP code 502 from API时，书籍已经翻译到第六章，那么后续该怎么处理才能不从头开始翻译呢？

已解决：cfg报错提示：不是内部或外部命令，也不是可运行的程序或批处理文件。

如图所示，如何修复呢

正常运行但是什么都不出现啊

我正常安装,也设定了setting.cfg，在对应目录中运行text_translation.py，以后什么都没有，既没有生成文件，也没有报错啊？？？求问各位大佬是什么情况

python3 text_translation.py --test xxxxx.pdf有如下报错，代理没问题的，请问这是什么问题

翻译风格的prompt探讨

作者大大，你介绍中的那张翻译图，把英文翻译成了文言文，您是用了什么prompt？我看源码里的prompt只是让它扮演gpt4进行翻译，应该还有什么吧？

[Feature Request] Add "Start Page" and "End Page" for the translation

To better control the translation range, and save tokens.
e.g. The previous 1-3 pages of a book is OK to skip.

导入 ChatCompletion报错。

File "text_translation.py", line 229, in translate_text
completion = openai.ChatCompletion.create(
AttributeError: module 'openai' has no attribute 'ChatCompletion'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "text_translation.py", line 363, in
translated_short_text = translate_and_store(short_text)
File "text_translation.py", line 281, in translate_and_store
translated_text = translate_text(text)
File "text_translation.py", line 253, in translate_text
completion = openai.ChatCompletion.create(
AttributeError: module 'openai' has no attribute 'ChatCompletion'

怎么解决啊大佬。

mobi问题

请问这个怎么弄呢

可否上传一个requirements？

openai.error.APIConnectionError: Error communicating with OpenAI: HTTPSConnectionPool(host='api.openai.com', port=443): Max retries exceeded with url: //v1/chat/completions (Caused by ProxyError('Unable to connect to proxy', SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1129)'))))

是我自己api有问题吗

pdf-2-text 进度条逻辑有误

这里进度条逻辑有问题，重复解析 10 次

可选更大的分割长度

现在gpt3.5已经有16k模型了，考虑到个人用户的api被限制在3次/分钟，更大的分割长度会提高翻译速度

此处代码是否需要修改？

text_translation.py 中的第 300 行：
def return_text(text): text = text.replace(".", ".\n")

是否应该将text.replace(".", ".\n") 改为text.replace(". ", ".\n")？
因为 . 不一定表示英文中的句号，也有可能出现在数字（如3.14）或者代码中（如text.replace）；
在 . 后面加上空格，才能准确地对应英文中的句号。

不知我的想法是否正确，望解答～

一直报错

Traceback (most recent call last):
File "/Users/cellier/ebook-GPT-translator/text_translation.py", line 121, in
config_text = f.read()
UnicodeDecodeError: 'gb2312' codec can't decode byte 0x81 in position 167: illegal multibyte sequence

这个pdfminer好多报错是咋回事啊

明明已经下载了pdfminer 这个库但是还是提示 No module named 'pdfminer.high_level 要么就是导入失败

txt翻译成epub以后没有内容

你好，我使用该程序翻译了一个txt文件，文件中全是英文内容。最后翻译出来是这样的

这里没有内容，我会自己尝试调试一下，看看具体有什么问题。文本确实很长，三万多个字符。

在谷歌colab中运行该项目

笔记本链接：https://drive.google.com/file/d/1-5dt9l8Cswx8P6ZvKK7wjF2iM4tgAG_3/view?usp=sharing

Missing deps

ModuleNotFoundError: No module named 'pdfminer' so I run pip install pdfminer
Then ModuleNotFoundError: No module named 'pdfminer.high_level'
Have you tested it on a new machine which doesn't have any python modules?

我想翻译小说，但是人物的性格和名称翻译的有些不好，能不能让我能够提前输入人物简介，再翻译

能不能做出一个程序安装包？

多谢

请问无法找到命令地址是怎么回事？

我是新手，请问找不到命令是怎么回事？

如何批量生产pdf双语

若每次执行换一次命令,想按顺序每次批量转,命令如何改

报错，一次翻译的词数太多

报错信息：
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 29968 tokens. Please reduce the length of the messages.

我数了了下，报错那次一共翻译了文本的1952到7285排共5333排的内容
我看了下代码，说是每次限制了长度1024，但是对于短文本的处理依然有问题啊，根本没能限制长度，为什么短文本要放在一起翻译而不是遍历每一排来翻译呢？

能否增加支持azure openai api？

给openai api充值太不容易了，能否支持使用azure openai api的选项？

翻译epub文件会破坏内置的样式效果。

尝试翻译了一片四十万字符左右的书，发现里面的排版样式都丢了，比如目录、每章节的分页换行等。

如果使用GPT4模型？

我的API已经通过GPT-4白名单，我在哪里设置可以将默认gpt-3.5模型修改为gpt-4

处理PDF文件时遇到了无效的交叉引用（XRef）表

解析这个optimized过的pdf报错，在deepl里面是可以正常处理的。
https://assets.ctfassets.net/95kuvdv8zn1v/44FqPJmYPZRwiZN2socdOK/14f5eb025d87a452100d80f513567f2a/Cruise_Impact_Report_-_2022-optimized.pdf

Converting PDF to text:   0% 0/10 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 722, in __init__
    self.read_xref_from(parser, pos, self.xrefs)
  File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 1000, in read_xref_from
    xref.load(parser)
  File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 282, in load
    raise PDFNoValidXRef("Invalid PDF stream spec.")
pdfminer.pdfdocument.PDFNoValidXRef: Invalid PDF stream spec.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/content/drive/MyDrive/ebook-GPT-translator/text_translation.py", line 347, in <module>
    text = convert_pdf_to_text(filename,startpage,endpage)
  File "/content/drive/MyDrive/ebook-GPT-translator/text_translation.py", line 221, in convert_pdf_to_text
    end_page = get_total_pages(pdf_filename)
  File "/content/drive/MyDrive/ebook-GPT-translator/text_translation.py", line 217, in get_total_pages
    document = PDFDocument(parser)
  File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 727, in __init__
    newxref.load(parser)
  File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 241, in load
    (_, obj) = parser.nextobject()
  File "/usr/local/lib/python3.9/dist-packages/pdfminer/psparser.py", line 609, in nextobject
    (pos, token) = self.nexttoken()
  File "/usr/local/lib/python3.9/dist-packages/pdfminer/psparser.py", line 526, in nexttoken
    self.fillbuf()
  File "/usr/local/lib/python3.9/dist-packages/pdfminer/psparser.py", line 239, in fillbuf
    raise PSEOF("Unexpected EOF")
pdfminer.psparser.PSEOF: Unexpected EOF

请教这个问题：ImportError: cannot import name 'HOCRConverter' from 'pdfminer.converter' (/usr/local/lib/python3.9/dist-packages/pdfminer/converter.py)

甲骨文AMD 机器，Debian11系统，
运行后出现：
ImportError: cannot import name 'HOCRConverter' from 'pdfminer.converter' (/usr/local/lib/python3.9/dist-packages/pdfminer/converter.py)
请教是什么问题？
另外，我在windows本地运行，必须开可以访问openai的节点吧？

大佬是神人还会修道厉害佩服

No module named 'chardet'，requirements.txt 文件的内容是不是要加上一个chardet

python 版本 3.10.1
python pip install -r requirements.txt
首次运行报错，然后报错，提示

Traceback (most recent call last):
  File "C:\Users\ebook-GPT-translator\text_translation.py", line 114, in <module>
    import chardet
ModuleNotFoundError: No module named 'chardet'

于是再安装 python -m pip install chardet，就可以了，看来 requirements.txt 的内容要更新。

但是运行还有一些警告

C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\__init__.py:102: RequestsDependencyWarning: urllib3 (1.26.8) or chardet (5.1.0)/charset_normalizer (2.0.10) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({})/charset_normalizer ({}) doesn't match a supported "

看起来不影响使用。

使用Pycharm报错：Unsupported file type

目前控制台显示是这样的。
那些包也全都安装了呢。
API-key用到是openAI官网的，代理地址是随便找了一个，不知道哪里出问题了。

token 长度问题

Traceback (most recent call last):
File "/content/ebook-GPT-translator/ebook-GPT-translator/pdf-epub-GPT-translator/text_translation.py", line 515, in
translated_short_text = translate_and_store(short_text)
File "/content/ebook-GPT-translator/ebook-GPT-translator/pdf-epub-GPT-translator/text_translation.py", line 355, in translate_and_store
translated_text = translate_text(text)
File "/content/ebook-GPT-translator/ebook-GPT-translator/pdf-epub-GPT-translator/text_translation.py", line 335, in translate_text
completion = create_chat_completion(prompt, text)
File "/content/ebook-GPT-translator/ebook-GPT-translator/pdf-epub-GPT-translator/text_translation.py", line 154, in create_chat_completion
return openai.ChatCompletion.create(
File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 298, in request
resp, got_stream = self._interpret_response(result, stream)
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 700, in _interpret_response
self._interpret_response_line(
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 763, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4291 tokens. Please reduce the length of the messages.

可参数可配置吗？

如果我要一次让他翻译多个txt文件，该如何设置比如翻译完1.txt 继续2.txt

针对Connection aborted的优化

对于长文档，比如600页的PDF，每次运行都会遇到

openai.error.APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

针对Connection aborted，能否添加一个参数，断开自动重试，并将重试的startpage设置为上次断掉的页面

Fail with invalid url

Fail with following error. Is that a configuration issue?

Invalid URL (POST /v1/chat/completions) will sleep  60 seconds
  0%|                                                                                                                                                                                                                                                     | 0/3 [01:01<?, ?it/s]
Traceback (most recent call last):
  File "/Users/xxxx/workspace/ebook-GPT-translator/text_translation.py", line 229, in translate_text
    completion = openai.ChatCompletion.create(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
                           ^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: Invalid URL (POST /v1/chat/completions)

Thanks a lot

jesselau76 / ebook-gpt-translator Goto Github PK

ebook-gpt-translator's People

Contributors

Stargazers

Watchers

Forkers

ebook-gpt-translator's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs