jesselau76 / ebook-gpt-translator Goto Github PK
View Code? Open in Web Editor NEWEnjoy reading with your favorite style.
Home Page: https://jesselau.com
License: MIT License
Enjoy reading with your favorite style.
Home Page: https://jesselau.com
License: MIT License
pdfminer.converter.HOCRConverter在python3.10中已被删除
当出现openai.error.APIError: HTTP code 502 from API时,书籍已经翻译到第六章,那么后续该怎么处理才能不从头开始翻译呢?
我正常安装,也设定了setting.cfg,在对应目录中运行text_translation.py,以后什么都没有,既没有生成文件,也没有报错啊???求问各位大佬是什么情况
我想要使用自己的接口
作者大大,你介绍中的那张翻译图,把英文翻译成了文言文,您是用了什么prompt?我看源码里的prompt只是让它扮演gpt4进行翻译,应该还有什么吧?
To better control the translation range, and save tokens.
e.g. The previous 1-3 pages of a book is OK to skip.
设置API_URL代理功能。以让在国内可以直接使用openai的API
File "text_translation.py", line 229, in translate_text
completion = openai.ChatCompletion.create(
AttributeError: module 'openai' has no attribute 'ChatCompletion'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "text_translation.py", line 363, in
translated_short_text = translate_and_store(short_text)
File "text_translation.py", line 281, in translate_and_store
translated_text = translate_text(text)
File "text_translation.py", line 253, in translate_text
completion = openai.ChatCompletion.create(
AttributeError: module 'openai' has no attribute 'ChatCompletion'
怎么解决啊大佬。
是我自己api有问题吗
现在gpt3.5已经有16k模型了,考虑到个人用户的api被限制在3次/分钟,更大的分割长度会提高翻译速度
text_translation.py 中的第 300 行:
def return_text(text): text = text.replace(".", ".\n")
是否应该将text.replace(".", ".\n")
改为text.replace(". ", ".\n")
?
因为 . 不一定表示英文中的句号,也有可能出现在数字(如3.14)或者代码中(如text.replace
);
在 . 后面加上空格,才能准确地对应英文中的句号。
不知我的想法是否正确,望解答~
Traceback (most recent call last):
File "/Users/cellier/ebook-GPT-translator/text_translation.py", line 121, in
config_text = f.read()
UnicodeDecodeError: 'gb2312' codec can't decode byte 0x81 in position 167: illegal multibyte sequence
明明已经下载了pdfminer 这个库但是还是提示 No module named 'pdfminer.high_level 要么就是导入失败
ModuleNotFoundError: No module named 'pdfminer'
so I run pip install pdfminer
Then ModuleNotFoundError: No module named 'pdfminer.high_level'
Have you tested it on a new machine which doesn't have any python modules?
多谢
若每次执行换一次命令,想按顺序每次批量转,命令如何改
给openai api充值太不容易了,能否支持使用azure openai api的选项?
尝试翻译了一片四十万字符左右的书,发现里面的排版样式都丢了,比如目录、每章节的分页换行等。
我的API已经通过GPT-4白名单,我在哪里设置可以将默认gpt-3.5模型修改为gpt-4
解析这个optimized过的pdf报错, 在deepl里面是可以正常处理的。
https://assets.ctfassets.net/95kuvdv8zn1v/44FqPJmYPZRwiZN2socdOK/14f5eb025d87a452100d80f513567f2a/Cruise_Impact_Report_-_2022-optimized.pdf
Converting PDF to text: 0% 0/10 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 722, in __init__
self.read_xref_from(parser, pos, self.xrefs)
File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 1000, in read_xref_from
xref.load(parser)
File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 282, in load
raise PDFNoValidXRef("Invalid PDF stream spec.")
pdfminer.pdfdocument.PDFNoValidXRef: Invalid PDF stream spec.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/content/drive/MyDrive/ebook-GPT-translator/text_translation.py", line 347, in <module>
text = convert_pdf_to_text(filename,startpage,endpage)
File "/content/drive/MyDrive/ebook-GPT-translator/text_translation.py", line 221, in convert_pdf_to_text
end_page = get_total_pages(pdf_filename)
File "/content/drive/MyDrive/ebook-GPT-translator/text_translation.py", line 217, in get_total_pages
document = PDFDocument(parser)
File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 727, in __init__
newxref.load(parser)
File "/usr/local/lib/python3.9/dist-packages/pdfminer/pdfdocument.py", line 241, in load
(_, obj) = parser.nextobject()
File "/usr/local/lib/python3.9/dist-packages/pdfminer/psparser.py", line 609, in nextobject
(pos, token) = self.nexttoken()
File "/usr/local/lib/python3.9/dist-packages/pdfminer/psparser.py", line 526, in nexttoken
self.fillbuf()
File "/usr/local/lib/python3.9/dist-packages/pdfminer/psparser.py", line 239, in fillbuf
raise PSEOF("Unexpected EOF")
pdfminer.psparser.PSEOF: Unexpected EOF
甲骨文AMD 机器,Debian11系统,
运行后出现:
ImportError: cannot import name 'HOCRConverter' from 'pdfminer.converter' (/usr/local/lib/python3.9/dist-packages/pdfminer/converter.py)
请教是什么问题?
另外,我在windows本地运行,必须开可以访问openai的节点吧?
python 版本 3.10.1
python pip install -r requirements.txt
首次运行报错,然后报错,提示
Traceback (most recent call last):
File "C:\Users\ebook-GPT-translator\text_translation.py", line 114, in <module>
import chardet
ModuleNotFoundError: No module named 'chardet'
于是再安装 python -m pip install chardet
, 就可以了,看来 requirements.txt 的内容要更新。
但是运行还有一些警告
C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\__init__.py:102: RequestsDependencyWarning: urllib3 (1.26.8) or chardet (5.1.0)/charset_normalizer (2.0.10) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({})/charset_normalizer ({}) doesn't match a supported "
看起来不影响使用。
Traceback (most recent call last):
File "/content/ebook-GPT-translator/ebook-GPT-translator/pdf-epub-GPT-translator/text_translation.py", line 515, in
translated_short_text = translate_and_store(short_text)
File "/content/ebook-GPT-translator/ebook-GPT-translator/pdf-epub-GPT-translator/text_translation.py", line 355, in translate_and_store
translated_text = translate_text(text)
File "/content/ebook-GPT-translator/ebook-GPT-translator/pdf-epub-GPT-translator/text_translation.py", line 335, in translate_text
completion = create_chat_completion(prompt, text)
File "/content/ebook-GPT-translator/ebook-GPT-translator/pdf-epub-GPT-translator/text_translation.py", line 154, in create_chat_completion
return openai.ChatCompletion.create(
File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 298, in request
resp, got_stream = self._interpret_response(result, stream)
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 700, in _interpret_response
self._interpret_response_line(
File "/usr/local/lib/python3.10/dist-packages/openai/api_requestor.py", line 763, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4291 tokens. Please reduce the length of the messages.
可参数可配置吗?
对于长文档,比如600页的PDF,每次运行都会遇到
openai.error.APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
针对Connection aborted,能否添加一个参数,断开自动重试,并将重试的startpage设置为上次断掉的页面
Fail with following error. Is that a configuration issue?
Invalid URL (POST /v1/chat/completions) will sleep 60 seconds
0%| | 0/3 [01:01<?, ?it/s]
Traceback (most recent call last):
File "/Users/xxxx/workspace/ebook-GPT-translator/text_translation.py", line 229, in translate_text
completion = openai.ChatCompletion.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py", line 619, in _interpret_response
self._interpret_response_line(
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: Invalid URL (POST /v1/chat/completions)
报错信息是
openai.error.RateLimitError: You exceeded your current quota, please check your plan and billing details.
希望有一个译名表的功能,在尝试翻译一些有专业术语的文本时,gpt会根据自己的理解来对专有名词进行翻译,还需要自行改动。
这里有两个可能的译名表实现方法。
增加一个指定译名表的参数,在拆分之后,请求api之前对需要翻译的内容参考译名表预先的替换,然后给gpt喂进去时多一句相关的描述“你不能替换其中[语言类型]的专有名词”;另一种是其余不变,在喂进去时增加“xxx应被译为xxx”这样的补充项
[Errno 13] Permission denied
I think this file format is needed for many people.
是不是可以增加一个用vpn访问的功能?
能否添加markdown支持?
Hello there,
Much appreicatied for this amazing tool
Thanks a lot
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.