lanbinshijie / bili2text Goto Github PK

Bilibili视频转文字，一步到位，输入链接即可使用

License: MIT License

Python 100.00%

bili2text's Introduction

Bili2text 📺

注意⚠️ 本仓库很久没有更新了，不保证Whisper能否继续使用。因为之前电脑错误，我本地的开发环境无法找回，所以很多问题不一定能解决。

Whisper仓库目前更新较多，本仓库选择的是8个月以前的版本，不保证能继续使用。还有一个方案是使用讯飞星火的识别或者其他Speech To Text的服务，通过网络请求上传音频识别并获取返回结果。如有技术的大佬可以试试完善一下这些接口提提PR！

简介 🌟

bili2text 是一个用于将 Bilibili 视频转换为文本的工具🛠️。这个项目通过一个简单的流程实现：下载视频、提取音频、分割音频，并使用 whisper 模型将语音转换为文本。整个过程是自动的，只需输入 Bilibili 视频的 av 号即可。整个过程行云流水，一步到胃😂

功能 🚀

🎥下载视频：从 Bilibili 下载指定的视频。
🎵提取音频：从下载的视频中提取音频。
💬音频分割：将音频分割成小段，以便于进行高效的语音转文字处理。
🤖语音转文字：使用 OpenAI 的 whisper 模型将音频转换为文本。

使用方法 📘

克隆仓库：

git clone https://github.com/lanbinshijie/bili2text.git
cd bili2text

安装依赖：安装必要的 Python 库。
```
pip install -r requirements.txt
```
运行脚本：使用 Python 运行 main.py 脚本。
```
python main.py
```
在提示时输入 Bilibili 视频的 av 号。
使用UI界面：
```
python window.py
```
在弹出的窗口中输入视频链接，会自动转换为av号，点击下载视频按钮即可完成文件转换。

示例 📋

from downBili import download_video
from exAudio import *
from speech2text import *

av = input("请输入av号：")
filename = download_video(av)
foldername = run_split(filename)
run_analysis(foldername, prompt="以下是普通话的句子。这是一个关于{}的视频。".format(filename))
output_path = f"outputs/{foldername}.txt"
print("转换完成！", output_path)

技术栈 🧰

Python 主要编程语言，负责实现程序逻辑功能
Whisper 语音转文字模型
Tkiner UI界面展示相关工具
TTKbootstrap UI界面美化库

后续开发计划 📅

生成requirements.txt
UI化设计

运行截图 📷

Star History ⭐

许可证 📄

本项目根据 MIT 许可证发布。

贡献 💡

如果你想为这个项目做出贡献，欢迎提交 Pull Request 或创建 Issue。

致谢 🙏

再此感谢Open Teens对青少年开源社区做出的贡献！@OpenTeens

bili2text's People

Contributors

Stargazers

Watchers

bili2text's Issues

谢谢各位支持！

感谢各位支持，一年前随手写的一个小项目竟然收获了这么多星星！这个项目已经很久没有更新了，大家如果有任何问题可以在Issue提出，如果有能力的大佬们可以提一提PR，我会积极合并的。

另外我的主要账号因为一些命名原因从@lanbinshijie迁移到了这个@lanbinleo下，大家可以帮忙follow一下哈~

最后感谢大家对这个项目的支持，任何问题可以在Issue中提出，我和社区的小伙伴们会尽量帮大家解决。

改了一份requirements，剔除了版本号和本地路径，自取

requirements.txt

加载whisper报错

[LOG][INFO] Whisper未加载！
[LOG][INFO] Exception in Tkinter callback
[LOG][INFO] Traceback (most recent call last):

[LOG][INFO]
[LOG][INFO] File "C:\Python312\Lib\tkinter_init_.py", line 1967, in call
return self.func(*args)
^^^^^^^^^^^^^^^^

[LOG][INFO]
[LOG][INFO] File "D:\Portable\git\bili2text\window.py", line 130, in load_whisper
import speech2text

[LOG][INFO]
[LOG][INFO] File "D:\Portable\git\bili2text\speech2text.py", line 1, in
import whisper

[LOG][INFO]
[LOG][INFO] File "C:\Python312\Lib\site-packages\whisper.py", line 69, in
libc = ctypes.CDLL(libc_name)
^^^^^^^^^^^^^^^^^^^^^^

[LOG][INFO]
[LOG][INFO] File "C:\Python312\Lib\ctypes_init_.py", line 369, in init
if '/' in name or '\' in name:
^^^^^^^^^^^

[LOG][INFO]
[LOG][INFO] TypeError: argument of type 'NoneType' is not iterable

运行python window.py，没有弹窗，运行python main.py，没有提示输入B站AV号

下载视频文件过程报错

博主你好，我在使用python main.py的时候提示如下报错，请问要怎么解决呢

请输入av号：1054203219
******************************B站视频下载小助手******************************
[下载视频的cid]:1530757862
[下载视频的标题]:关于中美第三轮库存周期下主动补库阶段的大宗价格演化路线分析
Traceback (most recent call last):
  File "/Users/zhengkai/anaconda3/envs/py37/lib/python3.7/site-packages/requests/models.py", line 910, in json
    return complexjson.loads(self.text, **kwargs)
  File "/Users/zhengkai/anaconda3/envs/py37/lib/python3.7/site-packages/simplejson/__init__.py", line 525, in loads
    return _default_decoder.decode(s)
  File "/Users/zhengkai/anaconda3/envs/py37/lib/python3.7/site-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/Users/zhengkai/anaconda3/envs/py37/lib/python3.7/site-packages/simplejson/decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 6, in <module>
    filename = download_video(av)
  File "/Users/zhengkai/Project/bili2text/downBili.py", line 200, in download_video
    video_list = get_play_list(start_url, cid, quality)
  File "/Users/zhengkai/Project/bili2text/downBili.py", line 21, in get_play_list
    html = requests.get(url_api, headers=headers).json()
  File "/Users/zhengkai/anaconda3/envs/py37/lib/python3.7/site-packages/requests/models.py", line 917, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: [Errno Expecting value] <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="zh-cn">

<head>
    <meta http-equiv="Page-Enter" content="blendTrans(Duration=0.5)">
    <meta http-equiv="Page-Exit" content="blendTrans(Duration=0.5)">
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
    <meta name="spm_prefix" content="333.937">
    <title>å�ºé��å�¦! - bilibili.com</title>
    <style type="text/css">

        html,body {
            vertical-align: middle;
            padding: 0;
            margin: 0;
        }

        div.center {
            position: absolute;
            top: 50%;
            left: 50%;
            margin: -25% 0 0 -320px;
            width: 640px;
            min-height: 427px;
            padding: 0px;
        }

        div.errmsg {
            text-align: left;
            width: 640px;
            line-height: 150%;
        }

        a {
            text-decoration: none;
            color: red
        }

        .center {
            display: none
        }

        .h5-container {
            display: none;
        }
        @media screen and (max-width: 500px) {
            #biliMainHeader, #internationalHeader, .error-container {
                display: none;
            }

            .h5-container {
                display: block;
                position:absolute;
                top: 50%;
                left: 50%;
                transform: translate(-50%, -50%);
            }

            .h5__img {
                display: block;
                width: 300px;
            }

            .h5__desc {
                color: #a0a0a0;
                text-align: center;
            }
        }

    </style>
    <link rel="shortcut icon" href="//static.hdslb.com/images/favicon.ico">
    <link href="//static.hdslb.com/error/dist/error.css" rel="stylesheet">
    <script type="text/javascript">
        var options = {
            type: 'defaultError'
        }
        window.spmReportData = {};
        window.reportConfig = {
            sample: 1,
            msgObjects: "spmReportData",
        };
    </script>
    <script src="//s1.hdslb.com/bfs/seed/log/report/log-reporter.js"></script>
    <script type="text/javascript" src="//s1.hdslb.com/bfs/static/jinkela/long/js/jquery/jquery1.7.2.min.js"></script>
</head>

<body style="direction: ltr;">
    <div id="biliMainHeader" style="height: 56px; background-color: #fff;"></div>
    <div class="error-container">
        <div class="error-panel server-error">
            <img src="//i0.hdslb.com/bfs/feedback/f7b667011a46615732c701f4bb1d07f793f8d1df.png">
            <div style="text-align: center; padding: 0 0 40px 0;">
              <a class="rollback-btn" style="padding: 0 20px; float: none;">è¿�å��ä¸�ä¸�é¡µ</a>
            </div>
        </div>
        <div class="error-split">
        </div>
        <div class="error-manga">
        </div>
    </div>
    <div class="h5-container">
        <img class="h5__img" src="//s1.hdslb.com/bfs/static/jinkela/long/bitmap/error_01.png" alt="parse failed">
        <div class="h5__desc">
            <span>Î£(oï¾�Ð´ï¾�oï¾�) æ��å�¡å�¨æ£å�¨ä¼�æ�¯ zZ</span>
        </div>
    </div>
    <script type="text/javascript" src="//s1.hdslb.com/bfs/seed/jinkela/header-v2/header.js"></script>
    <script type="text/javascript" charset="utf-8" src="//static.hdslb.com/error/dist/error.js"></script>
</body>

</html>
: 0

Traceback (most recent call last): File "C:\Users\mrvx\PycharmProjects\bili2text\main.py", line 1, in <module> from downBili import download_video File "C:\Users\mrvx\PycharmProjects\bili2text\downBili.py", line 5, in <module> from moviepy.editor import * ModuleNotFoundError: No module named 'moviepy'
这个文件不在项目里，请作者看到补充一下

系统找不到要安装的软件包

ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'F:\home\ktietz\src\ci\alabaster_1611921544520\work'

该路径指向一个内部使用的代码仓库。这意味着您可能无法通过 pip 直接安装此软件包。

requirement.txt 不可用

作者你好，你导出的requirement.txt是本地文件系统，修复这个问题请使用pip list --format=freeze > requirements.txt命令重新导出一下，谢谢

运行 main.py出错

mac os 安装库报错

pip install -r requirements.txt

Collecting absl-py (from -r requirements.txt (line 1))
Using cached absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting alabaster (from -r requirements.txt (line 2))
Downloading alabaster-1.0.0-py3-none-any.whl.metadata (2.8 kB)
Collecting altgraph (from -r requirements.txt (line 3))
Using cached altgraph-0.17.4-py2.py3-none-any.whl.metadata (7.3 kB)
Collecting anaconda-client (from -r requirements.txt (line 4))
Downloading anaconda-client-1.2.2.tar.gz (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.8/64.8 kB 597.1 kB/s eta 0:00:00
Preparing metadata (setup.py) ... done
ERROR: Could not find a version that satisfies the requirement anaconda-navigator (from versions: none)
ERROR: No matching distribution found for anaconda-navigator

pip 3.11

安装Python库出错

你好，安装库的时候出错