GithubHelp home page GithubHelp logo

open-chat-video-editor's Introduction

Open Chat Video Editor

简介

Open Chat Video Editor是开源的短视频生成和编辑工具,整体技术框架如下:

sys中文

TODO

  • windows、linux不同系统更方便的install指引
  • 创建docker,方便大家一键使用
  • 能够在线直接快速体验的url
  • 在短视频文案数据上对文本模型finetune,支持更多的文案风格
  • finetune SD模型,提升图像和视频的生成效果

目前具有以下特点:

  • 1)一键生成可用的短视频,包括:配音、背景音乐、字幕等。

  • 2)算法和数据均基于开源项目,方便技术交流和学习

  • 3)支持多种输入数据,方便对各种各样的数据,一键转短视频,目前支持:

    • 短句转短视频(Text2Video): 根据输入的简短文字,生成短视频文案,并合成短视频
    • 网页链接转短视频(Url2Video): 自动对网页的内容进行提取,生成视频文案,并生成短视频
    • 长视频转短视频(Long Video to Short Video): 对输入的长视频进行分析和摘要,并生成短视频
  • 4)涵盖生成模型多模态检索模型等多种主流算法和模型,如: Chatgpt,Stable Diffusion,CLIP 等

文本生成上,支持:

  • ChatGPT
  • BELLE
  • Alpaca
  • Dolly 等多种模型

视觉信息生成上,支持图像和视频两种模态,生成方式上,支持检索和生成两种模型,目前共有6种模式:

  • 图像检索
  • 图像生成(stable diffusion)
  • 先图像检索,再基于stable diffusion 进行图像生成
  • 视频检索
  • 视频生成(stable diffusion)
  • 视频检索后,再基于stable diffusion 进行视频生成

结果展示

1、短句转短视频(Text2Video)

界面如下: text2video 以输入文案:【小孩子养宠物】为例,利用文本模型(如:chatgpt 等),可以自动生成一个较长的短视频文案:

['小孩子养宠物', '可以更好地提升小孩子的责任感和独立感', '但也要慎重的选择合适的宠物', '因为只有经过一定的训练养成', '它们才能够成长起来', '一起玩耍和度过一段欢快的时光', '宠物不仅能够陪伴小孩子渡过寂寞时光', '还能培养小孩子处事冷静、自信以及情感交流和沟通能力', '在养宠物的过程中', '小孩子们可以唤醒和发掘他们被磨练出来的坚毅和耐力', '能够亲身体验到勤勉 和坚持的重要性'] 

根据不同的视频生成模式,可以生成不同的视频,各个模式对比如下:

1)图像检索

default.mp4

2)图像生成(stable diffusion)

default.mp4

3)先图像检索,再基于stable diffusion 进行图像生成

+.mp4

4)视频检索

default.mp4

2、网页转短视频(Url2Video)

界面如下:

url2video

1)输入一个url, 例如:https://zh.wikipedia.org/wiki/%E7%BE%8E%E5%9B%BD%E7%9F%AD%E6%AF%9B%E7%8C%AB 其内容是:美国短毛猫的维基百科

wiki

2)解析网页并自动摘要成短视频文案,结果如下:

['\n\n美国短毛猫', '是一种神奇又魔幻的宠物猫品种', '它们优雅可爱', '活力无比', '能拥有多达80多种头毛色彩', '最出名的是银虎斑', '其银色毛发中透着浓厚的黑色斑 
纹', '除此之外', '它们还非常温柔', '是非常适合家庭和人类相处的宠物', '并且平均寿命达15-20年', '这种可爱的猫 
品种', '正在受到越来越多人的喜爱', '不妨试试你也来养一只吧']

3)自动合成短视频 例如图像生成模式下生成的结果如下,其他模式不再一一对比

url.mp4

3、长视频转短视频(Long Video to Short Video)

即将发布,敬请期待

安装与使用

环境安装

首先下载源码

git clone https://github.com/SCUTlihaoyu/open-chat-video-editor.git

根据不同需求,选择不同的安装方式1、2、和3、任选其一。

1、Docker

目前docker环境因为每个人的cuda版本可能不一样,所以无法保证都能够正常使用GPU。目前支持图像检索模式,CPU机器也可以使用。但docker比较大,需要占用比较多的储存(24G)。 YourPath表示存放上面下载的代码的路径

docker pull iamjunhonghuang/open-chat-video-editor:retrival
docker run -it --network=host -v /YourPath/open-chat-video-editor:/YourPath/open-chat-video-editor/ iamjunhonghuang/open-chat-video-editor:retrival bash
conda activate open_editor

或者使用阿里云的镜像:

docker login --username=xxx registry.cn-hangzhou.aliyuncs.com
docker pull registry.cn-hangzhou.aliyuncs.com/iamjunhonghuang/open-chat-video-editor:retrival
docker run -it --network=host -v /YourPath/open-chat-video-editor:/YourPath/open-chat-video-editor/ registry.cn-hangzhou.aliyuncs.com/iamjunhonghuang/open-chat-video-editor:retrival bash
conda activate open_editor

注意:目前暂不支持中文字幕显示,所以需要修改配置文件yaml中的字体设置,例如’image_by_retrieval_text_by_chatgpt_zh.yaml‘

  subtitle:
    font: DejaVu-Sans-Bold-Oblique
    # font: Cantarell-Regular
    # font: 华文细黑

2、Linux (目前仅在centOS测试)

1)首先安装基于conda的python环境,gcc版本安装测试时是8.5.0,所以尽量升级到8以上

conda env create -f env.yaml
conda env update -f env.yaml #假如第一行出现错误,需要更新使用的命令

2) 接着安装环境依赖,主要目的是正常安装ImageMagick,其他linux版本可以参考

# yum groupinstall 'Development Tools'
# yum install ghostscript
# yum -y install bzip2-devel freetype-devel libjpeg-devel libpng-devel libtiff-devel giflib-devel zlib-devel ghostscript-devel djvulibre-devel libwmf-devel jasper-devel libtool-ltdl-devel libX11-devel libXext-devel libXt-devel libxml2-devel librsvg2-devel OpenEXR-devel php-devel
# wget https://www.imagemagick.org/download/ImageMagick.tar.gz
# tar xvzf ImageMagick.tar.gz
# cd ImageMagick*
# ./configure
# make
# make install

3)需要修改moviepy的调用路径,也就是将下面文件

$HOME/anaconda3/envs/open_editor/lib/python3.8/site-packages/moviepy/config_defaults.py

修改成

#IMAGEMAGICK_BINARY = os.getenv('IMAGEMAGICK_BINARY', 'auto-detect')
IMAGEMAGICK_BINARY='/usr/local/bin/magick'

4)目前暂不支持中文字幕显示,所以需要修改配置文件yaml中的字体设置,例如’image_by_retrieval_text_by_chatgpt_zh.yaml‘

  subtitle:
    font: DejaVu-Sans-Bold-Oblique
    # font: Cantarell-Regular
    # font: 华文细黑

3、Windows

1)建议使用python 3.8.16版本:

conda create -n open_editor python=3.8.16

2)安装pytorch

# GPU 版本
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

# CPU版本
pip3 install torch torchvision torchaudio

3)安装其他依赖环境

pip install -r requirements.txt

4)安装clip

pip install git+https://github.com/openai/CLIP.git

5)安装faiss

conda install -c pytorch faiss-cpu

代码执行

1)根据实际需要,选择不同的配置文件

配置文件 说明
configs/text2video/image_by_retrieval_text_by_chatgpt_zh.yaml 短文本转视频,视频文案采用chatgpt生成,视觉部分采用图像检索来生成
configs\text2video\image_by_diffusion_text_by_chatgpt_zh.yaml 短文本转视频,视频文案采用chatgpt生成, 视觉部分采用图像stable diffusion 来生成
configs\text2video\image_by_retrieval_then_diffusion_chatgpt_zh.yaml 短文本转视频,视频文案采用chatgpt生成,视觉部分采用先图像检索,然后再基于图像的stable diffusion 来生成
configs\text2video\video_by_retrieval_text_by_chatgpt_zh.yaml 短文本转视频, 视频文案采用chatgpt生成,视觉部分采用视频检索来生成
configs\url2video\image_by_retrieval_text_by_chatgpt.yaml url转视频,视频文案采用chatgpt生成,视觉部分采用图像检索来生成
configs\url2video\image_by_diffusion_text_by_chatgpt.yaml url转视频,视频文案采用chatgpt生成, 视觉部分采用图像stable diffusion 来生成
configs\url2video\image_by_retrieval_then_diffusion_chatgpt.yaml url转视频,视频文案采用chatgpt生成,视觉部分采用先图像检索,然后再基于图像的stable diffusion 来生成
configs\url2video\video_by_retrieval_text_by_chatgpt.yaml url转视频,视频文案采用chatgpt生成,视觉部分采用视频检索来生成

需要注意的是:如果要采用ChatGPT来生成文案,需要在配置文件里面,添加organization_id(要在Organization settings那里查,而不是直接输入“personal”)和 api_key

2)下载数据索引和meta信息data.tar,并解压到 data/index 目录下,

3)执行脚本。注意:下面的${cfg_file}指的是是上面列表中的配置文件的路径,不同配置文件会运行不同的模式。例如:将下面${cfg_file}更改成configs/text2video/image_by_retrieval_text_by_chatgpt_zh.yaml

# Text to video 
python  app/app.py --func Text2VideoEditor  --cfg ${cfg_file}


# URL to video 
python  app/app.py --func URL2VideoEditor  --cfg ${cfg_file}

声明

1、数据来源 图像检索数据来源于:LAION-5B

视频检索数据来源于:webvid-10m

请注意,我们并不拥有数据版权

2、该项目仅用于交流学习,不得用于商业,以及其他会对社会带来危害的用途。

交流与学习

欢迎通过Discard 或者微信与我们交流学习

一群200人已满,

二群200人已满,

三群200人已满,

四群200人已满,

五群200人已满, 请加六群 image

open-chat-video-editor's People

Contributors

junhongh avatar scutlihaoyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-chat-video-editor's Issues

How to generate customized .faiss .db

Hi, I wonder if owner of anyone else could share the method creating new webvid.faiss, and webvid.db. Would be very appreciated!

大家好, 请问有谁可以分享一下如何生成自己的视频检索index和db文件( webvid.faiss,webvid.db)?非常感谢帮助

ModuleNotFoundError: No module named 'faiss.swigfaiss_avx2'

window11中缺少dll包
2023-06-08 00:23:18,377 - faiss.loader - INFO - Loading faiss with AVX2 support.
2023-06-08 00:23:18,377 - faiss.loader - INFO - Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
2023-06-08 00:23:18,378 - faiss.loader - INFO - Loading faiss.

不支持mac

不支持mac最好在readme里面说明,还有python的版本,我换了好几个python版本,每次都是某个包曝出不支持当前的python版本,最后换到python3.8之后安装到pywin32的时候,发现mac平台不支持,真的是

WeChat group issue

The number of people in the WeChat group has exceeded 200 and cannot be added. Can we provide WeChat group pulling services for group leaders

在docker下build tts generator的时候会出现seg fault

Current thread 0x0000004001763fc0 (most recent call first):
File "", line 219 in _call_with_frames_removed
File "", line 1166 in create_module
File "", line 556 in module_from_spec
File "", line 657 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "", line 219 in _call_with_frames_removed
File "", line 1042 in _handle_fromlist
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddle/fluid/core.py", line 274 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "", line 219 in _call_with_frames_removed
File "", line 1042 in _handle_fromlist
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddle/fluid/framework.py", line 37 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "", line 219 in _call_with_frames_removed
File "", line 1042 in _handle_fromlist
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddle/fluid/init.py", line 36 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddle/framework/random.py", line 16 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "", line 219 in _call_with_frames_removed
File "", line 1042 in _handle_fromlist
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddle/framework/init.py", line 17 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddle/init.py", line 25 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/cli/utils.py", line 26 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/resource/resource.py", line 20 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/resource/init.py", line 14 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/cli/base_commands.py", line 20 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/cli/init.py", line 16 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "", line 219 in _call_with_frames_removed
File "", line 961 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "", line 219 in _call_with_frames_removed
File "", line 961 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/Users/tenghuiliu/Project/decoda_ai/github/open-chat-video-editor/generator/tts/paddlespeech_model.py", line 1 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/Users/tenghuiliu/Project/decoda_ai/github/open-chat-video-editor/generator/tts/build.py", line 2 in
File "", line 219 in _call_with_frames_removed
File "", line 843 in exec_module
File "", line 671 in _load_unlocked
File "", line 975 in _find_and_load_unlocked
File "", line 991 in _find_and_load
File "/Users/tenghuiliu/Project/decoda_ai/github/open-chat-video-editor/editor/build.py", line 3 in
File "", line 219 in _call_with_frames_removed

ValueError: max() arg is an empty sequence

2023-05-23 06:58:36,192 - comm.mylog - INFO - sentences: ['cats are lovely', ' fluffy creatures that people love to spend time with', " they're always happy and playful", ' and they love to cuddle up with their owners', " whether you're a cat person or not", " you can't 否认 cats are some of the best animals to have around", ' so why not add one to your life and see how much joy it can bring']
2023-05-23 06:58:36,192 - comm.mylog - INFO - en_out_text: ['cats are lovely', ' fluffy creatures that people love to spend time with', " they're always happy and playful", ' and they love to cuddle up with their owners', " whether you're a cat person or not", " you can't 否认 cats are some of the best animals to have around", ' so why not add one to your life and see how much joy it can bring']
[2023-05-23 06:58:47,705] [    INFO] - Already cached /root/.paddlenlp/models/bert-base-chinese/bert-base-chinese-vocab.txt
[2023-05-23 06:58:47,727] [    INFO] - tokenizer config file saved in /root/.paddlenlp/models/bert-base-chinese/tokenizer_config.json
[2023-05-23 06:58:47,728] [    INFO] - Special tokens file saved in /root/.paddlenlp/models/bert-base-chinese/special_tokens_map.json
Building prefix dict from the default dictionary ...
2023-05-23 06:58:58,554 - jieba - DEBUG - Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
2023-05-23 06:58:58,554 - jieba - DEBUG - Loading model from cache /tmp/jieba.cache
Loading model cost 0.778 seconds.
2023-05-23 06:58:59,332 - jieba - DEBUG - Loading model cost 0.778 seconds.
Prefix dict has been built successfully.
2023-05-23 06:58:59,332 - jieba - DEBUG - Prefix dict has been built successfully.
2023-05-23 06:59:04,170 - comm.mylog - INFO - final_clips: 0
Traceback (most recent call last):
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/gradio/routes.py", line 401, in run_predict
    output = await app.get_blocks().process_api(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/gradio/blocks.py", line 1302, in process_api
    result = await self.call_function(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/gradio/blocks.py", line 1025, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "app/app.py", line 30, in run_Text2VideoEditor_logit
    out_text,video_out = editor.run(input_text,style_text,out_video)
  File "/opt/gf/open-chat-video-editor/editor/chat_editor.py", line 96, in run
    video = concatenate_videoclips(final_clips)
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/moviepy/video/compositing/concatenate.py", line 75, in concatenate_videoclips
    w = max(r[0] for r in sizes)
ValueError: max() arg is an empty sequence

windows安装的两个小问题咨询下

本人python小白windows安装的时候有两个小问题

1.安装其他依赖环境 (这个必须要执行的吗)执行pip install -r requirements.txt 会提示没有文件

2.下载数据索引和meta信息data.tar,并解压到 data/index 目录下 这个data/index 在哪儿啊?

望大神解答,感谢

KeyError: 'phone_ids'

2023-05-23 07:01:21,517 - comm.mylog - INFO - chatgpt response: Here's a 50-word short video copy using cat content:

"Watch our cute cat playing with a string, trying to catch it and make it fly. Look how excited it is, just like a child playing with a toy. Don't you feel like hugging it and petting its head? Look at its little claws, they're so sharp and fierce. Sure, it's just a cat, but it's our pet and we love it just the same. Watch and enjoy our cat video."
2023-05-23 07:01:21,517 - comm.mylog - INFO - sentences: ['Here\'s a 50-word short video copy using cat content:"Watch our cute cat playing with a string', ' trying to catch it and make it fly', ' Look how excited it is', ' just like a child playing with a toy', " Don't you feel like hugging it and petting its head", ' Look at its little claws', " they're so sharp and fierce", ' Sure', " it's just a cat", " but it's our pet and we love it just the same", ' Watch and enjoy our cat video', '"']
2023-05-23 07:01:21,517 - comm.mylog - INFO - en_out_text: ['Here\'s a 50-word short video copy using cat content:"Watch our cute cat playing with a string', ' trying to catch it and make it fly', ' Look how excited it is', ' just like a child playing with a toy', " Don't you feel like hugging it and petting its head", ' Look at its little claws', " they're so sharp and fierce", ' Sure', " it's just a cat", " but it's our pet and we love it just the same", ' Watch and enjoy our cat video', '"']
Traceback (most recent call last):
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/gradio/routes.py", line 401, in run_predict
    output = await app.get_blocks().process_api(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/gradio/blocks.py", line 1302, in process_api
    result = await self.call_function(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/gradio/blocks.py", line 1025, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "app/app.py", line 30, in run_Text2VideoEditor_logit
    out_text,video_out = editor.run(input_text,style_text,out_video)
  File "/opt/gf/open-chat-video-editor/editor/chat_editor.py", line 50, in run
    tts_resp = self.audio_generator.batch_run(tts_in_text)
  File "/opt/gf/open-chat-video-editor/generator/tts/tts_generator.py", line 27, in batch_run
    resp.append(self.run_tts(text))
  File "/opt/gf/open-chat-video-editor/generator/tts/tts_generator.py", line 15, in run_tts
    self.tts_model.run_tts(text,out_path)
  File "/opt/gf/open-chat-video-editor/generator/tts/paddlespeech_model.py", line 16, in run_tts
    self.tts(text=text,lang=self.lang,am=self.am,output=out_path)
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/cli/utils.py", line 328, in _warpper
    return executor_func(self, *args, **kwargs)
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/cli/tts/infer.py", line 710, in __call__
    self.infer(text=text, lang=lang, am=am, spk_id=spk_id)
  File "<decorator-gen-603>", line 2, in infer
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddle/fluid/dygraph/base.py", line 375, in _decorate_function
    return func(*args, **kwargs)
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/cli/tts/infer.py", line 471, in infer
    frontend_dict = run_frontend(
  File "/data/anaconda3/envs/open_editor/lib/python3.8/site-packages/paddlespeech/t2s/exps/syn_utils.py", line 305, in run_frontend
    phone_ids = input_ids["phone_ids"]
KeyError: 'phone_ids'

AssertionError: Torch not compiled with CUDA enabled

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.46G/3.46G [07:25<00:00, 11.1MB/s]
│ E:\voice2face\open-chat-video-editor\app\app.py:27 in │
│ │
│ 24 │ │ # cfg_path = "configs/video_by_retrieval_text_by_chatgpt_zh.yaml" │
│ 25 │ │ cfg.merge_from_file(cfg_path) │
│ 26 │ │ print(cfg) │
│ ❱ 27 │ │ editor = build_editor(cfg) │
│ 28 │ │ def run_Text2VideoEditor_logit(input_text, style_text): │
│ 29 │ │ │ out_video = "test.mp4" │
│ 30 │ │ │ out_text,video_out = editor.run(input_text,style_text,out_video) │
│ │
│ E:\voice2face\open-chat-video-editor\editor\build.py:14 in build_editor │
│ │
│ 11 │ logger.info('visual_gen_type: {}'.format(visual_gen_type)) │
│ 12 │ # image_by_diffusion video_by_retrieval image_by_retrieval_then_diffusion video_by_ │
│ 13 │ if visual_gen_type in ["image_by_retrieval","image_by_diffusion","image_by_retrieval │
│ ❱ 14 │ │ vision_generator = build_image_generator(cfg) │
│ 15 │ else: │
│ 16 │ │ vision_generator = build_video_generator(cfg) │
│ 17 │
│ │
│ E:\voice2face\open-chat-video-editor\generator\image\build.py:33 in build_image_generator │
│ │
│ 30 │ │ image_generator = ImageGenbyRetrieval(cfg,query_model,index_server,meta_server) │
│ 31 │ elif visual_gen_type == "image_by_diffusion": │
│ 32 │ │ logger.info("start build_img_gen_model") │
│ ❱ 33 │ │ img_gen_model = build_img_gen_model(cfg) │
│ 34 │ │ image_generator = ImageGenByDiffusion(cfg,img_gen_model) │
│ 35 │ elif visual_gen_type == "image_by_retrieval_then_diffusion": │
│ 36 │ │ # build img retrieval generator │
│ │
│ E:\voice2face\open-chat-video-editor\generator\image\generation\build.py:6 in │
│ build_img_gen_model │
│ │
│ 3 def build_img_gen_model(cfg): │
│ 4 │ │
│ 5 │ model_id = cfg.video_editor.visual_gen.image_by_diffusion.model_id │
│ ❱ 6 │ model = StableDiffusionImgModel(model_id) │
│ 7 │ return model │
│ 8 │
│ 9 def build_img2img_gen_model(cfg): │
│ │
│ E:\voice2face\open-chat-video-editor\generator\image\generation\stable_diffusion.py:10 in │
init
│ │
│ 7 │ │ self.model_id = model_id │
│ 8 │ │ self.pipe = StableDiffusionPipeline.from_pretrained(self.model_id, torch_dtype=t │
│ 9 │ │ self.pipe.scheduler = DPMSolverMultistepScheduler.from_config(self.pipe.schedule │
│ ❱ 10 │ │ self.pipe = self.pipe.to("cuda") │
│ 11 │ │
│ 12 │ def run(self,prompt): │
│ 13 │ │ image = self.pipe(prompt).images[0] │
│ │
│ E:\voice2face\open-chat-video-editor\enve\lib\site-packages\diffusers\pipelines\pipeline_utils.p │
│ y:643 in to │
│ │
│ 640 │ │ │
│ 641 │ │ is_offloaded = pipeline_is_offloaded or pipeline_is_sequentially_offloaded │
│ 642 │ │ for module in modules: │
│ ❱ 643 │ │ │ module.to(torch_device, torch_dtype) │
│ 644 │ │ │ if ( │
│ 645 │ │ │ │ module.dtype == torch.float16 │
│ 646 │ │ │ │ and str(torch_device) in ["cpu"] │
│ │
│ E:\voice2face\open-chat-video-editor\enve\lib\site-packages\torch\nn\modules\module.py:1145 in │
│ to │
│ │
│ 1142 │ │ │ │ │ │ │ non_blocking, memory_format=convert_to_format) │
│ 1143 │ │ │ return t.to(device, dtype if t.is_floating_point() or t.is_complex() else No │
│ 1144 │ │ │
│ ❱ 1145 │ │ return self._apply(convert) │
│ 1146 │ │
│ 1147 │ def register_full_backward_pre_hook( │
│ 1148 │ │ self, │
│ │
│ E:\voice2face\open-chat-video-editor\enve\lib\site-packages\torch\nn\modules\module.py:797 in │
│ _apply │
│ │
│ 794 │ │
│ 795 │ def _apply(self, fn): │
│ 796 │ │ for module in self.children(): │
│ ❱ 797 │ │ │ module._apply(fn) │
│ 798 │ │ │
│ 799 │ │ def compute_should_use_set_data(tensor, tensor_applied): │
│ 800 │ │ │ if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): │
│ │
│ E:\voice2face\open-chat-video-editor\enve\lib\site-packages\torch\nn\modules\module.py:820 in │
│ _apply │
│ │
│ 817 │ │ │ # track autograd history of param_applied, so we have to use │
│ 818 │ │ │ # with torch.no_grad():
│ 819 │ │ │ with torch.no_grad(): │
│ ❱ 820 │ │ │ │ param_applied = fn(param) │
│ 821 │ │ │ should_use_set_data = compute_should_use_set_data(param, param_applied) │
│ 822 │ │ │ if should_use_set_data: │
│ 823 │ │ │ │ param.data = param_applied │
│ │
│ E:\voice2face\open-chat-video-editor\enve\lib\site-packages\torch\nn\modules\module.py:1143 in │
│ convert │
│ │
│ 1140 │ │ │ if convert_to_format is not None and t.dim() in (4, 5): │
│ 1141 │ │ │ │ return t.to(device, dtype if t.is_floating_point() or t.is_complex() els │
│ 1142 │ │ │ │ │ │ │ non_blocking, memory_format=convert_to_format) │
│ ❱ 1143 │ │ │ return t.to(device, dtype if t.is_floating_point() or t.is_complex() else No │
│ 1144 │ │ │
│ 1145 │ │ return self.apply(convert) │
│ 1146 │
│ │
│ E:\voice2face\open-chat-video-editor\enve\lib\site-packages\torch\cuda_init
.py:239 in │
│ _lazy_init │
│ │
│ 236 │ │ │ │ "Cannot re-initialize CUDA in forked subprocess. To use CUDA with " │
│ 237 │ │ │ │ "multiprocessing, you must use the 'spawn' start method") │
│ 238 │ │ if not hasattr(torch._C, '_cuda_getDeviceCount'): │
│ ❱ 239 │ │ │ raise AssertionError("Torch not compiled with CUDA enabled") │
│ 240 │ │ if _cudart is None: │
│ 241 │ │ │ raise AssertionError( │
│ 242 │ │ │ │ "libcudart functions unavailable. It looks like you have a broken build? │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AssertionError: Torch not compiled with CUDA enabled

python version 3.92
no GPU

CPU版本

pip3 install torch torchvision torchaudio

toolz 0.12.0
torch 2.0.1
torchaudio 2.0.2
torchvision 0.15.2

运行命令:>python app/app.py --func Text2VideoEditor --cfg configs\text2video\image_by_diffusion_text_by_chatgpt_zh.yaml

ERROR: No matching distribution found for torch==2.0.0+cu117

windows 10 GPU 安装报错:
pip install -r requirements.txt

Collecting toolz==0.12.0
Using cached toolz-0.12.0-py3-none-any.whl (55 kB)
ERROR: Could not find a version that satisfies the requirement torch==2.0.0+cu117 (from versions: 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==2.0.0+cu117

no space left on device

Followed the instruction in the docker section and tried twice to pull docker image. Each time I got the same error, but I have enough space on my Mac(more than 700GB left). Here is the command I ran on my Mac:

docker pull iamjunhonghuang/open-chat-video-editor:retrival

Here is the whole log printed in my terminal:

retrival: Pulling from iamjunhonghuang/open-chat-video-editor 2d473b07cdd5: Pull complete 2144867bd1ad: Pull complete 6661021dc03d: Pull complete 142f314be218: Pull complete 66744ab40d65: Pull complete d36c5d2af16f: Pull complete 7e87ed5688d3: Pull complete bcd0e7c63b53: Pull complete 686656cf88ae: Pull complete 5616e23b2d3c: Extracting [==================================================>] 12.31GB/12.31GB failed to register layer: Error processing tar file(exit status 1): write /data/anaconda3/envs/open_editor/lib/python3.8/site-packages/torch/lib/libtorch_cuda_linalg.so: no space left on device

Here is the error I got:

ailed to register layer: Error processing tar file(exit status 1): write /data/anaconda3/envs/open_editor/lib/python3.8/site-packages/torch/lib/libtorch_cuda_linalg.so: no space left on device

experiential summary: set server_port & server_name

Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere!

we can check their website building-demos:

  • Can be set by environment variable GRADIO_SERVER_PORT. If None, will search for an available port starting at 7860.
  • Can be set by environment variable GRADIO_SERVER_NAME. If None, will use "127.0.0.1".

一套很有希望的系统已经不更新了,甚是可惜

这里面可以结合和改进的点还有很多,但代码已经3个月没有更新了,看样子要偃旗息鼓了。

比如更细粒度的管理:生成剧本的模板管理,生成sd场景的关键词管理,关键帧的特效管理,特效到视频合成管理,字幕,旁白管理等……

这个系统做出来就是一个自媒体的视频编辑平台,商业前景很好,可惜可惜。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.