jbilcke-hf / ai-comic-factory Goto Github PK

View Code? Open in Web Editor NEW

827.0 10.0 168.0 3.63 MB

Generate comic panels using a LLM + SDXL. Powered by Hugging Face 🤗

Home Page: https://aicomicfactory.app

License: Apache License 2.0

Dockerfile 0.70% JavaScript 0.81% TypeScript 96.23% CSS 0.26% HTML 2.00%

ai comics

ai-comic-factory's Introduction

title

emoji

colorFrom

colorTo

sdk

pinned

app_port

disable_embedding

short_description

hf_oauth

hf_oauth_expiration_minutes

hf_oauth_scopes

AI Comic Factory

👩‍🎨

red

yellow

docker

true

3000

true

Create your own AI comic with a single prompt

true

43200

inference-api

AI Comic Factory

Last release: AI Comic Factory 1.2

The AI Comic Factory will soon have an official website: aicomicfactory.app

For more information about my other projects please check linktr.ee/FLNGR.

Running the project at home

First, I would like to highlight that everything is open-source (see here, here, here, here).

However the project isn't a monolithic Space that can be duplicated and ran immediately: it requires various components to run for the frontend, backend, LLM, SDXL etc.

If you try to duplicate the project, open the .env you will see it requires some variables.

Provider config:

LLM_ENGINE: can be one of INFERENCE_API, INFERENCE_ENDPOINT, OPENAI, GROQ, ANTHROPIC
RENDERING_ENGINE: can be one of: "INFERENCE_API", "INFERENCE_ENDPOINT", "REPLICATE", "VIDEOCHAIN", "OPENAI" for now, unless you code your custom solution

Auth config:

AUTH_HF_API_TOKEN: if you decide to use Hugging Face for the LLM engine (inference api model or a custom inference endpoint)
AUTH_OPENAI_API_KEY: to use OpenAI for the LLM engine
AUTH_GROQ_API_KEY: to use Groq for the LLM engine
AUTH_ANTHROPIC_API_KEY: to use Anthropic (Claude) for the LLM engine
AUTH_VIDEOCHAIN_API_TOKEN: secret token to access the VideoChain API server
AUTH_REPLICATE_API_TOKEN: in case you want to use Replicate.com

Rendering config:

RENDERING_HF_INFERENCE_ENDPOINT_URL: necessary if you decide to use a custom inference endpoint
RENDERING_REPLICATE_API_MODEL_VERSION: url to the VideoChain API server
RENDERING_HF_INFERENCE_ENDPOINT_URL: optional, default to nothing
RENDERING_HF_INFERENCE_API_BASE_MODEL: optional, defaults to "stabilityai/stable-diffusion-xl-base-1.0"
RENDERING_HF_INFERENCE_API_REFINER_MODEL: optional, defaults to "stabilityai/stable-diffusion-xl-refiner-1.0"
RENDERING_REPLICATE_API_MODEL: optional, defaults to "stabilityai/sdxl"
RENDERING_REPLICATE_API_MODEL_VERSION: optional, in case you want to change the version

Language model config (depending on the LLM engine you decide to use):

LLM_HF_INFERENCE_ENDPOINT_URL: ""
LLM_HF_INFERENCE_API_MODEL: "HuggingFaceH4/zephyr-7b-beta"
LLM_OPENAI_API_BASE_URL: "https://api.openai.com/v1"
LLM_OPENAI_API_MODEL: "gpt-4-turbo"
LLM_GROQ_API_MODEL: "mixtral-8x7b-32768"
LLM_ANTHROPIC_API_MODEL: "claude-3-opus-20240229"

In addition, there are some community sharing variables that you can just ignore. Those variables are not required to run the AI Comic Factory on your own website or computer (they are meant to create a connection with the Hugging Face community, and thus only make sense for official Hugging Face apps):

NEXT_PUBLIC_ENABLE_COMMUNITY_SHARING: you don't need this
COMMUNITY_API_URL: you don't need this
COMMUNITY_API_TOKEN: you don't need this
COMMUNITY_API_ID: you don't need this

Please read the .env default config file for more informations. To customise a variable locally, you should create a .env.local (do not commit this file as it will contain your secrets).

-> If you intend to run it with local, cloud-hosted and/or proprietary models you are going to need to code 👨‍💻.

The LLM API (Large Language Model)

Currently the AI Comic Factory uses zephyr-7b-beta through an Inference Endpoint.

You have multiple options:

Option 1: Use an Inference API model

This is a new option added recently, where you can use one of the models from the Hugging Face Hub. By default we suggest to use zephyr-7b-beta as it will provide better results than the 7b model.

To activate it, create a .env.local configuration file:

LLM_ENGINE="INFERENCE_API"

HF_API_TOKEN="Your Hugging Face token"

# "HuggingFaceH4/zephyr-7b-beta" is used by default, but you can change this
# note: You should use a model able to generate JSON responses,
# so it is storngly suggested to use at least the 34b model
HF_INFERENCE_API_MODEL="HuggingFaceH4/zephyr-7b-beta"

Option 2: Use an Inference Endpoint URL

If you would like to run the AI Comic Factory on a private LLM running on the Hugging Face Inference Endpoint service, create a .env.local configuration file:

LLM_ENGINE="INFERENCE_ENDPOINT"

HF_API_TOKEN="Your Hugging Face token"

HF_INFERENCE_ENDPOINT_URL="path to your inference endpoint url"

To run this kind of LLM locally, you can use TGI (Please read this post for more information about the licensing).

Option 3: Use an OpenAI API Key

This is a new option added recently, where you can use OpenAI API with an OpenAI API Key.

To activate it, create a .env.local configuration file:

LLM_ENGINE="OPENAI"

# default openai api base url is: https://api.openai.com/v1
LLM_OPENAI_API_BASE_URL="A custom OpenAI API Base URL if you have some special privileges"

LLM_OPENAI_API_MODEL="gpt-4-turbo"

AUTH_OPENAI_API_KEY="Yourown OpenAI API Key"

Option 4: (new, experimental) use Groq

LLM_ENGINE="GROQ"

LLM_GROQ_API_MODEL="mixtral-8x7b-32768"

AUTH_GROQ_API_KEY="Your own GROQ API Key"

Option 5: (new, experimental) use Anthropic (Claude)

LLM_ENGINE="ANTHROPIC"

LLM_ANTHROPIC_API_MODEL="claude-3-opus-20240229"

AUTH_ANTHROPIC_API_KEY="Your own ANTHROPIC API Key"

Option 6: Fork and modify the code to use a different LLM system

Another option could be to disable the LLM completely and replace it with another LLM protocol and/or provider (eg. Claude, Replicate), or a human-generated story instead (by returning mock or static data).

Notes

It is possible that I modify the AI Comic Factory to make it easier in the future (eg. add support for Claude or Replicate)

The Rendering API

This API is used to generate the panel images. This is an API I created for my various projects at Hugging Face.

I haven't written documentation for it yet, but basically it is "just a wrapper ™" around other existing APIs:

The hysts/SD-XL Space by @hysts
And other APIs for making videos, adding audio etc.. but you won't need them for the AI Comic Factory

Option 1: Deploy VideoChain yourself

You will have to clone the source-code

Unfortunately, I haven't had the time to write the documentation for VideoChain yet. (When I do I will update this document to point to the VideoChain's README)

Option 2: Use Replicate

To use Replicate, create a .env.local configuration file:

RENDERING_ENGINE="REPLICATE"

RENDERING_REPLICATE_API_MODEL="stabilityai/sdxl"

RENDERING_REPLICATE_API_MODEL_VERSION="da77bc59ee60423279fd632efb4795ab731d9e3ca9705ef3341091fb989b7eaf"

AUTH_REPLICATE_API_TOKEN="Your Replicate token"

Option 3: Use another SDXL API

If you fork the project you will be able to modify the code to use the Stable Diffusion technology of your choice (local, open-source, proprietary, your custom HF Space etc).

It would even be something else, such as Dall-E.

ai-comic-factory's People

Contributors

Stargazers

Watchers

Forkers

tonywhite11 secretenergy weatherfish fzrank3301 957662 blkluv gelove tsirtv jatteny imiracle focusaibuilder guozhenggang petercao huangyong668899 omnipotentai pawnzzi lancechung8888 kp-forks lierscn789 yanbinren sevenegg rsonglab binarytree ykimdeveloper joseph16388 1261012806 cleardry wyxtstudio leonsting wubukeneng aicodehunt 1771979508 kbllr jeffwan jpaulduncan joylifein10 5l1v3r1 toma214 fshengyu keyman9848 lrochetta colonnade-consulting-ltd jonnyquan all-in-aigc jaideepjoshi akram0bellir varinliali zk4 themindexpansionnetwork aimdreamboy yanglei50 huyxuhao tylerlie duojincai ogulcansarioglu dikadante mrartt biao506756 gentlemanhu voyager2009 witchfindertr belyext clic-ethiopia nnwhisperer anoop-qasolve sorokinvld toandreyhse wanghaisheng silky linlurui aabbccwyf byhamzahwijaya fsmedia suryatmodulus memiralbayrak skorth th3-m1nd-3xpansi0n-n3xus zhaopufeng jensinjames baicaiit thlz998 elhakimz itcast-shi666 labtwentyfive realcoolsnow eduardishion razzium swapnasamirshukla chenkaigithub df13990878688 freedom2022 andyislay linklin12138 tbergman popoimm mrunal-g martjay 006richard bigfatkevin pennyao

ai-comic-factory's Issues

how to use this as an api?

can i host it in my vps and run as api?

How to run this on windows ?

Is there a way ?

Creating more than just 4koma

Nice to see the problem of guided comic generation being tackled. There is a bit of a nit-pick: there are only 1 page and 4 panels, instead of a streamline from prompt to full comic chapter. StoryBird can generate whole stories, allow for narrative tuning and editing, and then generate images accordingly. what would be the major issue preventing having this type of feature? https://storybird.ai/

Side note: for comic generation there are already tools for dynamic panels, which is worth checking, since going from storyline to deciding on aspect ratio of panel outputs look strong https://shortbread.ai/

Returns incorrect content type for Inference Endpoints

I played around with my own Inference Endpoint using stabilityai/stable-diffusion-xl-base-1.0 and the Inference API. The second one worked, my own Endpoint did not. In fact, the wrong content type (test/plain) seems to be delivered at the Inference Endpoints if it is not requested in the header. Just add a "Accept": "image/png" to the request header here, rebuild and everything works fine again.

[Help] I am unable to run it successfully.

TypeError: fetch failed
at Object.fetch (node:internal/deps/undici/undici:11457:11)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
cause: ConnectTimeoutError: Connect Timeout Error
at onConnectTimeout (ai-comic-factory/node_modules/.pnpm/[email protected]_biqbaboplfbrettd7655fr4n2y/node_modules/next/dist/compiled/undici/index.js:1:82152)
at ai-comic-factory/node_modules/.pnpm/[email protected]_biqbaboplfbrettd7655fr4n2y/node_modules/next/dist/compiled/undici/index.js:1:81644
at Immediate._onImmediate (ai-comic-factory/node_modules/.pnpm/[email protected]_biqbaboplfbrettd7655fr4n2y/node_modules/next/dist/compiled/undici/index.js:1:82034)
at process.processImmediate (node:internal/timers:476:21)
at process.callbackTrampoline (node:internal/async_hooks:130:17) {
code: 'UND_ERR_CONNECT_TIMEOUT'
}
}

I am using the INFERENCE_API method, and I have configured everything in the .env file, but every time I enter the prompt and click "Go," it gives an error. Am I using the method incorrectly? To be straightforward, I've installed all the packages and performed the standard operations. Do I still need to start Docker?

How to run this locally on Windows?

Hi, I am more of an artist and less of a coder with only some basic Python experience and I was looking to deploy this on my local computer to experiment with AI comic art. What I did was I git cloned this project to my local folder, opened .env file and changed reference APIs and added API keys to all use OpenAI, but I don't know what to do from here. How do I actually run this program? Sorry if this is an irrelevant issue, I would really appreciate some help.

[Error] Unable to clone repo.

I get the following error:

C:\AI\git>git clone https://github.com/jbilcke-hf/ai-comic-factory
Cloning into 'ai-comic-factory'...
remote: Enumerating objects: 1027, done.
remote: Counting objects: 100% (198/198), done.
remote: Compressing objects: 100% (131/131), done.
Receiving objects:  80% (822/1027), 2.83 MiB | 5Receiving objects:  81% (832/1027), 2.83 MiB | 5Receiving objects:  82% (843/1027), 2.83 MiB | 5Receiving objects:  83% (853/1027), 2.83 MiB | 5Receiving objects:  84% (863/1027), 2.83 MiB | 5Receiving objects:  85% (873/1027), 2.83 MiB | 5Receiving objects:  86% (884/1027), 2.83 MiB | 5Receiving objects:  87% (894/1027), 2.83 MiB | 5Receiving objects:  88% (904/1027), 2.83 MiB | 5Receiving objects:  89% (915/1027), 2.83 MiB | 5Receiving objects:  90% (925/1027), 2.83 MiB | 5Receiving objects:  91% (935/1027), 2.83 MiB | 5Receiving objects:  92% (945/1027), 2.83 MiB | 5Receiving objects:  93% (956/1027), 2.83 MiB | 5remote: Total 1027 (delta 127), reused 121 (delta 64), pack-reused 829Receiving objects:  94% (9Receiving objects:  95% (976/1027), 2.83 MiB | 5Receiving objects:  96% (986/1027), 2.83 MiB | 5Receiving objects:  97% (997/1027), 2.83 MiB | 5Receiving objects:  98% (1007/1027), 2.83 MiB | Receiving objects:  99% (1017/1027), 2.83 MiB | Receiving objects: 100% (1027/1027), 2.83 MiB | Receiving objects: 100% (1027/1027), 3.13 MiB | 5.95 MiB/s, done.

Resolving deltas: 100% (543/543), done.
error: invalid path 'public/favicon/Icon?'
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

The resulting directory is empty.

Why does the virustotal tags this site as “malicious”?

Too many requests with Dalle3 as rendering engine

RENDERING_ENGINE="OPENAI"
NEXT_PUBLIC_ENABLE_RATE_LIMITER="true"
AUTH_OPENAI_API_KEY="sk-cxxxxxx"
RENDERING_OPENAI_API_BASE_URL="https://api.openai.com/v1"
RENDERING_OPENAI_API_MODEL="dall-e-3"

I am trying to start this project on my Macbook using openai both as LLM engine and as rendering engine.

This is my current .env setting, I start project with npm run dev and I can see UI has been rendered successfully. But when I type some prompts to frontend website and hit "GO" button, the backend command line shows me I have too many requests for OpenAI rendering engine.

I have tried to modify let delay = enableRateLimiter ? (1000 + (500 * panelIndex)) : 1000 to let delay = enableRateLimiter ? (70000 + (500 * panelIndex)) : 1000 in index.tsx file, however it still didn't work for me. @jbilcke I believe this is not expected, could you please take a look at this issue. Appriciate it if you could help~.

jbilcke-hf / ai-comic-factory Goto Github PK

ai-comic-factory's Introduction

AI Comic Factory

Running the project at home

The LLM API (Large Language Model)

Option 1: Use an Inference API model

Option 2: Use an Inference Endpoint URL

Option 3: Use an OpenAI API Key

Option 4: (new, experimental) use Groq

Option 5: (new, experimental) use Anthropic (Claude)

Option 6: Fork and modify the code to use a different LLM system

Notes

The Rendering API

Option 1: Deploy VideoChain yourself

Option 2: Use Replicate

Option 3: Use another SDXL API

ai-comic-factory's People

Contributors

Stargazers

Watchers

Forkers

ai-comic-factory's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs