Windows 11 - Docker Desktop
WSL2 backend - Ubuntu 22.04 LTS
1 minor issue resolved
multiple other failures
Minor:
\spellbook-docker> docker compose up
[+] Running 1/0
โ Network spellbook-docker_spellbook-net Error 0.0s
failed to create network spellbook-docker_spellbook-net: Error response from daemon: invalid network config:
invalid subnet 10.23.82.0/16: it should be 10.23.0.0/16
*Resolved by editing docker-compose.yml and correcting the subnet address
Failures:
spellbook_rabbitmq | /usr/local/bin/docker-entrypoint.sh: /amqp-init.sh: /bin/bash^M: bad interpreter: No such file or directory
spellbook_rabbitmq | /usr/local/bin/docker-entrypoint.sh: line 50: /amqp-init.sh: Success
spellbook_rabbitmq exited with code 127
spellbook_mariadb | /usr/local/bin/docker-entrypoint.sh: /docker-entrypoint-initdb.d/1-db-init.sh: /bin/bash^M: bad interpreter: No such file or directory
spellbook_mariadb exited with code 126
spellbook_arcane_bridge | Starting child process with 'node dist/arcane.bridge.js -vh http://10.23.82.2:8200 -vt /vault_share/write-token -cs'
spellbook_arcane_bridge | failed to start vault
spellbook_arcane_bridge | ๐ฃ service class VaultService did not start correctly, exiting program ๐ฃ
spellbook_arcane_bridge | Program node dist/arcane.bridge.js -vh http://10.23.82.2:8200 -vt /vault_share/write-token -cs exited with code 0
the only log entry in spellbook-docker-vault is:
exec /usr/local/bin/unseal.sh: no such file or directory
The build log for Vault was completed normally
I may try running this directly in WSL2 Ubuntu 22.04 and see if that makes a difference as I usually run all my docker stuff on other Ubuntu VMs but my Windows machine is where my GPUs are.
I'm working on something similar (structurally at least), a manga-to-anime pipeline.
It involves a lot of different steps/models, similar to this project:
Pre processing (alignment, upscaling, coloring).
Separating pages into panels.
Ordering the panels in the right reading order (took so much more effort than I thought...)
Segmentation (using segment-anything)
Extracting bubbles, the tails of bubbles/their vector, faces, bodies, backgrounds. Most of that necessitated training custom models.
Assigning a character identity to each face/body.
Making a naive association between faces and bubbles.
Reading the text of bubbles.
I feed all that data to GPT4-V, and ask it to "read" each panel, telling it what happened in previous panels, what bubble is associated with what face, etc, asking it to "understand" what is happening in the panel, and to "deduce" some associations between the items, the tone of voice, etc. I tried "just" asking GPT4-V to read manga pages without all the steps above, and it was terrible at it. But with all the provided info (which causes easily 10k-token prompts, just for the text), it gets much better at it. It's sort of "pre-chewing" the work for him.
That's where Iย am at now, the next step is going to be generating voice (what I'm working on now, bark/whisper/other models), sound effects, and then generating animation and special effects, and finally assembling all that into video.
I'll be looking closer into your project, in particular how it's organized, thanks a lot for sharing.
I'd be curious if you have any insights on how you'd do manga reading if you had to.
Hey, very exciting project, I feel like I wanted to build something like this for a while now.
I've recently started playing with local models using ollama (hosted by a small computer on my network).
Wondering if it is possible to connect this to ollama? (maybe I'm asking if it is possible to use this without the router that requires nvidia hardware)
Also, beyond installation instructions and high level design, is there a wiki where it is explained how individual parts work?