GithubHelp home page GithubHelp logo

Comments (8)

djbielejeski avatar djbielejeski commented on May 9, 2024 3

I've been seeing this too, looking into it right now.

from dreambooth-stable-diffusion.

Kallamamran avatar Kallamamran commented on May 9, 2024

Do you have "max_training_steps = 1000"?
I don't get mine to run at all, so :/

from dreambooth-stable-diffusion.

GuusDeKroon avatar GuusDeKroon commented on May 9, 2024

Nope, the max training steps is at 3000

from dreambooth-stable-diffusion.

djbielejeski avatar djbielejeski commented on May 9, 2024

I tried removing the --no-test param, still get this error

Here comes the checkpoint...
Another one bites the dust...

Traceback (most recent call last):
  File "main.py", line 847, in <module>
    trainer.fit(model, data)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit
    self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 723, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1236, in _run
    results = self._run_stage()
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1323, in _run_stage
    return self._run_train()
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1353, in _run_train
    self.fit_loop.run()
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/fit_loop.py", line 266, in advance
    self._outputs = self.epoch_loop.run(self._data_fetcher)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 205, in run
    self.on_advance_end()
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 255, in on_advance_end
    self._run_validation()
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 311, in _run_validation
    self.val_loop.run()
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
    dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 134, in advance
    self._on_evaluation_batch_end(output, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 267, in _on_evaluation_batch_end
    self.trainer._call_callback_hooks(hook_name, output, *kwargs.values())
  File "/opt/conda/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1636, in _call_callback_hooks
    fn(self, self.lightning_module, *args, **kwargs)
  File "/workspace/Dreambooth-Stable-Diffusion/main.py", line 470, in on_validation_batch_end
    self.log_img(pl_module, batch, batch_idx, split="val")
  File "/workspace/Dreambooth-Stable-Diffusion/main.py", line 434, in log_img
    images = pl_module.log_images(batch, split=split, **self.log_images_kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/Dreambooth-Stable-Diffusion/ldm/models/diffusion/ddpm.py", line 1328, in log_images
    batch = batch[0]
KeyError: 0

from dreambooth-stable-diffusion.

1blackbar avatar 1blackbar commented on May 9, 2024

Itsrunning fine here , past 1000 steps
image

from dreambooth-stable-diffusion.

djbielejeski avatar djbielejeski commented on May 9, 2024

Pretty sure I found the issue. It is when writing an epoch, and the epoch size depends on your training_samples count and your regularization_images count. You can trigger this by going over 1 epoch. I think I found the spot in the code, testing it now.

from dreambooth-stable-diffusion.

djbielejeski avatar djbielejeski commented on May 9, 2024

Fixed here

from dreambooth-stable-diffusion.

djbielejeski avatar djbielejeski commented on May 9, 2024

At least my issue...

from dreambooth-stable-diffusion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.