GithubHelp home page GithubHelp logo

error about docextractor HOT 6 CLOSED

monniert avatar monniert commented on May 25, 2024
error

from docextractor.

Comments (6)

marouamehri avatar marouamehri commented on May 25, 2024

Hi @monniert

Thank you very much for sharing your work !

I obtain the following error when tester.py was run.

File "/content/drive/MyDrive/docExtractor/src/utils/metrics.py", line 56, in _fast_hist
minlength=self.n_classes ** 2).reshape(self.n_classes, self.n_classes)
ValueError: cannot reshape array of size 3236 into shape (4,4)

What can I do?

Thank you in advance

from docextractor.

monniert avatar monniert commented on May 25, 2024

Hi @nada0698 @marouamehri

The script src/tester.py is made to evaluate quantitatively the segmentation results on given labels and it requires appropriate ground truth masks. The traceback suggests that there are many labels in your ground truth masks, depending on what you would like to evaluate, you should only have 1 or 2 labels (illustration and text) in your ground truth masks.

If you only want to get some qualitative segmentation masks inference, you can take a look at the notebook at demo/demo.ipynb

from docextractor.

nada0698 avatar nada0698 commented on May 25, 2024

@monniert thank you very much for your answer
I obtain the same error when tester.py was run.
i need the result of the test

!python /content/drive/MyDrive/docExtractor/src/trainer.py -wt --tag test1 --config syndoc.yml

Traceback (most recent call last):
File "/content/drive/MyDrive/docExtractor/src/trainer.py", line 336, in
tester.run()
File "/content/drive/MyDrive/docExtractor/src/tester.py", line 60, in run
self.single_run(image, label)
File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/content/drive/MyDrive/docExtractor/src/tester.py", line 81, in single_run
self.metrics.update(gt, pred)
File "/content/drive/MyDrive/docExtractor/src/utils/metrics.py", line 52, in update
self.confusion_matrix += self._fast_hist(lt_flat, lp_flat)
File "/content/drive/MyDrive/docExtractor/src/utils/metrics.py", line 56, in _fast_hist
minlength=self.n_classes ** 2).reshape(self.n_classes, self.n_classes)
ValueError: cannot reshape array of size 1284 into shape (4,4)

Capture

What can I do?

from docextractor.

marouamehri avatar marouamehri commented on May 25, 2024

Hi @monniert

I want to generate a dataset having only text (4) and illustration (1) labels when I run syndoc_generator.py -n 10 --dataset_name syndoc -m

Should I only edit restricted_labels: [1, 4] in docExtractor/configs/syndoc.yml ?

Thanks for taking time to answer us.

from docextractor.

monniert avatar monniert commented on May 25, 2024

Hi

  • @nada0698: similarly, the traceback suggests there are too many labels in you ground truth, you should check there is the same number of labels in your prediction / GT pair examples
  • @marouamehri: so you mean without the text border labels? using -m generates documents with illustration, text and text_border labels, I have just pushed the feature you need, you should use python src/syndoc_generator.py -n 10 --dataset_name toto -m -ntb to generate documents without borders. Otherwise, you can also generate as you did but not use border labels during training by specifying the appropriate restricted_labels argument

from docextractor.

monniert avatar monniert commented on May 25, 2024

I assume the issue is fixed so I close it for now, please reopen if needed

from docextractor.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.