GithubHelp home page GithubHelp logo

lpbeaulieu / typewriter-ocr-tintypetext Goto Github PK

View Code? Open in Web Editor NEW
11.0 2.0 1.0 6.44 MB

This typewriter OCR code can convert JPEG typewritten text images into RTF documents, while removing typos for you!

Home Page: https://github.com/LPBeaulieu

License: GNU Affero General Public License v3.0

Python 100.00%
typewriter optical-character-recognition ocr text-formatting deep-learning typewriter-ocr

typewriter-ocr-tintypetext's Issues

train_model.py fails with "KeyError: "Label 'S' was not included in the training dataset""

First - awesome project and I am fascinated by the idea of an OCR trained to my own typewriter. However, I cannot run the training successfully. Here is what I get:

(ocr) fred@banane:~/Documents/ML/Typewriter-OCR-TintypeText$ python3 train_model.py
/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/callback/core.py:69: UserWarning: You are shadowing an attribute (modules) that exists in the learner. Use self.learn.modules to avoid this
warn(f"You are shadowing an attribute ({name}) that exists in the learner. Use self.learn.{name} to avoid this")
epoch train_loss valid_loss accuracy time
Traceback (most recent call last):-------------------------------------------------------------------------------| 0.00% [0/1 00:00<?]
File "/home/fred/Documents/ML/Typewriter-OCR-TintypeText/train_model.py", line 60, in
learn = fit()
^^^^^
File "/home/fred/Documents/ML/Typewriter-OCR-TintypeText/train_model.py", line 49, in fit
learn.fit_one_cycle(epochs, learning_rate)
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/callback/schedule.py", line 119, in fit_one_cycle
self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd, start_epoch=start_epoch)
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/learner.py", line 264, in fit
self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/learner.py", line 199, in with_events
try: self(f'before
{event_type}'); f()
^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/learner.py", line 253, in _do_fit
self._with_events(self._do_epoch, 'epoch', CancelEpochException)
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/learner.py", line 199, in with_events
try: self(f'before
{event_type}'); f()
^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/learner.py", line 248, in _do_epoch
self._do_epoch_validate()
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/learner.py", line 244, in _do_epoch_validate
with torch.no_grad(): self._with_events(self.all_batches, 'validate', CancelValidException)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/learner.py", line 199, in with_events
try: self(f'before
{event_type}'); f()
^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/learner.py", line 205, in all_batches
for o in enumerate(self.dl): self.one_batch(*o)
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/load.py", line 127, in iter
for b in _loadersself.fake_l.num_workers==0:
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in next
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
data.reraise()
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/torch/_utils.py", line 722, in reraise
raise exception
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/transforms.py", line 261, in encodes
return TensorCategory(self.vocab.o2i[o])
~~~~~~~~~~~~~~^^^
KeyError: 'S'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 41, in fetch
data = next(self.dataset_iter)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/load.py", line 138, in create_batches
yield from map(self.do_batch, self.chunkify(res))
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastcore/basics.py", line 230, in chunked
res = list(itertools.islice(it, chunk_sz))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/load.py", line 168, in do_item
try: return self.after_item(self.create_item(s))
^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/load.py", line 175, in create_item
if self.indexed: return self.dataset[s or 0]
~~~~~~~~~~~~^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/core.py", line 447, in getitem
res = tuple([tl[it] for tl in self.tls])
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/core.py", line 447, in
res = tuple([tl[it] for tl in self.tls])
~~^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/core.py", line 406, in getitem
return self._after_item(res) if is_indexer(idx) else res.map(self._after_item)
^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/core.py", line 366, in _after_item
def _after_item(self, o): return self.tfms(o)
^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastcore/transform.py", line 208, in call
def call(self, o): return compose_tfms(o, tfms=self.fs, split_idx=self.split_idx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastcore/transform.py", line 158, in compose_tfms
x = f(x, **kwargs)
^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastcore/transform.py", line 81, in call
def call(self, x, **kwargs): return self._call('encodes', x, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastcore/transform.py", line 91, in _call
return self._do_call(getattr(self, fn), x, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastcore/transform.py", line 97, in _do_call
return retain_type(f(x, **kwargs), x, ret)
^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastcore/dispatch.py", line 120, in call
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/home/fred/anaconda3/envs/ocr/lib/python3.11/site-packages/fastai/data/transforms.py", line 263, in encodes
raise KeyError(f"Label '{o}' was not included in the training dataset") from e
KeyError: "Label 'S' was not included in the training dataset"

I do have a folder "Dataset" and I do have a subfolder "S" with a file called "S-38.jpg" sized 106 x 164 pics. What can I do? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.