GithubHelp home page GithubHelp logo

layoutlmv3_fine_tuning's Introduction

Hi there, Iโ€™m Manikanth Palamakula ๐Ÿ‘จโ€๐Ÿ’ป


I am a specialist Data science at LTIMindtree, India. Self-taught programmer, and I am passionate about Machine learning, Deep Learning, AI and learning new technologies and building cool stuff. I love to solve problems and I am always open to learn new things.

Website Badge Linkedin Badge Github Badge Twitter Badge Bitbucket Badge Kaggle Badge Gmail Badge

  • Hands-on experience in Machine Learning and Artificial Intelligence
  • I am a fast learner, I can quickly adapt to new technologies and environments
  • I am a team player, I am always willing to learn and help others

Languages and Tools:

python Keras django linux html css c flask docker mysql android Java iOS Git GitHub Terminal XCode Visual Studio Code AWS


Manikanth's Github stats

Top Lang's

trophy


๐Ÿ“• Latest Blog posts:

Online Resume Template Profiles:

layoutlmv3_fine_tuning's People

Contributors

manikanthp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

layoutlmv3_fine_tuning's Issues

"image" key is expected in task data [assume: item["data"]

I tried to import the file generated by the script "Create_LMV3_dataset_with_Paddle.py" which generated file "TC_label-studio_input_file.json"

However when uploading the above generated json file to Labelstudio it gives the following error. It is expecting "image" tag in "data". When I change the "ocr" to 'image" it is loading properly.

Error at item 0: "image" key is expected in task data [assume: item["data"] = task root with values] :: {'data': {'ocr':
Below is he screenshot.
image

Can you let me know what is going wrong?

Inference Code

Hi Manikanth,

It has been very good to see your code working out of the box. Many viewers have asked for the inference code from this piece.

Can you please expedite the inference code for this repository?

Thanks
Deepak

bbox code

why are you multiplying by 100 here in this code
bbox = {
'x': 100 * four_co_ord[0] / image_width,
'y': 100 * four_co_ord[1] / image_height,
'width': 100 * four_co_ord[2] / image_width,
'height': 100 * four_co_ord[3] / image_height,
'rotation': 0
}

Create_LMv3_dataset_with_paddleOCR.py line-no:95

ValueError: Expected input batch_size (1280) to match target batch_size (1024).

Hi @manikanthp

Thanks for sharing the repo.

I tried to run this code on my custom dataset. I have five classes, attaching the label-studio output file and converted file.

When I am trying to run the code. I am getting following error.

Traceback (most recent call last):
File "F:\PyCharmProjects\LayoutLMTrial\main.py", line 35, in
train_loss = train_fn(dataload, model, optimizer)
File "F:\PyCharmProjects\LayoutLMTrial\engine.py", line 11, in train_fn
_, loss = model(**data)
File "F:\PyCharmProjects\LayoutLMTrial\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "F:\PyCharmProjects\LayoutLMTrial\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "F:\PyCharmProjects\LayoutLMTrial\trainer.py", line 34, in forward
loss = loss_fn(output,lables)
File "F:\PyCharmProjects\LayoutLMTrial\trainer.py", line 13, in loss_fn
return nn.CrossEntropyLoss()(pred.view(-1,4),target.view(-1))
File "F:\PyCharmProjects\LayoutLMTrial\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "F:\PyCharmProjects\LayoutLMTrial\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "F:\PyCharmProjects\LayoutLMTrial\venv\lib\site-packages\torch\nn\modules\loss.py", line 1179, in forward
return F.cross_entropy(input, target, weight=self.weight,
File "F:\PyCharmProjects\LayoutLMTrial\venv\lib\site-packages\torch\nn\functional.py", line 3053, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
ValueError: Expected input batch_size (1280) to match target batch_size (1024).

The only change that I did in code was. Changing the classes from 4 to 5 in main.py as mentioned below:

model = ModelModule(5)

Can you please help me to fix this issue.

Thanks
Rishabh Gupta
Training_json_1.json
Training_layoutLMV3_1.json

Inference.py || utils.py

Traceback (most recent call last):
File "C:\Users\Admin\PycharmProjects\LayoutLMV3_Fine_Tuning-main\src\Inference.py", line 22, in
test_dict, width_scale, height_scale = dataSetFormat(image)
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Admin\PycharmProjects\LayoutLMV3_Fine_Tuning-main\src\utils.py", line 71, in dataSetFormat
test_dict['bboxes'].append(scale_bounding_box(process_bbox(item[0]), width, height))
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Admin\PycharmProjects\LayoutLMV3_Fine_Tuning-main\src\utils.py", line 58, in process_bbox
return [box[0][0], box[1][1], box[2][0] - box[0][0], box[2][1] - box[1][1]]
~~~~~~^^^
TypeError: 'float' object is not subscriptable

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.