GithubHelp home page GithubHelp logo

ankanbhunia / handwriting-transformers Goto Github PK

View Code? Open in Web Editor NEW
157.0 10.0 36.0 67.89 MB

Handwriting-Transformers (ICCV21)

License: MIT License

Python 88.20% Jupyter Notebook 11.80%
handwriting generative-ai handwriting-generation handwriting-synthesis mimic style-transfer

handwriting-transformers's Issues

params.py file

Hi,

Thanks for sharing the great work!

I am training this model from scratch, but the training parameters mentioned in the paper, such as IGM_HEIGHT and LR are different than the ones given in the params.py file.

Can you please share the parameters that were used to train the network?
Thanks.

Color images

Hello!

I need to generate color images for my project. I want to try changing your architecture accordingly. I guess this should be easy to do. It is enough to change the number of channels at input and output, right? I'm wondering what you think of this possibility. Any thoughts welcome!

RIMES weights?

Great project. I saw the paper had experiments with RIMES, any chance you still have those weights? Thanks!

About pre-trained model.

Sir, the method proposed in your paper has effectively improved.
I am very curious how to get the "fid: four scenes" in your paper
In addition, Would you like to share the pre-trained model.
I am sorry for my recklessness.
good luck.

Regarding the issue of loading the dataset

Hello, I have been troubled by some questions for a long time, and I hope to get answers. Regarding the parameters in prapare_data.py: TRAIN_IDX = 'gan.iam.tr_va.gt.filter27' and TEST_IDX = 'gan.iam.test.gt.filter27', how were they obtained? Are the IAM_WORD_DATASET_PATH and XMLS_PATH directories used to store word images and related XML files, respectively, downloaded from the I AM dataset official website?
I hope to receive your answer, and thank you very much.

When is this code complete?

Hello, thanks for the impressive work.
It seems some code(lexicon file and txt not be uploaded) and the install.md not be finished.
When will this code be updated?
@ankanbhunia

For other language's handwriting generation--how to properly fine-tune the model?

@ankanbhunia

While you mention:
"You can train the model in any custom dataset other than IAM and CVL. The process involves creating a dataset_name.pickle file and placing it inside files folder. The structure of dataset_name.pickle is a simple python dictionary."

I go through the codes and the way you suggest seems to retrain the model on another dataset from scratch rather than fine-tuning your model. As your paper has not yet much discussed this, I come to ask your opinion. If I want to apply the model to generate other language's handwritings e.g. Japanese, is there a way to quickly fine-tune your model upon a new Japanese handwriting data? Or do I need to retrain it from scratch?

Missing the positional encodings in the encoder

Hi, sir, thanks for the impressive work. Mentioned in the paper, "To retain information regarding the order of input sequences being supplied, we add the positional encodings [23] to the input of each attention layer". However, the released code does not add the positional encodings to the Multi-Head Attention of the encoder, and only adds positional encodings to the Multi-Head Attention of the decoder. It's better if we don't apply positional encodings in the encoder?

The Fake and real folder are empty after the training is completed

Hello author, thank you for your excellent paper. I have recently looked at your code and I have encountered a problem that I am not generating Fake and Real images in the saved_images folder after the training is completed, these two folders are empty, I would like to get your guidance and help, looking forward to your reply, thank you!

plan for code release?

Hi, many thanks for your innovative work, just wondering is there a plan for code release? thanks

No such file or directory: '../CVL_32.pickle'

Thank you for publishing IAM_32.pickle in Issue #4
I cloned this project to my personal Colab Notebook and executed "train.py". Now the file CVL_32.pickle is required.

Traceback (most recent call last):
  File "./Handwriting-Transformers/train.py", line 42, in <module>
    TextDatasetObjval = TextDatasetval()
  File "/content/Handwriting-Transformers/data/dataset.py", line 109, in __init__
    file_to_store = open(base_path, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '../CVL_32.pickle'

Would you please publish this file also?

Thank you very much. 🙇

‘base_path = '../IAM_32.pickle'’

Hello, what's in your code ‘base_path = '../IAM_32.pickle'’ 。Can this document be published? If so, please send me a private letter? Thanks for your reading!

Missing license

Love this work! Any chance you could add an MIT license or something? I've started my own fork, but wanted to make sure it remains open so people can keep building off of it. Thanks!

CVL_32.pickle

Hello, what's in your code ‘base_path = './CVL_32.pickle'’ 。Can this document be published? If so, please send me a private letter? Thanks for your reading

Model results

Hi!

I've been playing around with this model locally following the instructions in the README and my results don't seem to be nearly as good as yours.. I'm following your instructions here #11 (comment) and then running the prepare.py in my fork

For instance even with different style prompts the model seems to generate very similar results for me
Real on left, Generated on right

IAM style 1
image-9-IAM

IAM style 2
image-6_1

and then secondly the CVL and the IAM models give very different results to each other. But also quite consistent results within the model itself for different styles

CVL style 1
image-9-CVL

CVL style2
image-6

Is there something stupid I'm missing or do I need to train it with these writers in the dataset to get better results? does the google drive contain the fully trained models that were used to generate the results in the paper?

Very cool project though - congrats!!

Cycle loss

I went through the codebase, it seems like the cycleloss has been commented. is there any particluar reason why it has been commented out?

Tool for text cropping?

I'm trying to generate something based on my own handwriting, what text cropping tool did you use? Wondering if I can skip the work of cropping every single word by hand.

Thanks!!

Generate custom handwriting?

Could this be used to reproduce one particular handwriting style? (That is, one not in the IAM database -- such as my own handwriting, or the handwriting of a famous historical figure.)

If so, could you please walk me through how to do that? I would be very appreciative.

Having problem in loading the pretrained weight in orchvision\models\_utils.py

warnings. warn(
C:\Users\Asus\anaconda3\envs\pytorch\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=ResNet18_Weights.IMAGENET1K_V1. You can also use weights=ResNet18_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.