rinnakk / japanese-clip Goto Github PK

View Code? Open in Web Editor NEW

66.0 8.0 8.0 588 KB

Japanese CLIP by rinna Co., Ltd.

Home Page: https://huggingface.co/rinna

License: Apache License 2.0

Python 100.00%

clip cloob japanese pretrained-models vision language-model multi-modal-learning

japanese-clip's People

Contributors

Stargazers

Watchers

Forkers

tattn r-calland pshiko ae14watanabe clom pyama86 yutosuzukibp smartnews

japanese-clip's Issues

always clip loss is zero

I tried to calculate clip loss using this code.

from PIL import Image
import torch
import japanese_clip as ja_clip

device = "cuda" if torch.cuda.is_available() else "cpu"
# ja_clip.available_models()
# ['rinna/japanese-clip-vit-b-16', 'rinna/japanese-cloob-vit-b-16']
model, preprocess = ja_clip.load("rinna/japanese-clip-vit-b-16", cache_dir="/tmp/japanese_clip", device=device)
tokenizer = ja_clip.load_tokenizer()

image = preprocess(Image.open("./img/dog.jpeg")).unsqueeze(0).to(device)
encodings = ja_clip.tokenize(
    texts="象",
    max_seq_len=77,
    device=device,
    tokenizer=tokenizer, # this is optional. if you don't pass, load tokenizer each time
)

with torch.no_grad():
    res = model(input_ids=encodings['input_ids'], pixel_values=image, return_loss=True)
    
print("clip loss:", res.loss.item())

but, no matter how many times I change the image, always clip loss return 0.

If the usage seems to be wrong, would you tell me how to use it correctly?

Unable to use rinnai clip in Apple silicon based chip Macos

It was mentioned in requirements.txt sentencepiece >=0.1.91 and <=0.1.94 but there is no wheel for silicon chip based macos for those versions of sentencepiece library and when ever I am trying to run
$ pip install git+https://github.com/rinnakk/japanese-clip.git
I am getting an error as shown in below screeen shot.
But i downloaded the zip file and changed sentencepiece version to latest in requirements.txt and then i tried to make installations from that folder then i was able to download all the required dependencies and installed everything required for my model.

So I request you to look into this issue.

How Japanese CC12M was generated

Thank you very much for great work.

I would like to know how Japanese CC12M was generated. Did you translate it by your own machine translation model, or any other public services?

imagenet japanese class names

Hi,
Awesome repo

Could you please share the imagenet class names / prompt you used for zero shot evaluation?

I'm trying to build a good evaluation framework for clip (https://github.com/LAION-AI/CLIP_benchmark) and having these class names in other languages would be valuable

Thanks for your help!

How to initialize the vision encoder?

First of all, great work!!
I strongly believe this model has made big contribution to the Vision-and-Language community in Japan.

I find there is no description about the initialization of the vision encoder in CLIP/CLOOB.
Did you use some pre-trained weights available in HuggingFace, or just randomly initialize and train it from scratch?

rinnakk / japanese-clip Goto Github PK

japanese-clip's People

Contributors

Stargazers

Watchers

Forkers

japanese-clip's Issues

always clip loss is zero

Unable to use rinnai clip in Apple silicon based chip Macos

How Japanese CC12M was generated

imagenet japanese class names

How to initialize the vision encoder?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs