Light

uyzhang / jclip Goto Github PK

View Code? Open in Web Editor NEW

12.0 1.0 4.0 1.54 MB

Python 100.00%

jclip's Introduction

JCLIP

JCLIP为CLIP的Jittor版本，CLIP（Contrastive Language-Image Pre-Training）是一个在各种（图像、文本）对上训练的神经网络。可以用自然语言指示它在给定图像的情况下预测最相关的文本片段，而无需直接对任务进行优化，这与 GPT-2和3的zero-shot功能类似。

网络结构

使用方法

安装依赖环境

pip install jittor
pip install ftfy regex tqdm
python setup.py develop

模型权重

下载VIT-B-32或利用转换脚本，将PyTorch权重转换为Jittor权重。

import torch
import jittor as jt
clip = torch.load('ViT-B-32.pt').state_dict()

for k in clip.keys():
    clip[k] = clip[k].float().cpu()
jt.save(clip, 'ViT-B-32.pkl')

demo

import jittor as jt
import jclip as clip
from PIL import Image

jt.flags.use_cuda = 1

model, preprocess = clip.load("ViT-B-32.pkl")

image = preprocess(Image.open("CLIP.png")).unsqueeze(0)

text = clip.tokenize(["a diagram", "a dog", "a cat"])

with jt.no_grad():
    logits_per_image, logits_per_text = model(image, text)
    probs = logits_per_image.softmax(dim=-1).numpy()

print("Label probs:", probs)  # prints: [[0.9927937  0.00421068 0.00299572]]

第四届计图比赛Baseline

Training-Free版本

python baseline.py

Training版本

python baseline_ft.py

得到result.txt，打包为zip，提交即可。

jclip's People

Contributors

Stargazers

Watchers

Forkers

huismiling zyq00 fishsix20236356 zhiqing0205

jclip's Issues

Baseline.py 运行后提交结果错误

Baseline.py 运行后提交结果错误，baseline-ft.py正常

use

请问如果jittor是使用ubuntu的docker安装的，这个base怎么使用呢

baseline_ft版本分类结果全部为相同的值

运行baseline之后，结果正常，且分类结果符合视觉感受；但是运行baseline_ft之后，所有测试图片的分类结果的前5个类别ID，都是相同的值，如下图所示。

module 'jittor' has no attribute 'triu_'

在运行demo.py的时候报错：
Traceback (most recent call last):
File "demo.py", line 7, in
model, preprocess = clip.load("ViT-B-32.pkl")
File "/data1/mcy/code/DailyProject/jittor-race-few-shot/baseline/JCLIP-main/jclip/clip.py", line 159, in load
model = build_model(state_dict)
File "/data1/mcy/code/DailyProject/jittor-race-few-shot/baseline/JCLIP-main/jclip/model.py", line 279, in build_model
transformer_width, transformer_heads, transformer_layers)
File "/data1/mcy/code/DailyProject/jittor-race-few-shot/baseline/JCLIP-main/jclip/model.py", line 160, in init
attn_mask=self.build_attention_mask())
File "/data1/mcy/code/DailyProject/jittor-race-few-shot/baseline/JCLIP-main/jclip/model.py", line 193, in build_attention_mask
mask = jt.triu_(mask, 1) # zero out the lower diagonal
AttributeError: module 'jittor' has no attribute 'triu_'
之前jittor框架搭建里面的所有都正常运行

clip.load("ViT-B-32.pkl") 报错

Traceback (most recent call last):
File "baseline.py", line 15, in
model, preprocess = clip.load("ViT-B-32.pkl")
File "/root/contest/JCLIP-main/jclip/clip.py", line 159, in load
model = build_model(state_dict)
File "/root/contest/JCLIP-main/jclip/model.py", line 279, in build_model
transformer_width, transformer_heads, transformer_layers)
File "/root/contest/JCLIP-main/jclip/model.py", line 155, in init
output_dim=embed_dim)
File "/root/contest/JCLIP-main/jclip/model.py", line 99, in init
self.transformer = Transformer(width, layers, heads)
File "/root/contest/JCLIP-main/jclip/model.py", line 73, in init
for _ in range(layers)
File "/root/contest/JCLIP-main/jclip/model.py", line 73, in
for _ in range(layers)
File "/root/contest/JCLIP-main/jclip/model.py", line 47, in init
self.attn = MultiheadAttention(d_model, n_head)
File "/root/contest/JCLIP-main/jclip/mha.py", line 519, in init
self.in_proj_bias = jt.empty(3 * embed_dim, **factory_kwargs)
RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.ops.empty)).

Types of your inputs are:
self = module,
args = (int, ),
kwargs = {dtype=builtin_function_or_method, },

The function declarations are:
VarHolder* empty(NanoVector shape, NanoString dtype=ns_float32)

Failed reason:[f 0714 20:31:21.807070 08 pyjt_jit_op_maker.cc:18171] Not a valid call.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs