GithubHelp home page GithubHelp logo

taited / sgdiff Goto Github PK

View Code? Open in Web Editor NEW
30.0 5.0 3.0 32.13 MB

Official implementation of SGDiff (ACM MM '23)

Home Page: https://taited.github.io/sgdiff-project

License: Apache License 2.0

Python 41.18% Jupyter Notebook 58.80% Dockerfile 0.01% Shell 0.02%
diffusion fashion glide sgdiff multimedia style style-transfer

sgdiff's People

Contributors

ckkelvinchan avatar congee524 avatar endlesssora avatar ferryhuang avatar hejm37 avatar hellock avatar innerlee avatar leoxing1996 avatar liuwenran avatar magicdream2222 avatar nbei avatar okotaku avatar plyfager avatar quincylin1 avatar rangeking avatar ruoningyu avatar ryanxingql avatar sheffieldcao avatar sunnyxiaohu avatar taited avatar vongolawu avatar wangruohui avatar wwhio avatar xiaomile avatar xinntao avatar yanxingliu avatar yaochaorui avatar yshuo-li avatar z-fran avatar zengyh1900 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sgdiff's Issues

Dataset

Thanks for the great work! When will the dataset be available?

Implementation of perceptual loss

Thank you for an outstanding job!

When do you plan to release the training code? In particular, does the perceptual loss in the paper use StableDiffusionPipeline to obtain the generated image after each noise estimation?

Looking forward for your reply.

training code?

Hello, I am very interested in your work. When can you release training code?

Results are not good

i use this picture
9
and use Vincent van Gogh’s Starry Night as text prompt
the result is not as good as expected
results

How to do multi head attention, but the shape of q, k, and v is different?

The SCA moudle adopt semantic features form clip and text features form clip, the Q is only from text, K and V are added by text and image. And your picture in the paper shows that the Q, K, V have different shape, so how to do Q@K? Is this is your SCA code?

def forward(self, img, text_emb): if self.clip_norm: img = (img + 1) / 2 img = F.batch_norm(img, self.mean.to(self.device), self.std.to(self.device)) img = F.interpolate(img, (224, 224)) image_features = self.model(img) if self.last_layer_proj: image_features = torch.einsum('bld,ds->bls', image_features, self.model.proj) if self.cross_attn is not None: emb_features = self.cross_attn( text_emb.permute(0, 2, 1), image_features.permute(0, 2, 1)).permute(0, 2, 1) if self.skip_module is not None: if self.learned_length: residual = self.skip_module(emb_features.permute(0, 2, 1)) residual = residual.permute(0, 2, 1) else: residual = self.skip_module(emb_features) text_emb += residual return text_emb

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.