GithubHelp home page GithubHelp logo

archersama / inttower Goto Github PK

View Code? Open in Web Editor NEW
53.0 2.0 10.0 37.15 MB

Source code of CIKM 2022 and DLP-KDD workshop 2022 Best Paper: IntTower-“ IntTower: the Next Generation of Two-Tower Model for Pre-ranking System”

License: Apache License 2.0

Python 100.00%

inttower's Issues

Alibaba datset

Can you please post the download link for the Alibaba dataset? I can not find it in the link

CUDA out of memory

RuntimeError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 23.99 GiB total capacity; 37.27 GiB already allocated; 0 bytes free; 37.33 GiB reserved in total by PyTorch)

I run this code on 24G GPU, this error always happened after epoch 2 whatever batch_size I set, is there anything wrong with my environment?

Question about serving the model

Hello!

First of all thanks a lot for your great article and for opening the code base.
I have a question regarding the model serving:
I understand that you create Faiss indices based on the multi-head latent representation of the items but how do you query them? Do you use the multi-head latent representation of the last layer of the user tower? And after retrieving the top K items, do you compute the Fe score to rerank the candidates?

is CIR contrastive loss removed?

有个小问题想请假下哈

  1. 这个地方最后是没有用CIR 的 contrastive loss 吗
    # total_loss = loss + reg_loss + self.aux_loss + contras
    total_loss = loss + reg_loss + self.aux_loss
    # print(total_loss, contras, loss)

2.contrastive loss 这个地方为啥用y 去作为索引 选择cos_sim score呢, 比如batch_size 256, 那岂不是都选到前两个的score了 后面254个的都选不到, 另外一般这种不是只包含正例 这里面应该是正负的label 都有?

# Compute the loss
loss = torch.log(exp_scores.sum(dim=1)) - scores[range(scores.shape[0]), y]
loss = loss.mean()

矩阵相乘求相似度

您好,请教一下,下面的图中,函数fe_score()内部item_temp和user_tmep矩阵相乘是用来求向量相似度的对吧,但好像没看到在什么地方把item_temp和user_tmep中内部的向量给归一化到0-1之间
image

How to deploy in real recommender systems

I have several questions:

  1. As I known, faiss does not support 'max' operation.
  2. Fot i-th layer user representaion, we will compute each head pairwise to get the similarity score, So we need to retrieve H^2 times?If there are L layers, eventually we need to retrieve L* H^2 times?

Thanks!

Could you share the serving code?

Thanks for your great job! I wonder if you can provide the example code on how to deploy the IntTower in real scenario? such as how to execute the multi-head faiss and maxsim in a parallel way.

多目标serving时的融合

您好,请教一下,如果我的粗排有多个目标,比如,ctr,cvr, 想问一下在预测时如何进行融合,目前我想到的,
1、 使用multi-head分别对ctr塔和cvr塔的顶部进行提取,将提取得到的ctr embedding以及cvr embedding分别和multi-head提取的user embedding过一遍fe_score函数,然后将 ctr的fe_score和cvr的fe_score 以一定权重进行融合,得到最终的score
2、使用multi-head分别对ctr塔和cvr塔的顶部进行提取,将提取得到的ctr embedding以及cvr embedding以一定权重进行融合,将融合后的embedding 和multi-head提取的user embedding过一遍fe_score函数,得到最终的score
谢谢~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.