bbruceyuan / deepmatch-torch Goto Github PK

「PyTorch」A deep matching model library for recommendations & advertising. It's easy to train models and to export representation vectors which can be used for ANN search.

License: MIT License

Python 99.78% Shell 0.22%

deepmatch dssm-pytorch mind-pytorch youtubednn-pytorch

deepmatch-torch's Introduction

Hi there 👋

来了解我吧~

deepmatch-torch's People

Contributors

Stargazers

Watchers

Forkers

wp-zhang shawndong98 tonylibing kiwii139 ajing artemr238 whutbd aqihuyuyu

deepmatch-torch's Issues

关于mind代码中对用户行为的pooling操作

Discussed in #4

^{Originally posted by xbingsun May 25, 2022}
您好，我在mind实现代码中看到调用 input_from_feature_columns 函数得到用户的历史行为的表示，而在该函数中使用了get_varlen_pooling_list对用户的行为进行了pooling操作，想要请问这样做的原因是什么呢？好像原版本的deepmatch中也使用了这个操作，根据原论文胶囊层输入应该是各个items的embeddings（如下图）。

想问一下，这个function不return任何东西，是个bug吗？

DeepMatch-Torch/deepmatch_torch/utils/utils.py

Line 5 in 322589c

def concat_fun(inputs, axis=-1):

DSSM有线上测试过效果吗

如题，想请教下这些召回模型有实际上线验证码，比如DSSM

为什么pytorch_lightning不在setup.py里？

在linux安装的时候没法直接跑

error: ModuleNotFoundError: No module named 'tensorflow.python.keras.preprocessing'

我的电脑是 mac book pro
python=3.7
deepctr-torch=0.2.2
tensorflow-macos 2.8.0

我在 DeepMatch-Torch 的示例里运行了 run_fm_dssm.py 但是没有成功，报错了。
Traceback (most recent call last): File "/Users/henry/PycharmProjects/github/DeepMatch-Torch/examples/run_fm_dssm.py", line 8, in <module> from preprocess import gen_data_set, gen_model_input File "/Users/henry/PycharmProjects/github/DeepMatch-Torch/examples/preprocess.py", line 4, in <module> from tensorflow.python.keras.preprocessing.sequence import pad_sequences ModuleNotFoundError: No module named 'tensorflow.python.keras.preprocessing' WARNING:root: DeepCTR-PyTorch version 0.2.7 detected. Your version is 0.2.2. Use pip install -U deepctr-torch to upgrade.Changelog: https://github.com/shenweichen/DeepCTR-Torch/releases/tag/v0.2.7

提示让我安装 deepctr=0.2.7，但是好像依赖于 TensorFlow ，安装不上
(py38) henry@hzMacBookPro DeepMatch-Torch % pip3 install deepctr-torch==0.2.7 Looking in indexes: http://mirrors.aliyun.com/pypi/simple/ Collecting deepctr-torch==0.2.7 Using cached http://mirrors.aliyun.com/pypi/packages/d2/17/f392dfbaefdd6371335995c4f84cf3b5166cf907fdfa0aa4edc380fdfc5b/deepctr_torch-0.2.7-py3-none-any.whl (70 kB) Requirement already satisfied: torch>=1.1.0 in /Users/henry/miniforge3/envs/py38/lib/python3.8/site-packages (from deepctr-torch==0.2.7) (1.11.0) ERROR: Could not find a version that satisfies the requirement tensorflow (from deepctr-torch) (from versions: none) ERROR: No matching distribution found for tensorflow
想请问一下该如何解决。

Two Bugs regarding YoutubeDNN

There are two bugs related to the codes for YoutubeDNN model.

The gen_data_set_youteube has a typo... should be youtube. (Not necessarily a bug lol)
Here's the first bug: gen_data_set_youteube will produce the negative samples ONLY, without any positive samples. Consequently all training labels will be 0.
The second one: [neg_list[item_idx] for item_idx in np.random.choice(neg_list, negsample)] is not correct. It should directly call the indexes.

在创建用户特征的时候，生成的历史观影数据长度的声明前后不一致？

在run_youtubednn.py中，SEQ_LEN = 50，但是在创建user_feature_columns这一个变量的时候，声明的历史电影长度为10，这会影响模型的正确运行么？
VarLenSparseFeat(SparseFeat('hist_movie_id', vocabulary_size=feature_max_idx['movie_id'], embedding_dim=embedding_dim, embedding_name="movie_id"), maxlen=10, combiner='mean')

在创建item tower的时候，X维度为（BatchSize,61） (1+1+1+1+1+50+6=61)，当取item的id的时候，self.feature_index['movie_id']为[15:21]，这会导致不能正确地按照维度取值？

关于sampled softmax loss

您好，MIND模型中提到计算loss时候使用了sampled softmax的方法，在deepmatch的代码里我看到有对tf的版本调用，但是在pytorch版本里面好像只传入了sample的个数，但是并没有使用，请问这部分是否会考虑实现？