guoyang9 / ncf Goto Github PK

View Code? Open in Web Editor NEW

335.0 4.0 62.0 21 KB

A pytorch implementation of He et al. "Neural Collaborative Filtering" at WWW'17

Python 100.00%

python3 recommender-system pytorch neural-collaborative-filtering ncf

ncf's People

Contributors

Stargazers

Watchers

ncf's Issues

why did you use kaiming_uniform to init parameters?

nn.init.kaiming_uniform_(self.predict_layer.weight, a=1, nonlinearity='sigmoid')

can you tell me why?
thank you!

It seems that you didn't use the dataset of test_rating,can you show why?

向量生成

你好，
请问如果我想通过user与item的交互数据来生成user的factor与item的factor请问这个在这个代码中是哪里体现出来的？

Data files?

In order to reproduce the result, could you also provide the data files you used for training?
Since I observed a much worse performance than what you get by following the original paper.

Haven't used sigmoid in the output of last layer.

Haven't used sigmoid in the output of last layer, is it supposed to happen like that only or you forgot ?

the NCF wasn't good when I using pre-trained

Can you tell me the exact training parameters when using pre-trained like learning, SGD or Adam, weight decay....? thanks a lot

How to determine the hyper-parameters?

In the original paper, the author said he used a randomly sampled one interaction for each user as the validation data, but in your code, you just used test data as the validation data, so I feel confused. Does that mean you used that test data to select the best parameters?

Concatenation of user and item embeddings

Hi! I am a newbee in recommendation systems and machine learning. After reading the paper and watching your implementation I have a question - does NeuMF have a limitation of dimensionality of users and items? I'm not able to concat embeddings with parameters (for example)

user_num = 3
item_num = 2
RuntimeError                              Traceback (most recent call last)
<ipython-input-129-98d01e5c09d2> in <module>
      2 i_count = 2
      3 ncf = NCF_stolen(u_count, i_count, 12, num_layers=2, dropout=0.0, model='NeuMF-end')
----> 4 ncf(torch.zeros(u_count, dtype=torch.long), torch.zeros(i_count, dtype=torch.long))

~/env3/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

<ipython-input-118-10286f7471cc> in forward(self, user, item)
     93                         embed_user_GMF = self.embed_user_GMF(user)
     94                         embed_item_GMF = self.embed_item_GMF(item)
---> 95                         output_GMF = embed_user_GMF * embed_item_GMF
     96                 if not self.model == 'GMF':
     97                         embed_user_MLP = self.embed_user_MLP(user)

RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 0

Thx for feedback!

AssertionError: lack of GMF model

你好，运行了下您的代码，报了这个错误，查看下config.py文件，发现有个‘./models’的路径，这个后面写的有GMF.pth,MLP.pth,NeuMF.pth, 这些文件您是不是没有上传呢，谢谢！

RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

运行python main.py --batch_size=256 --lr=0.001 --factor_num=32时报错
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:

    if __name__ == '__main__':
        freeze_support()
        ...

The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.

user_vector

你好，请问为什么我在model.py中通过embed_user_GMF来获取user的向量表示，但是获取的数据并不与user的总数相等？

How to extract prediction from this model?

Hi, sorry to bother you. I'm trying to extract useful prediction from this model. For example, a prediction I'm trying to get is: for movie ID 123 and user ID 456, I want to know how likely the user will like this movie. Another way could be returning 10 top item for a particular user ID. I would really appreciate your help!

GMF better than BPR

Hi,

Compared to the result of your BPR model, GMF achieve better performance.
Is it reasonable

a error about use pretraining

hi，
when i use NeuMF model with pretraining, a error arises like this:

AttributeError: Can't get attribute 'GMF' on <module 'main' from 'NeuMF.py'>

in code :

do you know how to solve it? thank you very much!

main_path和model_path

what is the meaning of num_neg and test_num_neg?

the paper of NCF said "common strategy that randomly samples 100 items that are not interacted by the user, ranking the test item among the 100 items."
which parameter should I set to 99?

negative samples should be different in each train epoch

negative samples in training data should be different in each training epoch, otherwise it will overfitting.
when you sampling negative items to get training dataset, the positive item in test dataset should not be included in user's positive dictionary, it also have chance to become a negative item in training data.

All above you can find that in the official implementation.

Do your code with pytorch have the

Do your code have the step of pre_training? I didn't find it,so when i use the NeuMf-pre ,it accur the error that lack of GMF model.Can you show me the step of Pre_training,thankyou

another issue

And another issue is that when i use the model as MIP and GMF,the model isn't saved?
Do you know the reason?
With the issue ,i can't use the model as NeuMF-pre ,the issue is that can't find GMF

Why I use the same code and parameters,the result is not as good as yours?

Broken pipe error

I am getting a broken Pipe error while running your code, please help

请问evaluation_threads = 1 # mp.cpu_count()这里可以调整为多线程吗？

我看是evaluate里面写了evaluate_model函数里面的参数num_thread，但是这个我调多了之后就报错了。。。。TypeError: 'NoneType' object is not subscriptable请问有兄弟能多线程跑吗？

The negative sampling method has a great influence on performance

I found that the negative sampling method used in your code is without replacement:

np.random.choice(user_negative[user], size=numbers, replace=False).tolist()

Although this method takes more time (around 300s for generating negative samples for the training set), it can greatly improve the performance. While in the original implementation of NCF, the author uses sampling with replacement:

def get_train_instances(train, num_negatives):
    trainData = []
    num_users = train.shape[0]
    for (u, i) in train.keys():
        # positive instance
        trainData.append([u, i, 1])
        # negative instances
        for t in range(num_negatives):
            j = np.random.randint(num_items)
            while (u, j) in train.keys():
                j = np.random.randint(num_items)
            trainData.append([u, j, 0])
    return np.array(trainData)

From my experiments, using such strategy in your code seriously decrease the performance. Also, if you use the provided test data to test the model , you will find that the performance is pretty poor. I'm not quite sure how the provided test data is generated, but the performance should not be that bad.

I tried to use pytorch to reproduce the results on my own before, but it performs much worse than using keras with tensorflow as the backend. Then I found your implementation and did some experiments, but it turns out the performance is unsatisfactory too. It will be great if you can test on the provided test data and do some further experiments. Thanks.

guoyang9 / ncf Goto Github PK

ncf's People

Contributors

Stargazers

Watchers

Forkers

ncf's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs