hexiangnan / neural_factorization_machine Goto Github PK
View Code? Open in Web Editor NEWTenforFlow Implementation of Neural Factorization Machine
TenforFlow Implementation of Neural Factorization Machine
Hello,
Question 1 :
I have a binary classification problem (click/no click).
I would like to retrieve the prediction values from test set (0 or 1) after training, testing and evaluation the NeuralFM (or FM) model in order to calculate some metrics like precision-recall.
Is that possible ?
Question 2:
What is the "keep_prob" argument in NeuralFM.py ? I don't know how to use it.
Anyway, thanks for this python library, that's a great too to play with.
Thanks in advance for the answers.
跑了一下。NFM代码效果并不比FM效果好,不知为什么
Hi there,
Could you please provide the command for evaluating NeuralFM model? -i.e. I was expecting something similar presented at https://github.com/hexiangnan/attentional_factorization_machine.
Best,
all_weights['feature_bias'] = tf.Variable(tf.random_uniform([self.features_M, 1], 0.0, 0.0), name='feature_bias') # features_M * 1
elements in all_weights['feature_bias'] are all zeros.
so in
# _________out _________ Bilinear = tf.reduce_sum(self.FM, 1, keep_dims=True) # None * 1 self.Feature_bias = tf.reduce_sum(tf.nn.embedding_lookup(self.weights['feature_bias'], self.train_features) , 1) # None * 1 Bias = self.weights['bias'] * tf.ones_like(self.train_labels) # None * 1 self.out = tf.add_n([Bilinear, self.Feature_bias, Bias]) # None * 1
all of elements in self.Feature_bias are zeros.
So, all_weights['feature_bias'] should be initialized from 0 and 1.
ok?
Q1:
It seems there's a deprecated API tf.sub
in your implementation, which will throw exception like
AttributeError: 'module' object has no attribute 'sub'
in tensorflow 1.3.0+.
Change that to tf.subtract
will fix it.
Q2:
Furthermore, have you ever consider using the tf.estimator.Estimator
to replace sklearn.base.BaseEstimator
?
A tf.estimator.Estimator
model with model.export_model
will enable you to deploy a trained model with tensorflow Serving for a product env, while it can also make your training parallelized with high level api tf.contrib.learn.Experiment
.
do you know code example to run criteo-1tb-benchmark fully locally , without spark?
kind of online learning?
for
https://labs.criteo.com/2013/12/download-terabyte-click-logs-2/
As the code it seems do not load the deep layer params if use the pretrain model.
https://github.com/hexiangnan/neural_factorization_machine/blob/master/NeuralFM.py#L202-L217
Is that on purpose which means daily model has a brand new deep layer params?
FM.py中
elif self.loss_type == 'log_loss':
self.out = tf.sigmoid(self.out)
if self.lambda_bilinear > 0:
self.loss = tf.contrib.losses.log_loss(self.out, self.train_labels, weight=1.0, epsilon=1e-07, scope=None) + tf.contrib.layers.l2_regularizer(self.lamda_bilinear)(self.weights['feature_embeddings']) # regulizer
else:
self.loss = tf.contrib.losses.log_loss(self.out, self.train_labels, weight=1.0, epsilon=1e-07, scope=None)
self.lambda_bilinear 变量有误,应该是 self.lamda_bilinear
I change the parameter layers from [64] to [64,128,256,128,64],and I meet the error "InvalidArgumentError (see above for traceback): slice index 4 of dimension 0 out of bounds."
How could I fix it
https://github.com/hexiangnan/neural_factorization_machine/blob/master/LoadData.py#L47
In the read_features()
function, you just init a dict, and record the feature names and the first user which get this feature!! Nonsense!! Lost of Data.
This is a preview of ml-tag.test.libfm
file:
-1.0 51798:1 2473:1 37583:1
-1.0 66335:1 61344:1 29842:1
-1.0 89085:1 60033:1 47050:1
1.0 61293:1 8073:1 3903:1
-1.0 81335:1 56575:1 50067:1
-1.0 65166:1 48181:1 12510:1
-1.0 75300:1 26027:1 38510:1
1.0 10219:1 2122:1 383:1
1.0 80855:1 80856:1 24728:1
1.0 67033:1 721:1 19495:1
I rewrite the code and test on upper file, and clearly you lost many data! Wrong Badly.
def read_features(file): # read a feature file
features = {}
i = len(features)
with open(file) as f:
for line in f:
items = line.strip().split(' ')
for item in items[1:]: # ['51798:1', '2473:1', '37583:1']
if item not in features:
features[item] = i
i = i + 1
else:
print('nfm load code error', i, item)
return features
In deepfm, continuous features is multiplied by its responding embedding vector, while is your codes no multiplication was not seen. Can nfm not deal with continuous features?
nonzero_embeddings = tf.nn.embedding_lookup(self.weights['feature_embeddings'], self.train_features)
self.summed_features_emb = tf.reduce_sum(nonzero_embeddings, 1) # None * K
self.summed_features_emb_square = tf.square(self.summed_features_emb) # None * K
The above code is in the NerualFM.py
when you computed the self.summed_features_emb
, the axis you wrote is '1', I think it should be 0. Did I understand wrong?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.