shenweichen / graphembedding Goto Github PK
View Code? Open in Web Editor NEWImplementation and experiments of graph embedding algorithms.
License: MIT License
Implementation and experiments of graph embedding algorithms.
License: MIT License
WARNING:tensorflow:From /home/ant/.conda/envs/lxh_tf/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling Base
ResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Traceback (most recent call last):
File "sdne_wiki.py", line 54, in
model = SDNE(G, hidden_size=[256, 128],)
File "/home/ant/researchInstitute/luoxianhao/ge/shenweichen/graphEmbedding/ge/models/sdne.py", line 93, in init
self.reset_model()
File "/home/ant/researchInstitute/luoxianhao/ge/shenweichen/graphEmbedding/ge/models/sdne.py", line 101, in reset_model
self.model.compile(opt, [l_2nd(self.beta), l_1st(self.alpha)])
File "/home/ant/.conda/envs/lxh_tf/lib/python3.7/site-packages/tensorflow_core/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/ant/.conda/envs/lxh_tf/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 373, in compile
self._compile_weights_loss_and_weighted_metrics()
File "/home/ant/.conda/envs/lxh_tf/lib/python3.7/site-packages/tensorflow_core/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/ant/.conda/envs/lxh_tf/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 1652, in compile_weights_loss
and_weighted_metrics self.total_loss = self._prepare_total_loss(masks)
File "/home/ant/.conda/envs/lxh_tf/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 1712, in _prepare_total_loss
per_sample_losses = loss_fn.call(y_true, y_pred)
File "/home/ant/.conda/envs/lxh_tf/lib/python3.7/site-packages/tensorflow_core/python/keras/losses.py", line 216, in call
return self.fn(y_true, y_pred, **self.fn_kwargs)
File "/home/ant/researchInstitute/luoxianhao/ge/shenweichen/graphEmbedding/ge/models/sdne.py", line 37, in loss_2nd
b[y_true != 0] = beta
TypeError: 'Tensor' object does not support item assignment
编译环境:tf1.15
如何修改下面这行代码
b_[y_true != 0] = beta
48万个节点,32G内存跑不同,在_create_A_L上报内存不够。
_create_A_L种构造矩阵的方式是否可优化,稀疏矩阵这样存储太浪费。
这几个例子是都没有设置weight嘛
请教:
classify.py 中 定义top_k_list = [len(l) for l in Y]
top_k_list的元素就是测试集Y中每个对应元素的长度?
top_k_list是什么作用呢?
请问是怎么解决的呢?
你好,在运行node2vec时候节点多就跑的很慢,请问怎么才能使用GPU加速??感谢
我看现在是使用cpu跑的,速度非常的慢,如何使用GPU加速呢
GraphEmbedding/ge/models/line.py
Lines 134 to 135 in c186681
def deepwalk_walk(self, walk_length, start_node):
walk = [start_node]
while len(walk) < walk_length:
cur = walk[-1]
cur_nbrs = list(self.G.neighbors(cur))
if len(cur_nbrs) > 0:
walk.append(random.choice(cur_nbrs))
else:
break
return walk
个人觉得上面的函数是否不太妥当,完全没用到,各个结点间的转移概率。各个节点间的转移概率其实是可以统计得到的,是否用上会更好?
File "/Users/yangfengyu/Desktop/GraphEmbedding-master/examples/line_wiki.py", line 4, in
from ge.classify import read_node_label, Classifier
File "/Users/yangfengyu/Desktop/GraphEmbedding-master/ge/init.py", line 1, in
from .models import *
File "/Users/yangfengyu/Desktop/GraphEmbedding-master/ge/models/init.py", line 5, in
from .struc2vec import Struc2Vec
File "/Users/yangfengyu/Desktop/GraphEmbedding-master/ge/models/struc2vec.py", line 28, in
from fastdtw import fastdtw
ModuleNotFoundError: No module named 'fastdtw'
我在本地进行了实验,但是一直报错如上。
请问fastdtw模块是项目内的还是第三方的库,我在项目里全局搜索没有找到。
期待回复!
I've tried tensorflow versions 1.15 and >2 and get this error. Were there breaking changes to this repo? If you or anyone else don't get these errors could you share you environment configuration?
(py3SDNE3) mac0632:examples patrick.mullen$ python sdne_wiki.py
WARNING:tensorflow:From /Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Traceback (most recent call last):
File "sdne_wiki.py", line 49, in <module>
model = SDNE(G, hidden_size=[256, 128],)
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/ge-0.0.0-py3.7.egg/ge/models/sdne.py", line 93, in __init__
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/ge-0.0.0-py3.7.egg/ge/models/sdne.py", line 101, in reset_model
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/tensorflow_core/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 373, in compile
self._compile_weights_loss_and_weighted_metrics()
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/tensorflow_core/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 1652, in _compile_weights_loss_and_weighted_metrics
self.total_loss = self._prepare_total_loss(masks)
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 1712, in _prepare_total_loss
per_sample_losses = loss_fn.call(y_true, y_pred)
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/tensorflow_core/python/keras/losses.py", line 216, in call
return self.fn(y_true, y_pred, **self._fn_kwargs)
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/ge-0.0.0-py3.7.egg/ge/models/sdne.py", line 36, in loss_2nd
File "<__array_function__ internals>", line 6, in ones_like
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/numpy-1.18.2-py3.7-macosx-10.14-x86_64.egg/numpy/core/numeric.py", line 278, in ones_like
res = empty_like(a, dtype=dtype, order=order, subok=subok, shape=shape)
File "<__array_function__ internals>", line 6, in empty_like
File "/Users/patrick.mullen/aws/py3SDNE3/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 736, in __array__
" array.".format(self.name))
NotImplementedError: Cannot convert a symbolic Tensor (2nd_target:0) to a numpy array.
node2vec keyerror(' ',' ')???
(venv) ➜ GraphEmbedding git:(master) python setup.py install
File "setup.py", line 16
`tensorflow`
^
SyntaxError: invalid syntax
line.py中111行到137行,建立结点同名表的时候,norm_prob的总和是1,在create_alias_table函数里将norm_prob转换为均值为1。请问为什么在创建边同名表的时候,norm_prob的均值是1?
def _gen_sampling_table(self):
# create sampling table for vertex
power = 0.75
numNodes = self.node_size
node_degree = np.zeros(numNodes) # out degree
node2idx = self.node2idx
for edge in self.graph.edges():
node_degree[node2idx[edge[0]]
] += self.graph[edge[0]][edge[1]].get('weight', 1.0)
total_sum = sum([math.pow(node_degree[i], power)
for i in range(numNodes)])
norm_prob = [float(math.pow(node_degree[j], power)) /
total_sum for j in range(numNodes)]
self.node_accept, self.node_alias = create_alias_table(norm_prob)
# create sampling table for edge
numEdges = self.graph.number_of_edges()
total_sum = sum([self.graph[edge[0]][edge[1]].get('weight', 1.0)
for edge in self.graph.edges()])
norm_prob = [self.graph[edge[0]][edge[1]].get('weight', 1.0) *
numEdges / total_sum for edge in self.graph.edges()]
self.edge_accept, self.edge_alias = create_alias_table(norm_prob)
不支持gbk
There are a lot of algorithms in KarateClub that might be good to take a look at. There are all common algorithms for Community Graph Embedding. Inversely, the algorithms in this repo are made such that they can be included in the other repo.
Results of node2vec
, deewalk
, line
, sdne
and struc2vec
on all datasets. Hope this will help anyone who is interested in this project.
wiki
Alg | micro | macro | samples | weighted | acc | NMI |
---|---|---|---|---|---|---|
node2vec | 0.7447 | 0.6771 | 0.7193 | 0.7450 | 0.6279 | 0.3536 |
deepwalk | 0.7307 | 0.6579 | 0.7058 | 0.7296 | 0.6091 | 0.3416 |
line | 0.5059 | 0.2461 | 0.4536 | 0.4523 | 0.3160 | 0.0798 |
sdne | 0.6916 | 0.5119 | 0.6528 | 0.6718 | 0.5530 | 0.1801 |
struc2vec | 0.4512 | 0.1249 | 0.3933 | 0.3383 | 0.2308 | 0.0516 |
brazil
Alg | micro | macro | samples | weighted | acc | NMI |
---|---|---|---|---|---|---|
node2vec | 0.1481 | 0.1579 | 0.1481 | 0.1648 | 0.1481 | 0.0442 |
deepwalk | 0.1852 | 0.1694 | 0.1852 | 0.2004 | 0.1852 | 0.0471 |
line | 0.4444 | 0.4167 | 0.4444 | 0.4753 | 0.4444 | 0.2822 |
sdne | 0.5926 | 0.5814 | 0.5926 | 0.5928 | 0.5926 | 0.4041 |
struc2vec | 0.7778 | 0.7739 | 0.7778 | 0.7762 | 0.7778 | 0.3906 |
europe
Alg | micro | macro | samples | weighted | acc | NMI |
---|---|---|---|---|---|---|
node2vec | 0.4125 | 0.4156 | 0.4125 | 0.4209 | 0.4125 | 0.0155 |
deepwalk | 0.4375 | 0.4358 | 0.4375 | 0.4347 | 0.4375 | 0.0180 |
line | 0.5000 | 0.4983 | 0.5000 | 0.5016 | 0.5000 | 0.1186 |
sdne | 0.5000 | 0.4818 | 0.5000 | 0.4916 | 0.5000 | 0.1714 |
struc2vec | 0.5375 | 0.5247 | 0.5375 | 0.5294 | 0.5375 | 0.0783 |
usa
Alg | micro | macro | samples | weighted | acc | NMI |
---|---|---|---|---|---|---|
node2vec | 0.5420 | 0.5278 | 0.5420 | 0.5351 | 0.5420 | 0.0822 |
deepwalk | 0.5504 | 0.5394 | 0.5504 | 0.5472 | 0.5504 | 0.0910 |
line | 0.4160 | 0.4032 | 0.4160 | 0.4175 | 0.4160 | 0.1660 |
sdne | 0.6092 | 0.5819 | 0.6092 | 0.5971 | 0.6092 | 0.2028 |
struc2vec | 0.5210 | 0.5040 | 0.5210 | 0.5211 | 0.5210 | 0.0702 |
Preprocess transition probs...
[Parallel(n_jobs=30)]: Using backend MultiprocessingBackend with 30 concurrent workers.
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 567, in call
return self.func(*args, **kwargs)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 225, in call
for func, args, kwargs in self.items]
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 225, in
for func, args, kwargs in self.items]
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/ge/walker.py", line 88, in _simulate_walks
walk_length=walk_length, start_node=v))
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/ge/walker.py", line 56, in node2vec_walk
next_node = cur_nbrs[alias_sample(alias_edges[edge][0],
KeyError: (9836, 7324)
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "blog_node.py", line 69, in
model = Node2Vec(G, 10, 80, workers=30,p=0.25,q=2 )
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/ge/models/node2vec.py", line 39, in init
num_walks=num_walks, walk_length=walk_length, workers=workers, verbose=1)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/ge/walker.py", line 72, in simulate_walks
partition_num(num_walks, workers))
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 934, in call
self.retrieve()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 833, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/home/ubuntu/anaconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
我是下载之后把joblib的backend改成multiprocessing,但其实不管是默认的还是其他的,都会报错说keyerror是random一个数据点。我用的是http://socialcomputing.asu.edu/datasets/BlogCatalog3 这个数据集,不管是stru2vec还是node2vec都会遇到这样问题。请教
大佬您好,请问那个TopKRanker类是干什么的
in deepwalk.py
`
def train(self, embed_size=128, window_size=5, workers=3, iter=5, **kwargs):
sentences = pd.read_pickle('random_walks.pkl')
kwargs["sentences"] = sentences
kwargs["min_count"] = kwargs.get("min_count", 0)
kwargs["size"] = embed_size
kwargs["sg"] = 1 # skip gram
kwargs["hs"] = 1 # deepwalk use Hierarchical Softmax
kwargs["workers"] = workers
kwargs["window"] = window_size
kwargs["iter"] = iter
`
cannot find random_walks.pkl
could you provide it ?
Thanks!
好像是读取连边数据时nodetype是str,但是改成int的话就报错了,怎么让嵌入向量顺序与节点对上?
请问不用cuda9.0行吗 用cuda10的话需要修改哪里的代码
在将node2vec代码应用在新的数据集上,存在edge weights为0的情况, 出现ZeroDivisionError. 调整walker.py的line 185,186代码, 添加try except statement, 用append item to empty list的形式替代原来的list comprehension,仍然报错,不知道是什么原因。谢谢
I got this issue using struc2vec and node2vec methods
`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
1 model_struc2vec = ge.Struc2Vec(G, 10, 80, workers=4, verbose=40, ) #init model
----> 2 model_struc2vec.train(window_size = 5, iter = 3)# train model
3 embeddings_struc2vec = model_struc3vec.get_embeddings()# get embedding vectors
/anaconda3/envs/python36/lib/python3.6/site-packages/ge-0.0.0-py3.6.egg/ge/models/struc2vec.py in train(self, embed_size, window_size, workers, iter)
114 print("Learning representation...")
115 model = Word2Vec(sentences, size=embed_size, window=window_size, min_count=0, hs=1, sg=1, workers=workers,
--> 116 iter=iter)
117 print("Learning representation done!")
118 self.w2v_model = model
/anaconda3/envs/python36/lib/python3.6/site-packages/gensim-3.6.0-py3.6-macosx-10.7-x86_64.egg/gensim/models/word2vec.py in init(self, sentences, corpus_file, size, alpha, window, min_count, max_vocab_size, sample, seed, workers, min_alpha, sg, hs, negative, ns_exponent, cbow_mean, hashfxn, iter, null_word, trim_rule, sorted_vocab, batch_words, compute_loss, callbacks, max_final_vocab)
765 callbacks=callbacks, batch_words=batch_words, trim_rule=trim_rule, sg=sg, alpha=alpha, window=window,
766 seed=seed, hs=hs, negative=negative, cbow_mean=cbow_mean, min_alpha=min_alpha, compute_loss=compute_loss,
--> 767 fast_version=FAST_VERSION)
768
769 def _do_train_epoch(self, corpus_file, thread_id, offset, cython_vocab, thread_private_mem, cur_epoch,
/anaconda3/envs/python36/lib/python3.6/site-packages/gensim-3.6.0-py3.6-macosx-10.7-x86_64.egg/gensim/models/base_any2vec.py in init(self, sentences, corpus_file, workers, vector_size, epochs, callbacks, batch_words, trim_rule, sg, alpha, window, seed, hs, negative, ns_exponent, cbow_mean, min_alpha, compute_loss, fast_version, **kwargs)
757 raise TypeError("You can't pass a generator as the sentences argument. Try an iterator.")
758
--> 759 self.build_vocab(sentences=sentences, corpus_file=corpus_file, trim_rule=trim_rule)
760 self.train(
761 sentences=sentences, corpus_file=corpus_file, total_examples=self.corpus_count,
/anaconda3/envs/python36/lib/python3.6/site-packages/gensim-3.6.0-py3.6-macosx-10.7-x86_64.egg/gensim/models/base_any2vec.py in build_vocab(self, sentences, corpus_file, update, progress_per, keep_raw_vocab, trim_rule, **kwargs)
941 trim_rule=trim_rule, **kwargs)
942 report_values['memory'] = self.estimate_memory(vocab_size=report_values['num_retained_words'])
--> 943 self.trainables.prepare_weights(self.hs, self.negative, self.wv, update=update, vocabulary=self.vocabulary)
944
945 def build_vocab_from_freq(self, word_freq, keep_raw_vocab=False, corpus_count=None, trim_rule=None, update=False):
/anaconda3/envs/python36/lib/python3.6/site-packages/gensim-3.6.0-py3.6-macosx-10.7-x86_64.egg/gensim/models/word2vec.py in prepare_weights(self, hs, negative, wv, update, vocabulary)
1820 # set initial input/projection and hidden weights
1821 if not update:
-> 1822 self.reset_weights(hs, negative, wv)
1823 else:
1824 self.update_weights(hs, negative, wv)
/anaconda3/envs/python36/lib/python3.6/site-packages/gensim-3.6.0-py3.6-macosx-10.7-x86_64.egg/gensim/models/word2vec.py in reset_weights(self, hs, negative, wv)
1837 for i in xrange(len(wv.vocab)):
1838 # construct deterministic seed from word AND seed argument
-> 1839 wv.vectors[i] = self.seeded_vector(wv.index2word[i] + str(self.seed), wv.vector_size)
1840 if hs:
1841 self.syn1 = zeros((len(wv.vocab), self.layer1_size), dtype=REAL)
TypeError: unsupported operand type(s) for +: 'int' and 'str'
`
我有大概80多万关系十几万节点,在循环执行_get_order_degreelist_node这个方法的时候,计算非常慢,该怎么提升速度。
There are a few popular nide embedding repos that might be good to talk about. https://github.com/VHRanger/nodevectors
Hi, I am trying to re-implement SDNE code but I got stuck when reading code. Could you explain for me these below lines:
-
GraphEmbedding/ge/models/sdne.py
Line 162 in 7de7a09
get('weight',1)
. Why does it has 1
value here?
GraphEmbedding/ge/models/sdne.py
Line 169 in 7de7a09
A_
and its purpose?GraphEmbedding/ge/models/sdne.py
Line 172 in 7de7a09
GraphEmbedding/ge/models/sdne.py
Line 125 in 7de7a09
L[i:j, :]
for a mini batch to train. But what is meaning of L[index][:, index].todense()
with index = [i:j]
GraphEmbedding/ge/models/sdne.py
Line 46 in 7de7a09
L = y_true
and Y = y_pred
?Thank your helping!
if not self.use_rejection_sampling:
alias_edges = {}
for edge in G.edges():
alias_edges[edge] = self.get_alias_edge(edge[0], edge[1])
self.alias_edges = alias_edges
这是walk.py里面根据前一个节点t和当前节点v动态计算概率的代码,但为什么要放在if not self.use_rejection_sampling:这个条件下,不用负采样也应该有这个计算,然后更新alias_edges吧
在node2vec中,你们在对邻居采样的时候,对权重(概率)做了归一化
`
alias_nodes = {}
for node in G.nodes():
unnormalized_probs = [G[node][nbr].get('weight', 1.0)
for nbr in G.neighbors(node)]
norm_const = sum(unnormalized_probs)
normalized_probs = [
float(u_prob)/norm_const for u_prob in unnormalized_probs]
alias_nodes[node] = create_alias_table(normalized_probs)`
那么所有的概率都将小于1,使用alias采样时所有的概率都会分到small里,那么就和不使用是没区别的。 这样的话似乎就是等概率采样了?
Reference: https://github.com/dkaslovsky/GraphRole
您好,我在使用struc2vec构建结构相似度的代码时发现有一些问题。
具体来说,当opt2_reduce_sim_calc开启的时候,get_vertices函数拿到的是对于与每个节点自己相似的邻居,这里的这个相似性是单向的。也就是假如a与b相似,则a的邻居中有b,若b与a也相似,则b的邻居中也有a(类似于有向图),然而在后面_get_layer_rep方法中,是把这种相似度当作无向情况来处理的,也就是只考虑了opt2_reduce_sim_calc为False的情况。
此时,当opt2为True时,由于a和b的相似邻居中都有对方,而构建边的时候,会为每个点保存“入”和“出”的两条边,这样就会导致重复的边。换句话说,我认为_get_layer_rep在opt2_reduce_sim_calc选项为True的时候,行为是有错误的。
期待您的回复
最后感谢您开源这部分代码,极大的方便了我的工作,节约了时间,谢谢
您好,我在尝试运行您的sdne_wiki.py代码时,提示了如下错误信息:
AttributeError: module 'tensorflow' has no attribute 'to_float'
我的python版本为3.6.7,tf.version = "2.0.0-alpha0",我猜测可能是我的版本太新问题,请问我可以如何跑起来您的代码呢?谢谢!
您好,为什么您的项目要求TensorFlow
我使用这个作者提供的网络嵌入代码做实验,花了将近一个月的时间,都没有出效果,都快崩溃了。检查了无数次自己的算法是否有问题,最后才发现是这个作者提供的deepwalk代码有问题,大家如果是要用deepwalk,请使用https://github.com/phanein/deepwalk
I experimented with the embedding code provided by this author, and it took me nearly a month, and it didn't work, and I almost crashed. When I checked my algorithm for a number of times, I finally found that there was a problem with the deepwalk code provided by this author. If you want to use deepwalk, please use https://github.com/phanein/deepwalk.
我参照deepwalk论文的参数进行evaluate_embeddings,
--number-walks 80 --representation-size 128 --walk-length 40 --window-size 10
最后得到的结果与原论文相差甚远,请问是什么问题呢?
Hello!
Thank you for providing this wonderful tool for study. I changed the second
option into all
for LINE
(line 48 of GraphEmbedding/examples/line_wiki.py
), and encountered the following error:
....
97/97 - 1s - loss: nan - first_order_loss: nan - second_order_loss: 0.0503
Epoch 48/50
97/97 - 1s - loss: nan - first_order_loss: nan - second_order_loss: 0.0480
Epoch 49/50
97/97 - 1s - loss: nan - first_order_loss: nan - second_order_loss: 0.0485
Epoch 50/50
97/97 - 1s - loss: nan - first_order_loss: nan - second_order_loss: 0.0472
Training classifier using 80.00% nodes...
Traceback (most recent call last):
File "line_wiki.py", line 52, in <module>
evaluate_embeddings(embeddings)
File "line_wiki.py", line 19, in evaluate_embeddings
clf.split_train_evaluate(X, Y, tr_frac)
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/ge-0.0.0-py3.6.egg/ge/classify.py", line 66, in split_train_evaluate
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/ge-0.0.0-py3.6.egg/ge/classify.py", line 34, in train
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/scikit_learn-0.22.2.post1-py3.6-linux-x86_64.egg/sklearn/multiclass.py", line 239, in fit
for i, column in enumerate(columns))
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/joblib-0.13.0-py3.6.egg/joblib/parallel.py", line 917, in __call__
if self.dispatch_one_batch(iterator):
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/joblib-0.13.0-py3.6.egg/joblib/parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/joblib-0.13.0-py3.6.egg/joblib/parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/joblib-0.13.0-py3.6.egg/joblib/_parallel_backends.py", line 182, in apply_async
result = ImmediateResult(func)
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/joblib-0.13.0-py3.6.egg/joblib/_parallel_backends.py", line 549, in __init__
self.results = batch()
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/joblib-0.13.0-py3.6.egg/joblib/parallel.py", line 225, in __call__
for func, args, kwargs in self.items]
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/joblib-0.13.0-py3.6.egg/joblib/parallel.py", line 225, in <listcomp>
for func, args, kwargs in self.items]
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/scikit_learn-0.22.2.post1-py3.6-linux-x86_64.egg/sklearn/multiclass.py", line 79, in _fit_binary
estimator.fit(X, y)
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/scikit_learn-0.22.2.post1-py3.6-linux-x86_64.egg/sklearn/linear_model/_logistic.py", line 1527, in fit
accept_large_sparse=solver != 'liblinear')
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/scikit_learn-0.22.2.post1-py3.6-linux-x86_64.egg/sklearn/utils/validation.py", line 755, in check_X_y
estimator=estimator)
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/scikit_learn-0.22.2.post1-py3.6-linux-x86_64.egg/sklearn/utils/validation.py", line 578, in check_array
allow_nan=force_all_finite == 'allow-nan')
File "/export/d1/shuaiw/GraphEmbedding/env/lib/python3.6/site-packages/scikit_learn-0.22.2.post1-py3.6-linux-x86_64.egg/sklearn/utils/validation.py", line 60, in _assert_all_finite
msg_dtype if msg_dtype is not None else X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Could anyone take a look and see if that can be fixed? Thank you very much!
如果代码没有理解错的话,应该这部分是负采样。
https://github.com/shenweichen/GraphEmbedding/blob/master/ge/models/line.py#L170-L173
但这里的负采样没有剔除相邻节点和自身节点
Hello, I have a problem at python3.7
error: python-dateutil 2.8.1 is installed but python-dateutil<2.8.1,>=2.1 is required by {'botocore'}
Full log
eurvanov@eurvanov-HP-ProBook-430-G5:~/python/adeo/market-radar/synonyms-service/research/GraphEmbedding$ python setup.py install
running install
running bdist_egg
running egg_info
writing ge.egg-info/PKG-INFO
writing dependency_links to ge.egg-info/dependency_links.txt
writing requirements to ge.egg-info/requires.txt
writing top-level names to ge.egg-info/top_level.txt
reading manifest file 'ge.egg-info/SOURCES.txt'
writing manifest file 'ge.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/ge
copying build/lib/ge/utils.py -> build/bdist.linux-x86_64/egg/ge
copying build/lib/ge/classify.py -> build/bdist.linux-x86_64/egg/ge
copying build/lib/ge/alias.py -> build/bdist.linux-x86_64/egg/ge
copying build/lib/ge/init.py -> build/bdist.linux-x86_64/egg/ge
creating build/bdist.linux-x86_64/egg/ge/models
copying build/lib/ge/models/node2vec.py -> build/bdist.linux-x86_64/egg/ge/models
copying build/lib/ge/models/deepwalk.py -> build/bdist.linux-x86_64/egg/ge/models
copying build/lib/ge/models/struc2vec.py -> build/bdist.linux-x86_64/egg/ge/models
copying build/lib/ge/models/init.py -> build/bdist.linux-x86_64/egg/ge/models
copying build/lib/ge/models/line.py -> build/bdist.linux-x86_64/egg/ge/models
copying build/lib/ge/models/sdne.py -> build/bdist.linux-x86_64/egg/ge/models
copying build/lib/ge/walker.py -> build/bdist.linux-x86_64/egg/ge
byte-compiling build/bdist.linux-x86_64/egg/ge/utils.py to utils.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/classify.py to classify.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/alias.py to alias.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/init.py to init.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/models/node2vec.py to node2vec.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/models/deepwalk.py to deepwalk.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/models/struc2vec.py to struc2vec.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/models/init.py to init.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/models/line.py to line.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/models/sdne.py to sdne.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/ge/walker.py to walker.cpython-37.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying ge.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying ge.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying ge.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying ge.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying ge.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating 'dist/ge-0.0.0-py3.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing ge-0.0.0-py3.7.egg
Removing /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages/ge-0.0.0-py3.7.egg
Copying ge-0.0.0-py3.7.egg to /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages
ge 0.0.0 is already the active version in easy-install.pth
Installed /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages/ge-0.0.0-py3.7.egg
Processing dependencies for ge==0.0.0
Searching for python-dateutil>=2.1
Reading https://pypi.org/simple/python-dateutil/
Downloading https://files.pythonhosted.org/packages/d4/70/d60450c3dd48ef87586924207ae8907090de0b306af2bce5d134d78615cb/python_dateutil-2.8.1-py2.py3-none-any.whl#sha256=75bb3f31ea686f1197762692a9ee6a7550b59fc6ca3a1f4b5d7e32fb98e2da2a
Best match: python-dateutil 2.8.1
Processing python_dateutil-2.8.1-py2.py3-none-any.whl
Installing python_dateutil-2.8.1-py2.py3-none-any.whl to /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages
writing requirements to /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages/python_dateutil-2.8.1-py3.7.egg/EGG-INFO/requires.txt
Adding python-dateutil 2.8.1 to easy-install.pth file
Installed /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages/python_dateutil-2.8.1-py3.7.egg
Searching for botocore<1.14.0,>=1.13.26
Reading https://pypi.org/simple/botocore/
Downloading https://files.pythonhosted.org/packages/8a/93/ea2ec042794dfda186348df02c6057223a8bbc21c055124fbe3e16925441/botocore-1.13.26-py2.py3-none-any.whl#sha256=9fefb42c6d4fa0079a52b49e5491fa0738cca63649f68be180b3ed6c253d2622
Best match: botocore 1.13.26
Processing botocore-1.13.26-py2.py3-none-any.whl
Installing botocore-1.13.26-py2.py3-none-any.whl to /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages
writing requirements to /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages/botocore-1.13.26-py3.7.egg/EGG-INFO/requires.txt
Adding botocore 1.13.26 to easy-install.pth file
Installed /home/eurvanov/anaconda3/envs/mr-research/lib/python3.7/site-packages/botocore-1.13.26-py3.7.egg
error: python-dateutil 2.8.1 is installed but python-dateutil<2.8.1,>=2.1 is required by {'botocore'}
I could fix it. The problem is because of the 'gensim' versioning
ImportError: No module named 'tensorflow.python.keras'
请问这是什么原因?
Traceback (most recent call last):
File "deepwalk_wiki.py", line 4, in
from ge.classify import read_node_label, Classifier
File "//anaconda/lib/python3.5/site-packages/ge-0.0.0-py3.5.egg/ge/init.py", line 1, in
File "//anaconda/lib/python3.5/site-packages/ge-0.0.0-py3.5.egg/ge/models/init.py", line 1, in
File "//anaconda/lib/python3.5/site-packages/ge-0.0.0-py3.5.egg/ge/models/deepwalk.py", line 20, in
File "//anaconda/lib/python3.5/site-packages/ge-0.0.0-py3.5.egg/ge/walker.py", line 7, in
ImportError: No module named 'joblib'
这几个例子是都没有weight么?
_create_A_L函数
A_ = sp.csr_matrix((A_data + A_data, (A_row_index + A_col_index, A_col_index + A_row_index)),
shape=(node_size, node_size))
D = sp.diags(A_.sum(axis=1).flatten().tolist()[0])
L = D - A_
return A, L`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.