Comments (11)
For the yaml format, please refer to the document page.
The pickled object is a dict that contains all embeddings, and their index mappings to the original node names. The following lines may help you understand what is in the pickled object.
with open("line_blogcatalog.pkl", "rb") as fin:
blogcatalog = pickle.load(fin)
print(blogcatalog.keys())
names = blogcatalog.id2name
embeddings = blogcatalog.vertex_embeddings
print(names[1024], embddings[1024])
from graphvite.
Thank you for your prompt reply. The document is helpful but a little bit confusing to me. So if I want to specify a edge_link file to run. I need to create a single yaml file and change the global yaml file to indicate the dataset path? Also, I am not pretty sure which data format is acceptable for your project.
This information may be stored somewhere in the document. An easy start point in the readme file may still be very helpful for people who only want to calculate the embedding in a short run.
from graphvite.
The dataset path in the global yaml file is for downloading/caching standard datasets. You don't necessarily need to modify it in most cases.
To run your own dataset,
- Fork an existing yaml.
- Change dataset paths in
graph
,evaluation
. It's better to use absolute paths. - Change hyperparameters if necessary.
For all applications, the dataset format is strings separated by delimiters. There can be some comment at the end of each line. By default, delimiters are any blank characters, and comment prefix is "#". For node embeddings, the following examples are valid.
- # this is a comment line
- xxx yyy 1.5 # some comment
- xxx yyy # some comment
The edge weight is optional.
If you're using Python, you can also pass list of (string, string, float)
to the interface.
Sorry for not clarifying the data formats. We will add documentation of dataset format.
from graphvite.
Thanks for your explanation!
Looks like I just need to provide a common edge list to it.
I have opened another issue about the evaluation file format. I hope it will be easy for you to add a simple documents for that one as well.
from graphvite.
I also found it a bit confusing of what the desired dataset format should be. Could you add some simple examples in the readme file. Or maybe tutorials on how we can run the codes on our own data? Thank you
from graphvite.
sorry to brother you again. But I don't know what's the input format for LargeViz? I know it should be vectors for nodes, but do not know exactly how the vector should be formatted.
from graphvite.
That's good. I will add it to the document.
If you call LargeVis from yaml, there are two formats, depending on the task. For graph visualization, it's an edge list. For vector visualization, it's an n*d text matrix, i.e. n lines of d-dimensional samples.
The command line graphvite visualize
is only designed for vector visualization, but you can also pass a numpy dump of n*d matrix as input, with .npy
suffix.
from graphvite.
Thank you for the quick reply!
from graphvite.
Want to double check. The label file for largevis is just an array (length = n) of string or integer right?
from graphvite.
Yes. It can be either n lines of strings (*.txt) or a 1d numpy array (*.npy).
from graphvite.
Added in v0.2.0
from graphvite.
Related Issues (20)
- Is it normal to falling in an infinite loop?
- CMake Error when using 'cd build && cmake .. && make && cd -' command HOT 3
- Raise KeyError: 'PREFIX' when import HOT 6
- mute output of graphvite solver
- url link bug HOT 1
- Is network augmentation necessary?
- The kernel for network/line.ipynb appears to have died. It will restart automatically.
- KnowledgeGraphSolver Error
- Check failed: error == cudaSuccess CUDA error
- Non-pickle version of pretrained embeddings
- How to install without Conda
- Cannot install graphvite on colab HOT 1
- When running with quick_start.yaml with own dataset getting the following error HOT 1
- Cannot import graphvite
- Running out of memory when loading pre-trained embeddings
- CUDA unknown error with running the quick start
- Cannot install graphvite HOT 6
- AttributeError: module 'graphvite' has no attribute 'dataset' HOT 1
- cannot install graphvite from source HOT 4
- Error when loading pre-trained Wikidata5M models (.pkl files) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from graphvite.