GithubHelp home page GithubHelp logo

zjukeliu / atom2vec Goto Github PK

View Code? Open in Web Editor NEW

This project forked from idocx/atom2vec

0.0 1.0 1.0 1.16 MB

A python implement of Atom2Vec: a simple way to describe atoms for machine learning

License: MIT License

Python 100.00%

atom2vec's Introduction

PyAtom2Vec

A python implement of Atom2Vec: a simple way to describe atoms for machine learning

Background

Atom2Vec is first proposed on Zhou Q, Tang P, Liu S, et al. Learning atoms for materials discovery[J]. Proceedings of the National Academy of Sciences, 2018, 115(28): E6411-E6417.

It is a powerful but simple method to transfer atoms into vectors, quite similar to Word2Vec in NLP.

Requirements

To run this program, you will need Scipy and Numpy packages. If you want to generate your own dataset, you may also need Requests package for web requests.

  • Anaconda environment is highly recommended.

If you have installed pip, you may use these commands to install these packages.

# on Linux
pip3 install scipy numpy requests
# on Windows
pip install scipy numpy requests

How To Use

from Atom2Vec import Atom2Vec

# data_file: path to the dataset file
# vec_length: length of atom vector you want
atoms_vec = Atom2Vec(data_file, vec_length)
atoms_vec.saveAll()

Output:

Generating index 77402/77402 -- Complete!
Building matrix  -- Complete!
SVD -- Complete!

Also, this package contains a dataset, which was obtained from Material Project using GetMP.py. The raw response is stored in string_2.json and string_3.json. Then the response is further processed by Preprocess.py, whose result is saved to string.json for further use.

Output File

The output is kept in atoms_vec.txt and atoms_index.txt.

  • atoms_vec.txt contains a M * N matrix. M is the index of the atoms. N is the length of atom vector. Each row represents a vector describing certain atom.

  • atoms_index.txt contains a M * 1 matrix. Each row contains a integer, which is the atomic number of certain atom. It tells which atom each row represents in atoms_vec.txt.

Test Program

Atom2Vec.py also contains a simple test, which can be run by

# Linux
python3 Atom2Vec
# Windows
python Atom2Vec

If the program can run normally, it will exit with no errors raised.

Interactive Similarity Map

We can calculate cosine distance to quantify similarity between every atom.

You can find a interactive similarity map on https://www.yuxingfei.com/src/similarity.html

atom2vec's People

Contributors

idocx avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.