GithubHelp home page GithubHelp logo

arxiv_crawler's Introduction

ArXiv.org crawler for cyber network

This tool able you to make much more cyberlinks easier. You can move arxiv.org articles to the Great Web by relevant keywords. Thanks to this keyword extractor.

The tool:

  • parses articles metadata by keyword you interesting in
  • gets the keywords from the article summary
  • downloads the article in pdf
  • calculates ipfs hashes from keywords, title and downloaded pdf
  • pins all hashes to you machine
  • generate the unsigned transaction type
[keywords_ipfs_hashes] - > article_ipfs_hash

Disclaimer

This is a semi-automatic script. Work in progress and so far from ideal.

Requirements

Usage

Parsing arxiv and generating transactions:

  1. Clone this repo and go into it

    git clone https://github.com/SaveTheAles/arxiv_crawler.git
    cd arxiv_crawler
  2. Install python packages

    pip install pandas
    pip install arxiv
    pip install multi-rake
    pip install ipfshttpclient
    pip install progressbar
    pip install json
  3. Fill config.py

    Put your cyber address as ADDRESS variable and QUERY variable as keyword you want to discover.

  4. Run main.py

    python3 main.py

    As result of this command will be ./data/txs/link_txN.json files with prepared for signing and broadcasting to cyber network

Sign and broadcast transactions. This step will use a simple bash script. It will work only if you allow you tx signing without a password. Otherwise, you should update it or sign transactions manually.

  1. Make executable:

    chmod u+x sign-brod.sh
  2. Define range for loop (from 0 till number of transactions) in sign-brod.sh

  3. Insert the proxy node in sign-brod.sh

  4. Run sign-brod.sh:

    ./sign-brod.sh

    You should see broadcasted transactions as output in console.

ToDo

  • Built-in signer and broadcaster
  • Storage for saving state
  • Checks for valid txs and bandwidth

Contribution

Welcome. If you knows how to implement some or make the tool better do it for the Great Justice! Check the ToDo section and Wishlist for inspiration.

Wishlist

Here you can add with PR all features you want.

arxiv_crawler's People

Contributors

savetheales avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.