GithubHelp home page GithubHelp logo

spacecase123 / elasticslurp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nemec/elasticslurp

0.0 0.0 0.0 11 KB

identify and investigate open ElasticSearch servers

License: MIT License

Python 100.00%

elasticslurp's Introduction

Elasticslurp

This program is a very basic way to identify and investigate open ElasticSearch servers. It is not intended to archive or scrape complete ElasticSearch databases, only to sample a bit of data from a wide range of indexes that you can later data-mine to see if they contain interesting fields.

It's implemented in a multi-stage scraping process and all information is stored in a SQLite database. The final document samples can be exported to JSON and further investigated with tools like grep, jq, or even just a text editor.

This app requires Shodan Query API credits, which it uses to find the IP addresses of open ElasticSearch boxes. You can check whether you have any credits by logging in to your Shodan account and visiting this page.

Install Instructions

git clone https://github.com/nemec/elasticslurp.git
cd elasticslurp/
python3 -m venv env  # create virtual environment
source env/bin/activate  # activate virtual environment
pip3 install -r requirements.txt
cp config.py.default config.py

Now edit config.py to add your Shodan API key (found on this page).

Usage

Follow each step in order. Also, ensure you have activated your virtual environment, otherwise the packages will be missing.

Create Database

This database holds data related to one group of search queries. Since SQLite produces database files with little overhead, you should create a new database each time you want to sample data.

python3 main.py create customer.db

Search Shodan

The search command will search for ElasticSearch databases on any port matching the provided keyword. A summary of results will tell you how many databases were found and how many new IP addresses were added to the project database. This command can be repeated with multiple keywords or variations on a keyword (e.g. customer, customers) to append all results to the same database.

python3 main.py search --database customer.db customer
# Total results for keyword "customer": 176
# New IP addresses added: 177

Inside the IP_SEARCH_RESULT table in the database you'll find additional info about the results that were added, including Organization (Tencent, Amazon, etc.) and Location.

Scraping Index Information

The third step is to scrape the complete list of Elastic indexes from each host that was added to the database in the previous step. If a host is not online (common with Shodan, since they cache results for some time), a warning will be added to the console output and the host will be ignored when retrieving samples. This process is parallelized, but may take some time if there are many offline hosts.

python3 main.py scrape --database customer.db
# Scraping 1.209.255.255:9200
# Exception connecting to IP 1.209.255.255:9200: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7f4093379160>: Failed to establish a new connection: [Errno 111] Connection refused) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f4093379160>: Failed to establish a new connection: [Errno 111] Connection refused)
# Scraping 101.132.255.255:9200
# Scraped 5 indexes from IP 101.132.255.255:9200
# Scraping 101.201.255.255:9200
# Scraped 7 indexes from IP 101.201.255.255:9200
# 100%|███████████████████████████████████████████| 5/5 [00:02<00:00,  2.44it/s]

Index data is stored in the ES_INDEXES table and includes info such as document count for each index and the size of the entire index.

The configuration variable INDEX_EXCLUSION_LIST_REGEXES is a list of regular expressions which, if matched anywhere in the string, will cause the index to be ignored. Use this to hide common indexes containing useless data (like metrics).

Sampling Data

The sample step downloads a few documents from each index which you can browse later to find interesting data. By default, at least 10 documents per index are sampled, but this can be controlled with the --count argument.

python3 main.py sample --database customer.db
# 100%|█████████████████████████████████████████| 15/15 [00:03<00:00,  4.23it/s]

Displaying Sampled Data

Once you've collected enough data, you can dump all samples from the database into a JSON file for further analysis (the samples are also available in the ES_SAMPLES table). Samples are dumped to stdout, so you can pipe it to another program or to a file.

python3 main.py dump --database customer.db
python3 main.py dump --database customer.db > samples.json
python3 main.py dump --database customer.db | grep 'gmail'
python3 main.py dump --database customer.db | \
    jq '.[].data._source.name' -cr | \  # find the 'name' field inside the document
    grep -v '^null$'  # jq will output null if it can't find the key in the document

elasticslurp's People

Contributors

nemec avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.