The watchdog from amiraflak

WatchDog

powerful ETL subdomain tracking pipeline

About The Project

Watchdog is a powerful ETL pipeline designed to track subdomains of specified domains in real-time. The goal of this project is to identify new subdomains as soon as they are discovered and alert the user immediately. This is achieved through efficient subdomain generation using multiprocessing, seamless and reliable data streaming with Kafka, flexible and scalable management of subdomains with MongoDB, advanced subdomain processing with PySpark, and effective workflow management and task coordination with Airflow. With the addition of the Telegram Notification feature, Watchdog provides real-time alerts and quick response to potential security threats. This project is ideal for security professionals, system administrators, and anyone who needs to monitor subdomains of specified domains in real-time.

Features

Efficient Subdomain Generation: Watchdog leverages multiprocessing to generate subdomains quickly and accurately, optimizing performance.
Real-time Streaming: The pipeline integrates Kafka to provide seamless and reliable data streaming, ensuring up-to-date information.
Scalable Storage: Watchdog utilizes MongoDB as its storage solution, enabling flexible and scalable management of subdomains.
Advanced Subdomain Processing and Security Scanning: With the power of PySpark, Watchdog efficiently processes and analyzes subdomains, allowing for sophisticated data manipulation.Watchdog also offers a powerful subdomain scanning capability, This feature also allows for a more comprehensive understanding of the subdomains and their associated IP addresses, which can be useful for identifying potential security threats.
Robust Orchestration: Watchdog employs Airflow for effective workflow management and task coordination, ensuring smooth execution.
Telegram Notification: Watchdog supports sending notifications to a Telegram channel or group when a new subdomain is found. This feature allows for real-time alerts and quick response to potential security threats.

Built With

Apache Airflow - Workflow management and task scheduling.
Apache Spark - Fast and distributed data processing.
Apache Kafka - Distributed streaming platform.
MongoDB - Scalable NoSQL database.

(back to top)

Screenshots

Kafka producer: sends subdomains to the specified Kafka topic.
Kafka consumer: Spark Streaming consumer to consume subdomains and store them in MongoDB.
MongoDB: Checking the MongoDB collection snapshot to see the subdomains that have been tracked.
Airflow: The Airflow DAG logs show the status and progress of the ETL pipeline.

(back to top)

Getting Started

This is an example of how you may give instructions on setting up your WatchDog locally. To get a local copy up and running follow these simple example steps.

Prerequisites

Before you can use this project, you'll need to have the following installed on your machine:

Python above 3.10
Docker
Docker Compose
Airflow

If you don't have these installed, you can follow the installation instructions for each tool:

Once you have these tools installed, you'll be ready to use this project.

Installation & Usage

Clone the repo

git clone https://github.com/AmirAflak/WatchDog.git

Navigate to the project directory:
```
cd WatchDog/
```
Set targets in configs.py:
```
TARGETS=['caterpillar.com', 'url.com']
```
Install the required packages:
```
make install
```
Initialize Docker Compose:
```
make docker
```
Initialize the Spark streaming consumer:
```
make consumer
```
Initialize the Airflow scheduler:
```
make scheduler
```
Initialize the Airflow webserver GUI:
```
make webserver
```
To stop the Docker Compose containers, run:
```
make stop
```

That's it! You should now be able to use the project.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

amiraflak / watchdog Goto Github PK

watchdog's Introduction

WatchDog

About The Project

Features

Built With

Screenshots

Getting Started

Prerequisites

Installation & Usage

Contributing

License

watchdog's People

Contributors

Watchers

watchdog's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs