CrawlerX - Develop Extensible, Distributed, Scalable Crawler System which is a web platform that can be used to crawl URLs in different kind of protocols in a distributed way.
Is your feature request related to a problem? Please describe.
This issue introduces Kubernetes artifacts for CrawlerX project. Currently, we can only deploy the platform in Docker orchestration frameworks.
Describe the bug
Even though firebase has only used for authentication purpose in CrawlerX it is not nice to put them in public repo. instead of that we can add a env file to the project and put all the environmental variables there
Is your feature request related to a problem? Please describe.
Currently it only possible to view data via the embedded JSON viewer. It would be great if we can export these data as a JSON, CSV file in each project.
Is your feature request related to a problem? Please describe.
As per the current implementation, CrawlerX supports only for HTTP and HTTPS urls. This need to extend for Tor browser Urls.
Hi.
I run docker-compose up --build and get an error:
I tried installing Twisted's dependencies in Dockerfile or change version Twisted in requirements.txt but it didn't solve the problem. RUN apt-get update && apt-get install -y gcc libc6-dev
Is your feature request related to a problem? Please describe.
Currently it supports basic data saving mechanism for each user. Since this is a data scraping server, this should have a capability to manage data in more user friendly manner .