kevthehermit / pastehunter Goto Github PK
View Code? Open in Web Editor NEWScanning pastebin with yara rules
License: GNU General Public License v3.0
Scanning pastebin with yara rules
License: GNU General Public License v3.0
/usr/bin/PasteHunter-master# python pastehunter.py
Traceback (most recent call last):
File "pastehunter.py", line 8, in
import requests
ImportError: No module named requests
Pastebin updated the API links. If you don't change it until April 27 you will be unable to use the scraper. Let's take a look at the Pastebin's scraping doc. https://pastebin.com/doc_scraping_api
Use these:
api_scrape : https://scrape.pastebin.com/api_scraping.php
api_raw : https://scrape.pastebin.com/api_scrape_item.php?i=
Instead of these:
api_scrape : https://pastebin.com/api_scraping.php
api_raw : https://pastebin.com/api_scrape_item.php?i=
`ERROR:pastehunter.py:Unable to store Vp9hd6Pa to ConnectionError((<urllib3.connection.HTTPConnection object at 0x7f16044b0160>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)')) caused by: ConnectTimeoutError((<urllib3.connection.HTTPConnection object at 0x7f16044b0160>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)'))
PUT http://192.168.1.22:9200/paste-test-2018-42/paste/Tv33mSXj [status:N/A request:10.010s]
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 83, in create_connection
raise err
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py", line 115, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 333, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise
raise value
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 146, in _new_conn
(self.host, self.timeout))
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0x7f1605576438>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)')
ERROR:pastehunter.py:Unable to store Tv33mSXj to ConnectionError((<urllib3.connection.HTTPConnection object at 0x7f1605576438>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)')) caused by: ConnectTimeoutError((<urllib3.connection.HTTPConnection object at 0x7f1605576438>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)'))
INFO:pastehunter.py:Sleeping for 300 Seconds
`
After leaving the script running using nohup for approx 48 hours, the scrips stops finding any hits. Here is an excerpt from the logs:
2019-01-27 23:13:37,939 [MainThread ] INFO:Blacklisted pastebin.com paste Td8GRTQf
2019-01-27 23:18:08,265 [MainThread ] INFO:Populating Queue
2019-01-27 23:18:08,351 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:18:09,487 [MainThread ] INFO:Added 80 Items to the queue
2019-01-27 23:18:18,207 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-27 23:18:57,755 [MainThread ] INFO:Blacklisted pastebin.com paste 1QUb8s9U
2019-01-27 23:23:18,700 [MainThread ] INFO:Populating Queue
2019-01-27 23:23:18,719 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:23:19,070 [MainThread ] INFO:Added 112 Items to the queue
2019-01-27 23:23:19,071 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-27 23:28:19,162 [MainThread ] INFO:Populating Queue
2019-01-27 23:28:19,168 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:28:19,452 [MainThread ] INFO:Added 91 Items to the queue
2019-01-27 23:28:19,453 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-27 23:33:19,544 [MainThread ] INFO:Populating Queue
2019-01-27 23:33:19,549 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:33:19,742 [MainThread ] INFO:Added 85 Items to the queue
2019-01-27 23:33:19,742 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-27 23:38:19,804 [MainThread ] INFO:Populating Queue
2019-01-27 23:38:19,810 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:38:20,088 [MainThread ] INFO:Added 92 Items to the queue
2019-01-27 23:38:20,089 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-27 23:43:20,188 [MainThread ] INFO:Populating Queue
2019-01-27 23:43:20,193 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:43:20,442 [MainThread ] INFO:Added 77 Items to the queue
2019-01-27 23:43:20,443 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-27 23:48:20,544 [MainThread ] INFO:Populating Queue
2019-01-27 23:48:20,549 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:48:20,835 [MainThread ] INFO:Added 92 Items to the queue
2019-01-27 23:48:20,836 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-27 23:53:20,917 [MainThread ] INFO:Populating Queue
2019-01-27 23:53:20,919 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:53:21,231 [MainThread ] INFO:Added 99 Items to the queue
2019-01-27 23:53:21,231 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-27 23:58:21,324 [MainThread ] INFO:Populating Queue
2019-01-27 23:58:21,327 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:58:21,494 [MainThread ] INFO:Added 96 Items to the queue
2019-01-27 23:58:21,495 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-28 00:03:21,581 [MainThread ] INFO:Populating Queue
2019-01-28 00:03:21,587 [MainThread ] INFO:Fetching paste list from inputs.pastebin
2019-01-28 00:03:21,808 [MainThread ] INFO:Added 82 Items to the queue
2019-01-28 00:03:21,808 [MainThread ] INFO:Sleeping for 300 Seconds
2019-01-28 00:08:21,908 [MainThread ] INFO:Populating Queue
2019-01-28 00:08:21,913 [MainThread ] INFO:Fetching paste list from inputs.pastebin
As you can see the script stops finding anything. Happy to provide anymore information where I can.
Restarting the script solves this issue.
Good Afternoon,
First off, thank you for this project. Very excited to run this.
I have a request to add in the Tags field which would be populated by which Yara rule triggered to cause the collection.
Thank you!
On my system I was finding that pastehunter was pegging my CPU at 100% because the worker processes spent most of their time busy-looping round q.empty()
I've put a simple sleep statement in (PR below) and it now performs much better (and my fans are quieter!)
Unfortunately dumpz.org no longer has the recent end point for its api.
https://dumpz.org/help/api
After installing paste hunter and running it. Paste hunter requires a premium account on paste bin to white list our IP for scraping is there any other option available?
Hi,
It seems if you create a new Yara rule from scratch, and then restart pastehunter, you get the message below. If you remove the new Yara rule then the restart works.
Feb 22 13:05:19 vps639933 systemd[1]: pastehunter.service: Service hold-off time over, scheduling restart.
Feb 22 13:05:19 vps639933 systemd[1]: pastehunter.service: Scheduled restart job, restart counter is at 5.
Feb 22 13:05:19 vps639933 systemd[1]: Stopped PasteHunter.
Feb 22 13:05:19 vps639933 systemd[1]: pastehunter.service: Start request repeated too quickly.
Feb 22 13:05:19 vps639933 systemd[1]: pastehunter.service: Failed with result 'start-limit-hit'.
Feb 22 13:05:19 vps639933 systemd[1]: Failed to start PasteHunter.
If it's possible with all of the different site formats, the size field sent from pastes or gists should be submitted as an integer so Elastic and other services can use it for calculations.
Hi,
The pastehunter.py is giving me random connection issues. Details as follows:
When running fine, there are 7 tasks when I do "systemctl status pastehunter.service"
Main PID: 89222 (python3)
Tasks: 7 (limit: 4633)
CGroup: /system.slice/pastehunter.service
├─89222 /usr/bin/python3 /opt/pastehunter/pastehunter.py
├─89259 /usr/bin/python3 /opt/pastehunter/pastehunter.py
├─89260 /usr/bin/python3 /opt/pastehunter/pastehunter.py
├─89261 /usr/bin/python3 /opt/pastehunter/pastehunter.py
├─89262 /usr/bin/python3 /opt/pastehunter/pastehunter.py
└─89263 /usr/bin/python3 /opt/pastehunter/pastehunter.pyMain PID: 89222 (python3)
When error occurs, the task becomes 2.
Tasks: 2 (limit: 4633)
CGroup: /system.slice/pastehunter.service
└─2390 /usr/bin/python3 /opt/pastehunter/pastehunter.py
Following are errors from "systemctl status pastehunter.service"
Apr 21 21:43:32 pbsvr python3[80377]: raise ConnectionError(e, request=request)
Apr 21 21:43:32 pbsvr python3[80377]: requests.exceptions.ConnectionError: HTTPSConnectionPool(host='scrape.pastebin.com', port=443): Max retries exceeded with url: /api_scrape_item.php?i=uRnqq9Gj (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f1092269710>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
2)
Apr 21 01:55:23 pbsvr python3[80305]: ERROR:pastebin.py:Unable to parse paste results: HTTPSConnectionPool(host='scrape.pastebin.com', port=443): Max retries exceeded with url: /api_scraping.php?limit=200 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fdbddf56a58>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
I have to do a systemctl retsart pastehunter.service to restart the service.
Please help to suggest how to prevent the connection error.
Thank you.
After setting up the program and running it, I get this error with elasticsearch:
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Setting Log Level to 0
PUT /paste-test-2019-10/paste/npNFcthk [status:406 request:0.429s]
ERROR:pastehunter.py:Unable to store npNFcthk to <outputs.elastic_output.ElasticOutput object at 0x7f0f95c7be80> with error TransportError(406, 'Content-Type header [] is not supported')
I found this article and it looks similar:
etsy/411#177
Does anyone have a fix for this?
It's mentioned in the README that you plan to support paste.ee as well. Are you sure their API supports this? I tried it myself and it doesn't seem to work because as I understand it you can only list your own pastes using a user application key (https://pastee.github.io/docs/#pastes). I'll be happy to help implement the feature if you've found another way.
I tried to run the script after setup all the requirement but there is an error before sending the results to Elasticsearch :
Traceback (most recent call last):
File "pastehunter.py", line 100, in
if match.rule == 'core_keywords' or match.rule == 'custom_keywords':
AttributeError: 'str' object has no attribute 'rule'
I used python3.
Hello,
The python script works well for a couple hours, then starts to give this error:
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/local/lib/python3.6/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 743, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 346, in _make_request
self._validate_conn(conn)
File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 850, in _validate_conn
conn.connect()
File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 284, in connect
conn = self._new_conn()
File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x107d30cc0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known
Is this something on my side?
Hi Kev,
any idea how to solve this issue:
INFO:pastehunter.py:Fetching paste list from inputs.pastebin
ERROR:pastebin.py:Unable to parse paste results: HTTPSConnectionPool(host='pastebin.com', port=443): Max retries exceeded with url: /api_scraping.php?limit=200 (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)'),))
Thanks a lot for your help.
Marcus
Recently i started receiving these errors. I have not made any configuration changes and my IP is currently whitelisted in pastebin. Any ideas how to solve it?
It would be cool to have a way to be able to add found links of supported sites to a query list.
For example, one pastebin paste links to a gist which links to a b64 encoded executable back on pastebin that is older.
The only issue is a cycle throwing it into a loop, unless a history is checked against. The feature could easily be turned off by just turning off the post-processor.
Hello,
Sorry for the stupid question, but it is not so clear to me what to change in the settings.json. I followed some tutorials on TechAnarchy but I'm really not understanding.
I have a Pastebin Pro account, too.
See my output while executing it:
user@server:/opt/pastehunter$ sudo python3 pastehunter.py
INFO:pastehunter.py:Starting PasteHunter Version: 0.1
INFO:pastehunter.py:Reading Configs
ERROR:pastehunter.py:Log Level not in config file. Update your base config file!
INFO:pastehunter.py:Setting Log Level to 20
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Enabled Input: gists
INFO:pastehunter.py:Enabled Input: dumpz
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Output: elastic_output
INFO:pastehunter.py:Compile Yara Rules
INFO:pastehunter.py:Enable Blacklist Rules
INFO:pastehunter.py:Populating Queue
INFO:pastehunter.py:Fetching paste list from inputs.pastebin
ERROR:pastebin.py:Unable to parse paste results: Expecting value: line 1 column 1 (char 0)
INFO:pastehunter.py:Fetching paste list from inputs.gists
INFO:gists.py:Remaining Limit: 54. Resets at 2018-05-04T18:41:44
ERROR:gists.py:Auth Failed
INFO:pastehunter.py:Fetching paste list from inputs.dumpz
INFO:pastehunter.py:Sleeping for 300 Seconds
I have manually and successfully installed Elastic Search, Kibana, Logstash and Yara.
Can you advise me, please?
Thank you!
I found this while working with Amazon's Elasticsearch environment - SSL is required as our deployment does not support 80/tcp or 9200/tcp, only 443/tcp.
I had to edit https://github.com/kevthehermit/PasteHunter/blob/master/outputs/elastic_output.py#L16 to include use_ssl=True
self.es = Elasticsearch(es_host, port=es_port, http_auth=(es_user,es_pass), use_ssl=True)
Would it be possible to configure this in the settings.conf file ? While I understand this might be an edge case, there's probably some others out there than just me that might have the same kind of deployment.
Hi @kevthehermit,
how do you think service? I mean run with supervise or systemd?
Sorry if this is a noob error, but i am receiving the following error when executing the newly updated paste hunter
python3 pastehunter.py
INFO:pastehunter.py:Starting PasteHunter Version: 0.1
INFO:pastehunter.py:Reading Configs
ERROR:common.py:Unable to parse config file: Expecting ',' delimiter: line 93 column 5 (char 2481)
Traceback (most recent call last):
File "pastehunter.py", line 38, in
if "logging_level" in conf["general"]:
TypeError: 'NoneType' object is not subscriptable
Hello,
Instead of just having the name SMTP attachment of the alert being "Alert.json", would it be possible to add the Pastebin resource locator? Maybe in the format "Alert-[URL].json"
For example, the attachment name could be Alert-8sh8js9Q.json
Many Thanks,
Hello,
just found an issue in the settings file: (comma ending the "run_frequency" line.
"general": {
"run_frequency": 300
"logging_level": 20
Sould be:
"general": {
"run_frequency": 300,
"logging_level": 20
Cheers
Marcus
Hi, I have installed elasticsearch, kibana, python libraries and cloned the package to the box. However when i run python3 pastehunter.py the tool comes back with the following error:
user@ubuntu:~/pastehunter$ python3 pastehunter.py
INFO:pastehunter.py:Starting PasteHunter Version: 0.1
INFO:pastehunter.py:Reading Configs
ERROR:common.py:Unable to parse config file: Expecting ',' delimiter: line 93 column 5 (char 2487)
Traceback (most recent call last):
File "pastehunter.py", line 38, in
if "logging_level" in conf["general"]:
TypeError: 'NoneType' object is not subscriptable
I have tried a couple things such as running pastehunter in different versions of python.
Any ideas?
In settings.json.sample, there is a typo for output_path :
"csv_output": {
"enabled": false,
"module": "outputs.csv_output",
"classname": "CSVOutput",
"output_path": "/logs/csv/"
},
Should be:
"csv_output": {
"enabled": false,
"module": "outputs.csv_output",
"classname": "CSVOutput",
"output_path": "logs/csv/"
},
Hello kev,
first of all let me thank you for creating this great tool.
Could you help me with below error message I'm getting?
That'd be great. Thanks a lot.
Marcus
ERROR:pastehunter.py:Unable to scan raw paste : SVmF9Z9r - could not map file "" into memory
Monitor pastebin user or github/gist users for new pastes/modified pastes or new commits.
I'm trying to make it work, but I'm missing something in instructions. I've made all steps till the cron script.
http://storage1.static.itmages.com/i/17/0914/h_1505422329_9821783_a362db27be.png
The message is that the file or directory was not found.
I'm running pastehunter with python 3.5.2 on ubuntu and yara 3.4.
As soon as I get a hit and it tries to parse it I get
Traceback (most recent call last):
File "pastehunter.py", line 125, in
if match.rule == 'core_keywords' or match.rule == 'custom_keywords':
AttributeError: 'str' object has no attribute 'rule'
I checked the dict rules.match returns and it seems that it only has one subelement called main which then includes all the elements the filter looks for in a list.
Any idea what I could change?
Many thanks,
Mat
It doesn't matter what you configure in the settings.conf file, the Elasticsearch index is hardcoded in the elastic_output.py
file as paste_test
https://github.com/kevthehermit/PasteHunter/blob/master/outputs/elastic_output.py#L23
when running pastehunter.py i get the following error on the script for pastepin.py
pastebin.py:Unable to parse paste results: Expecting value: line 1 column 1 (char 0)
Any ideas how to solve this?
Hey,
I think it should be better to have
condition:
$b64_exe at 0
instead of :
condition:
$b64_exe
Readme says copy settings.conf.sample to settings.conf but in common.py the conf_file is looking for 'settings.json' which causes an error
I changed 'settings.json' to 'settings.conf' in common.py , not sure which way was intended.
I did all the procedure, but I don't know how to visualize or the information. Does he have a database? Where do I see the information? by the browser? Can someone help me?
I followed all the procedures of this linke https://pastehunter.readthedocs.io/en/latest/installation.html
All working fine(Ubuntu 16.0.4), but keep showing that error right after "Fetching paste list from inputs.gists"
ERROR:pastehunter.py:Unable to store 55599622 to <outputs.csv_output.CSVOutput object at 0x7f279029d208> with error 'scrape_url
What's the issue with it?
as per the website:
" /api/recent
GET
List of recently uploaded dumps.
(!) This API method is no longer available."
and in the terminal it shows : ERROR:dumpz.py:Unable to parse paste results: Expecting value: line 1 column 1 (char 0)
I'm having trouble getting this to work behind an authenticated http proxy. I was wondering if this was even possible without the use of environment variables.
Any suggestions or direction would be amazing!
Hi, I already had a project on my server with an instance of elasticsearch and kibana. So I changed the port 9200 for 9201 and 5601 for 5602 and I execute the docker build. Does it look goods?
But when I tried to check if it works with this command :
cur 127.0.0.1:9201
I got this error :
curl: (56) Recv failure: Connection reset by peer
Did someone already get this problem? I only changed the port values... so I didn't understand why it did work?
Thank you in advance
Hi:
The rule always matches loots of following urls.
#b64_url
https://gist.githubusercontent.com/lansetiankongFXQ/f7dfc3e0b827c812ebf06a10bae0b961/raw/101ac98fa42d9eb4b27836e122cc81d495e33c21/MyPython_venv_Lib_site-packages_pip-10.0.1-py3.7.egg_pip__vendor_certifi_cacert.pem
#db_connection
https://gist.githubusercontent.com/mzegar/723a6c16e065684f6751bca7dc8fd782/raw/6b6f9968d7a106931ef315066d0ec156a1823112/SaveTheEnvironmentGameJam2018_venv_Lib_site-packages_pip-10.0.1-py3.7.egg_pip__vendor_urllib3_util_url.py
They are python builtin modules. Is there any way to add a negative rule. When matched the negative rule then the file goes into black list and do not send to output module.
Thanks!
Sites like slexy have heavy rate limits. - #52
Add a configurable rate limit per input source
Tried rebuilding a few times, but, every time, this error comes up.
Up to this point, everything seems to work.
Would appreciate any help.
ERROR:pastehunter.py:Unable to store dxy4fUrc
If of any help, here's what my Docker has running: https://i.imgur.com/6vlnFbP.png
I'm having trouble getting any output from the application. It adds items to the queue but throws a warning about processes.
F:\Scripts[omitted]\PasteHunter>py pastehunter.py
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Compile Yara Rules
INFO:pastehunter.py:Enable Blacklist Rules
INFO:pastehunter.py:Enable Test Rules
WARNING:pastehunter.py:Creating New Process
WARNING:pastehunter.py:Creating New Process
WARNING:pastehunter.py:Creating New Process
WARNING:pastehunter.py:Creating New Process
WARNING:pastehunter.py:Creating New Process
INFO:pastehunter.py:Populating Queue
INFO:pastehunter.py:Fetching paste list from inputs.pastebin
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Enabled Output: smtp_output
Process Process-2:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
NameError: name 'q' is not defined
Process Process-4:
Process Process-1:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
Process Process-3:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
Process Process-5:
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
Traceback (most recent call last):
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
NameError: name 'q' is not defined
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
NameError: name 'q' is not defined
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
NameError: name 'q' is not defined
NameError: name 'q' is not defined
DEBUG:pastehunter.py:Writing History
INFO:pastehunter.py:Added 196 Items to the queue
I have been trying to get the script to pull in pastes for a couple of days. I havent had the script import any records as of yet, I keep getting the following error:
Traceback (most recent call last):
File "pastehunter.py", line 95, in
matches = rules.match(data=raw_paste_data)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 62: ordinal not in range(128)
Please let me know if there is any other information you need as this is the first time I am putting in an issue.
It seems that sometimes the two requests.get(raw_paste_uri).text calls in pastehunter.py fire SSLErrors and freeze the executing thread. This is fixed by adding some try-catches around them and letting them cleanly fail.
Turning this:
raw_paste_data = requests.get(raw_paste_uri).text
Into this:
try:
raw_paste_data = requests.get(raw_paste_uri).text
except requests.exceptions.SSLError as e:
logger.error("Unable to scan raw paste : {0} - {1}".format(paste_data['pasteid'], e))
continue
There may be a place here for a separate request function in the future.
If you turn on the entropy calculator, it fills the default logs with:
INFO:pastehunter.py:Running Post Module postprocess.post_entropy on
It also runs on blacklisted pastes, wasting CPU time.
Affected code:
# If any of the blacklist rules appear then empty the result set
if conf['yara']['blacklist'] and 'blacklist' in results:
results = []
logger.info("Blacklisted {0} paste {1}".format(paste_data['pastesite'], paste_data['pasteid']))
# Post Process
# If post module is enabled and the paste has a matching rule.
post_results = paste_data
for post_process, post_values in conf["post_process"].items():
if post_values["enabled"]:
if any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"]:
logger.debug("Running Post Module {0} on {1}".format(post_values["module"], paste_data["pasteid"]))
post_module = importlib.import_module(post_values["module"])
post_results = post_module.run(results,
raw_paste_data,
paste_data
)
To cut the logic off at the important point:
if any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"]:
This says either any(i in results for i in post_values["rule_list"])
or "ALL" in post_values["rule_list"]
will cause a paste to be parsed. This means a post_processor like "entropy calculator" will be run on EVERY paste, blacklisted or not.
slexy.org banned our server after running the PasteHunter in 5 minutes.
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='slexy.org', port=443): Max retries exceeded with url: /raw/s2ZRogGN7i?token=4dbda20607eee17c502beca6b457699fb64c4d67e5e94f66ec4a2411711ff794&ts=1540807309 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc60c810550>: Failed to establish a new connection: [Errno 111] Connection refused'))
The settings for email triggering under the settings.json file are enabled, however the program will not initialize the email triggering.
I am having the following error ! Have i missed something ?
INFO:pastehunter.py:Populating Queue
INFO:pastehunter.py:Fetching paste list from inputs.pastebin
ERROR:pastebin.py:Unable to parse paste results: Expecting value: line 1 column 1 (char 0)
INFO:pastehunter.py:Fetching paste list from inputs.dumpz
INFO:pastehunter.py:Fetching paste list from inputs.gists
INFO:gists.py:Remaining Limit: 56. Resets at 2018-01-17T15:47:47
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.