GithubHelp home page GithubHelp logo

pastehunter's People

Contributors

acabey avatar alesandroortiz avatar daverstephens avatar edznux avatar fnk0c avatar h4ckd4ddy avatar jdsnape avatar kevthehermit avatar knightsc avatar kovacsbalu avatar mattk-vmw avatar ntddk avatar peterdavehello avatar plazmaz avatar recrudesce avatar riccigrj avatar secbug avatar sfinlon avatar slthomason avatar toanalien avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pastehunter's Issues

Problem working

/usr/bin/PasteHunter-master# python pastehunter.py
Traceback (most recent call last):
File "pastehunter.py", line 8, in
import requests
ImportError: No module named requests

Should update the Pastebin API URL

Pastebin updated the API links. If you don't change it until April 27 you will be unable to use the scraper. Let's take a look at the Pastebin's scraping doc. https://pastebin.com/doc_scraping_api

Use these:
api_scrape : https://scrape.pastebin.com/api_scraping.php
api_raw : https://scrape.pastebin.com/api_scrape_item.php?i=

Instead of these:
api_scrape : https://pastebin.com/api_scraping.php
api_raw : https://pastebin.com/api_scrape_item.php?i=

Timing out to 192.168.1.22

`ERROR:pastehunter.py:Unable to store Vp9hd6Pa to ConnectionError((<urllib3.connection.HTTPConnection object at 0x7f16044b0160>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)')) caused by: ConnectTimeoutError((<urllib3.connection.HTTPConnection object at 0x7f16044b0160>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)'))
PUT http://192.168.1.22:9200/paste-test-2018-42/paste/Tv33mSXj [status:N/A request:10.010s]
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 83, in create_connection
raise err
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/elasticsearch/connection/http_urllib3.py", line 115, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 333, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise
raise value
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 146, in _new_conn
(self.host, self.timeout))
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0x7f1605576438>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)')
ERROR:pastehunter.py:Unable to store Tv33mSXj to ConnectionError((<urllib3.connection.HTTPConnection object at 0x7f1605576438>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)')) caused by: ConnectTimeoutError((<urllib3.connection.HTTPConnection object at 0x7f1605576438>, 'Connection to 192.168.1.22 timed out. (connect timeout=10)'))
INFO:pastehunter.py:Sleeping for 300 Seconds
`

Script stops scraping after approx 48 hours

After leaving the script running using nohup for approx 48 hours, the scrips stops finding any hits. Here is an excerpt from the logs:

2019-01-27 23:13:37,939 [MainThread  ] INFO:Blacklisted pastebin.com paste Td8GRTQf
2019-01-27 23:18:08,265 [MainThread  ] INFO:Populating Queue
2019-01-27 23:18:08,351 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:18:09,487 [MainThread  ] INFO:Added 80 Items to the queue
2019-01-27 23:18:18,207 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-27 23:18:57,755 [MainThread  ] INFO:Blacklisted pastebin.com paste 1QUb8s9U
2019-01-27 23:23:18,700 [MainThread  ] INFO:Populating Queue
2019-01-27 23:23:18,719 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:23:19,070 [MainThread  ] INFO:Added 112 Items to the queue
2019-01-27 23:23:19,071 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-27 23:28:19,162 [MainThread  ] INFO:Populating Queue
2019-01-27 23:28:19,168 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:28:19,452 [MainThread  ] INFO:Added 91 Items to the queue
2019-01-27 23:28:19,453 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-27 23:33:19,544 [MainThread  ] INFO:Populating Queue
2019-01-27 23:33:19,549 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:33:19,742 [MainThread  ] INFO:Added 85 Items to the queue
2019-01-27 23:33:19,742 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-27 23:38:19,804 [MainThread  ] INFO:Populating Queue
2019-01-27 23:38:19,810 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:38:20,088 [MainThread  ] INFO:Added 92 Items to the queue
2019-01-27 23:38:20,089 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-27 23:43:20,188 [MainThread  ] INFO:Populating Queue
2019-01-27 23:43:20,193 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:43:20,442 [MainThread  ] INFO:Added 77 Items to the queue
2019-01-27 23:43:20,443 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-27 23:48:20,544 [MainThread  ] INFO:Populating Queue
2019-01-27 23:48:20,549 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:48:20,835 [MainThread  ] INFO:Added 92 Items to the queue
2019-01-27 23:48:20,836 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-27 23:53:20,917 [MainThread  ] INFO:Populating Queue
2019-01-27 23:53:20,919 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:53:21,231 [MainThread  ] INFO:Added 99 Items to the queue
2019-01-27 23:53:21,231 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-27 23:58:21,324 [MainThread  ] INFO:Populating Queue
2019-01-27 23:58:21,327 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-27 23:58:21,494 [MainThread  ] INFO:Added 96 Items to the queue
2019-01-27 23:58:21,495 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-28 00:03:21,581 [MainThread  ] INFO:Populating Queue
2019-01-28 00:03:21,587 [MainThread  ] INFO:Fetching paste list from inputs.pastebin
2019-01-28 00:03:21,808 [MainThread  ] INFO:Added 82 Items to the queue
2019-01-28 00:03:21,808 [MainThread  ] INFO:Sleeping for 300 Seconds
2019-01-28 00:08:21,908 [MainThread  ] INFO:Populating Queue
2019-01-28 00:08:21,913 [MainThread  ] INFO:Fetching paste list from inputs.pastebin

As you can see the script stops finding anything. Happy to provide anymore information where I can.

Restarting the script solves this issue.

PasteHunter pegs CPU at 100%

On my system I was finding that pastehunter was pegging my CPU at 100% because the worker processes spent most of their time busy-looping round q.empty()

I've put a simple sleep statement in (PR below) and it now performs much better (and my fans are quieter!)

Premium paste bin required

After installing paste hunter and running it. Paste hunter requires a premium account on paste bin to white list our IP for scraping is there any other option available?

Yara index creation fails when local rule has syntax errors.

Hi,

It seems if you create a new Yara rule from scratch, and then restart pastehunter, you get the message below. If you remove the new Yara rule then the restart works.

Feb 22 13:05:19 vps639933 systemd[1]: pastehunter.service: Service hold-off time over, scheduling restart.
Feb 22 13:05:19 vps639933 systemd[1]: pastehunter.service: Scheduled restart job, restart counter is at 5.
Feb 22 13:05:19 vps639933 systemd[1]: Stopped PasteHunter.
Feb 22 13:05:19 vps639933 systemd[1]: pastehunter.service: Start request repeated too quickly.
Feb 22 13:05:19 vps639933 systemd[1]: pastehunter.service: Failed with result 'start-limit-hit'.
Feb 22 13:05:19 vps639933 systemd[1]: Failed to start PasteHunter.

Paste "size" field should be a JSON integer

If it's possible with all of the different site formats, the size field sent from pastes or gists should be submitted as an integer so Elastic and other services can use it for calculations.

Pastehunter service random connection error

Hi,

The pastehunter.py is giving me random connection issues. Details as follows:
When running fine, there are 7 tasks when I do "systemctl status pastehunter.service"

Main PID: 89222 (python3)
Tasks: 7 (limit: 4633)
CGroup: /system.slice/pastehunter.service
├─89222 /usr/bin/python3 /opt/pastehunter/pastehunter.py
├─89259 /usr/bin/python3 /opt/pastehunter/pastehunter.py
├─89260 /usr/bin/python3 /opt/pastehunter/pastehunter.py
├─89261 /usr/bin/python3 /opt/pastehunter/pastehunter.py
├─89262 /usr/bin/python3 /opt/pastehunter/pastehunter.py
└─89263 /usr/bin/python3 /opt/pastehunter/pastehunter.pyMain PID: 89222 (python3)

When error occurs, the task becomes 2.

Tasks: 2 (limit: 4633)

CGroup: /system.slice/pastehunter.service
└─2390 /usr/bin/python3 /opt/pastehunter/pastehunter.py

Following are errors from "systemctl status pastehunter.service"

Apr 21 21:43:32 pbsvr python3[80377]: raise ConnectionError(e, request=request)
Apr 21 21:43:32 pbsvr python3[80377]: requests.exceptions.ConnectionError: HTTPSConnectionPool(host='scrape.pastebin.com', port=443): Max retries exceeded with url: /api_scrape_item.php?i=uRnqq9Gj (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f1092269710>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
2)
Apr 21 01:55:23 pbsvr python3[80305]: ERROR:pastebin.py:Unable to parse paste results: HTTPSConnectionPool(host='scrape.pastebin.com', port=443): Max retries exceeded with url: /api_scraping.php?limit=200 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fdbddf56a58>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))

I have to do a systemctl retsart pastehunter.service to restart the service.
Please help to suggest how to prevent the connection error.

Thank you.

possible issue with elasticsearch

After setting up the program and running it, I get this error with elasticsearch:

INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Setting Log Level to 0
PUT /paste-test-2019-10/paste/npNFcthk [status:406 request:0.429s]
ERROR:pastehunter.py:Unable to store npNFcthk to <outputs.elastic_output.ElasticOutput object at 0x7f0f95c7be80> with error TransportError(406, 'Content-Type header [] is not supported')

I found this article and it looks similar:
etsy/411#177

Does anyone have a fix for this?

paste.ee

It's mentioned in the README that you plan to support paste.ee as well. Are you sure their API supports this? I tried it myself and it doesn't seem to work because as I understand it you can only list your own pastes using a user application key (https://pastee.github.io/docs/#pastes). I'll be happy to help implement the feature if you've found another way.

AttributeError: 'str' object has no attribute 'rule'

I tried to run the script after setup all the requirement but there is an error before sending the results to Elasticsearch :

Traceback (most recent call last):
File "pastehunter.py", line 100, in
if match.rule == 'core_keywords' or match.rule == 'custom_keywords':
AttributeError: 'str' object has no attribute 'rule'

I used python3.

Socket and urllib3 Error

Hello,

The python script works well for a couple hours, then starts to give this error:

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/usr/local/lib/python3.6/site-packages/urllib3/util/connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 743, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 850, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 284, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 150, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x107d30cc0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

Is this something on my side?

import requests requests.packages.urllib3.disable_warnings() # disable SSL warnings

Hi Kev,

any idea how to solve this issue:

INFO:pastehunter.py:Fetching paste list from inputs.pastebin
ERROR:pastebin.py:Unable to parse paste results: HTTPSConnectionPool(host='pastebin.com', port=443): Max retries exceeded with url: /api_scraping.php?limit=200 (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)'),))

Thanks a lot for your help.
Marcus

Blacklisted pastebin.com paste

Recently i started receiving these errors. I have not made any configuration changes and my IP is currently whitelisted in pastebin. Any ideas how to solve it?

Feature Request: Cyclical Links

It would be cool to have a way to be able to add found links of supported sites to a query list.
For example, one pastebin paste links to a gist which links to a b64 encoded executable back on pastebin that is older.

The only issue is a cycle throwing it into a loop, unless a history is checked against. The feature could easily be turned off by just turning off the post-processor.

What to config?

Hello,
Sorry for the stupid question, but it is not so clear to me what to change in the settings.json. I followed some tutorials on TechAnarchy but I'm really not understanding.
I have a Pastebin Pro account, too.
See my output while executing it:
user@server:/opt/pastehunter$ sudo python3 pastehunter.py
INFO:pastehunter.py:Starting PasteHunter Version: 0.1
INFO:pastehunter.py:Reading Configs
ERROR:pastehunter.py:Log Level not in config file. Update your base config file!
INFO:pastehunter.py:Setting Log Level to 20
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Enabled Input: gists
INFO:pastehunter.py:Enabled Input: dumpz
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Output: elastic_output
INFO:pastehunter.py:Compile Yara Rules
INFO:pastehunter.py:Enable Blacklist Rules
INFO:pastehunter.py:Populating Queue
INFO:pastehunter.py:Fetching paste list from inputs.pastebin
ERROR:pastebin.py:Unable to parse paste results: Expecting value: line 1 column 1 (char 0)
INFO:pastehunter.py:Fetching paste list from inputs.gists
INFO:gists.py:Remaining Limit: 54. Resets at 2018-05-04T18:41:44
ERROR:gists.py:Auth Failed
INFO:pastehunter.py:Fetching paste list from inputs.dumpz
INFO:pastehunter.py:Sleeping for 300 Seconds

I have manually and successfully installed Elastic Search, Kibana, Logstash and Yara.
Can you advise me, please?

Thank you!

SSL Elasticsearch

I found this while working with Amazon's Elasticsearch environment - SSL is required as our deployment does not support 80/tcp or 9200/tcp, only 443/tcp.

I had to edit https://github.com/kevthehermit/PasteHunter/blob/master/outputs/elastic_output.py#L16 to include use_ssl=True

self.es = Elasticsearch(es_host, port=es_port, http_auth=(es_user,es_pass), use_ssl=True)

Would it be possible to configure this in the settings.conf file ? While I understand this might be an edge case, there's probably some others out there than just me that might have the same kind of deployment.

Unable to parse config file

Sorry if this is a noob error, but i am receiving the following error when executing the newly updated paste hunter

python3 pastehunter.py
INFO:pastehunter.py:Starting PasteHunter Version: 0.1
INFO:pastehunter.py:Reading Configs
ERROR:common.py:Unable to parse config file: Expecting ',' delimiter: line 93 column 5 (char 2481)
Traceback (most recent call last):
File "pastehunter.py", line 38, in
if "logging_level" in conf["general"]:
TypeError: 'NoneType' object is not subscriptable

[Feature] Pastebin link on Email Attachment

Hello,

Instead of just having the name SMTP attachment of the alert being "Alert.json", would it be possible to add the Pastebin resource locator? Maybe in the format "Alert-[URL].json"

For example, the attachment name could be Alert-8sh8js9Q.json

Many Thanks,

comma missing in setting.json

Hello,
just found an issue in the settings file: (comma ending the "run_frequency" line.

"general": {
"run_frequency": 300
"logging_level": 20

Sould be:
"general": {
"run_frequency": 300,
"logging_level": 20

Cheers
Marcus

Installation troubleshooting

Hi, I have installed elasticsearch, kibana, python libraries and cloned the package to the box. However when i run python3 pastehunter.py the tool comes back with the following error:

user@ubuntu:~/pastehunter$ python3 pastehunter.py
INFO:pastehunter.py:Starting PasteHunter Version: 0.1
INFO:pastehunter.py:Reading Configs
ERROR:common.py:Unable to parse config file: Expecting ',' delimiter: line 93 column 5 (char 2487)
Traceback (most recent call last):
File "pastehunter.py", line 38, in
if "logging_level" in conf["general"]:
TypeError: 'NoneType' object is not subscriptable

I have tried a couple things such as running pastehunter in different versions of python.

Any ideas?

typo in settings.json.sample

In settings.json.sample, there is a typo for output_path :

"csv_output": {
"enabled": false,
"module": "outputs.csv_output",
"classname": "CSVOutput",
"output_path": "/logs/csv/"
},

Should be:
"csv_output": {
"enabled": false,
"module": "outputs.csv_output",
"classname": "CSVOutput",
"output_path": "logs/csv/"
},

ERROR:pastehunter.py:Unable to scan raw paste

Hello kev,

first of all let me thank you for creating this great tool.

Could you help me with below error message I'm getting?
That'd be great. Thanks a lot.

Marcus

ERROR:pastehunter.py:Unable to scan raw paste : SVmF9Z9r - could not map file "" into memory

Monitor Specific Users

Monitor pastebin user or github/gist users for new pastes/modified pastes or new commits.

Connect to the docker?

Something isn't working when I try to run it. I want to trouble-shoot this, but I can't figure out how to connect to the docker and see what is going wrong. Is there some way to see some output logs or something? My Kibana just shows this:

image

rules.match returns strings instead of dict

I'm running pastehunter with python 3.5.2 on ubuntu and yara 3.4.
As soon as I get a hit and it tries to parse it I get

Traceback (most recent call last):
File "pastehunter.py", line 125, in
if match.rule == 'core_keywords' or match.rule == 'custom_keywords':
AttributeError: 'str' object has no attribute 'rule'

I checked the dict rules.match returns and it seems that it only has one subelement called main which then includes all the elements the filter looks for in a list.
Any idea what I could change?

Many thanks,
Mat

improving base64.yar rule

Hey,

I think it should be better to have

    condition:
$b64_exe at 0

instead of :

    condition:
$b64_exe

settings.conf called settings.json mismatch

Readme says copy settings.conf.sample to settings.conf but in common.py the conf_file is looking for 'settings.json' which causes an error
I changed 'settings.json' to 'settings.conf' in common.py , not sure which way was intended.

Error 'scrape_url'

All working fine(Ubuntu 16.0.4), but keep showing that error right after "Fetching paste list from inputs.gists"

ERROR:pastehunter.py:Unable to store 55599622 to <outputs.csv_output.CSVOutput object at 0x7f279029d208> with error 'scrape_url

What's the issue with it?

dumpz.org no longer has scraping api

as per the website:
" /api/recent
GET

List of recently uploaded dumps.

(!) This API method is no longer available."

and in the terminal it shows : ERROR:dumpz.py:Unable to parse paste results: Expecting value: line 1 column 1 (char 0)

[Feature] support for authenticated http proxy

I'm having trouble getting this to work behind an authenticated http proxy. I was wondering if this was even possible without the use of environment variables.

Any suggestions or direction would be amazing!

Already had an elasticsearch instance on port 9200 with kibana on port 5601

Hi, I already had a project on my server with an instance of elasticsearch and kibana. So I changed the port 9200 for 9201 and 5601 for 5602 and I execute the docker build. Does it look goods?

image

image

But when I tried to check if it works with this command :
cur 127.0.0.1:9201
I got this error :
curl: (56) Recv failure: Connection reset by peer
Did someone already get this problem? I only changed the port values... so I didn't understand why it did work?

Thank you in advance

Can I add blacklist rule?

Hi:
The rule always matches loots of following urls.

#b64_url
https://gist.githubusercontent.com/lansetiankongFXQ/f7dfc3e0b827c812ebf06a10bae0b961/raw/101ac98fa42d9eb4b27836e122cc81d495e33c21/MyPython_venv_Lib_site-packages_pip-10.0.1-py3.7.egg_pip__vendor_certifi_cacert.pem
 #db_connection
https://gist.githubusercontent.com/mzegar/723a6c16e065684f6751bca7dc8fd782/raw/6b6f9968d7a106931ef315066d0ec156a1823112/SaveTheEnvironmentGameJam2018_venv_Lib_site-packages_pip-10.0.1-py3.7.egg_pip__vendor_urllib3_util_url.py

They are python builtin modules. Is there any way to add a negative rule. When matched the negative rule then the file goes into black list and do not send to output module.
Thanks!

name 'q' is not defined

I'm having trouble getting any output from the application. It adds items to the queue but throws a warning about processes.

F:\Scripts[omitted]\PasteHunter>py pastehunter.py
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Compile Yara Rules
INFO:pastehunter.py:Enable Blacklist Rules
INFO:pastehunter.py:Enable Test Rules
WARNING:pastehunter.py:Creating New Process
WARNING:pastehunter.py:Creating New Process
WARNING:pastehunter.py:Creating New Process
WARNING:pastehunter.py:Creating New Process
WARNING:pastehunter.py:Creating New Process
INFO:pastehunter.py:Populating Queue
INFO:pastehunter.py:Fetching paste list from inputs.pastebin
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Starting PasteHunter Version: 1.0
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Log File: logs/pastehunter.log
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Setting Log Level to 10
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: smtp_output
INFO:pastehunter.py:Enabled Output: smtp_output
Process Process-2:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
NameError: name 'q' is not defined
Process Process-4:
Process Process-1:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
Process Process-3:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
Process Process-5:
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
Traceback (most recent call last):
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
NameError: name 'q' is not defined
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
258, in _bootstrap
self.run()
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
NameError: name 'q' is not defined
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line
93, in run
self._target(*self._args, **self._kwargs)
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
File "F:\Scripts[omitted]\PasteHunter\pastehunter.py", line 153, in paste_scanne
r
if q.empty():
NameError: name 'q' is not defined
NameError: name 'q' is not defined
DEBUG:pastehunter.py:Writing History
INFO:pastehunter.py:Added 196 Items to the queue

Codec cant encode character

I have been trying to get the script to pull in pastes for a couple of days. I havent had the script import any records as of yet, I keep getting the following error:
Traceback (most recent call last):
File "pastehunter.py", line 95, in
matches = rules.match(data=raw_paste_data)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 62: ordinal not in range(128)

Please let me know if there is any other information you need as this is the first time I am putting in an issue.

When SSLErrors occur, threads are frozen

It seems that sometimes the two requests.get(raw_paste_uri).text calls in pastehunter.py fire SSLErrors and freeze the executing thread. This is fixed by adding some try-catches around them and letting them cleanly fail.

Turning this:
raw_paste_data = requests.get(raw_paste_uri).text

Into this:

try:
    raw_paste_data = requests.get(raw_paste_uri).text
except requests.exceptions.SSLError as e:
    logger.error("Unable to scan raw paste : {0} - {1}".format(paste_data['pasteid'], e))
    continue

There may be a place here for a separate request function in the future.

Postprocessing is Noisy and Wasteful

If you turn on the entropy calculator, it fills the default logs with:
INFO:pastehunter.py:Running Post Module postprocess.post_entropy on
It also runs on blacklisted pastes, wasting CPU time.

Affected code:

# If any of the blacklist rules appear then empty the result set
if conf['yara']['blacklist'] and 'blacklist' in results:
    results = []
    logger.info("Blacklisted {0} paste {1}".format(paste_data['pastesite'], paste_data['pasteid']))

# Post Process

# If post module is enabled and the paste has a matching rule.
post_results = paste_data
for post_process, post_values in conf["post_process"].items():
    if post_values["enabled"]:
        if any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"]:
            logger.debug("Running Post Module {0} on {1}".format(post_values["module"], paste_data["pasteid"]))
            post_module = importlib.import_module(post_values["module"])
            post_results = post_module.run(results,
                                            raw_paste_data,
                                            paste_data
                                            )

To cut the logic off at the important point:
if any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"]:

This says either any(i in results for i in post_values["rule_list"]) or "ALL" in post_values["rule_list"] will cause a paste to be parsed. This means a post_processor like "entropy calculator" will be run on EVERY paste, blacklisted or not.

Slexy.org connection refused

slexy.org banned our server after running the PasteHunter in 5 minutes.

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='slexy.org', port=443): Max retries exceeded with url: /raw/s2ZRogGN7i?token=4dbda20607eee17c502beca6b457699fb64c4d67e5e94f66ec4a2411711ff794&ts=1540807309 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc60c810550>: Failed to establish a new connection: [Errno 111] Connection refused'))

Unable to parse paste results

I am having the following error ! Have i missed something ?
INFO:pastehunter.py:Populating Queue
INFO:pastehunter.py:Fetching paste list from inputs.pastebin
ERROR:pastebin.py:Unable to parse paste results: Expecting value: line 1 column 1 (char 0)
INFO:pastehunter.py:Fetching paste list from inputs.dumpz
INFO:pastehunter.py:Fetching paste list from inputs.gists
INFO:gists.py:Remaining Limit: 56. Resets at 2018-01-17T15:47:47

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.