GithubHelp home page GithubHelp logo

opsdisk / pagodo Goto Github PK

View Code? Open in Web Editor NEW
2.7K 85.0 487.0 1.43 MB

pagodo (Passive Google Dork) - Automate Google Hacking Database scraping and searching

License: GNU General Public License v3.0

Python 100.00%
google-dorks google-hacking-database python google dork osint osint-python ghdb google-dork yagooglesearch

pagodo's Issues

Unicode error (ghdb-scraper)

when fetching ghdb, its getting error, because variable saved to disk needs to be encoded as utf-8

[*] Initiation timestamp: 20200613_050537
Traceback (most recent call last):
  File ".\ghdb_scraper.py", line 95, in <module>
    retrieve_google_dorks(**vars(args))
  File ".\ghdb_scraper.py", line 55, in retrieve_google_dorks
    fh.write(f"{extracted_dork}\n")
  File "C:\Users\murray\AppData\Local\Programs\Python\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0131' in position 30: character maps to <undefined>

in line 49, add: encoding='utf-8' to open function

fix this with:

with open(google_dork_file, "w", encoding='utf-8') as fh:

I'm getting the following error:

Great script. I'm getting the following error however even after installing the requirements:

emily@kali:/Tools/pagodo$ sudo pip3 install -r requirements.txt
[sudo] password for emily:
Requirement already satisfied: beautifulsoup4>=4.6.0 in /usr/lib/python3/dist-packages (from -r requirements.txt (line 1))
Collecting google>=2.0.1 (from -r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/c8/b1/887e715b39ea7d413a06565713c5ea0e3132156bd6fc2d8b165cee3e559c/google-2.0.1.tar.gz
Requirement already satisfied: numpy>=1.13.3 in /usr/lib/python3/dist-packages (from -r requirements.txt (line 3))
Requirement already satisfied: requests>=2.18.4 in /usr/lib/python3/dist-packages (from -r requirements.txt (line 4))
Building wheels for collected packages: google
Running setup.py bdist_wheel for google ... done
Stored in directory: /root/.cache/pip/wheels/b3/6d/94/ad59f018e26ad1987116a8eda758a4dd4285fcb0b4daf7c50d
Successfully built google
Installing collected packages: google
Successfully installed google-2.0.1
emily@kali:
/Tools/pagodo$ python ghdb_scraper.py -n 5 -x 3875 -s -t 3
Traceback (most recent call last):
File "ghdb_scraper.py", line 15, in
from bs4 import BeautifulSoup
File "/usr/local/lib/python2.7/dist-packages/bs4/init.py", line 30, in
from .builder import builder_registry, ParserRejectedMarkup
File "/usr/local/lib/python2.7/dist-packages/bs4/builder/init.py", line 314, in
from . import _html5lib
File "/usr/local/lib/python2.7/dist-packages/bs4/builder/_html5lib.py", line 70, in
class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder):
AttributeError: 'module' object has no attribute '_base'

Suggestions? Thanks.

Typo

Should this

python ghdb_scraper.py -n 5 -x 3785 -f -t 3

not read as this?

python ghdb_scraper.py -n 5 -x 3785 -s -t 3

Python3

python3: can't open file '/home/crypticcipher/pagodo/pagoda.py': [Errno 2] No such file or directory

Lol soo i figure out the problem i spelled pagodo wrong i spelled progoda

SyntaxError: invalid syntax

root@ubuntu:/soft/dork-tools/pagodo# python3 ghdb_scraper.py
File "ghdb_scraper.py", line 33
print(f"[-] Error retrieving google dorks from: {url}")
^
SyntaxError: invalid syntax
root@ubuntu:/soft/dork-tools/pagodo#
root@ubuntu:/soft/dork-tools/pagodo# ls
ghdb_scraper.py google_dorks.json pagodo.py requirements.txt
google_dorks_20181229_113249.txt LICENSE README.md user_agents.txt
root@ubuntu:/soft/dork-tools/pagodo# python3 pagodo.py
File "pagodo.py", line 32
self.log_file = f"pagodo_results_{get_timestamp()}.txt"
^
SyntaxError: invalid syntax
root@ubuntu:/soft/dork-tools/pagodo# python pagodo.py
File "pagodo.py", line 32
self.log_file = f"pagodo_results_{get_timestamp()}.txt"
^
SyntaxError: invalid syntax

[-] ERROR with dork:

I'm getting this error with each dork after every query is that normal?

[-] ERROR with dork: about.php?cartID=

0 results for each dork

Hi,
Thank you for your tool, it is very interesting but unfortunately I can't get it to work, the searches always return me 0 results.

root@debian: uname -a
>Linux debian 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u5 (2017-09-19) x86_64 GNU/Linux

root@debian: python3 -V
> Python 3.7.2

root@debian: cat dorks.txt
> filetype:pdf

root@debian: python3 pagodo.py -d yahoo.com -g dorks.txt
>[*] Initiation timestamp: 20190208_085808
>[*] Search ( 1 / 1 ) for Google dork [ site:yahoo.com filetype:pdf ] and waiting 80.32581305323839 >seconds between searches using User-Agent 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2 (.NET CLR 3.5.30729)'
>[*] Results: 0 sites found for Google dork: filetype:pdf
>[*] Total dorks found: 0
>[*] Completion timestamp: 20190208_085929
>[+] Done!

No matter which dork I use, I have 0 results:/

Thank's for help

[-] EXCEPTION: HTTP Error 429:

[-] Error with dork: index.of.private
[-] EXCEPTION: HTTP Error 429: Too Many Requests
i used the proxychains4 but i had that's problem

[!] Specify a valid file containing Google dorks with -g

I got error when execute code drom Hak5 tutrial.

(.venv) josh@josh-pc:~/pagodo$ python3 pagodo.py -d amazon.com -g "dorks/files_containig_juicy_info.dorks" -l 50 -s -e 35.0 -j 1.1

[!] Specify a valid file containing Google dorks with -g #54

EXCEPTION: HTTP Error 429: Too Many Requests

Hello,

I have configured 4 Tor proxies and my proxychain4 configuration looks like that:

round_robin_chain
chain_len = 1
proxy_dns
remote_dns_subnet 224
tcp_read_time_out 15000
tcp_connect_time_out 8000
[ProxyList]
socks4  127.0.0.1 9050
socks4  127.0.0.1 9060
socks4  127.0.0.1 9062
socks4  127.0.0.1 9064

and I run Pogodo using this command :

proxychains4 python3 pagodo.py -d domain.com -g dorks/files_containing_juicy_info.dorks -l 50 -s -e 60.0 -j 1.1

But I am getting HTTP Error since the first try :

[proxychains] config file found: /etc/proxychains4.conf
[proxychains] preloading /usr/lib/x86_64-linux-gnu/libproxychains.so.4
[proxychains] DLL init: proxychains-ng 4.14
[*] Initiation timestamp: 20210803_101300
[*] Search ( 1 / 938 ) for Google dork [ site:******.com intitle:"Ganglia" "Cluster Report for" ] and waiting 120.85752787276157 seconds between searches using User-Agent 'Mozilla/5.0 (iPad; U; CPU OS 3_2_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B500 Safari/53'
[proxychains] Round Robin chain  ...  127.0.0.1:9050  ...  www.google.com:443  ...  OK
[proxychains] Round Robin chain  ...  127.0.0.1:9060  ...  www.google.com:443  ...  OK
[proxychains] Round Robin chain  ...  127.0.0.1:9062  ...  www.google.com:443  ...  OK
[-] Error with dork: intitle:"Ganglia" "Cluster Report for"
[-] EXCEPTION: HTTP Error 429: Too Many Requests
[*] Google is blocking you, looks like you need to spread out the Google searches.  Don't know how to utilize SSH and dynamic socks proxies?  Do yourself a favor and pick up a copy of The Cyber Plumber's Handbook and interactive lab (https://gumroad.com/l/cph_book_and_lab) to learn all about Secure Shell (SSH) tunneling, port redirection, and bending traffic like a boss.
[*] Search ( 2 / 938 ) for Google dork [ site:*****.com allinurl:/examples/jsp/snp/snoop.jsp ] and waiting 122.82545531944587 seconds between searches using User-Agent 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.8) Gecko/20100723 SUSE/3.6.8-0.1.1 Firefox/3.6.8'
[proxychains] Round Robin chain  ...  127.0.0.1:9064  ...  www.google.com:443  ...  OK

Could you tell me please how can I bypass these errors ?
Best regards

Default Delay causes Service Unavailable

Acutally this is probably not an issue with the code itself, however the default delay value (-e 35.0) causes a Exception HTTP Error 503: Service unavailable, already after the 4th dork. I'm still trying to figure out a value that works for me and does not take too long. I'm trying now with -e 120.0. With this value checking all currently available dorks (about 4500) for one domain will take about 7 days. I'll give an update if this delay value worked next week :-). Would be nice to know what values other users use, @opsdisk already mentioned in issue #10 that the default values worked for him in the past.

error code 429

Why is it happening?
HTTP Error code 429?

This happens often and no file is created.

option with --proxy

i have a proxy server that can rotate the ip address on every request, can use the --proxy 192.168.1.1:8080 option?

Some parameters doesn't work

parameters
Hi,

While the script is running, it does not use some parameter values (-i, -x, -m) that I gave at the beginning. Instead, it uses some default values in the code. I tried to manually change the code myself but it kept working with default settings every time.

Regards..

GHDB scraper produces inaccurate output

I found that the output from ghdb scraper was not precise. for example the title gives "Google Dork" but when viewed it produces site:".edu" intitle:"index of"|".db" and the tool saves the output "Google Dork" instead of site:".edu" intitle:" index of"|".db"

bandicam.2024-02-03.13-23-30-421.mp4

Wrong number of contents in ghdb_scraper.py

Hi, there is an error in ghdb_scraper.py, in lines 34 and 35. I solved it changing actual lines by these:

column = table.find_all('td')[2]
dork = column.find_all('a')[0].contents[0]

With this, it is retrieving correctly the google dork from the table that is actually shown in https://exploit-db.com/ghdb//

Regards!

Received HTTP Error 429: Too Many Requests

Hi,

I have received this http error message after 10 searches: HTTP Error 429: Too Many Requests. I have used a high delay (-e 30.0). I have tried to change my IP but no joy!
Any ideas how I can solve this.
thank you
Jay!

traceback problem

hi, when im trying executing the command , this problem pop up

Traceback (most recent call last):
File "pagodo.py", line 15, in
import yagooglesearch
ModuleNotFoundError: No module named 'yagooglesearch'

please help,
thanks :>

Add color to terminal results

implement colorize for the results.

at least in this line, to be exact:
[*] Results: 0 sites found for Google dork: ...

pip install google : ImportError: cannot import name IncompleteRead

Any idea on this?

root@DS:~# pip install google
Traceback (most recent call last):
File "/usr/bin/pip", line 9, in
load_entry_point('pip==1.5.6', 'console_scripts', 'pip')()
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 356, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2476, in load_entry_point
return ep.load()
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2190, in load
['name'])
File "/usr/lib/python2.7/dist-packages/pip/init.py", line 74, in
from pip.vcs import git, mercurial, subversion, bazaar # noqa
File "/usr/lib/python2.7/dist-packages/pip/vcs/mercurial.py", line 9, in
from pip.download import path_to_url
File "/usr/lib/python2.7/dist-packages/pip/download.py", line 25, in
from requests.compat import IncompleteRead
ImportError: cannot import name IncompleteRead

ModuleNotFoundError: No module named 'yagooglesearch'

When i try to run,using some dork commands,i get this
Traceback (most recent call last):
File "/home/ethical****/Documents/pagodo/pagodo.py", line 15, in
import yagooglesearch
ModuleNotFoundError: No module named 'yagooglesearch'

Failed to resolve 'myproxy'

Hi all,
on mac m1,
Upon running this command:

python pagodo.py -d example.com -g dorks/all_google_dorks.txt -p http://myproxy:8080,socks5h://127.0.0.1:9050,socks5h://127.0.0.1:9051 -s pagodo-results.txt

this is the error:
2023-10-13 07:41:12,032 [MainThread ] [INFO] Initiation timestamp: 2023-10-13T07:41:12.032601
2023-10-13 07:41:12,032 [MainThread ] [INFO] Search ( 1 / 7752 ) for Google dork [ site:ovo.id intitle:"Ganglia" "Cluster Report for" ] using User-Agent 'Mozilla/5.0 (X11; U; Linux x86_64; fr; rv:1.9.1.6) Gecko/20091215 Ubuntu/9.10 (karmic) Firefox/3.5.6' through proxy 'http://myproxy:8080'
2023-10-13 07:41:12,032 [MainThread ] [INFO] Requesting URL: https://www.google.com/
2023-10-13 07:41:12,041 [MainThread ] [ERROR] Error with dork: intitle:"Ganglia" "Cluster Report for". Exception HTTPSConnectionPool(host='www.google.com', port=443): Max retries exceeded with url: / (Caused by ProxyError('Unable to connect to proxy', NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x1032380d0>: Failed to resolve 'myproxy' ([Errno 8] nodename nor servname provided, or not known)")))
2023-10-13 07:41:12,041 [MainThread ] [INFO] Sleeping 49.8 seconds before executing the next dork search...

I've made sure the requests[sock] is installed:
pip install 'requests[socks]'
Requirement already satisfied: requests[socks] in ./.venv/lib/python3.11/site-packages (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.venv/lib/python3.11/site-packages (from requests[socks]) (3.3.0)
Requirement already satisfied: idna<4,>=2.5 in ./.venv/lib/python3.11/site-packages (from requests[socks]) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./.venv/lib/python3.11/site-packages (from requests[socks]) (2.0.6)
Requirement already satisfied: certifi>=2017.4.17 in ./.venv/lib/python3.11/site-packages (from requests[socks]) (2023.7.22)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in ./.venv/lib/python3.11/site-packages (from requests[socks]) (1.7.1)

https://stackoverflow.com/questions/69152016/cant-send-requests-through-socks5-proxy-with-python

please help.

Random error with dork number...

Total Google dorks retrieved: 3228

Others failed with "Random error with dork number 3604" etc

What is the best thing to do to get the remaining dorks?

Rerun ghdb_scraper.py ?

Getting HTTP code 503 (google detects) after 3rd dork

Hi!

I was wondering if you could give me some tips on how to get undetected from Google with pagodo, after only the 3rd dork in a list of 373 dorks I got detect with the following syntax:

python3 pagodo.py -g ALL_dorks.txt -s -e 35.0 -l 700 -j 1.1

[*] Search ( 3 / 373 ) for Google dork [ inurl:/index.jsf filetype:jsf ] and waiting 70.1710817159225 seconds between searches using User-Agent 'Mozilla/5.0 (Windows; U; Windows NT 6.1; zh-CN; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729)'
[-] Error with dork: inurl:/index.jsf filetype:jsf
[-] EXCEPTION: HTTP Error 503: Service Unavailable

Thanks!

details

where does details found saved?

Tool is not working

[-] Error with dork: intitle:"Ganglia" "Cluster Report for"
[-] EXCEPTION: module 'googlesearch' has no attribute 'search'
Traceback (most recent call last):
File "/root/bugbounty/pagodo/pagodo.py", line 103, in go
for url in googlesearch.search(
AttributeError: module 'googlesearch' has no attribute 'search'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/bugbounty/pagodo/pagodo.py", line 227, in
pgd.go()
File "/root/bugbounty/pagodo/pagodo.py", line 145, in go
if e.code == 429:
AttributeError: 'AttributeError' object has no attribute 'code'

ghdb_scraper.py no longer retrieves dork

The GHDB scraper no longer works - presumably this is because the exploit-db website has been updated.

Here's the output I am getting:

[] Initiation timestamp: 20181205_104042
[
] Spawing thread #0
[] Spawing thread #1
[
] Spawing thread #2
[+] Retrieving dork 6: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 7: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 9: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 10: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 5: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 8: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 12: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 13: Penetration Testing with Kali Linux (PWK)
[+] Retrieving dork 15: Penetration Testing with Kali Linux (PWK)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.