Comments (6)
Hi @JoshuaMart - thanks for taking the time to submit an issue. I got the same result with the yahoo.com
domain and same dork file when using pagodo. It does, however, return results using the Google GUI engine. Looks like it may be something in googlesearch logic (https://github.com/opsdisk/pagodo/blob/master/pagodo.py#L99)
Using ipython to debug, this returns URLs:
In [31]: for url in googlesearch.search(
...: "hello world",
...: start=0,
...: stop=search_max,
...: num=100,
...: pause=5.0,
...: extra_params={"filter": "0"},
...: user_agent=user_agent,
...: tbs="li:1", # Verbatim mode. Doesn't return suggested results with other domains.
...: ):
...: print(url)
but this doesn't
In [31]: for url in googlesearch.search(
...: "site:yahoo.com filetype:pdf",
...: start=0,
...: stop=search_max,
...: num=100,
...: pause=5.0,
...: extra_params={"filter": "0"},
...: user_agent=user_agent,
...: tbs="li:1", # Verbatim mode. Doesn't return suggested results with other domains.
...: ):
...: print(url)
Google might have added some defenses to pagodo...I'll have to dig into it deeper.
from pagodo.
I hope you can find a solution. :)
from pagodo.
This Metagoofil may help for what you're trying to do: https://github.com/opsdisk/metagoofil
Just tried it for yahoo.com and PDFs and it returned results.
from pagodo.
It's by chance that I used a "filetype: pdf" dork, it's not specifically what I'm looking to use, but thanks, I tested MetaGooFil it's really nice!
from pagodo.
tl;dr - install the google
library from Github until 2.0.2 is pushed to PyPI. Here's the ticket I submitted: MarioVilas/googlesearch#68
git clone https://github.com/MarioVilas/googlesearch.git
cd googlesearch
python setup.py install
pip install google
is not installing the latest version.
Looks like there's a bug in the google
library for 2.0.1 that is fixed in 2.0.2. It's appending an encoded "+" (%2B) to the search query since domain_query
is an empty string
if domains:
domain_query = '+OR+'.join('site:' + domain for domain in domains)
else:
domain_query = ''
# Prepare the search string.
query = quote_plus(query + '+' + domain_query)
Modifying the library to print out the query, strip the extra encoded "+" (%2B), and then running this script, returns the expected results:
import googlesearch
for url in googlesearch.search(
'filetype:pdf',
start=0,
stop=10,
num=10,
pause=0,
extra_params={"filter": "0"},
user_agent='Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.7) Gecko/20100723 Fedora/3.6.7-1.fc13 Firefox/3.6.7',
tbs="li:1", # Verbatim mode. Doesn't return suggested results with other domains.
):
print(url)
from pagodo.
It works !
Thank's :)
from pagodo.
Related Issues (20)
- Python3 HOT 1
- Tool is not working HOT 2
- Add color to terminal results HOT 5
- [!] Specify a valid file containing Google dorks with -g HOT 2
- EXCEPTION: HTTP Error 429: Too Many Requests HOT 12
- details HOT 4
- option with --proxy HOT 12
- traceback problem HOT 2
- Some parameters doesn't work HOT 2
- Error while running ./pagodo and ./ghdb_scraper.py
- Google dorks HOT 1
- syntax erron in line 125
- unicode decode error HOT 2
- Python 3.11.2 line 125 SyntaxError: invalid syntax HOT 7
- SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:992)'))) HOT 3
- ModuleNotFoundError: No module named 'yagooglesearch' HOT 13
- Failed to resolve 'myproxy' HOT 5
- Hi HOT 1
- Add the import of proxies from a file.
- GHDB scraper produces inaccurate output HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pagodo.