laramies / metagoofil Goto Github PK
View Code? Open in Web Editor NEWMetadata harvester
License: GNU General Public License v2.0
Metadata harvester
License: GNU General Public License v2.0
What steps will reproduce the problem?
1.Execute metagoofil with any target
2.Open the HTML file generated
3.Inspect the code (firefox even highlights the errors)
What is the expected output? What do you see instead?
The HTML output can be opened with a browser and there aren't any errors or
warnings, but if the HTML code is parsed, you realise that it is not
well-formed. For example, it starts directly with a <title> tag without
enveloping everything in a <document> tag. There are also two </head> ending
tags one after the other, and several extra </pre> tags.
What version of the product are you using? On what operating system?
I've checked out a read-only copy with svn, working in an Ubuntu, but I found
the same problem with the version provided with Kali Linux.
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 24 Apr 2014 at 11:32
What steps will reproduce the problem?
1.dl metagoofil,untar,chmod +x,run the script on xubuntu 12
What is the expected output? What do you see instead?
from: can't read /var/mail/discovery
from: can't read /var/mail/extractors
./metagoofil.py: line 3: import: command not found
./metagoofil.py: line 4: import: command not found
./metagoofil.py: line 5: import: command not found
./metagoofil.py: line 6: import: command not found
./metagoofil.py: line 7: import: command not found
./metagoofil.py: line 8: import: command not found
./metagoofil.py: line 9: import: command not found
./metagoofil.py: line 10: import: command not found
./metagoofil.py: line 12: syntax error near unexpected token `"ignore"'
./metagoofil.py: line 12: `warnings.filterwarnings("ignore") # To prevent
errors from hachoir deprecated functions, need to fix.'
What version of the product are you using? On what operating system?
xubuntu 12.04 . 3.2.0-36-generic kernel
metagoofil-2.1 from the BH2011_Arsenal.tar
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 25 Jan 2013 at 8:48
What steps will reproduce the problem?
1. I Ran metagoofil.py
2. Module Error
3.
What is the expected output? What do you see instead?
Error on the module
What version of the product are you using? On what operating system?
2.2, Ubuntu 12.04
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 4 Apr 2013 at 11:17
What steps will reproduce the problem?
1. default run of metagoofil taken from help screen
What is the expected output? What do you see instead?
expected: lots of results from apple.com
instead: 0 results, due to HTTP 302 from google
What version of the product are you using? On what operating system?
2.2
Please provide any additional information below.
I solved it by inserting following code in line 28 of discovery/googlesearch.py:
if returncode == 302:
h = httplib.HTTP(self.server)
h.putrequest('GET', headers.getheader('Location')[20:])
h.putheader('Host', self.hostname)
h.putheader('User-agent', self.userAgent)
h.endheaders()
returncode, returnmsg, headers = h.getreply()
best regards &
many thanks to author for this great piece of software!
Original issue reported on code.google.com by [email protected]
on 15 Apr 2015 at 10:18
Attachments:
What steps will reproduce the problem?
1. Run metagoofil.py as expected
2. Observe the number of documents found always 5 more than what was specified
What is the expected output? What do you see instead?
The expected number of matches is always 5 more than requested. This is due to
cruft at the bottom of the Google results page that is being matched by the
compiled regular expression for "<a href=".
What version of the product are you using? On what operating system?
metagoofil-read-only from SVN dated May 16, 2011 (revision 2)
Please provide any additional information below.
The indication of the erroneous matches is seen from metagoofil.py's output
from the googlesearch.py's invocation of parser.py's fileurls call. Note the
output below where "Searching 100 results..." is followed by "Results: 105
files found".
-------
[-] Searching for doc files, with a limit of 10
Searching 100 results...
Results: 105 files found
Starting to download 10 of them..
From inspection, the 5 extra matches are:
'/'
'/intl/en/ads/'
'/services/'
'/intl/en/privacy.html'
'/intl/en/about.html'
and are being matched from URLs in the footer of the google search page.
Patch attached to address the additional (spurious) pattern matches.
Original issue reported on code.google.com by [email protected]
on 16 Jun 2011 at 3:00
Attachments:
Hello
I just downloaded metagoofil and cloned this projects url. It all went fine, but when I tried to run metagoofil it gave me this error:
File "/usr/share/metagoofil/metagoofil.py", line 14
print "\n******************************************************"
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("\n******************************************************")?
I don't know why I'm getting an error and can't find anyone else who has had this problem... What should I do about this?
What steps will reproduce the problem?
1. downloaded metagoofil
2. untar,chmod +x,run the script on updated/upgraded backtrack 5 r3.
What is the expected output? What do you see instead?
from: can't read /var/mail/extractors
Please provide any additional information below.
updated/upgraded backtrack 5 r3.
Original issue reported on code.google.com by [email protected]
on 12 Feb 2013 at 6:01
What steps will reproduce the problem?
1. run program as-is, no support for proxys
2.
3.
What is the expected output? What do you see instead?
n/a
What version of the product are you using? On what operating system?
svn read-only revision 2
Please provide any additional information below.
The patch below adds download proxy support. It's not vetted nor robust, but
illustrates the approach that we're taking to add proxy support for both
downloading files and a separate proxy for queries.
Original issue reported on code.google.com by [email protected]
on 21 Jun 2011 at 8:38
Attachments:
What steps will reproduce the problem?
1. Download megagoofil-readonly from svn
2. Run the code without defining values for "-l" or "-n"
What is the expected output? What do you see instead?
The expected output is for the default values to be used as defined in
metagoofil.py:
limit=100
filelimit=50
What version of the product are you using? On what operating system?
svn release metagoofil-read-only dated May 16, 2011 (revision 2)
Please provide any additional information below.
The shell output below shows what happens when these values are not provided on
the command line (the code exits in line 126 of metagoofil.py upon attempting
to convert filelimit into a string)
[** This shows a fresh svn checkout, followed by an attempt to run metagoofil
without filelimit and without limit specified on the command line. Result:
Failure to download **]
Checked out revision 2.
gray@fireball:~/metagoofil$ cd metagoofil-read-only/
gray@fireball:~/metagoofil/metagoofil-read-only$ python ./metagoofil.py -d
iastate.edu -f iastate-download.html -o iastate -t doc
*************************************
* Metagoofil Ver 2.0 - Reborn *
* Christian Martorella *
* Edge-Security.com *
* cmartorella_at_edge-security.com *
* BACKTRACK 5 Edition!! *
*************************************
['doc']
[-] Starting online search...
[** Program dies at this point -PG **]
[** This shows a fresh svn checkout, followed by an attempt to run metagoofil
without filelimit specified on the command line. Result: Failure to download
**]
gray@fireball:~/metagoofil/metagoofil-read-only$ python ./metagoofil.py -d
iastate.edu -f iastate-download.html -o iastate -t doc -l 10
[** This shows a fresh svn checkout, followed by an attempt to run metagoofil
with filelimit and with limit specified on the command line. Result: Success
**]
gray@fireball:~/metagoofil/metagoofil-read-only$ python ./metagoofil.py -d
iastate.edu -f iastate-download.html -o iastate -t doc -l 10 -n 10
*************************************
* Metagoofil Ver 2.0 - Reborn *
* Christian Martorella *
* Edge-Security.com *
* cmartorella_at_edge-security.com *
* BACKTRACK 5 Edition!! *
*************************************
['doc']
[-] Starting online search...
[-] Searching for doc files, with a limit of 10
Searching 100 results...
Results: 105 files found
Starting to download 10 of them..
----------------------------------------------------
[0/10] http://www.registrar.iastate.edu/calendar/Deptcal1011.doc
[1/10] http://www.mcdb.iastate.edu/MCDBhandbook20102011.doc
[2/10] http://www.registrar.iastate.edu/courses/changesupdates2011.doc
[3/10] http://www.registrar.iastate.edu/courses/changesupdatef2011.doc
[4/10] http://www.philrs.iastate.edu/Rottlerapp.doc
[5/10] http://www.immunobiology.iastate.edu/IMBIOHandbook20102011.doc
[6/10] http://www.stuorg.iastate.edu/isss/SEDSGuidelines.doc
[7/10] http://www.registrar.iastate.edu/veterans/vabenefits.doc
[8/10] http://lasonline.iastate.edu/pdf_files/Online_CDG.doc
[9/10] http://www.bus.iastate.edu/jmcelroy/McElroy.doc
[10/10] http://www.hrs.iastate.edu/ISUOnLine/VeteransPref.doc
[+] List of users found:
--------------------
mkmcdow
cchulse
Kathryn B. Andre
Katie
adptemp
bjhotch
Janet Krengel
Aragorn
AerE Department
Iowa State University
Molly Helmers
College of Business
McElroy, James C
marleneb
cball
[+] List of software found:
-----------------------
Microsoft Office Word
Microsoft Word 10.0
Microsoft Macintosh Word
[+] List of paths and servers found:
--------------------------------
Normal.dotm
Normal
Normal.dot
Attached patch addresses the issue.
Original issue reported on code.google.com by [email protected]
on 16 Jun 2011 at 2:46
Attachments:
1. I had to first grant metagoofil.py execution permissions. (might not be an
issue)
2. Tried to run the metagoofil.py script and then it stated two errors.
Therefore, I had to go into metagoofil.py and add the #!/usr/bin/python line.
The two errors I received before adding the first line are presented below:
a. from: can't read /var/mail/discovery
b. from: can't read /var/mail/extractors
3. After doing the above steps, I then tried to run the script and it gave me
an error "ImportError: No module named myparser". I realized that, while
theHarvester had myparser.py in its directory, metagoofil did not. Instead, it
had a file called parser.py instead.
Original issue reported on code.google.com by [email protected]
on 20 Feb 2013 at 7:37
What steps will reproduce the problem?
python metagoofil.py -d liquid11.co.uk -t pdf,doc,docx,xls,xlsx,ppt,pptx -l 200
-n 50 -o liquid11files -f metagoofil-liquid11-results.html
What is the expected output? What do you see instead?
[1/50] /webhp?hl=en
[x] Error downloading /webhp?hl=en
[2/50] /intl/en/ads
[x] Error downloading /intl/en/ads
[3/50] /services
[x] Error downloading /services
[4/50] /intl/en/policies/
processing
tuple index out of range
Error creating the file
What version of the product are you using? On what operating system?
2.2 - Ubuntu 12.04.1 LTS
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 19 Mar 2013 at 4:33
What steps will reproduce the problem?
1. python metagoofil.py
2.
3.
What is the expected output? What do you see instead?
~/Downloads/metagoofil-2.2 ⮀ python metagoofil.py
Traceback (most recent call last):
File "metagoofil.py", line 1, in <module>
from discovery import googlesearch
File "/home/lyy/Downloads/metagoofil-2.2/discovery/googlesearch.py", line 3, in <module>
import myparser
ImportError: No module named myparser
What version of the product are you using? On what operating system?
2.2 downloaded from here.Linux amd64
Please provide any additional information below.
blackhat version provided may be work,
but lots of modified should be made to work.Especially when it comes with
Chinese characters.
Original issue reported on code.google.com by [email protected]
on 8 Jul 2013 at 1:03
What steps will reproduce the problem?
1. metagoofil -d microsoft.com -t doc,pdf -l 200 -n 50 -o msfiles -f
results.html
2. metagoofil -d docs.kali.org -t doc,pdf -l 200 -n 50 -o kalifiles -f
results.html
3. metagoofil -d apple.com -t doc,pdf -l 200 -n 50 -o applefiles -f results.html
What is the expected output? What do you see instead?
All searches return the same thing.
[-] Starting online search...
[-] Searching for doc files, with a limit of 200
Searching 100 results...
Searching 200 results...
Results: 0 files found
Starting to download 50 of them:
----------------------------------------
[-] Searching for pdf files, with a limit of 200
Searching 100 results...
Searching 200 results...
Results: 0 files found
Starting to download 50 of them:
----------------------------------------
processing
user
email
[+] List of users found:
--------------------------
[+] List of software found:
-----------------------------
[+] List of paths and servers found:
---------------------------------------
[+] List of e-mails found:
----------------------------
What version of the product are you using? On what operating system?
metagoofil 2.2 on Kali Linux. Same result with a fresh svn checkout.
Please provide any additional information below.
I suspect Google has changed things on you again.
Original issue reported on code.google.com by [email protected]
on 24 Mar 2015 at 3:02
What steps will reproduce the problem?
1.cl execution
root@bt:/pentest/enumeration/google/metagoofil# ./metagoofil.py -d DOMAINNAM -t
doc -l 200 -n 50 -o /root/Desktop/ -f results.html
2. cli ERROR
[1/50] /webhp?hl=en
Error downloading /webhp?hl=en
[2/50] /intl/en/ads
Error downloading /intl/en/ads
[3/50] /services
Error downloading /services
[4/50] /intl/en/policies/
tuple index out of range
Error creating the file
3.
What is the expected output? What do you see instead?
data gathered, instead get errors
What version of the product are you using? On what operating system?
2.1 BackTrack5R3 (ubuntu)
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 31 Jul 2013 at 9:59
What steps will reproduce the problem?
1.I run this script in China:
> metagoofil.py -d swu.edu.cn -t doc -l 20 -n 20 -o test -f test.html
output:
******************************************************
* /\/\ ___| |_ __ _ __ _ ___ ___ / _(_) | *
* / \ / _ \ __/ _` |/ _` |/ _ \ / _ \| |_| | | *
* / /\/\ \ __/ || (_| | (_| | (_) | (_) | _| | | *
* \/ \/\___|\__\__,_|\__, |\___/ \___/|_| |_|_| *
* |___/ *
* Metagoofil Ver 2.2 *
* Christian Martorella *
* Edge-Security.com *
* cmartorella_at_edge-security.com *
******************************************************
['doc']
[-] Starting online search...
[-] Searching for doc files, with a limit of 20
Searching 100 results...
Results: 0 files found
Starting to download 20 of them:
----------------------------------------
processing
user
email
[+] List of users found:
--------------------------
[+] List of software found:
-----------------------------
[+] List of paths and servers found:
---------------------------------------
[+] List of e-mails found:
----------------------------
2. I tried to modify the file: discovery/googlesearch.py
change:
self.server="www.google.com"
self.hostname="www.google.com"
to:
self.server="www.google.com.hk"
self.hostname="www.google.com.hk"
Re-run step1,
output:
....
['doc']
[-] Starting online search...
[-] Searching for doc files, with a limit of 20
_
This time, the screen output to stop in here and can not continue to go down.(I
do not know if you can understand, i'm sorry for my poor English!)
I debugged the code and found this script execution is blocked here,i don't
know what's happen
discovery/googlesearch.py:27 self.results = h.getfile().read()
It looks like google to return too many results
3.so I adjusted the page size parameter:
discovery/googlesearch.py:16
self.quantity="100" ===> self.quantity="10"
discovery/googlesearch.py:46
self.counter+=100 ===> self.counter+=10
and I also modified this point
discovery/googlesearch.py:27
self.results = h.getfile().read()
h.close() #Add this sentence seems to be useful
Re-run step1, Sometimes it works, sometimes the same as before
What is the expected output? What do you see instead?
it does not work very well
What version of the product are you using? On what operating system?
metagoofil 2.2 windows7 python2.6
Please provide any additional information below.
if the script run successfully, the results file path list contains some
errors:
eg :
[1/20] /webhp?hl=en-HK
[x] Error downloading /webhp?hl=en-HK
[12/20] /support/websearch/bin/answer.py?answer=134479
[x] Error downloading /support/websearch/bin/answer.py?answer=134479
....
my solution is :
myparser.py:43
#reg_urls = re.compile('<a href="(.*?)"')
reg_urls = re.compile('<a href="[^">]*?/url\?q=([^">]*?)&sa=U.*?"')
The result looks no problem, I do not know any other way, I do not want to change it again.
Original issue reported on code.google.com by [email protected]
on 23 Sep 2013 at 2:45
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.