sherlock-project / sherlock Goto Github PK
View Code? Open in Web Editor NEWHunt down social media accounts by username across social networks
Home Page: https://sherlockproject.xyz
License: MIT License
Hunt down social media accounts by username across social networks
Home Page: https://sherlockproject.xyz
License: MIT License
The readme only shows the example python3 sherlock.py
, but should show what the other arguments are, especially the default username argument - such as: python3 sherlock.py user123
.
While the text output file is good to have, it only shows the sites that do have an account with the username. It would be useful to also have a list of the sites that still had an opening.
I think that it would be useful to have a csv output as well. It could be optional, so the user would have to add a --csv
switch before the file would be output. Then the output would be available in LibreOffice or Excel for users to sort and organize as they wish.
I am thinking of the following columns:
Running py sherlock.py obviousfakeusernamethatdoesntexist
gives me a obviousfakeusernamethatdoesntexist.txt with https://www.dailymotion.com/obviousfakeusernamethatdoesntexist and https://www.ebay.com/usr/obviousfakeusernamethatdoesntexist. Opening the pages show no user. I'm from Brazil and the messages in the sites are displayed in portuguese. Maybe that's why it fails recognizing the error.
Running the program with a username that doesn't exist on any of the supported sites, you get 16 false positives:
In other words, the logic that determines whether a profile exists or not doesn't work correctly for those sites. Having a look at the current code structure, you might need to split the sites out into their own modules to get it to work satisfactorily – you'll need some more complicated logic for some of these sites (e.g. Tumblr, to pass the Oath GDPR gate).
In pull request #39, some questions that I had about @boardens content brought up something that has been bugging me: What should the scope of this tool be?
I looked on Wikipedia, and they have a list of social media sites. Many of the sites that are already included in Sherlock are not included in that list. I am not saying that that makes Sherlock wrong...it just makes me wonder when it will ever end.
There are approximately 5 bazillion Internet Forums out there, and if all of them were added to Sherlock then it is really going to add much churn. It would be like trying to release a Python module that had a snapshot of a search engine index embedded inside of it. It just would not scale.
Beyond the number of sites, I also wonder about their focus. There are plenty of websites that have a community of people making comments or doing reviews. This goes from a eCommerce site like Amazon, to other sites that are less savory. Should they be included?
@LjMario007 was suggesting categories, which I think would be a good way for users to focus their queries. Yet, if the scope is going to be so wide, it seems like Sherlock would better be implemented as a web application. Something like https://www.namecheckr.com/, but with a wider scope.
What is the vision?
I installed required packages:
pip3 install -r requirements.txt --user
Requirement already satisfied: requests in /home/dentrax/.local/lib/python3.7/site-packages (from -r requirements.txt (line 1)) (2.21.0)
Requirement already satisfied: requests_futures in /usr/lib/python3.7/site-packages (from -r requirements.txt (line 2)) (0.9.9)
Requirement already satisfied: torrequest in /home/dentrax/.local/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (0.1.0)
Requirement already satisfied: colorama in /usr/lib/python3.7/site-packages (from -r requirements.txt (line 4)) (0.4.1)
Requirement already satisfied: idna in /usr/lib/python3.7/site-packages (from -r requirements.txt (line 5)) (2.8)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/lib/python3.7/site-packages (from requests->-r requirements.txt (line 1)) (1.24.1)
Requirement already satisfied: certifi>=2017.4.17 in /home/dentrax/.local/lib/python3.7/site-packages (from requests->-r requirements.txt (line 1)) (2018.11.29)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/lib/python3.7/site-packages (from requests->-r requirements.txt (line 1)) (3.0.4)
Requirement already satisfied: stem>=1.4.0 in /usr/lib/python3.7/site-packages (from torrequest->-r requirements.txt (line 3)) (1.7.1)
Requirement already satisfied: PySocks>=1.5.7 in /usr/lib/python3.7/site-packages (from torrequest->-r requirements.txt (line 3)) (1.6.8)
And also:
pip3 install idna --user
Requirement already satisfied: idna in /usr/lib/python3.7/site-packages (2.8)
Testing:
python3 sherlock.py test
Error:
Traceback (most recent call last):
File "sherlock.py", line 18, in <module>
import requests
File "/home/dentrax/.local/lib/python3.7/site-packages/requests/__init__.py", line 113, in <module>
from . import packages
File "/home/dentrax/.local/lib/python3.7/site-packages/requests/packages.py", line 7, in <module>
locals()[package] = __import__(package)
ModuleNotFoundError: No module named 'idna'
�[37;1m[�[92;1m+�[37;1m]�[92;1m Unsplash:�[0m https://unsplash.com/@user
�[1;92m[�[0m�[1;77m*�[0m�[1;92m] Saved: �[37;1mUser.txt�[0m
That's how all the lines look like in windows powershell. I guess the print is designed for a linux platform?
Does anyone want to create and maintain a PyPi package for this project? You will be credited in the README.md
This will allow people to download Sherlock easily.
I really love this tool.
But imagine a world where we can search by email instead of the nick name.
It breaks when I search for a username that has a dot/period
I can't really manage this repo anymore. So if there is anyone who wants to get the ownership of this repo please let me know.
For username 'user123', here is output:
16:54 $ python sherlock.py user123
."""-.
/ \
____ _ _ _ | _..--'-.
/ ___|| |__ ___ _ __| | ___ ___| |__ >.`__.-""\;"`
\___ \| '_ \ / _ \ '__| |/ _ \ / __| |/ / / /( ^\
___) | | | | __/ | | | (_) | (__| < '-`) =|-.
|____/|_| |_|\___|_| |_|\___/ \___|_|\_\ /`--.'--' \ .-.
.'`-._ `.\ | J /
/ `--.| \__/
[*] Removing previous file: user123.txt
[*] Checking username user123 on:
[+] Instagram: https://www.instagram.com/user123
[+] Twitter: https://www.twitter.com/user123
[+] Facebook: https://www.facebook.com/user123
[+] YouTube: https://www.youtube.com/user123
[+] Blogger: https://user123.blogspot.com
[+] Google Plus: https://plus.google.com/+user123
[+] Reddit: https://www.reddit.com/user/user123
[+] Pinterest: https://www.pinterest.com/user123
[+] GitHub: https://www.github.com/user123
[+] Steam: https://steamcommunity.com/id/user123
[+] Vimeo: https://vimeo.com/user123
[+] SoundCloud: https://soundcloud.com/user123
[+] Disqus: https://disqus.com/user123
[+] Medium: https://medium.com/@user123
[+] DeviantART: https://user123.deviantart.com
[+] VK: https://vk.com/user123
[+] About.me: https://about.me/user123
[+] Imgur: https://imgur.com/user/user123
[+] Flipboard: https://flipboard.com/@user123
[+] SlideShare: https://slideshare.net/user123
[+] Fotolog: https://fotolog.com/user123
[+] Spotify: https://open.spotify.com/user/user123
[+] MixCloud: https://www.mixcloud.com/user123
[+] Scribd: https://www.scribd.com/user123
[+] Patreon: https://www.patreon.com/user123
[+] BitBucket: https://bitbucket.org/user123
[+] Roblox: https://www.roblox.com/user.aspx?username=user123
[+] Gravatar: http://en.gravatar.com/user123
[+] iMGSRC.RU: https://imgsrc.ru/main/user.php?user=user123
[+] DailyMotion: https://www.dailymotion.com/user123
[+] Etsy: https://www.etsy.com/shop/user123
[+] CashMe: https://cash.me/user123
[+] Behance: https://www.behance.net/user123
[+] GoodReads: https://www.goodreads.com/user123
[+] Instructables: https://www.instructables.com/member/user123
[+] Keybase: https://keybase.io/user123
[+] Kongregate: https://www.kongregate.com/accounts/user123
[+] LiveJournal: https://user123.livejournal.com
[+] VSCO: https://vsco.co/user123
[+] AngelList: https://angel.co/user123
[+] last.fm: https://last.fm/user/user123
[+] Dribbble: https://dribbble.com/user123
[+] Codecademy: https://www.codecademy.com/user123
[+] Pastebin: https://pastebin.com/u/user123
[+] Foursquare: https://foursquare.com/user123
[+] Gumroad: https://www.gumroad.com/user123
[+] Newgrounds: https://user123.newgrounds.com
[+] Wattpad: https://www.wattpad.com/user/user123
[+] Canva: https://www.canva.com/user123
[+] Trakt: https://www.trakt.tv/users/user123
[+] 500px: https://500px.com/user123
[+] BuzzFeed: https://buzzfeed.com/user123
[+] TripAdvisor: https://tripadvisor.com/members/user123
[+] Contently: https://user123.contently.com/
[+] Houzz: https://houzz.com/user/user123
[+] BLIP.fm: https://blip.fm/user123
[+] HackerNews: https://news.ycombinator.com/user?id=user123
[+] Codementor: https://www.codementor.io/user123
[+] ReverbNation: https://www.reverbnation.com/user123
[+] Designspiration: https://www.designspiration.net/user123
[+] Bandcamp: https://www.bandcamp.com/user123
[+] ColourLovers: https://www.colourlovers.com/love/user123
[+] IFTTT: https://www.ifttt.com/p/user123
[+] Ebay: https://www.ebay.com/usr/user123
[+] Slack: https://user123.slack.com
[+] Trip: https://www.trip.skyscanner.com/user/user123
[+] Ello: https://ello.co/user123
[+] HackerOne: https://hackerone.com/user123
[+] Tinder: https://www.gotinder.com/@user123
[+] We Heart It: https://weheartit.com/user123
[+] Flickr: https://www.flickr.com/people/user123
[+] WordPress: https://user123.wordpress.com
[+] Unsplash: https://unsplash.com/@user123
[+] Pexels: https://www.pexels.com/@user123
[+] devRant: https://devrant.com/users/user123
[+] MyAnimeList: https://myanimelist.net/profile/user123
[+] ImageShack: https://imageshack.us/user/user123
[+] Badoo: https://badoo.com/profile/user123
[+] MeetMe: https://www.meetme.com/user123
[+] Quora: https://www.quora.com/profile/user123
[+] Pixabay: https://pixabay.com/en/users/user123
[+] Giphy: https://giphy.com/user123
[+] Taringa: https://www.taringa.net/user123
[+] SourceForge: https://sourceforge.net/u/user123
[+] Codepen: https://codepen.io/user123
[+] Launchpad: https://launchpad.net/~user123
[+] Photobucket: https://photobucket.com/user/user123/library
[+] Wix: https://user123.wix.com
[+] Crevado: https://user123.crevado.com
[+] Carbonmade: https://user123.carbonmade.com
[+] Coroflot: https://www.coroflot.com/user123
[+] Jimdo: https://user123.jimdosite.com
[+] Repl.it: https://repl.it/@user123
[+] Issuu: https://issuu.com/user123
[*] Saved: user123.txt
It shows that user123 is on every site that is not the case.
For my user name appi147
, it shows it doesn't exist on any of the sites.
Also, it takes a lot of time to exit after printing that my username does not exist on any of the sites.
Using verbose option does not do anything.
I don't know how to add a user lookup on https://hearthis.at/.
The user placeholder is hearthis.at/[user] , but when incorrect user is entered, it will just search the string.
Now, I've tried "errorType": "message" with the message "Search results", but I was getting false positives. I also tried "errorType": "response_url" with the URL https://hearthis.at/search/?q= and "regexCheck": "^[a-zA-Z][a-zA-Z0-9_-]*$", but still false positives. Does anyone have an idea how to solve this?
It would be nice to display the response time for each request.
We would only display this information in verbose mode (-v
flag). The information can be displayed in each output line as:
[+] [78 ms] Quora: https://www.quora.com/profile/nareddyt
Furthermore, these response times can be exported in the CSV.
We would just need to store the time reach request was created in the net_info
or results_site
dictionaries. Then when the request is done, we would note the end time.
Note there is some math involved in calculating times. It's not a simple end-start
because the result
for each response is extracted in the order of requests (not as soon as a request finishes). This is something I still need to think about...
(xenial)sunjester@localhost:~/Downloads/sherlock$ python sherlock.py sunjester
File "sherlock.py", line 44
Fore.RED + f" {errstr}" +
^
SyntaxError: invalid syntax
many websites in China.Maybe you can support one or two.like: weibo,gitee
For users in some country, Facebook is blocked by GOV firewall, which using proxy is alternative path.
I came across your project today when I was looking for something to contribute to.
I noticed Tumblr isn't working and started working on a way to get round the GDPR consent page by sending a hardcoded pfg
cookie when we make the get request to Tumblr (see this branch and this commit) but unfortunately this didn't work.
Does anyone have an idea how we can get round this? If someone can point me in the right direction I'm keen to give this another go.
Nick
P.S. This is a great project by the way, nice work!
Neat project and fast too! Love it! Just in case you want more site signatures, check out my similar tool over at https://github.com/webbreacher/whatsmyname. JSON file has the site info which may or may not be helpful...like I said, we have similar projects.
I would like to say it's all in the title. GitHub accounts can't be found.
print(f"\033[37;1m[\033[91;1m-\033[37;1m]\033[91;1m {errstr}\033[93;1m {err if debug else var}")
^
SyntaxError: invalid syntax
Process finished with exit code 1
f
Because the user should not guess which character is uppercase and lowercase, which is kind of annoying.
For example let's say you want to see if this user have Deviant Art account :
python sherlock.py username --site deviantart
❌
The above will result in error, now the user will change first character to uppercase :
python sherlock.py username --site DeviantArt
❌
And that user still will get an error, because the correct name are "DeviantART"
python sherlock.py username --site DeviantART
✔️
If the parameter for site name are not case sensitive, issue like that can be avoided.
Hi, I really like this awesome tool. I followed the installation on the README page and ran the exact same command python sherlock.py user123
, it gives me the correct result yet not the correct color text rendered. Does it fully supports on Windows 10? or I did something wrong? Thank you!
OS: Windows 10 (64-bit)
Python Version: 3.7.1
Software: Windows PowerShell
, Command Prompt
I ran python3 sherlock.py sdushantha
, and it told me "Not Found!" for all of the sites.
Here is the strange thing: if I delete all of the sites from the data.json file except GitHub, and run python3 sherlock.py sdushantha
, it finds your GitHub account. But if I have both GitHub and Issuu in data.json, it does not find your GitHub account (nor does it find anything on Issuu).
I am using Python 3.7.0 on macOS 10.12.6.
I get
File "sherlock.py", line 17
print (f"\033[37;1m[\033[91;1m-\033[37;1m]\033[91;1m {errstr}\033[93;1m {err}")
^
I am using
Python 3.5.2
If a domain is blocked, it crashes.
feature request:
has anyone built a rest api using Sherlock,
I would love a simple API using flask or chalice and instructions to deploying on AWS Lambda .
Should we force users to pip install modules that they might not need?
try:
from requests_futures.sessions import FuturesSession
except ImportError:
FuturesSession = None
try:
from torrequest import TorRequest
except ImportError:
TorRequest = None
# [...]
if FuturesSession:
xyz()
if TorRequest:
abc()
Noticed while running the tests that devRant is not working right. @TheYahya suggested during the review of #105 that we use the usernames of founders tests (as they would be expected to never delete their accounts). But, when I tried to find the "dfox" username, it said that it did not exist. Yet, if I go to https://devrant.com/users/dfox, I can see his profile.
Here is the example command that demonstrates the problem:
python -u sherlock.py dfox --site devRant --verbose
Looks like @sdushantha added this site originally.
Nice work on the project.
How do you pass the CORS issue?
I'm building the JavaScript fork of this project at https://github.com/abhijithvijayan/sherlock.
I have to now use a CORS proxy to get it right.
Currently --quiet
and --verbose
are opposites, where --quiet is the default option.
Given the number of websites covered by Sherlock, should --quiet
be altered so that it is not the opposite of verbose but an option that displays only websites where the username has been found?
This way no options remains as it is, -v
as it is, and -q
as described above.
step to reproduce an error
docker debian test using python 2 & 3. Both return an error
docker pull debian
docker run -it debian /bin/bash
apt update
apt install -y git python-pip python3-pip
git clone https://github.com/sdushantha/sherlock.git
cd sherlock
pip install -r requirements.txt
pip3 install -r requirements.txt
root@cbf5d10f33ac:/sherlock# python -V
Python 2.7.13
root@cbf5d10f33ac:/sherlock# python sherlock.py username
File "sherlock.py", line 42
Fore.RED + f" {errstr}" +
^
SyntaxError: invalid syntax
root@cbf5d10f33ac:/sherlock# python3 -V
Python 3.5.3
root@cbf5d10f33ac:/sherlock# python3 sherlock.py username
File "sherlock.py", line 42
Fore.RED + f" {errstr}" +
^
SyntaxError: invalid syntax
docker ubuntu test using python 2 & 3. Only python 2 return an error same like on debian
docker pull ubuntu
docker run -it ubuntu /bin/bash
apt update
apt install -y git python-pip python3-pip
git clone https://github.com/sdushantha/sherlock.git
cd sherlock
pip install -r requirements.txt
pip3 install -r requirements.txt
root@ba7ffcdd7f80:/sherlock# python3 -V
Python 3.6.7
root@ba7ffcdd7f80:/sherlock# python -V
Python 2.7.15rc1
root@ba7ffcdd7f80:/sherlock# python sherlock.py username
File "sherlock.py", line 42
Fore.RED + f" {errstr}" +
^
SyntaxError: invalid syntax
Almost all social networks have some rules for valid usernames. For example, Facebook says it must be between 5 and 50 characters and can contain dot. Twitter says it must be between 1 and 15 characters and can contain dot and underscore. Pinterest says it must be between 3 and 30 characters, no dots, no underscores (only alphanumerics).
Unfortunately sherlock doesn't take any of these into account and it can be misleading for user (for example, he might think that username is taken but it is, in fact, illegal).
I think this tool is great for someone looking to create a new internet personna or company, so maybe it would be a logical addition to add a search for .com and .net domains, per example.
$ python sherlock/sherlock.py user123
."""-.
/ \
____ _ _ _ | _..--'-.
/ ___|| |__ ___ _ __| | ___ ___| |__ >.`__.-""\;"`
\___ \| '_ \ / _ \ '__| |/ _ \ / __| |/ / / /( ^\
___) | | | | __/ | | | (_) | (__| < '-`) =|-.
|____/|_| |_|\___|_| |_|\___/ \___|_|\_\ /`--.'--' \ .-.
.'`-._ `.\ | J /
/ `--.| \__/
[*] Removing previous file: user123.txt
[*] Checking username user123 on:
Traceback (most recent call last):
File "sherlock/sherlock.py", line 313, in <module>
main()
File "sherlock/sherlock.py", line 289, in main
results = sherlock(username, verbose=args.verbose, tor=args.tor, unique_tor=args.unique_tor)
File "sherlock/sherlock.py", line 93, in sherlock
raw = open("data.json", "r", encoding="utf-8")
FileNotFoundError: [Errno 2] No such file or directory: 'data.json'
While it is easy to add a new site to scan, it is really hard to know if it is actually supported properly. Especially for the sites that do not have an explicit error code. For these sites, Sherlock relies on knowing the exact text of the informational page that the social media site presents. Which, even if it works today, could very well break tomorrow.
I am thinking that there would be a tests
directory that would contain Python code that would run various tests. The main input to these tests would be a curated list of user names that are supported at a given social media site, and another list of user names that are not.
There is a bash script which installs all of the packages automatically and that is great! But there is one drawback though, it wont work in Arch based distros (Manjaro, ArchLabs, etc). That is because those distros use pacman
instead of apt
to download the packages.
In my opinion, I dont think that it is good idea to have that script unless we add support to other distros as well.
File "D:/sherlock-master/sherlock.py", line 39
print(f"\033[37;1m[\033[91;1m-\033[37;1m]\033[91;1m {errstr}\033[93;1m {err if debug else var}")
^
SyntaxError: invalid syntax
Process finished with exit code 1
I think an async version would be pretty nifty and not too difficult to do for this project (*knocks on wood*). I can get started on this if you want.
I know trio more than I do asyncio/aiohttp but if you prefer the latter I could learn those.
there is a syntax error in the file sherlock.py in line 17.
please review.
thank you!
Regardless of what username I try, for Spotify and BLIP.fm I obtain "Error Connecting".
I do not think that the version number system should use the date. There are multiple systems out there, but they are all flavors of major.minor.maintenance. This allows the version number to have some meaning to other people.
https://github.com/sdushantha/sherlock/blob/e2c4dbf1ef69db80a9c6ebf591be874686e04301/sherlock.py#L20
requests == 2.13.0
requests has no attribute __description__
instead only has
'__author__',
'__build__',
'__builtins__',
'__cached__',
'__copyright__',
'__doc__',
'__file__',
'__license__',
'__loader__',
'__name__',
'__package__',
'__path__',
'__spec__',
'__title__',
'__version__',
'_internal_utils',
so i use the __doc__
am i right?
Thank you
Blackplanet is giving false positives. (request from germany)
@Czechball you added this in #81 ; maybe you are able to fix it?
lines 87 - 89, commented out, state
User agent is needed because some sites does not
return the correct information because it thinks that
we are bot
which I think could tidied up to
A user agent is needed because some sites don't
return the correct information since they think that
we are bots
other than that tiny issue, i find your code very readable. great work and thanks for making a usable script!
It may be a good idea to sort the sites in sites.md and data.json alphabetically. When I'm looking for sites to add, I always have to Ctrl+F in this repo or just scroll through the file... Also when seeing the results, it's just chaos.
Hey !
Could you add a webhook to the docker hub to automatically build the image on the cloud and make it available without cloning the project ?
I can't do it for you in a PR, I'm sorry !
Thank you for your work 👍
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.