0xprateek / stardox Goto Github PK
View Code? Open in Web Editor NEWGithub stargazers information gathering tool
License: GNU General Public License v3.0
Github stargazers information gathering tool
License: GNU General Public License v3.0
scrap the email id of stargazer and display it in tree list view.
Stardox currently supports linux platforms only and is not tested for windows platform. Looking forward to make it compatible for windows users as well.
Looks like a great tool! After following the installation instructions I run:
$ python3 stardox.py
The Stardox logo appears, then the following error appears:
[-] Error importing requests module.
Suggestions appreciated!
As a new feature, using the username to fetch someone's github profile details can prove to be helpful. Also, their repositories will be listed and then one can look at the repo information.
Adding windows colors support.
First of all, thank you...
I've tried this on a few repos and am getting this response:
Traceback (most recent call last): File "stardox.py", line 384, in <module> stardox(repository_link,verbose,issave) File "stardox.py", line 327, in stardox structer.plotdata(len(data.username_list), pos, count) File "/Users/xxxxxxxxx/Desktop/Stardox/src/structer.py", line 20, in plotdata data.star_list[pos].strip(), IndexError: list index out of range
Any ideas?
Traceback (most recent call last):
File "stardox.py", line 346, in
stardox(repository_link, verbose, max_threads=16)
File "stardox.py", line 232, in stardox
soup1 = BeautifulSoup(html, "lxml")
NameError: name 'BeautifulSoup' is not defined
In the README there are a few typos, i.e. "It scraps Github" and "information of yours/someone's"
Use something like pipenv
?
It's not a good practice to install requirements globally for an application. We can use a virtual environment where the dependencies are installed only for the application, not globally.
This also reduces errors due to environment.
If the repo only has 1 stargazer,cannot get any info.
while user enters wrong info or wrong repository cli exit the script , instead of exiting cli we should give atleast 3 attempts ,or atleast one attempt before exiting which will create more ease of entering info to the user .
Stardox should run even if instead of giving the complete link to the repository, only the format owner/repo-name
is entered.
The user will have an option to enter complete link or in this format.
The repo has 2.3k stargazers, but it can only show me 1,192 stargazers' info.
Making it to use via command line arguments.
Use PEP-8 standard for formatting code.
you can read about it here.
A new feature to get the details of all the contributors of a repository.
With this feature, users will be able to save the doxed stargazer's information into a text file.
The current approach of command-line arguments in our code is not feasible for adding a new argument.
writing a new easy to implement command-line argument code is required to make everything simple out there.
Make sure to update your changes and usage method in readme.md
Stardox takes a lot of time to come up with the results. This issue is made for resolving the speed problem.
I have listed things we can do for speeding it up (Or for the fast mode.) :
Using this for the first time and it seems to be a bit bad with the numbers when it's used on big repositories. For example the stargazers for this repo are actually 1000x bigger
Enter the repository address :: https://github.com/freeCodeCamp/freeCodeCamp
[+] Got the repository data
[+] Repository Title : freeCodeCamp
[+] Total watchers : 83
[+] Total stargazers : 306
[+] Total Forks : 231
[] Fetching stargazers list] Doxing started ...
[
When exporting the results to csv, I am not receiving more than 1201 rows returned.
I might be wrong on this one, but if this app doesn't use Github's security tokens, fetching member info from various repositories is severely limited to just a few entries no matter what.
Top right avatar / Settings / Developer settings / Personal access tokens / Generate new token
I think this raises the number from 60 entries to something like 1500 per hour. I've actually bumped into some similar app that was also working around the limitations by waiting, which is also handy!
An --email only
flag is required as it's requested by many users of stardox.
It will give us only emails of the stargazers.
Logging of all the;
The current code in if __name__ == '__main__':
part can be made into a function, since if we add new features where this part is not required, it will not be called. (For example, in fetching details using username, this part will not be required.)
Also, to add new arguments, I suggest making a separate arguments.py file, which will store the action
classes of new arguments.
Increase the speed of scrapper by using multithreding.
This fails to work on my linux machine for some reason...
dread@FreezingMoon:
$ git clone https://github.com/0xprateek/stardox$ cd stardox/
Cloning into 'stardox'...
remote: Enumerating objects: 28, done.
remote: Counting objects: 100% (28/28), done.
remote: Compressing objects: 100% (27/27), done.
remote: Total 211 (delta 15), reused 4 (delta 0), pack-reused 183
Receiving objects: 100% (211/211), 79.87 KiB | 614.00 KiB/s, done.
Resolving deltas: 100% (110/110), done.
dread@FreezingMoon:
dread@FreezingMoon:/stardox$ pip install -r requirements.txt/stardox$ cd src/
Collecting requests (from -r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/51/bd/23c926cd341ea6b7dd0b2a00aba99ae0f828be89d72b2190f27c11d4b7fb/requests-2.22.0-py2.py3-none-any.whl
Collecting beautifulsoup4 (from -r requirements.txt (line 2))
Using cached https://files.pythonhosted.org/packages/f9/d9/183705a87492249b212d88eef740995f55076195bcf45ed59306c146e42d/beautifulsoup4-4.8.1-py2-none-any.whl
Collecting lxml (from -r requirements.txt (line 3))
Using cached https://files.pythonhosted.org/packages/e4/f4/65d145cd6917131826050b0479be35aaccba2847b7f80fc4afc6bec6616b/lxml-4.4.1-cp27-cp27mu-manylinux1_x86_64.whl
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 (from requests->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/b4/40/a9837291310ee1ccc242ceb6ebfd9eb21539649f193a7c8c86ba15b98539/urllib3-1.25.7-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/18/b0/8146a4f8dd402f60744fa380bc73ca47303cccf8b9190fd16a827281eac2/certifi-2019.9.11-py2.py3-none-any.whl
Collecting chardet<3.1.0,>=3.0.2 (from requests->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Collecting idna<2.9,>=2.5 (from requests->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl
Collecting soupsieve>=1.2 (from beautifulsoup4->-r requirements.txt (line 2))
Using cached https://files.pythonhosted.org/packages/81/94/03c0f04471fc245d08d0a99f7946ac228ca98da4fa75796c507f61e688c2/soupsieve-1.9.5-py2.py3-none-any.whl
Collecting backports.functools-lru-cache; python_version < "3" (from soupsieve>=1.2->beautifulsoup4->-r requirements.txt (line 2))
Using cached https://files.pythonhosted.org/packages/da/d1/080d2bb13773803648281a49e3918f65b31b7beebf009887a529357fd44a/backports.functools_lru_cache-1.6.1-py2.py3-none-any.whl
Installing collected packages: urllib3, certifi, chardet, idna, requests, backports.functools-lru-cache, soupsieve, beautifulsoup4, lxml
Successfully installed backports.functools-lru-cache-1.6.1 beautifulsoup4-4.8.1 certifi-2019.9.11 chardet-3.0.4 idna-2.8 lxml-4.4.1 requests-2.22.0 soupsieve-1.9.5 urllib3-1.25.7
dread@FreezingMoon:
dread@FreezingMoon:~/stardox/src$ python3 stardox.py
ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
sssssss ssssssssss ss ss sss ss sss ss ss ss sss sss
sssssss ssss sss sss sss ss sss ss ss ss ss ss
ssssssssssssss ssss sss sss sss ss sss ss ss ss ss ss
ssssssssssssss ssss sssssssssss sssssssssss sss ss ss ss ssss
ssss ssss sssssssssss sssssss sss ss ss ss ssss
ssss ssss sss sss sss sss sss ss ss ss ss ss
ssssssssssssss ssss sss sss sss sss sss ss ssssssssss ss ss
sssssssssssssss ssss sss sss sss sss sssssssss ssssssssss sss sss Made By : Pr0t0n
[-] Error importing requests module.
dread@FreezingMoon:~/stardox/src$
Along with the other details of each stargazer's github profile, if the bio and location is also showed, it'll help to know them better.
Increase code readability by adding comments and by change in code style.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.