GithubHelp home page GithubHelp logo

gerbenjavado / linkfinder Goto Github PK

View Code? Open in Web Editor NEW
3.6K 3.6K 586.0 1.11 MB

A python script that finds endpoints in JavaScript files

Home Page: https://gerbenjavado.com/discovering-hidden-content-using-linkfinder

License: MIT License

Python 88.01% HTML 10.27% Dockerfile 1.72%
endpoints infosec

linkfinder's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

linkfinder's Issues

Remove duplicates endpoints ( cli / html )

Sometimes duplicate data is useful, but not in this case, I think. Could be better if you remove all duplicated entries in endpoints list [ ], before send to output ( cli/html )

File could not be found error

When you don't prefix the input URL with http/https the script fails with a non obvious error:

Usage: python linkfinder.py [Options] use -h for help
Error: file could not         be found.

This might trip new users up. I suggest either

  1. checking the url and suggesting they add http or https
  2. adding a comment to the 'file could not be found' error message such as
Error: file could not         be found.
(did you remember to specify either http or https?)

Script stops abruptly if SSL error is encountered

Line 280-281 has:

except Exception as e:
     parser_error("invalid input defined or SSL error: %s" % e)

And parser_error() calls sys.exit().

This causes the following loop to terminate the execution mid-way, leaving the remaining endpoints unprocessed:

if args.domain:
    for endpoint in endpoints:

        <snipped>

        except Exception as e:
            parser_error("invalid2 input defined or SSL error: %s" % e)
        print("")

I'm not sure if this is intended behaviour - thought I'd just report the issue. Feel free to close this if you think it's not something that needs to be fixed.

Cheers,
Amal

issue

Error: invalid input defined or SSL error: <urlopen error [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:852)>

ImportError: No module name html

Traceback (most recent call last):
  File "linkfinder.py", line 11, in <module>
    import re, sys, glob, html, argparse, jsbeautifier, webbrowser, subprocess, base64, ssl, xml.etree.ElementTree
ImportError: No module named html

Can't load .js files from the folder.

I have downloaded .js files from gau then I want to process it with this script.
But it just dumps the names of all files on cli and doesn't works.

Command I tried.

python3 SecretFinder.py -i ../location/to/jsfiles/*

python3 SecretFinder.py -i ../location/to/jsfiles/

List of URLS

Hello man ,
Can you add a feature to like , add the js file from a text file , I have a file full of js file endpoints but I cannot pass it through linkfinder , if you can add this feature , it would be really helpful.

Import Error: No module named html

python version: 3.8.3rc1
LinkFinder version: master

Error:

Traceback (most recent call last):
  File "./linkfinder.py", line 11, in <module>
    import re, sys, glob, html, argparse, jsbeautifier, webbrowser, subprocess, base64, ssl, xml.etree.ElementTree
ImportError: No module named html

No Output if not using -d flag

There's no output if I try to provide a file with URLs or a single Url(tired with multiple URLs/Js Files). It only works if I provide the -d flag
image

-i (--input) does not work with wildcards when using in path to folder: /path/to/folder/*.js

Hi Gerben, I hope you're well :)

I found a small issue in -i option. When I pass as an argument a path to the folder with JavaScript files using wildcard:

$ ./linkfinder.py -i ~/hacking/bugbounty/Valve/store.steampowered.com/*.js

I get the following error:

linkfinder.py: error: unrecognized arguments: 
/Users/bl4de/hacking/bugbounty/Valve/store.steampowered.com/dynamicstore.js 
/Users/bl4de/hacking/bugbounty/Valve/store.steampowered.com/home.js 
/Users/bl4de/hacking/bugbounty/Valve/store.steampowered.com/jquery-1.8.3.min.js 
/Users/bl4de/hacking/bugbounty/Valve/store.steampowered.com/main.js 
/Users/bl4de/hacking/bugbounty/Valve/store.steampowered.com/shared_global.js 
/Users/bl4de/hacking/bugbounty/Valve/store.steampowered.com/shared_responsive_adapter.js 
/Users/bl4de/hacking/bugbounty/Valve/store.steampowered.com/tooltip.js

When I call single file from this folder, LinkFinder works like a charm:

$ ./linkfinder.py --input ~/hacking/bugbounty/Valve/store.steampowered.com/cluster.js

Running against: /Users/bl4de/hacking/bugbounty/Valve/store.steampowered.com/cluster.js

Let me know if you need any additional information about this issue

Regards,

bl4de

html.escape not found in python2

Error message:

Traceback (most recent call last):
  File "linkfinder.py", line 375, in <module>
    url = html.escape(endpoint["link"])
AttributeError: 'module' object has no attribute 'escape'

Last time I fixed this issue, it turns out that LinkFinder worked on my machine because I silently have "future" package installed. Any machine running python2 without that package will fail.

Thank you Vishal for reporting this to me.

Here is the point of this issue.
Should I still fix it as python2 support ? Fixing it will take just few minutes but it will make the code bit messier and python2 will be deprecated in 4 months anyway.

@EdOverflow What is your opinion ?

Feature Request

@EdOverflow @GerbenJavado

  • First of All Thanks for Creating this Tool.

How about Adding This Feature :

All *.js urls in input.txt file.

Some JS File's may require Session Cookie :

./linkfinder.py -cookie="SESSION" -i input.txt -o res.html


Make's Request to *.js url.
takes the response.
find links
and output it.

Let me know if it is good idea .

Thanks!

Reduce false positives by setting minlength

In some testing I noticed we might be able to reduce a lot of false positives by introducing a min length for a URL. Now stuff like ./zh-tw or even /./ are valid. If we set a min-length of 5-7 characters this would limit a lot of them.

What do you think of this @EdOverflow and @Bankde? What would be the ideal tradeoff here? maybe even more than 7 characters min?

SSL Error

Hi.........
when i run python3 linkfinder.py -i https://target.com/12345674545.js command it gives me following error.
Please help me.
Thnks.

Error: invalid input defined or SSL error: <urlopen error [SSL] internal error (_ssl.c:1123)>

Regex not recognize filenames with extra dot in it

File example:

<html>
<body>
<script src="some-name.anotherone.js"></script>
</body>
</html>

Command:
python linkfinder.py -i s.html -o cli
outputs nothing
Solution:
Changing reqular expression by adding dot to filename regex.
from:
([a-zA-Z0-9_\-]{1,} # filename
to:
([a-zA-Z0-9_\-.]{1,} # filename

help me! OSError: [WinError 1] 函数不正确。

我该如何去解救这个报错??

python3 linkfinder.py -i http://www.baidu.com -d -o gcy.html

Running against: https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/jquery/jquery-1.10.2.min_65682a2.js

Running against: https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/plugins/every_cookie_4644b13.js

Running against: https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/plugins/every_cookie_mac_82990d4.js

Running against: https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/global/js/all_async_search_f3b9a2f.js

URL to access output: file://D:\LinkFinder-master\gcy.html
Output can't be saved in gcy.html             due to exception: [WinError 1] 函数不正确。
Traceback (most recent call last):
  File "linkfinder.py", line 251, in html_save
    print("URL to access output: file://%s" % os.path.abspath(args.output))
OSError: [WinError 1] 函数不正确。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "linkfinder.py", line 392, in <module>
    html_save(output)
  File "linkfinder.py", line 259, in html_save
    due to exception: %s" % (args.output, e))
OSError: [WinError 1] 函数不正确。

Add support for list of urls

Hi @Willianvdv

I used Waybackurls and got list of js urls now i want that output to be fed into Linkfinder but could not find any Option for feeding urls.txt into linkfinder.

Please add support for urls txt file.

Thanks.

Unable to discover .js for a specific URL

For some reason when I search a particular URL I get the following error. Note that other URLs work fine.

The URL contains a hash - not sure if thats tripping it - www.example.com/#Login

Error:
[1846:1846:0716/002700.036506:ERROR:http_bridge.cc(110)] Not implemented reached in virtual void syncer::HttpBridgeFactory::OnSignalReceived()
[1846:1869:0716/002700.073889:ERROR:browser_process_sub_thread.cc(221)] Waited 5 ms for network service

Getting error while using burp file

$ python3 linkfinder.py -b -i /home/kindred.burp -o cli
Traceback (most recent call last):
File "linkfinder.py", line 325, in
urls = parser_input(args.input)
File "linkfinder.py", line 98, in parser_input
items = xml.etree.ElementTree.fromstring(open(args.input, "r").read())
File "/usr/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x85 in position 1: invalid start byte

$ python3 --version
Python 3.8.3

error with ssl

hi im getting this error when i run invalid input defined or SSL error: <urlopen error [SSL] internal error
after installation it was running fine maybe its something to do with the urlopener function
thanks in advance.

[Feature Request] Add an argument to not open the browser if the output is set to html

Thank you for this tools, i use it all the time.

Im creating a small script to monitor one of my target and i need the html output.
Everytime the tools done scanning it will open a browser to the output file and my bash script is looping through 400+ js url which will open 400+ browser tab.

That will be great to add a feature to not open the browser with the output is set to html after the scan.

invalid input defined or SSL error: 'module' object has no attribute 'PROTOCOL_TLSv1_2'

Hey,

I got this error invalid input defined or SSL error: 'module' object has no attribute 'PROTOCOL_TLSv1_2'.

root@localhost:~/Documents/LinkFinder# python linkfinder.py -i https://example.com -d Usage: python linkfinder.py [Options] use -h for help Error: invalid input defined or SSL error: 'module' object has no attribute 'PROTOCOL_TLSv1_2'
Regards

[Tool run error] "DeprecationWarning: cgi.escape is deprecated, use html.escape instead"

Hi there,

When we try to run the tool with python3.7, we got the following warning and the script doesn't run:

linkfinder.py:372: DeprecationWarning: cgi.escape is deprecated, use html.escape instead
  ''' % (cgi.escape(url), cgi.escape(url))
linkfinder.py:375: DeprecationWarning: cgi.escape is deprecated, use html.escape instead
  url = cgi.escape(endpoint["link"])
linkfinder.py:377: DeprecationWarning: cgi.escape is deprecated, use html.escape instead
  cgi.escape(url),
linkfinder.py:378: DeprecationWarning: cgi.escape is deprecated, use html.escape instead
  cgi.escape(url)
linkfinder.py:381: DeprecationWarning: cgi.escape is deprecated, use html.escape instead
  endpoint["context"]
linkfinder.py:384: DeprecationWarning: cgi.escape is deprecated, use html.escape instead
  cgi.escape(endpoint["link"]),
linkfinder.py:386: DeprecationWarning: cgi.escape is deprecated, use html.escape instead
  cgi.escape(endpoint["link"])

Cheers!

error in processing burp file

LinkFile is giving me an error in processing the burp file can you please take a look at this one and help me out Thanks

root@user:~/tools/LinkFinder# python3 linkfinder.py -i burpfile -b
Traceback (most recent call last):
  File "linkfinder.py", line 315, in <module>
    urls = parser_input(args.input)
  File "linkfinder.py", line 92, in parser_input
    items = xml.etree.ElementTree.fromstring(open(args.input, "r").read())
  File "/usr/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd2 in position 10: invalid continuation byte

UnicodeDecodeError byte 0x9c

Given a Burp input file the b64decode and decode('utf-8') function seem to conflict if a 0x9c byte is present. Should be fixed.

MBP-van-Gerben:~ gerben$ linkfinder -i desktop/htmlbundle -b -o cli

Traceback (most recent call last):
  File "/Applications/Pentesting/Linkfinder/linkfinder.py", line 208, in <module>
    urls = parser_input(args.input)  
  File "/Applications/Pentesting/Linkfinder/linkfinder.py", line 108, in parser_input
    jsfiles.append({"js":base64.b64decode(item.find('response').text).decode('utf-8'), "url":item.find('url').text})
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9c in position 1741: invalid start byte

Output HTML can't be saved due to UnicodeEncodeError exception: 'ascii' codec can't encode character u'\u2018'

Hi Gerben,

First things first - thank you for such great tool :) I've just started to use it ;)

Unfortunately, I've stuck with an error thrown by UnicodeEncodeError while HTML with output is saved to output file (sorry for url redacted, but it comes from private BB program and I can't disclose it - basically it's just big, minified JavaScript file, full of juicy endpoint URLs)

Here's how I call your script:

bl4de:~/hacking/tools/LinkFinder $ ./linkfinder.py -i https://[REDACTED]/script.js

And the result is:

Traceback (most recent call last):
  File "./linkfinder.py", line 132, in <module>
    text_file.write(html)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 71461: ordinal not in range(128)
bl4de:~/hacking/tools/LinkFinder $

\u2018 is Unicode Character 'LEFT SINGLE QUOTATION MARK' (http://www.fileformat.info/info/unicode/char/2018/index.htm). When I've tried to look for it in JS file, I found it's just regular \u0027 (apostrophe).

I found how to (probably) resolve such issues in this StackOverflow thread:

https://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-cant-encode-character-u-xa0-in-position-20

For now, I've just made a quick workaround for this using try...except:

try:
    text_file = open(args.output, "w")
    text_file.write(html)
    text_file.close()
    print("URL to access output: file:///%s" % os.path.abspath(args.output))
except UnicodeEncodeError, e:
    print("Output can't be saved in {} due to exception: {}".format(args.output, e))
    

If you'd like to I can try to provide PR with fix for this issue :)

Regards,

bl4de

Regular expression problem

Regular expressions may not be perfect enough, some urls cannot match.
for example:
var url=get_path_url("?mod=admin&act=login&method=ajax);

If I put index.php in front of "?", I can match it.
for example:
var url=get_path_url("index.php?mod=admin&act=login&method=ajax);

Spider integration?

Great idea.

Coupling this with a spider or inside of Burp would be best for full scope testing =)

Linkfinder too fast to scan everythin?

Hello there,

I just ran Linkfinder on 393.162 JS-URLs and it took less than a second to run and it found 2 Endpoints. I dont think that it checked all of them. What am I missing?

My code:

python3 linkfinder.py -i valid-js.txt -o cli

Issue when find the .js files in `-d` mode

Hi,

I found an issue in -d mode. I think the root cause is LinkFinder can't find the .js file from the first URL using regex.

The command is:

python3 linkfinder.py -i https://mywebsites/ -d -o cli

It doesn't return anything.

The HTML content of my website is:

<!DOCTYPE html><html><body><script src=/js/first-file.js></script><script src=/js/app.js></script></body></html>

TypeError: string indices must be integers, not str

python linkfinder.py -i https://www.fly.com/  -b -o cli
Traceback (most recent call last):
  File "linkfinder.py", line 326, in <module>
    file = url['js']
TypeError: string indices must be integers, not str

Possible Output.html Bug for Burp Files

Hey @GerbenJavado

  • I Guess there is a bug Displaying ahref value to BURP FILE instead of URL in output.html

Python linkFinder.py -i burpjsfiles -b

  • Here is the output of different JS URLS from burp file .
<!DOCTYPE html>
<html>
<head>

  <style>
       h1 {
          font-family: sans-serif;
       }
       a {
          color: #000;
       }
       .text {
          font-size: 16px;
          font-family: Helvetica, sans-serif;
          color: #323232;
          background-color: white;
       }
       .container {
          background-color: #e9e9e9;
          padding: 10px;
          margin: 10px 0;
          font-family: helvetica;
          font-size: 13px;
          border-width: 1px;
          border-style: solid;
          border-color: #8a8a8a;
          color: #323232;
          margin-bottom: 15px;
       }
       .button {
          padding: 17px 60px;
          margin: 10px 10px 10px 0;
          display: inline-block;
          background-color: #f4f4f4;
          border-radius: .25rem;
          text-decoration: none;
          -webkit-transition: .15s ease-in-out;
          transition: .15s ease-in-out;
          color: #333;
          position: relative;
       }
       .button:hover {
          background-color: #eee;
          text-decoration: none;
       }
       .github-icon {
          line-height: 0;
          position: absolute;
          top: 14px;
          left: 24px;
          opacity: 0.7;
       }
  </style>
  <title>LinkFinder Output</title>
</head>
<body contenteditable="true">
  

           
            
            <h1>File: <a href="Burp file" target="_blank" rel="nofollow noopener noreferrer">Burp file</a></h1>
            <div><a href='/v1/report' class='text'>/v1/report</a><div class='container'>            $.post('<span style='background-color:yellow'>/v1/report</span>', {
</div></div>
            <h1>File: <a href="Burp file" target="_blank" rel="nofollow noopener noreferrer">Burp file</a></h1>
            <div><a href='/stream' class='text'>/stream</a><div class='container'>         * var source = new EventSource('<span style='background-color:yellow'>/stream</span>');
</div></div><div><a href='/basic/greeting.jst' class='text'>/basic/greeting.jst</a><div class='container'>         * var compiled = _.template('hello &lt;%= name %&gt;', null, { 'sourceURL': '<span style='background-color:yellow'>/basic/greeting.jst</span>' });
</div></div><div><a href='/lodash/template/source[' class='text'>/lodash/template/source[</a><div class='container'>            var sourceURL = '\n/*\n//# sourceURL=' + (options.sourceURL || '<span style='background-color:yellow'>/lodash/template/source[</span>' + (templateCounter++) + ']') + '\n*/';
</div></div><div><a href='/connect.safariextz' class='text'>/connect.safariextz</a><div class='container'>        return window.location = '<span style='background-color:yellow'>/connect.safariextz</span>';
</div></div>
  
  <a class='button' contenteditable='false' href='https://github.com/GerbenJavado/LinkFinder/issues/new' rel='nofollow noopener noreferrer' target='_blank'><span class='github-icon'><svg height="24" viewbox="0 0 24 24" width="24" xmlns="http://www.w3.org/2000/svg">
  <path d="M9 19c-5 1.5-5-2.5-7-3m14 6v-3.87a3.37 3.37 0 0 0-.94-2.61c3.14-.35 6.44-1.54 6.44-7A5.44 5.44 0 0 0 20 4.77 5.07 5.07 0 0 0 19.91 1S18.73.65 16 2.48a13.38 13.38 0 0 0-7 0C6.27.65 5.09 1 5.09 1A5.07 5.07 0 0 0 5 4.77a5.44 5.44 0 0 0-1.5 3.78c0 5.42 3.3 6.61 6.44 7A3.37 3.37 0 0 0 9 18.13V22" fill="none" stroke="#000" stroke-linecap="round" stroke-linejoin="round" stroke-width="2"></path></svg></span> Report an issue.</a>
</body>
</html>

Bug : File: Burp file

xdg-open: no method available for opening on WSL

Hi,

I got problem when use this tool, by the way i use WSL 2 for this tool

Couldn't get a file descriptor referring to the console                                                                 Warning: program returned non-zero exit code #1                                                                         Couldn't get a file descriptor referring to the console                                                                 Couldn't get a file descriptor referring to the console                                                                 
xdg-open: no method available for opening 'file:////home/wayc0de/tools/LinkFinder/output.html' 

Verify xdg-setting

wayc0de@DESKTOP-9C0TVKV:~/tools/LinkFinder$ xdg-settings get default-web-browser chromium-browser.desktop

when i tried direct xdg-open output.html the output.html successfully come to my chome.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.