edwardbetts / find_link Goto Github PK
View Code? Open in Web Editor NEWSearch for pages on Wikipedia and add links
Search for pages on Wikipedia and add links
Forgive my lazy, but I feel like theres a missing paragraph or step on 1 which context this would be initiated from. Probably a one sentence explanation to a noob who can read the code and see the idea but as a windows noob not sure where literally first step is. Any insight or url with prerequisite reading included that could be shared would be appreciated and possibly result in donations to you lmao.
Cheers
In search results from an internet search engine, Find link output pages are included, as can be seen in DuckDuckGo search results.
This is undesirable for several reasons:
The problem can be resolved by adding the following line to your http://edwardbetts.com/robots.txt (robots exclusion standard) settings:
Disallow: /find_link/
Allow: /find_link/$
tool often find word in longer text, which is already link
Searching: Foo
in article is [[Something Foo bar]]
tool offers replacing with Something [[Foo]] bar
There should be checkbox for ignoring these cases
Example:
First of all thanks for the great tool!
To my opinion, disambiguation pages should not contain ''aditional links'', and should therefor not appear in the search results. If you do not want to keep them away completely, they should maybe get a label?
Other labels could be for results that are within certain tags, like -tags.
https://edwardbetts.com/find_link/Daniel_Delgadillo
this seems like something that could be resolved unlike other false positives?
This URL fails: https://edwardbetts.com/find_link/Brentford
Need to test with gunicorn. Doesn't fail when run from flask dev server.
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/flask/app.py", line 1982, in wsgi_app
response = self.full_dispatch_request()
File "/usr/lib/python3/dist-packages/flask/app.py", line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/lib/python3/dist-packages/flask/app.py", line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/lib/python3/dist-packages/flask/_compat.py", line 33, in reraise
raise value
File "/usr/lib/python3/dist-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/lib/python3/dist-packages/flask/app.py", line 1598, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/edward/find_link/find_link/view.py", line 110, in findlink
ret = p.runcall(do_search, q, redirect_to)
File "/usr/lib/python3.5/cProfile.py", line 109, in runcall
return func(*args, **kw)
File "/home/edward/find_link/find_link/core.py", line 112, in do_search
longer = find_longer(q, search, articles) if len(q) > 6 else None
File "/home/edward/find_link/find_link/core.py", line 43, in find_longer
more_articles, more_redirects = wiki_backlink(doc['title'])
File "/home/edward/find_link/find_link/api.py", line 211, in wiki_backlink
ret = api_get(params)
File "/home/edward/find_link/find_link/api.py", line 54, in api_get
r = s.get(get_query_url(), params=params)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 501, in get
return self.request('GET', url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 488, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 609, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 423, in send
timeout=timeout
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 594, in urlopen
chunked=chunked)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 350, in _make_request
self._validate_conn(conn)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 837, in _validate_conn
conn.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 330, in connect
cert = self.sock.getpeercert()
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 324, in getpeercert
'subjectAltName': get_subj_alt_name(x509)
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 197, in get_subj_alt_name
for name in ext.get_values_for_type(x509.DNSName)
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 197, in <listcomp>
for name in ext.get_values_for_type(x509.DNSName)
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 153, in _dnsname_to_stdlib
name = idna_encode(name)
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 150, in idna_encode
return prefix.encode('ascii') + idna.encode(name)
File "/usr/lib/python3/dist-packages/idna/core.py", line 355, in encode
result.append(alabel(label))
File "/usr/lib/python3/dist-packages/idna/core.py", line 276, in alabel
check_label(label)
File "/usr/lib/python3/dist-packages/idna/core.py", line 253, in check_label
raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))
idna.core.InvalidCodepoint: Codepoint U+0027 at position 2 of "b'org'" not allowed
Thanks for this very useful tool.
It would be more user-friendly to have a simple button "Add link" next to the results list. It is possible to edit pages on behalf of the user using OAuth (retrieve user token, then use this token to send the edit to wiki server).
https://www.mediawiki.org/wiki/OAuth/For_Developers
There is already a reference to this in codebase, but I guess it was just testing until now.
Line 266 in c1c4662
Please make this useful tool work for non-Wikipedia Wikimedia projects, such as Wikispecies. Wikisource, Commons, Wikiquote, etc.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.