asciimoo / filtron Goto Github PK
View Code? Open in Web Editor NEWFiltering reverse HTTP proxy
License: GNU Affero General Public License v3.0
Filtering reverse HTTP proxy
License: GNU Affero General Public License v3.0
i have my searx instance setup with uwsgi, how can i target the uwsgi socket with filtron?
Sorry, my question is more related to nginx, but I haven't found a solution and I know here are nginx users who have already set up filtron behind a nginx reverse proxy ...
In my filtron rules I filter
"filters": ["Header:Connection=close"],
And in my nginx I have configured a reverse proxy
location /searx {
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Script-Name /searx;
proxy_pass http://127.0.0.1:4004/;
}
My HTTP client sends a Connection keep-alive
header to nginx but the backend at http://127.0.0.1:4004
does not see this keep-alive header (the filter matches and I have also debuged this).
I have exactly the same setup configured with Apache where I haven't observed such problems. It seems, that Apache pass all headers through the proxied service .. why does nginx change the Connection header .. and how can I persuade nginx to pass the header through and keep the connection alive?
I guess nginx sets 'Connection close' since it does not hold the connection to the upstream .. but I don't have a clue how to configure nginx to be transparent: if clients are setting 'keep alive', then hold the line and if client does not set Connection header, then close the line upstream.
It would interesting to have:
To avoid infinite loop, "goto" can only go below in rules.json, not above.
For now, it is possible to deny access when there are too much queries per seconds.
But if there is a continuous flood, it is not possible to something more.
With a "goto" it would be possible to:
Or:
Up to the person writing rules.json to make something coherent about the selectors.
I am running Filtron in front of a Searx instance, with the recommended rules here. The problem is that if I do a search from any computer on my network I am presented with "Rate limit exceeded". I believe this is because all of the requests from within my network show as coming from my router, but I can't seem to figure out how to whitelist. Apologies if I'm wrong or this has been asked before, I tried checking issue requests first but couldn't find anything relevant.
$GOPATH
Is set up correctly at /usr/local/filtron/go-apps
but when running go get github.com/asciimoo/filtron
i get the following output:
# github.com/klauspost/compress/flate
go-apps/src/github.com/klauspost/compress/flate/deflate.go:135:23: cannot convert d.window (type []byte) to type *[32768]byte
go-apps/src/github.com/klauspost/compress/flate/deflate.go:135:56: cannot convert d.window[windowSize:] (type []byte) to type *[32768]byte
go-apps/src/github.com/klauspost/compress/flate/fast_encoder.go:93:28: cannot convert e.hist (type []byte) to type *[32768]byte
go-apps/src/github.com/klauspost/compress/flate/fast_encoder.go:93:63: cannot convert e.hist[offset:] (type []byte) to type *[32768]byte
go-apps/src/github.com/klauspost/compress/flate/huffman_bit_writer.go:794:17: cannot convert w.literalFreq[:] (type []uint16) to type *[256]uint16
go-apps/src/github.com/klauspost/compress/flate/huffman_bit_writer.go:796:16: cannot convert w.literalFreq[256:] (type []uint16) to type *[32]uint16
I verified in /usr/local/filtron/go-apps
and there is no bin/filtron
present, only src/filtron
GO version: go1.13.5 linux/amd64
Hello,
I still have Google issues with searx despites I'm using Filtron.
As @return42 said in searx/searx#729, I want to see the logs to be able to see if there is some more requests to block.
watch the log for a while and see if questionable requests get through. If so, improve your /etc/filtron/rules.json to block such requests.
This is what I do when my instance get CAPTCHA from google (had 2 issues in the last 6 month).
However, with the Docker image (setup with https://github.com/searx/searx-docker), I don't find a way to show/enable the logs.
I can see logs printed through the log.Println()
function but I'm not familiar enough with go to understand how to setup/enable/show the logs of all requests (if there is a way to see them).
Otherwise, I could enable the logs of my proxy (traefik and not caddy) to see all requests (that's easy to do and this is maybe the right way).
Thanks for your help (and correct me if I should just enable the logs of my proxy) :)
I now put in filtron via utils/filtron.sh install
in searx - but now I have search urls like
http://127.0.0.1:4004/search?q=
shown in the searx Search URL box though base_url
is set.
The current selectors about the source IP and Host are
Behind a reverse proxy they are useless.
The HTTP header X-Forwarded-For allows to filter the real user IP:
"filters": ["Header:X-Forwarded-For=<ip>
In the searx-docker project, there is this rule:
{
"name": "searx.space",
"filters": ["Header:X-Forwarded-For=(2001:41d0:8:de3::1|176.31.252.227)"],
"stop": true,
"actions": [{ "name": "log"}]
},
It would useful to have something like this:
{
"name": "searx.space",
"filters": ["Header:X-Forwarded-For=nslookup(check.searx.space)"],
"stop": true,
"actions": [{ "name": "log"}]
},
So even if at one point in time, check.searx.space
resolves to different IPs, the rule would still work as expected.
The nslookup
function can resolve the IPs addresses when filtron starts. The result of that function would be a regex similar to the first snippet.
Extend version: filtron resolves every day the IPs addresses without restart.
When you specify a selector using nslookup(...) Filtron performs the DNS lookup while parsing the rules.json.
If the DNS lookup fails the following error message is shown:
Cannot parse rules: Cannot parse selector '..........': invalid expression
This implies an error with the rules.json itself and is very misleading.
In my opinion there should be a separate error message for this case.
Hi,
if I set
...
"name": "search request",
"filters": ["Param:q", "Path=^(/|/search)$"],
"interval": 60,
"limit": 1,
...
in my rules.json
I would expect that I can only make 1 search request per minute but actually I can make as much as I want (like without using filtron at all).
Did I understand something wrong?
Hi there, trying to use this in my own docker-compose file and I noticed that I cannot use the dns name of the target like hostname:8080 in the docker-compose command, even if I have links and depends_on the container whose hostname I would like to use. Would be neat if DNS resolution would work in the command to start filtron.
here is a sample of the portion of the compose-file:
filtron:
container_name: filtron
image: dalf/filtron
hostname: filtron
restart: always
network_mode: bridge
links:
- searx
depends_on:
- searx
command: -listen 0.0.0.0:4040 -api 0.0.0.0:4041 -target "searx:8888"
volumes:
- filtron-data:/etc/filtron:rw
read_only: true
A dummy action could be useful for when the user wants to whitelist something. For example in https://github.com/paulgoio/filtron/blob/main/src/rules.json#L9 I am allowing the /image_proxy path and stop after that (So that the internal image proxy does not get rate limited by the rule following). I would like to just stop there without logging and have the rest underneath apply to everything except the /image_proxy path. A dummy action, that does nothing, could be the solution.
Maybe this can be implemented by adding a new case at: https://github.com/asciimoo/filtron/blob/master/action/action.go#L45 ?
Recently filtron started spamming my logs while using it with searx. The error message is always the same (see the following example). I already updated searx and filtron to the latest master to no no avail. Trying filtron 0.2.0 also didn't fix it.
Dec 20 07:36:45 v220191283267104968 filtron[19754]: 2020/12/20 07:36:45 Response error: error when reading response headers: cannot parse response status code: unexpected first char found. Expecting 0-9. Response "outdoor sconce y&categori
es=general HTTP/1.1 200 OK\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length: 18452\r\nServer-Timing: total;dur=868.159, total_0_wp;dur=29.639, total_1_sp;dur=71.768, total_2_bi;dur=223.456, total_3_wd;dur=828.235, load_0_wp;dur=
28.949, load_1_sp;dur=64.846, load_2_bi;dur=216.94, load_3_wd;dur=822.593\r\n\r\n<!DOCTYPE html>\n<html xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"en\" xml:lang=\"en\">\n<head>\n <meta charset=\"UTF-8\" />\n <meta name=\"descripti
on\" content=\"searx - a privacy-respecting, hackable metasearch engine\" />\n <meta name=\"keywords\" content=\"searx, search, search engine, metasearch, meta search\" />\n <meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\">\n
<meta name=\"generator\" content=\"searx/0.18.0\">\n <meta name=\"referrer\" content=\"no-referrer\">\n <meta name=\"viewport\" content=\"width=device-width, initial-scale=1 , maximum-scale=2.0, user-scalable=1\" />\n <link re
l=\"alternate\" type=\"application/rss+xml\" title=\"Searx search: white\" href=\"https://search.mdosch.de/search?q=white&categories=general&language=en&format=rss\"> <script src=\"https://search.mdosch.de/translations.js\">
</script>\n <title>white - search.mdosch.de</title>\n <link rel=\"stylesheet\" href=\"https://search.mdosch.de/static/css/bootstrap.min.css\" type=\"text/css\" />\n <link rel=\"stylesheet\" href=\"https://search.mdosch.de/static/t
hemes/oscar/css/logicodev.min.css\" type=\"text/css\" />\n <link rel=\"stylesheet\" href=\"https://search.mdosch.de/static/themes/oscar/css/leaflet.min.css\" type=\"text/css\" />\n <!-- HTML5 Shim and Respond.js IE8 support of HTML5
elements and media queries -->\n <!--[if lt IE 9]>\n <script src=\"https://search.mdosch.de/static/js/html5shiv.min.js\"></script>\n <script src=\"https://search.mdosch.de/static/js/respond.min.js\"></script>\n <![endif]-->
\n\n <link rel=\"shortcut icon\" href=\"https://search.mdosch.de/static/themes/oscar/img/favicon.png\" />\n\n\n <link title=\"search.mdosch.de\" type=\"application/opensearchdescription+xml\" rel=\"search\" href=\"/opensearch.xml?met
hod=POST&autocomplete=\"/>\n <noscript>\n <style type=\"text/css\">\n .tab-content > .active_if_nojs, .active_if_nojs {display: block !important; visibility: visible !important;}\n .margin_top_if_nojs {m
argin-top: 20px;}\n .hide_if_nojs {display: none !important;overflow: hidden !important;}\n .disabled_if_nojs {pointer-events: none; cursor: default; text-decoration: line-through;}\n </style>\n </noscript>\
n</head>\n<body class=\"results_endpoint\" >\n<div class=\"searx-navbar\"><span class=\"instance pull-left\"><a href=\"https://search.mdosch.de\">search.mdosch.de</a></span><span class=\"pull-right\"><a href=\"https://search.mdosch.de/abou
t\">about</a><a href=\"https://search.mdosch.de/preferences\">preferences</a></span></div>\n <div class=\"container\">\n\n\n<form method=\"POST\" action=\"https://search.mdosch.de/search\" id=\"search_form\" role=\"search\">\n <div cla
ss=\"row\">\n <div class=\"col-xs-12 col-md-8\">\n <div class=\"input-group search-margin\">\n <input type=\"search\" autofocus name=\"q\" class=\"form-control\" id=\"q\" placeholder=\"Search for...\" aria-label=\"Search for
...\" autocomplete=\"off\" value=\"white\" accesskey=\"s\">\n <span class=\"input-group-btn\">\n <button type=\"submit\" class=\"btn btn-default\" aria-label=\"Start search\"><span class=\"hide_if_nojs\"><span title=\"\"
class=\"glyphicon glyphicon-search\"></span></span><span class=\"hidden active_if_nojs\">Start search</span></button>\n\t <button type=\"button\" id=\"clear_search\" class=\"btn btn-default hide_if_nojs\" aria-label=\"Clear search\"><sp
an title=\"\" class=\"glyphicon glyphicon-remove\"></span></button>\n </span>\n </div>\n </div>\n <div class=\"col-xs-6 col-md-2 search-margin\"><label class=\"visually-hidden\" for=\"time-range\">Time range</label>\n<sel
ect name=\"time_range\" id=\"time-range\" class=\"custom-select form-control\" accesskey=\"t\"><option id=\"time-range-anytime\" value=\"\" selected>Anytime</option><option id=\"time-range-day\" value=\"day\" >Last day</option><option id=\
"time-range-week\" value=\"week\" >Last week</option><option id=\"time-range-month\" value=\"month\" >Last month</option><option id=\"time-range-year\" value=\"year\" >Last year</option></select></div>\n <div class=\"col-xs-6 col-md-2 s
earch-margin\"><label class=\"visually-hidden\" for=\"language\">Language</label>\n<select class=\"language form-control custom-select\" id=\"language\" name=\"language\" accesskey=\"l\">\n <option value=\"all\" >Default language</option>
<option value=\"af-ZA\" >Afrikaans- af-ZA</option><option value=\"ca-ES\" >Català- ca-ES</option><option value=\"da-DK\" >Dansk- da-DK</option><option value=\"de\" >Deutsch- de</option><option value=\"de-AT\" >Deutsch(Österreich) - de-AT</
option><option value=\"de-CH\" >Deutsch(Schweiz) - de-CH</option><option value=\"de-DE\" >Deutsch(Deutschland) - de-DE</option><option value=\"et-EE\" >Eesti- et-EE</option><option value=\"en\" selected=\"selected\">English- en</option><op
tion value=\"en-AU\" >English(Australia) - en-AU</option><option value=\"en-CA\" >English(Canada) - en-CA</option><option value=\"en-GB\" >English(United Kingdom) - en-GB</option><option value=\"en-IE\" >English(Ireland) - en-IE</option><o
ption value=\"en-IN\" >English(India) - en-IN</option><option value=\"en-NZ\" >English(New Zealand) - en-NZ</option><option value=\"en-PH\" >English(Philippines) - en-PH</option><option value=\"en-SG\" >English(Singapore) - en-SG</option><
option value=\"en-US\" >English(United States) - en-US</option><option value=\"es\" >Español- es</option><option value=\"es-AR\" >Español(Argentina) - es-AR</option><option value=\"es-CL\" >Español(Chile) - es-CL</option><option value=\"es
-ES\" >Español(España) - es-ES</option><option value=\"es-MX\" >Español(México) - es-MX</option><option value=\"fr\" >Français- fr</option><option value=\"fr-BE\" >Français(Belgique) - fr-BE</option><option value=\"fr-CA\" >Français(Canada
) - fr-CA</option><option value=\"fr-CH\" >Français(Suisse) - fr-CH</option><option value=\"fr-FR\" >Français(France) - fr-FR</option><option value=\"hr-HR\" >Hrvatski- hr-HR</option><option value=\"id-ID\" >Indonesia- id-ID</option><optio
n value=\"it-IT\" >Italiano- it-IT</option><option value=\"sw-TZ\" >Kiswahili- sw-TZ</option><option value=\"lv-LV\" >Latviešu- lv-LV</option><option value=\"lt-LT\" >Lietuvių- lt-LT</option><option value=\"hu-HU\" >Magyar- hu-HU</option><
option value=\"ms-MY\" >Melayu- ms-MY</option><option value=\"nl\" >Nederlands- nl</option><option value=\"nl-BE\" >Nederlands(België) - nl-BE</option><option value=\"nl-NL\" >Nederlands(Nederland) - nl-NL</option><option value=\"nb-NO\" >
Norsk Bokmål- nb-NO</option><option value=\"pl-PL\" >Polski- pl-PL</option><option value=\"pt\" >Português- pt</option><option value=\"pt-BR\" >Português(Brasil) - pt-BR</option><option value=\"pt-PT\" >Português(Portugal) - pt-PT</option>
<option value=\"ro-RO\" >Română- ro-RO</option><option value=\"sk-SK\" >Slovenčina- sk-SK</option><option value=\"sl-SI\" >Slovenščina- sl-SI</option><option value=\"sr-RS\" >Srpski- sr-RS</option><option value=\"fi-FI\" >Suomi- fi-FI</opt
ion><option value=\"sv-SE\" >Svenska- sv-SE</option><option value=\"vi-VN\" >Tiếng Việt- vi-VN</option><option value=\"tr-TR\" >Türkçe- tr-TR</option><option value=\"is-IS\" >Íslenska- is-IS</option><option value=\"cs-CZ\" >Čeština- cs-CZ<
/option><option value=\"el-GR\" >Ελληνικά- el-GR</option><option value=\"be-BY\" >Беларуская- be-BY</option><option value=\"bg-BG\" >Български- bg-BG</option><option value=\"ru-RU\" >Русский- ru-RU</option><option value=\"uk-UA\" >Українсь
ка- uk-UA</option><option value=\"hy-AM\" >Հայերեն- hy-AM</option><option value=\"he-IL\" >עברית- he-IL</option><option value=\"ar-EG\" >العربية- ar-EG</option><option value=\"fa-IR\" >فارسی- fa-IR</option><option value=\"th-TH\" >ไทย- th-
TH</option><option value=\"zh\" >中文- zh</option><option value=\"zh-CN\" >中文(**) - zh-CN</option><option value=\"zh-TW\" >中文(台灣) - zh-TW</option><option value=\"ja-JP\" >日本語- ja-JP</option><option value=\"ko-KR\" >한국어- ko-KR
</option></select></div>\n </div>\n <div class=\"row\">\n <div class=\"col-sm-12\"><div id=\"categories\"><input class=\"hidden\" type=\"checkbox\" id=\"checkbox_general\" name=\"category_general\" checked=\"checked\" /><label for=\"c
heckbox_general\">general</label><input class=\"hidden\" type=\"checkbox\" id=\"checkbox_files\" name=\"category_files\" /><label for=\"checkbox_files\">files</label><input class=\"hidden\" type=\"checkbox\" id=\"checkbox_images\" name=\"
category_images\" /><label for=\"checkbox_images\">images</label><input class=\"hidden\" type=\"checkbox\" id=\"checkbox_it\" name=\"category_it\" /><label for=\"checkbox_it\">it</label><input class=\"hidden\" type=\"checkbox\" id=\"chec
kbox_map\" name=\"category_map\" /><label for=\"checkbox_map\">map</label><input class=\"hidden\" type=\"checkbox\" id=\"checkbox_music\" name=\"category_music\" /><label for=\"checkbox_music\">music</label><input class=\"hidden\" type=\
"checkbox\" id=\"checkbox_news\" name=\"category_news\" /><label for=\"checkbox_news\">news</label><input class=\"hidden\" type=\"checkbox\" id=\"checkbox_science\" name=\"category_science\" /><label for=\"checkbox_science\">science</lab
el><input class=\"hidden\" type=\"checkbox\" id=\"checkbox_social_media\" name=\"category_social media\" /><label for=\"checkbox_social_media\">social media</label><input class=\"hidden\" type=\"checkbox\" id=\"checkbox_videos\" name=\"ca
tegory_videos\" /><label for=\"checkbox_videos\">videos</label></div></div>\n </div>\n</form><!-- / #search_form_full -->\n <div class=\"row\">\n <div class=\"col-sm-4 col-sm-push-8\" id=\"sidebar_results\">\n\n\n<div class=\"p
anel panel-default infobox\">\n <div class=\"panel-heading\"><div class=\"infobox_part\">\n <div class=\"pull-right\">\n <span class=\"label label-default\">wikidata</span>\n <span class=\"label
label-default\">wikipedia</span>\n </div>\n <h4 class=\"panel-title\"><bdi>White</bdi></h4> </div>\n </div>\n <input type=\"checkbox\" class=\"infobox_checkbox\" id=\"expand_infobox_wikidata\" hidden>\n
<div class=\"panel-body infobox_body\">\n<img class=\"img-responsive center-block infobox_part\" src=\"https://proxy.mdosch.de?mortyurl=https%3A%2F%2Fcommons.wikimedia.org%2Fwiki%2FSpecial%3AFilePath%2FColor%2520icon%2520white.svg%3Fwidt
h%3D500%26height%3D400&mortyhash=df36908a11108020603b7a32f7fbcc1f5c73fe9bd8bce1d3b656caaee7f8ad17\" />\n<bdi><p class=\"infobox_part\">White is the lightest color and is achromatic. It is the color of fresh snow, chalk and milk, and is
the opposite of black. White objects fully reflect and scatter all the visible wavelengths of light. White on television and computer screens is created by a mixture of red, blue and green light. In everyday life, whiteness is often confe
rred with white pigments, especially titanium dioxide, of which is produced more than 3,000,000 tons per year.</p></bdi>\n\n<div class=\"infobox_part\">\n<bdi><p class=\"btn btn-default btn-xs\"><a href=\"https://en.wikipedia.org/wiki/Whit
e\" rel=\"noreferrer\">Wikipedia</a></p>\n<p class=\"btn btn-default btn-xs\"><a href=\"http://www.wikidata.org/entity/Q23444\" rel=\"noreferrer\">Wikidata</a></p>\n</bdi></div>\n </div>\n <label for=\"expand_infobox_wikidata\" class
=\"infobox_toggle panel-footer\">\n <span class=\"infobox_label_down glyphicon glyphicon-chevron-down\"></span>\n <span class=\"infobox_label_up glyphicon glyphicon-chevron-up\"></span>\n </label>\n</div>\n\n\n\n
<div class=\"panel panel-default\">\n <div class=\"panel-heading\"><h4 class=\"panel-title\">Links</h4></div>\n <div class=\"panel-body\">\n <form role=\"form\"><div class=\"form-group\"><
label for=\"search_url\">Search URL</label><input id=\"search_url\" type=\"url\" class=\"form-control select-all-on-click cursor-text\" name=\"search_url\" value=\"https://search.mdosch.de/search?q=white&categories=general&language
=en\" readonly></div></form>\n <label>Download results</label>\n <div class=\"clearfix\"></div>\n <form method=\"POST\" action=\"https://search.mdosch.de/search\" class=\"form-inline
pull-left result_download\"><input type=\"hidden\" name=\"category_general\" value=\"1\"/><input type=\"hidden\" name=\"q\" val". Buffer size=12288, contents: "outdoor sconce y&categories=general HTTP/1.1 200 OK\r\nContent-Type: text/html
; charset=utf-8\r\nContent-Length: 18452\r\nServer-Timing: total;dur=868.159, total_0_wp;dur=29.639, total_1_sp;dur=71.768, tota"..." <form method=\"POST\" action=\"https://search.mdosch.de/search\" class=\"form-inline pull-left result_
download\"><input type=\"hidden\" name=\"category_general\" value=\"1\"/><input type=\"hidden\" name=\"q\" val" HTTP/1.1 200 OK
Dec 20 07:36:45 v220191283267104968 filtron[19754]: Date: Sun, 20 Dec 2020 06:36:44 GMT
Dec 20 07:36:45 v220191283267104968 filtron[19754]: Content-Length: 0
Dec 20 07:36:45 v220191283267104968 filtron[19754]: [1B blob data]
Hi,
I'm used to send me update notifications via RSS over blogtrottr.com but for https://github.com/asciimoo/filtron/releases/tag/v0.1.0 "No content." is written as description. So I'm a little bit confused: Were there any changes with v0.1.0 or is it just the current development state tagged as "release"?
Is there a reason why the release binary is not staticly compiled?
Unable to bind to or communicate with ipv6 addresses.
the issue lies with the included fasthttp. They are aware lack of ipv6 support is affecting performance (due to most OSes preferring v6 over v4 these days) but have no plans to implement.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.