The RSS Funnel is a modular RSS processing pipeline. It is designed to be used to modify existing RSS source in various interesting ways such as:
- Fetch full content
- Generate a RSS feed from an HTML page
- Remove unwanted elements from the article (using a CSS selector)
- Keep or remove articles matching keywords or patterns
- Highlight keywords in articles
- rich DOM manipulation on the feed
- Redact or replace text in the article (using a regular expression)
- Split a single RSS article into multiple articles
- Run arbitrary JS code to transform the article
You can use the docker image (latest version) in your docker-compose.yaml
:
version: "3.8"
services:
rss-funnel:
image: ghcr.io/shouya/rss-funnel:latest
ports:
- 4080:4080
volumes:
- ./funnel.yaml:/funnel.yaml
command: /rss-funnel -c /funnel.yaml server
Alternatively, you can build it directly from source:
git clone https://github.com/shouya/rss-funnel.git cd rss-funnel cargo build --release
To use rss-funnel
, you need to supply a configuration file in YAML. Here is an example configuration.
endpoints:
- path: /tokio-blog.xml
note: Full text of Tokio blog
source: https://tokio.rs/_next/static/feed.xml
filters:
- full_text: {}
- simplify_html: {}
- path: /solidot.xml
note: Solidot news with links
source: https://www.solidot.org/index.rss
filters:
- full_text: {}
- keep_element: .p_mainnew
- simplify_html: {}
- sanitize:
- replace_regex:
from: "(?<link>http(s)?://[^< \n]*)"
to: '<a href="$link">$link</a>'
- path: /hackernews.xml
note: Full text of Hacker News
source: https://news.ycombinator.com/rss
filters:
- full_text:
simplify: true
append_mode: true
Save above file to /path/to/funnel.yaml
and run the following command:
rss-funnel -c /path/to/funnel.yaml server
You can optionally specify the bind address and port (default 127.0.0.1:4080
). Detailed usage can be found in --help
output.
The endpoints like http://127.0.0.1:4080/tokio-blog.xml
should be serving the filtered feeds.
Each of the configuration contains a number of endpoints. Each endpoint correspond to a RSS feed.
Properties:
path
(required): The path of the endpoint. The path should start with/
.note
(optional): A note for the endpoint. Only used for display purpose.source
(optional): The source url of the RSS feed.- If not specified, you must specify
?source=<url>
query in the request. This allows for usages like applying same filters for different feeds. - If the source points to a HTML page,
rss-funnel
will try to generate a RSS feed from the page with a single article. You can then usesplit
filter to split the single article into multiple articles. See Cookbook: Hacker News Top Links for an example.
- If not specified, you must specify
filters
(required): A list of filters to apply to the feed.- The feed from the
source
goes through the filters in the order specified. You can think of each filter as corresponding to a transformation on theFeed
. - Each filter is specified as an YAML object with the singleton key being the name of the filter and the value being the configuration of the filter.
- For example, in the filter definition:
- keep_element: .p_mainnew
- the filter’s name is
keep_element
- the configuration is the string value
.p_mainnew
. Depending on the filter, the configuration can have different types.
- the filter’s name is
- For example, in the filter definition:
- The
Feed
object from the last filter is returned as the response.
- The feed from the
client
(optional): The configuration for the HTTP client used to fetch the source like the user_agent. See Client config for detail.
See Filters for the documentations for all available filters.
See Cookbook for some examples of using rss-funnel
.