GithubHelp home page GithubHelp logo

slurdge / goeland Goto Github PK

View Code? Open in Web Editor NEW
170.0 3.0 11.0 5.24 MB

An alternative to rss2email written in golang with many filters

License: MIT License

Makefile 1.38% Batchfile 0.93% Go 84.75% Python 2.58% HTML 5.24% Dockerfile 0.77% CSS 4.35%
rss2email rss rss-feed-scraper email-sender go golang rss-feed pipes hacktoberfest hacktoberfest2022

goeland's Introduction

goeland

goeland

GitHub release (latest by date) version GitHub Image license

Build StatusDocker images CodeQL

An RSS to email, à la rss2email, written in Go.

Support this project by giving it a ⭐️ and sharing it.

About

Goeland excels at creating beautiful emails from RSS feeds, tailored for daily or weekly digest.

It includes a number of filters (see below) that can transform the RSS content along the way. It can also consume other sources, such as Imgur tags.

Goeland transforms this...

<rss version="2.0">
<channel>
<title>Phoronix</title>
<link>https://www.phoronix.com/</link>
<description>
Linux Hardware Reviews, Benchmarks & Open-Source News
</description>
<language>en-us</language>
<item>
<title>
Google Announces KataOS As Security-Focused OS, Leveraging Rust & seL4 Microkernel
</title>
<link>https://www.phoronix.com/news/Google-KataOS</link>
<guid>https://www.phoronix.com/news/Google-KataOS</guid>
<description>
Google this week has announced the release of KataOS as their newest operating system effort focused on embedded devices running ambient machine learning workloads. KataOS is security-minded, exclusively uses the Rust programming language, and is built atop the seL4 microkernel as its foundation...
</description>
<pubDate>Sun, 16 Oct 2022 06:10:25 -0400</pubDate>
</item>
</rss>

into this

email

Goeland has a size-fits-all default template that works well with mobile, tablet, desktop and webmail clients.

Goeland can extract full text from most article sources, enabling a ready to consume email.

Status

Goeland is used in production with many email clients, and has sent over thousands of emails. It is considered stable.

Installation

Grab the latest binary release from the release page. Binaries are available for the following platforms:

  • linux/386
  • linux/amd64
  • linux/arm
  • linux/arm64
  • darwin/amd64
  • windows/amd64
  • windows/386

Just put it in a folder where you have write permissions and run it first with :

goeland run

If you are interested for another platform to be supported, please open a PR or submit a feature request.

Usage

On first run, if it doesn't exist yet, goeland will create a config.toml with the default values. You need to adjust the [email] section with your SMTP server details. The config values can also be set with environment variables (e.g. GOELAND_EMAIL_PASSWORD_FILE=/path/to/pass).

Sources

Afterwards, fill the [sources] and [pipes] sections. Source are identified by their name after the [source.] field:

[sources.hackernews]
type = "feed"
url = "https://hnrss.org/newest"
filters = ["all", "today"]

You can then use 'hackernews' in the following pipes.

The different source types are:

  • "feed": RSS, Atom or JSON feed (all supported formats can be found here). Fill in the url field.
  • "imgur": Return most recent results for a tag. Fill in the the tag field.
  • "merge": Will merge two or more sources together. Fill in the sources field with a list of sources: sources = ["source1", "source2"]. Especially useful to merge different sources on the same topic. Don't forget to digest or combine it later.

Filtering

One powerful aspect of goeland is filtering. Instead of sending the content of the RSS directly to the email system, it can transform it in a number of ways in order to make it easier to read, process, etc.

Any number of filters can be defined, the order is important. For example, the following:

filters = ["unseen", "retrieve", "digest"]

Will first keep only previously unseen entries, then make it nicer with the retrieve filter, and, at last, will put them all together with digest. This will create only one email with a SourceTitle as the title of the RSS feed.

Filters can have options. For example, to get the 3 newest post, you would do:

filters = ["first(3)"]

The available filters are as follows:

  • none: Removes all entries
  • all: Default, include all entries
  • first: Keep only the first (usually newest) entries (default 1)
  • last: Keep only the last (usually oldest) entries (default 1)
  • reverse: Reverse the order of the entries
  • random: Keep 1 or more random entries (default 5)
  • unseen: Keep only unseen entry. Entries that have been seen will be put in a goeland.db file. Use the purge command to remove seen entries
  • today: Keep only the entries of the day
  • lasthours: Keep only the entries that are from the X last hours (default 24)
  • digest: Make a digest of all entries (optional heading level, default 2)
  • combine: Combine all the entries into one source and use the first entry title as source title. Useful for merged sources
  • links: Rewrite relative links src="// and href="// to have an https:// prefix
  • embedimage: Embed a picture if the entry has an attachment with a type of picture (optional position: top|bottom|left|right, default top)
  • replace: Replace a string with another. Use with an argument like this: replace(myreplace) and define
[replace.myreplace]
        from="A string"
        to="Another string"

in your config file.

  • includelink: Include the link of entries in the digest form
  • includesourcetitle: Include source titles of entries in the digest form
  • retrieve: Retrieves the full content from a goquery. E.g. you can use retrieve(div.content) to get the full excerpts of Next INpact's LeBrief
  • language: Keep only the specified languages (best effort detection), use like this: language(en,de)
  • untrack: Removes feedburner pixel tracking
  • reddit: Better formatting for reddit rss
  • sanitize: Sanitize the content of entries (to be used if --unsafe-no-sanitize-filter was passed)
  • toc: Create a special table of content entry containing the titles of all entries. Use toc(title) to use the Title as a link
  • limitwords: Limit the number of words in the entry, use like this: limitwords(32)

Pipes

After defining some sources, you can send them to a pipe. One source can be sent to multiple pipes, but a pipe can only have one source. If you need to combine sources together, use the above special merge.

A pipe has the following structure:

[pipes.hackernews]
source = "hackernews"
destination = "email"
email_to = "[email protected]"
email_from = "HackerNews <[email protected]>"
email_title = "{{.EntryTitle}}"
template = "/path/to/template.html" # optional

You can use EntryTitle, SourceTitle and SourceName in the email template. SourceTitle is the title of the RSS feed.

For debugging purposes, or in order to pipe to other systems, you can set the destination to terminal.

Email

In the email section you need to specify your outgoing mail server. You can specify both encryption and allow-insecure to connect to self-hosted servers. You can also specify authentication to select the appropriate option for your server (the options available are "none", "plain", "login" and "crammd5"; if unspecified it defaults to "plain"; see go-simple-mail's documentation for details).

[email]
host = "smtp.example.com"
port = 25
username = "default"
password = "p4ssw0rd"
# password_file = /run/password/goeland_smtp_pass
encryption = "tls"
allow-insecure = false
authentication = "plain"
#Email customization
include-header = true
include-footer = true
#footer = Your custom footer
#logo = internal:goeland.png
#template = /path/to/template.html

You can create your own template, see relevant documentation. The pipe template takes precedence over the main template defined in the [email] section.

Examples

This will bring you 6 puppies to your inbox.

loglevel = "info"
dry-run = false

[email]
host = "smtp.sendgrid.net"
port = 587
username = "apikey"
password = "<sendgridapikey>"

[sources]

[sources.insta]
url = "https://rssbridge.example.com/?action=display&bridge=Instagram&context=Hashtag&h=puppy&media_type=picture&direct_links=on&format=MRss"
type = "feed"
filters = ["random(3)"]

[sources.imgur]
type = "imgur"
tag = "puppy"
filters = ["random(3)"]

[sources.puppies]
type = "merge"
sources = ["insta", "imgur"]
filters = ["combine"]

[pipes]

[pipes.puppies]
source = "puppies"
destination = "email"
email_to = ["[email protected]"]
email_from = "DailyPuppy <[email protected]>"

This will give you the latest article on a specific subreddit:

loglevel = "none"
dry-run = false
database = "goeland.db"

[email]
host = "example.com"
port = 25
username = "username"
password = "password"

[sources]

[sources.reddit]
url = "https://www.reddit.com/r/selfhosted/top.rss"
type = "feed"
filters = ["unseen", "includelink", "digest"]

[pipes.reddit]
source = "reddit"
destination = "email"
email_to = ["[email protected]"]
email_from = "Reddit <[email protected]>"

It is possible to send an email to multiple addresses, just put them in a list:

[pipes.reddit]
source = "reddit"
destination = "email"
email_to = ["[email protected]", "[email protected]", "[email protected]"]
email_from = "Reddit <[email protected]>"

See also the examples/ folder.

Contributing

Feel free to open issues or PR for bugs and suggestions for more filters and source types.

If you encounter a problematic feed, please open an issue with the content of the feed attached.

Future

Here is a list of things that could be nice

goeland's People

Contributors

dependabot[bot] avatar dfosas avatar fabianofranz avatar github-actions[bot] avatar kylrth avatar panigrc avatar slurdge avatar sweenu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

goeland's Issues

includelink filter doesn't work

Hi folks,

Thank you for Goeland. It's FANTASTIC software!!!

I'm trying to get rss2email experience - by getting a single e-mail per each new post in RSS feed.

I configured Goeland like this:

...

[sources.xyz]
url = "https://.../"
type = "feed"
filters = ["unseen", "retrieve", "includesourcetitle", "includelink", "untrack"]
allow-insecure = false

...

[pipes.xyz]
disabled = false
source = "xyz"
destination = "email"
email_title = "xyz: {{.EntryTitle}}"
email_to = ["[email protected]"]
email_cc = []
email_bcc = []
email_from = "xyz <from@example.>"
# email_title =

...

But I'm getting e-mails with plain text - with no links to the post.

Is it a bug?

Thank you.

Desktop

  • OS: Linux
  • Browser: firefox
  • Version: goeland 0.18.3

Option to accept self-signed SSL certificate or expired SSL certificate (in HTTPS)

Hello,

would it be possible (or is it already possible?) to add an option to configuration file to accept self-signed SSL certificates (certificate signed by unknown authority) and expired SSL certificates in HTTPS connections?

I'm aware about the security risk, but sometimes it is difficult to get the SSL certificate corrected / updated.

Thank you

Aggregated weekly digest with only titles

First, I'd like to say I really like goeland. It's simple, yet powerful and I like the fact that it's stateless and easily configurable with a config file.

Now to my problem, what I hoped I could get by using goeland is a weekly digest of all the new articles of the blogs I follow sent to me in one email. It would look something like:

### Weekly digest
All the new posts from last week:
#### [Blog one](https://linktotheweb.site)
* [How to make a cake?](https://linktotheblog.post)
* [Are cakes still relevant in 2022?](https://linktotheblog.post)
#### [Blog three](https://linktotheweb.site)
* [Make a pie in three steps](https://linktotheblog.post)

There is no Blog two because it won't have posted anything the past week.

However, right now this is not possible for serveral reasons from what I see in the code:

  • the source url is not saved (easily fixed by adding a URL field to the Source struct).
  • using the merge source type makes you loose the information about the subsources like source.Title which I would need to do what I want.
  • the source information (name, title and url in the future) is not accessible via the html template.

Is this a usecase you'd like to support? If yes, is there a certain way you would like it implemented? I am down to work on this.
I believe it would be cleaner to loop over all the sources and their entries in the html template than doing in go and embedding a single long string in the html.

Docker bind volume ownership/permissions and goeland.db

Since you now offer a docker image, I put together a docker-compose.yml file containing:

services:
  app:
    image: slurdge/goeland
    restart: unless-stopped
    volumes:
      - ~/docker/volumes/goeland:/data

Issues I've encountered:

  • Unless I first create ~/docker/volumes/goeland (as user 1000), it gets created by docker as root and config.toml cannot be written to it. Shouldn't it be created automatically by docker as the container user (1000?) and just work?
  • Even when config.toml is created/edited, successfully, a "permission denied" error occurs when creating goeland.db.

I admit that I'm pretty new to docker (and to goeland!), so perhaps this is all expected behaviour and I'm "doing it wrong". Either way, I'm hoping for guidance.

P.S. is there a way to default the container to start with the --run-at-startup flag?

Digest of two RSS feeds not getting articles

Describe the bug
I have two RSS feeds, from which the RSS digest is not working:
http://www.dsl.sk/export/rss_articles.php
http://www.zive.sk/rss/najnovsie/
Instead of RSS digest, only last article or no article is received. Period of polling of listed feeds is 24h.

To Reproduce
Steps to reproduce the behavior:
goeland configuration, which is not working for listed RSS feeds, but it is working for 200+ other RSS feeds:

[sources]
[sources.src1]
url = "http://www.dsl.sk/export/rss_articles.php"
type = "feed"
filters = ["unseen", "includelink", "embedimage(left)", "digest(4)", "retrieve"]
[sources.src2]
url = "http://www.zive.sk/rss/najnovsie/"
type = "feed"
filters = ["unseen", "includelink", "embedimage(left)", "digest(4)", "retrieve"]

[pipes]
[pipes.src1]
source = "src1"
destination = "email"
email_to = "[email protected]"
email_from = "[email protected]"
email_title = "{{.EntryTitle}}"
[pipes.src2]
source = "src2"
destination = "email"
email_to = "[email protected]"
email_from = "[email protected]"
email_title = "{{.EntryTitle}}"

Expected behavior
Get the digest of all articles from last run.

Screenshots
N/A

Additional context
Add any other context about the problem here.

Youtube RSS feed doesn't work

Hi,

I'm trying to get emails for the following feed : https://www.youtube.com/feeds/videos.xml?channel_id=UCb0MyY46T9ZYOzDHkYnIoXg

Unfortunately the email I receive only contain the goeland logo and footer.

Here's the content of one of these email :

Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=UTF-8

<!DOCTYPE html><html lang=3D"en"><head><title>Portable Nuclear Weapons | Pl=
ainly Difficult Short</title><meta charset=3D"utf-8"/><meta name=3D"viewpor=
t" content=3D"width=3Ddevice-width,initial-scale=3D1"/><meta http-equiv=3D"=
x-ua-compatible" content=3D"IE=3Dedge"/><style type=3D"text/css">"a:hover" =
{
text-decoration: none !important
}@media screen and (min-width:600px){
h1 {
font-size: 48px !important;
line-height: 48px !important
}
.intro {
font-size: 24px !important;
line-height: 36px !important
}
}
</style></head><body style=3D"-webkit-text-size-adjust:100%;-ms-text-size-a=
djust:100%;overflow-wrap:break-word;height:100%;width:100%;margin:0;padding=
:0"><!--[if (gte mso 9)|(IE)]&gt;&lt;table cellspacing=3D0 cellpadding=3D0 =
border=3D0 width=3D720 align=3Dcenter role=3Dpresentation&gt;&lt;tr&gt;&lt;=
td&gt;&lt;![endif]--><div role=3D"article" aria-label=3D"Portable" nuclear=
=3D"" weapons=3D"" |=3D"" plainly=3D"" difficult=3D"" short=3D"" lang=3D"en=
" style=3D"font-family: &#39;Avenir Next&#39;, -apple-system, BlinkMacSyste=
mFont, &#39;Segoe UI&#39;, Roboto, Helvetica, Arial, sans-serif, &#39;Apple=
 Color Emoji&#39;, &#39;Segoe UI Emoji&#39;, &#39;Segoe UI Symbol&#39;; fon=
t-size: 18px; font-weight: 400; line-height: 28px; margin: 0 auto; max-widt=
h: 720px; padding: 40px 20px 40px 20px;"><header><a href=3D"https://www.git=
hub.com/slurdge/goeland" style=3D"-webkit-text-size-adjust:100%;-ms-text-si=
ze-adjust:100%;color:rgb(16, 120, 189);font-weight:600;text-decoration:unde=
rline"><center><img style=3D"-ms-interpolation-mode:bicubic;border:0;height=
:auto;line-height:100%;outline:none;text-decoration:none;max-width:100%;bac=
kground-color:white;padding:16px;border-radius:16px" src=3D"cid:20230312.20=
[email protected]"/></center></a>
</header><main></main><footer style=3D"margin-top: 24px; padding: 16px; bor=
der-radius: 16px;"><center><p style=3D"font-size: 16px; font-weight: 400; l=
ine-height: 16px;">Enjoy your =F0=9F=93=A7 by <a href=3D"https://www.github=
.com/slurdge/goeland" style=3D"-webkit-text-size-adjust:100%;-ms-text-size-=
adjust:100%;color:rgb(16, 120, 189);font-weight:600;text-decoration:underli=
ne">goeland</a></p></center></footer></div><!--[if (gte mso 9)|(IE)]&gt;&lt=
;/table&gt;&lt;![endif]--></body></html>
--2eb3659b3f9ed2b1fd5958bb7479e3a2bdb9cb44b07ae7141f18ae081ba2--

--26b4b4ee37adb6c6ba198b9fc94b43fc695f2f35bbc27842d0e44c47a5dc
Content-Disposition: inline;
 	filename="logo.png"
Content-Id: <[email protected]>
Content-Transfer-Encoding: base64
Content-Type: image/png;
 	name="logo.png"

iVBORw0KGgoAAAANSUhEUgAAAPoAAABoCAMAAADvnB1HAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ
bWFnZVJlYWR5ccllPAAAAF1QTFRF////TU1Npqam9PT0WFhYenp6bm5u09PT3t7e95MevLy8kJCQ
6enpY2NjhYWFsbGxm5ubx8fH+a5W/eTHonA2+rxy//jxzYIq+Jos/uvV+8mPwn0tpZmK/NCd////
u6G8bwAAAB90Uk5T////////////////////////////////////////AM0ZdhAAAAL4SURBVHja
7J3ZduMgDIZZjYONl7bT2fv+jznTbAfb9cLmGCn/VXuaQ/UZkEREFPKBVsT6EYem6ASRBugElyx0
gk43dEIwshOk5Bd0glOI0T/hn+jI0WvmLK3qh9jNVe9urJpF76mfZNfzPbEVk8LP0mpuSM/xzira
fWaf6ybETDXzMGmYpE4OXlYizEY2My4NlWFJF76SwRbOoBMTPDIV7ZHBKZ3blaWIMLhRScDrJoJt
VC/4ENdw0cliMn6XYNW3k1kRsnI1Nr4nVu1oRkwZ262P/kGxczRdVD80Lu6OL80jwqjDZmRiS+7g
IT0YWJEDitvwMtqK1DZ4TQ4qXlnrkkcnLxQ5sJSJzK7X05HDTHwTlV2nCxsJxCKy6wS+I6V0NHad
JGIk3fAiDnt+5PZhIIQ9R/I47HmSx2DPlTycPV/yUPacycPY8yYPYc+d3J89f3JfdgjkfuwwyH3Y
oZC7s8Mhn2cv1VSwyGfYe+NbE82dvaYoyL9i71bLghwqe/PIuuVj2dstlxVKkOx8Sz1baJjslZxq
8jw6qL7uC3HFJEBHvzWvG9QtwbGveG+7bsmAscu1V9Zy7bpXruxy/aX3pEdwUOxb0pX7QaaBlNts
810tsCVPeMvU0l+l7G43Jyq6eXsAeC7nm2iitH8DNO1LYsPdrYDt9iXdQhoZLXmOD71evd8JRtdZ
NuNngWDF15ew34+Du0Cw2cv/02x6y+Wv3OWGLIMnvI3VgDq/+UT6P6+f+okR/e101je86KdfeNFf
EKFXeNGv6dzvE77Nfk1pvl/If7zjIb99wPjlrL/v+LZ6gS+h4QJUAcojtNES7aQbtGcX2qIj18Dq
Lw5vWwiK9MR6JzdoydG9Q9NCu1SyVVZ5HVciVyf4PHQW3Nq+TgiDXG9q+jG8PwaCnBc+7Y1AzHkV
sXNQZnLvJCShHNdc0Q2curLbgm/6PWz6OJqbE43eybvt1ltSH65d47OtJkr0Z/dglOiI22WjZMfb
Gv/5hQhYvwbjnwADABz/SJCHx6o2AAAAAElFTkSuQmCC
--26b4b4ee37adb6c6ba198b9fc94b43fc695f2f35bbc27842d0e44c47a5dc--

And here's my config.toml :

## Log level
## Either "none", "error", "debug", "info"
loglevel = "none"

## Dry run
## Do not output anything or send email after fecthing the sources
#dry-run = false

## Do not sanitize input
## This is not sanitize (the default) any input.
## Use at your own risk as you will include everything from your sources, including scripts, etc.
## You can always sanitize afterwards with the 'sanitize' filter.
unsafe-no-sanitize-filter = false

## Run all the pipes once at startup in daemon mode
run-at-startup = false

## Purge days
## Number of days to keep the entries when the purge command is used
## Can be overrided by command line switch
purge-days = 90

## Auto purge
## Automatically run the purge command after the run command
auto-purge = true

[email]
host = ""
port = 587
username = ""
password = ""
## Include header in email
## Put a nice goeland logo in emails
#include-header = true

## Include footer in email
## Put "Sent with ❤️ by goeland in the bottom of HTML emails"
#include-footer = true

## Include title in header
include-title = true

## Email timeout in milliseconds
#timeout-ms = 5000

## Logo file
#logo = internal:goeland.png

## Template file
#template = "/path/to/template.html"

[sources]

[sources.hackernews]
url = "https://hnrss.org/newest"
type = "feed"
# See doc for available filters
filters = ["all", "today"]
# Allow invalid certificates
allow-insecure = false

[sources.youtube]
url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCb0MyY46T9ZYOzDHkYnIoXg"
type = "feed"
# See doc for available filters
filters = ["all"]
# Allow invalid certificates
allow-insecure = false

[pipes]

[pipes.youtube]
#Either put disabled = true or prefix pipes with disalbed like this: disabled.pipes.hackernews
disabled = false
source = "youtube"
destination = "email"
email_to = [""]

results from Nitter feeds don't include the tweet author

With the following pipe definition:

[sources.lexfridman]
url = "https://nitter.nl/lexfridman/search/rss?f=tweets&e-replies=on"
type = "feed"
filters = ["unseen", "links", "includelink", "digest"]

I get an email that looks like this:

As you can see, the tweet body appears as both the title and the content. I'd ideally like the title to be the tweet author name (which may be different from @lexfridman in the case of retweets or if I'm making a digest of multiple Twitter users), and then let the body be the same.

I know you've got that replace filter for simple text manipulation in the body, but have you got any ideas about more complex field manipulation that might make it possible to insert the <dc:creator> tag into the title or something like that?

Have a way to get the default email template.

Right now the email default template is embedded within the application, but people wanting to customize it have to get to the source code.

It would be useful to get it from within the application as an action, such as output-email-template.

Improve templates documentation

Trying to create custom template and found no documentation about placeholders for templater as well as no ability to simply get a link for entry.

run as daemon?

I know it's easy to just run this as a cron job, but being able to run goeland as a daemon or a Docker container would mean I can manage goeland along with all the other services I may have on the same machine.

I've created an example in this repo that demonstrates how I'm currently building goeland to run as a simple docker-compose service. This works just fine for me, but I wanted to bring it up here to see if you were interested in supporting this kind of usage.

Supporting this could be as simple as adding a goeland daemon subcommand, if you aren't interested in supporting Docker.

Auto grab of contents in RSS feed

Feeds are often only including an URL or summary. With existing filters, it's possible to cleanly grab the content of the article, however it would also be good to have an 'auto content' option for people which are not used to CSS

smtp connection timed out

level=fatal msg="cannot create email pool: Mail Error: SMTP Connection timed out"
\Downloads\goeland> smtp connection timed out

When I give the command goeland ./goeland.exe run, which I intend to use to send news from the RSS source I created via email, it gives the above error. Can anyone with information help me?
(I am very new and my knowledge is really limited)

No link included despite the includelink filter

Describe the bug
No link included despite the includelink filter

To Reproduce
Steps to reproduce the behavior:
I have example emails with the config being correctly sending emails, but no link included

Expected behavior
THe link being included :)

Screenshots
N/A

Desktop (please complete the following information):
Any mail client

Additional context
Config.toml excerpt:

[sources.myblog]
url = "<redacted>"
type = "feed"
filters = ["all", "unseen", "includelink"]
allow-insecure = false

warn when a merge source refers to itself

I kept getting massive amounts of logs (gigabytes every few days), finally narrowed it down to goeland, and turned off debug logging, thinking the problem was solved. But I wasn't getting any emails from a particular pipe and I was confused by that. Turns out I had accidentally included the merge source in its own set of sources, causing an infinite loop. All the other pipes ran as expected, which was why I wasn't tipped off sooner.

It could be good to include a check for self-reference like that. Obviously you could still come up with contrived examples to produce infinite looping, like by having two merge sources reference each other, but this would warn users quickly if they've made a simple mistake. Thoughts?

Not all embedded image are displayed, depending by different sources

Using source as

[sources.calcio]
url = "https://www.corrieredellosport.it/rss/calcio"
type = "feed"
filters = ["unseen","includelink","embedimage","digest"]

The email works fine, displaying the embedded image.

If I use this resource:

[sources.focus_scienza]
url = "https://www.focus.it/rss/scienza.rss"
type = "feed"
filters = ["unseen","includelink","embedimage","digest"]

The email doesn't display the image, showing a blank line space instead. The tag in the RSS is the same, called Enclosure. My Rss client (Vienna) displays both of them. It would be great if I didn't have to specify "embedimage" as filter, but the image would be displayed by default, as normally an RSS client does.

URL isn't showing up in "unseen" key

With the following source:

[sources.apnews]
url = "https://rss.kylrth.com/:proxy:items=||*[class=content]||p/https://apnews.com/apf-topnews"
type = "feed"
filters = ["unseen", "lasthours", "links", "includelink", "combine(3)"]

I get the following logs:

Executing pipe named: apnews
Fetching source: apnews of type feed
Retrieved 10 feeds for source apnews
Executing unseen filter with args: []
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
already seen entry with key: apnews/
After unseen: 0 feeds
After unseen: []
Executing lasthours filter with args: []
After lasthours: 0 feeds
After lasthours: []
Executing links filter with args: []
After links: 0 feeds
After links: []
Executing includelink filter with args: []
After includelink: 0 feeds
After includelink: []
Executing combine filter with args: [3]
After combine(3): 0 feeds
After combine(3): []
&{apnews Top News: US & International Top News Stories Today | AP News []}

running with v0.10.2 in Docker. Somehow the URLs for the items are not showing up in the keys for the "unseen" filter, so all items have the same key.

Add a Dockerfile

Now that daemon mode is supported we need a proper Dockerfile

Getting a daily digest from a Ghost blog?

Hello,

I'm probably a bit stupid, but I can't get goeland to send emails on a regular basis when new entries appear in the rss feed.

Here is the config part that I used :
Docker compose :

  goeland-2024:
    image: slurdge/goeland
    container_name: rss-to-email-2024
    volumes:
      - ./goeland-data/config.toml:/data/config.toml
    networks:
      - ghost

Config.toml relevant parts :

loglevel = "debug"
dry-run = false
run-at-startup = true
[email]
host = "<mailhost>"
port = 587
encryption = "tls"
authentication = "plain"
username = "no-reply@<mydomain>"
password = "<mypassword>"
include-title = true
[sources]
[sources.myblog]
url = "<rssfeed>"
type = "feed"
filter = [ "unseen" ]
[pipes]
[pipes.myblog]
source = "myblog"
destination = "email"
email_to = [ "<myemail>" ]
email_from = "Blog 2024 <no-reply@<mydomain>>"
email_title = "Blog 2024: {{.EntryTitle}}"

I created a first blog entry, and then restarted the container, and I got the email. Great news ! But now, I added 2 more articles and no email was sent (after 48 hours). I restarted the container, got 3 emails. Restarted it again, got 3 emails again. I don't get why no goeland.db is created or why if I keep the container running, no email is sent.

Can you help me understand what's going on?

Thanks,

Plain text email

Hi,

Could you add a plain text option? I'd rather not use HTML in emails if given the choice.

Reddit RSS cannot be downloaded

This issue is made to document that reddit either RSS or plain downloading of pages is broken in latests version of golang. Eigher 429 or 403 is answered depending on the variations.

The fix is as follow:

//this one is needed because of incompatibility between latest golang and reddit
defaultClient = http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{},},
}

Expose configuration for authentication

I have been playing with this fantastic project but it took me a while to get it working with a particular smtp server. It seemed to boil down to authentication options. It seems like AuthPlain is the default but the underlying package allows for other options, needing AuthLogin in my case.

I have no idea of go or smtp, but I got to a minimal working example with the following additions to run.go by looking at the surrounding code:

authentications := map[string]email.AuthType{"plain": email.AuthPlain, "login": email.AuthLogin, "crammd5": email.AuthCRAMMD5}
authentication, found := authentications[config.GetString("email.auth")]
if !found {
    authentication = email.AuthPlain
}
server.Authentication = authentication

which allows configuration files to have an email.auth field.

Happy to do a pull request as per the contributing guidelines with that change and updating the documentation if that sounds sensible, but I am afraid I do not know how to go about tests here.

Many thanks!

Custom User agent

Is your feature request related to a problem? Please describe.
Some webpages are sending HTTP 403 for goeland, but when I will use standard web browser, the RSS feed is available. I would assume that the webpages are checking User Agent.

Describe the solution you'd like
Add configuration option, per source, for configuring User Agent (like goeand will act as Firefox, or custom configurable string).

Describe alternatives you've considered
I'm not aware about more effective option now.

Additional context
Not required.
Thank you

Write more documentation

This is issue can be worked on for Hacktoberfest.

Difficulty: ⭐
Time needed: A few hours

Right now, the project needs more documentation, especially with the filters.
Steps:

  • Experiment with goeland and with filters
  • Write documentation
  • Do a PR

Explicit SSL is not supported

Hello! Any plans on SSL/TLS support for smtp servers? I've tried a couple of servers(Google, yandex, mail ru) and all of them don't work supossedly because of lack of encryption(ERR conn timed out). Would be nice to be able to toggle SSL usage via config file.

Max number of characters from article

Would it be possible to add a filter which will limit the max number of characters which will be put to email message from article in RSS feed?

In different wording, the article from RSS feed have 5000 characters. In current version of goeland, the whole article text will be placed in email message. My target is to have a possibility to limit the number of characters, which will be placed in email message.

Example email message:
Article title #1
Text of article #1 limited to 100 characters.

Article title #2
Text of article #2 limited to 100 characters.

Please let me know if there are more details or different description is required.

Thank you

Release 0.13.0 reports it's v. 0.11.0

Describe the bug
I've downloaded the release 0.13.0 from here goeland_linux_amd64 but when I execute the version command it reports that it is 0.11.0

Expected behavior
I would expect to get 0.13.0

Screenshots
image

Server (please complete the following information):

  • OS: Ubuntu server 22.04

Email is not sent without cron in `daemon` mode

This is my config.toml

loglevel = "info"
dry-run = false
run-at-startup = true
purge-days = 5

[email]
host = "smtp.xxx.com"
port = 2525
authentication = "none"

[sources]

[sources.hackernews]
url = "https://hnrss.org/newest"
type = "feed"
filters = ["all", "today"]
allow-insecure = false

[pipes]

[pipes.hackernews]
disabled = false
source = "hackernews"
destination = "email"
email_to = ["[email protected]", "[email protected]"]
email_from = ["[email protected]"]

When running 'docker-compose up -d' ,Goeland sends an email, but the feeds will be updated in the future no email will be sent

Patch to add URL variable to template

The attached huge and complicated patch adds a URL variable for use in the email template to allow for adding a clickable link to to the original article like so:

<a href="{{.URL}}">{{.EntryTitle}}</a>

Patch:

diff -urN goeland-0.16.0.dist/cmd/run.go goeland-0.16.0.cet/cmd/run.go
--- goeland-0.16.0.dist/cmd/run.go	2023-10-27 07:34:30.000000000 -0700
+++ goeland-0.16.0.cet/cmd/run.go	2023-12-16 01:56:03.808909272 -0800
@@ -160,6 +160,7 @@
 		EntryFooter   string
 		ContentID     string
 		CSS           string
+		URL           string
 	}{
 		EntryTitle:    html.EscapeString(entry.Title),
 		EntryContent:  entry.Content,
@@ -169,6 +170,7 @@
 		EntryFooter:   footer,
 		ContentID:     "cid:" + logoAttachmentName,
 		CSS:           defaultCSS,
+		URL:           entry.URL,
 	}
 	if destination == "htmlfile" {
 		data.ContentID = "data:image/png;base64," + base64.StdEncoding.EncodeToString(logoBytes)

goeland-add_url_var.patch.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.