strideynet / bsky-furry-feed Goto Github PK

A BlueSky custom feed generator for furry content !

Home Page: https://bsky-furry-feed-info.vercel.app

License: MIT License

Go 56.32% Dockerfile 0.10% HCL 1.37% HTML 0.29% JavaScript 6.84% Shell 0.01% Makefile 0.14% Svelte 8.13% TypeScript 9.80% SCSS 0.09% Vue 16.88% CSS 0.04%

atproto atprotocol bluesky

bsky-furry-feed's Introduction

bsky-furry-feed

The source code and infrastructure for https://feed.furryli.st.

It produces a custom feed for the Bluesky social media site, selecting posts based on people's membership of the furry community!

For furries

Check out https://furryli.st for information about the feeds and how to join.

For developers

This is also a somewhat complex example of the a BlueSky feed generator and firehose consumer written in Go. Learn from it - and feel free to ask questions.

Interested in contributing? Read our getting started guide or check out the open issues.

bsky-furry-feed's People

Contributors

Stargazers

Watchers

Forkers

kevslashnull fuzzylittledevil

bsky-furry-feed's Issues

CLI support for marking actor or post hidden

After Dark feed

Remove any After Dark content from the primary feed and into its own feed.

Do this based on:

Labels of posts at ingest
Post manually marked AD by a moderator
Account marked AD by a moderator

Introduce #commissions-open feed

This feed should be chronological and pick up any post with the hash tag #comms-open . Perhaps only show the latest post from each actor to stop it being spammed.

Improve admin navigation

Following #68, the admin site will have two kinds of pages, namely the queue (on /) and profiles (on /users/[did]). This makes navigation UX more complicated.

Once (and if) we have the backend ability to list users and their profiles, this distinction will become clearer. For now, we can replace the Admin text with a Home link.

Improve metrics on ingester

Worker utilisation
Types of events
Status of handling (Error, Ignored, Actioned)

Implement dark mode + light mod for the website

I would love to try and do this

Split API and Ingester into separate Kubernetes deployments

This will make this service a bit more reliable (as one failing won't cause the other to fail) & let us adjust resource allocation more granularly.

Make baseUrl an env variable

The baseUrl in web/admin/composables/useAPI.ts should be configurable based on the environment.

More resilient handling of websocket stream dying

Right now, it causes the ingester to exit and then k8s takes care of restarting it, we should handle this in-app and indicate an unhealthy state whilst we are unable to bring the websocket stream back online.

firehose ingester: deletions can potentially be reordered before creations

Currently, the firehose ingester does something like this:

Receive commit event from firehose.
Send commit event to evtChan.
1 of n workers reads from evtChan and processes it in parallel.

Consider a scenario where the firehose emits the following two operations in separate commits: follow -> unfollow

Assume the GraphFollow is fanned out to worker 1 and GraphUnfollow is fanned out to worker 2.
Assume that worker 2 is scheduled before worker 1.
Worker 2 processes follow: no-op/error, because the follow didn't exist in the repo in the first place.
Worker 1 processes unfollow: follow is recorded.

However, the correct state should be no follow exists in the repo.

I think we can fix this by either:

processing completely sequentially based on the collection (follows, posts, etc), with 1 worker per collection type (painful to scale out), or,
writing tombstones on delete instead of deleting the record, then cleaning up tombstones later (optional).

Add line breaks to admin comments

Admin comments will often be informally structured in some way. Allowing line breaks is most important for clear structuring. Later, we can add other formatting as well.

add support for "pinned" users on certain feeds

for e.g. con feeds, it may be useful to just always put con account skeets onto the feed even if they aren't tagged with the con.

one consideration is that if bff does not know about the con account, it won't ingest any of its posts. should the con account be added directly into the database without requiring them to follow furryli.st? should con skeets be shown on other feeds too?

Community guidelines

Determine the rules by which we will determine when to:

Mark a post as AD
Remove a post from all feeds
Mark an account as AD
Ban an account from all feeds

Background Post Hotness generation

Right now, we calculate post hotness on the fly which makes cursoring and performance problematic.

We should calculate these every X minutes and persist them to a table we can join across. We should support multiple "algos" and store the last X post history.

Possibly only generate Post Hotness for posts w/in the last X hours (possibly 48/72) to keep the size of this a bit more bound.

Remove premature post tagging from ingester

Right now, the ingester is responsible for scanning hash tags and the media status of a post and then assigning it a "tag" from a pre-selected list. This means when we add new hash tags for new feeds, we end up having to wait for new data to come in rather than being able to use old data. It feels like the classification is essentially too premature.

What I'd suggest instead is we start extracting more "general" data about a post and deprecate the tags field. I'd like to see the following new columns:

hashtags - I think there's still a big performance benefit in us storing hashtags as an array column rather than doing a text search during feed generation. But rather than only supporting a fixed list, we should extract all hash tags within the post and normalize them (e.g lowercase them).
media_attached - A simple boolean value indicating if the post has any media.
raw - The JSON representation of the raw resource we received from the firehose. We probably won't query on this as part of standard feed generation, but we can use it for migrations in future which add new extracted fields.

I think with these new columns, we'll still have good performance but we'll be more able to design new feeds on the fly without needing to modify the ingester and wait for posts matching that description to be made.

Caching of feed first loads

Cache the "first" view of the feed in memory and update every X minutes, this should alleviate database load and improve user experience.

how to use instructions into readme

https://skyfeed.app

Rotating credential support for Bsky Client

Currently, we need to make a new client before we do anything - we should have a single client that will manage its own credential lifetime.

GetPostsWithLikes and GetFurryNewFeed cursors should take URI into account

minor issue: right now they only take the timestamp into account, so if two posts have identical timestamps, across a pagination boundary it is possible for the other post to be lost.

Include hash tags in alt text

Improve error handling

Let’s look into TanStack Query for fetching data. This should help us with explicitly having to handle errors and easily being able to refresh data.

Moderation: UI MVP

furry-new feed without the #nsfw tag

RBAC for Moderation API

We need the ability to have less privileged moderators during their training period. An RBAC model based on casbin or OPA will give us the flexibility to tune this to our needs.

Reenable admin SSR

After #51 and reducing the number of furs in the queue, we want to reenable SSR for the admin page.

We might be blocked by bluesky-social/atproto#910.

Handle deletion events from firehose

Persist websocket cursor and resume upon restart

Right now we lose events during restarts or outages, let's persist the cursor and resume from last known position.

Potentially need to persist the cursor every X seconds (5 ?) to avoid hammering the database - the cursor can be a bit behind as long as we ensure we gracefully handle duplicate inserts.

Track actor profile changes

It'd be nice to track the content of users profiles over time. We should capture it on their addition to the database and then we should capture changes to this profile over the firehose.

This'll be helpful in a few ways:

Moderation: If someone removes hateful content from their bio, our moderation team will still be able to view it.
Feed generation: we can detect certain content within bios/handles (e.g "denfur") and automatically push their posts into a feed.
Content Rating: we can detect certain content within bios/handles and use this to exclude all of their posts from the clean feeds (e.g "🔞" or "minors dni")

Upgrade bluesky-social/indigo dependency

In order to start a test PDS and BGS, we need to upgrade https://github.com/bluesky-social/indigo as they have recently started exporting relevant helpers (such as testing.TestPDS).

We should also check out the diff to see if there’s other updated or newly added functionality we can leverage: bluesky-social/indigo@da6d879...fdb5f97

Audit trail for moderation actions

Record when, what, who and why

Moderation: Approval Queue MVP

We need a web dashboard so I can onboard moderators who can approve new accounts to be "furries"

Thoughts:

Login with bsky
Use bsky tokens for authentication against the gRPC BFF API and also to make calls directly to bsky.
Semi responsive UI - we don't need it to work on mobile, but something Ipad shaped should look sane.
Deploy with Vercel as we do for the info site (probably admin.furryli.st)
#20
#21

Cannot approve actor with no profile configured

{
  "insertId": "m6yn1gtpdlxigfgx",
  "jsonPayload": {
    "error": "updating profile and following actor: getting profile: resolving rpath within mst: mst: not found",
    "stacktrace": "github.com/strideynet/bsky-furry-feed/api.New.unaryLoggingInterceptor.func1.1\n\t/app/api/server.go:132\nconnectrpc.com/connect.NewUnaryHandler[...].func2\n\t/go/pkg/mod/connectrpc.com/[email protected]/handler.go:81\nconnectrpc.com/connect.(*Handler).ServeHTTP\n\t/go/pkg/mod/connectrpc.com/[email protected]/handler.go:239\ngithub.com/strideynet/bsky-furry-feed/proto/bff/v1/bffv1pbconnect.NewModerationServiceHandler.func1\n\t/app/proto/bff/v1/bffv1pbconnect/moderation_service.connect.go:307\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2136\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2514\ngithub.com/strideynet/bsky-furry-feed/api.New.(*Cors).Handler.func3\n\t/go/pkg/mod/github.com/rs/[email protected]/cors.go:236\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2136\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2938\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:2009",
    "level": "error",
    "caller": "api/server.go:132",
    "msg": "gRPC request failed",
    "logger": "api",
    "ts": 1692556311.928075,
    "procedure": "/bff.v1.ModerationService/ProcessApprovalQueue"
  },
  "resource": {
    "type": "k8s_container",
    "labels": {
      "pod_name": "bff-api-5bcf76cd9c-27prf",
      "container_name": "api",
      "cluster_name": "us-east",
      "namespace_name": "default",
      "location": "us-east1",
      "project_id": "bsky-furry-feed"
    }
  },
  "timestamp": "2023-08-20T18:31:51.928298438Z",
  "severity": "ERROR",
  "labels": {
    "k8s-pod/app_kubernetes_io/name": "api",
    "compute.googleapis.com/resource_name": "gk3-us-east-default-pool-4c438cbe-216m",
    "k8s-pod/pod-template-hash": "5bcf76cd9c",
    "k8s-pod/app_kubernetes_io/part-of": "bff"
  },
  "logName": "projects/bsky-furry-feed/logs/stderr",
  "receiveTimestamp": "2023-08-20T18:31:55.082269667Z"
}

Set up integration tests with PDS/BGS

We can use Bluesky’s Indigo to set up a PDS/BGS and the run integration tests on the ingest server.

The goal is to test that an event results in a database entry (or update in the case of deletions).

Add mod action icon and UI text for creating actor

Improve dev environment setup

Thoughts:

Perhaps we ought to have a tools.go for migrate, buf and golangci-lint
Improve documentation in developing.md
Add more env vars where necessary to make configuring the behaviour of bffsrv easier for local envs.

Refactor Transactions

Take a look at how we handle approving an actor, a number of the queries fall outside of the transaction. Find a way to put them all inside the same transaction so when things go wrong, we don't half complete actions.

Light mode improvements

The UI at times is design too much for dark mode, so some elements might look weird on light mode.

For example, the audit action items:

Backend support for hiding actor or post

Feed pagination

Self labels test keeps failing

The integration test for self-labelling keeps failing in CI.

Logs

    logger.go:130: 2023-08-20T09:31:13.476Z	ERROR	failed to handle repo commit	{"error": "create (at://did:plc:c1e1ee347119cd4f/app.bsky.feed.post/3k5exq62yqer): handling record create: handling app.bsky.feed.post create: creating post: executing CreateCandidatePost query: failed to encode args[6]: unable to encode &bsky.FeedPost{LexiconTypeID:\"app.bsky.feed.post\", CreatedAt:\"2023-08-20T09:31:13.386Z\", Embed:(*bsky.FeedPost_Embed)(nil), Entities:[]*bsky.FeedPost_Entity(nil), Facets:[]*bsky.RichtextFacet(nil), Labels:(*bsky.FeedPost_Labels)(0xc000584c90), Langs:[]string(nil), Reply:(*bsky.FeedPost_ReplyRef)(nil), Text:\"paws paws paws\"} into text format for jsonb (OID 3802): json: error calling MarshalJSON for type *bsky.FeedPost_Labels: cannot marshal empty enum"}

...

    ingester_test.go:280: 
        	Error Trace:	/home/runner/work/bsky-furry-feed/bsky-furry-feed/ingester/ingester_test.go:282
        	            				/opt/hostedtoolcache/go/1.21.0/x64/src/runtime/asm_amd64.s:1650
        	Error:      	Received unexpected error:
        	            	no rows in result set
    ingester_test.go:280: 
        	Error Trace:	/home/runner/work/bsky-furry-feed/bsky-furry-feed/ingester/ingester_test.go:280
        	Error:      	Condition never satisfied
        	Test:       	TestFirehoseIngester/waiting_for_posts/self_labels