GithubHelp home page GithubHelp logo

strideynet / bsky-furry-feed Goto Github PK

View Code? Open in Web Editor NEW
47.0 2.0 3.0 1.94 MB

A BlueSky custom feed generator for furry content !

Home Page: https://bsky-furry-feed-info.vercel.app

License: MIT License

Go 55.24% Dockerfile 0.10% HCL 1.43% HTML 0.30% JavaScript 7.11% Shell 0.01% Makefile 0.15% Svelte 8.45% TypeScript 10.06% SCSS 0.09% Vue 17.03% CSS 0.04%
atproto atprotocol bluesky

bsky-furry-feed's Introduction

Howdy

Hey! I’m Noah, a software engineer based in London. Outside of work, you can find me DJing, climbing or playing video games. I’m a huge music fiend and can almost always be found listening to music. In the winter, I’ll often be seen hitting the slopes.

I have a blog (that I sometimes update) at https://noahstride.co.uk.

👨‍💻 Work

I’m a product-focussed engineer on a mission to meet users’ needs - whether they be an internal team or a customer. You’ll find me happiest in a role which lets me engage with users and participate actively in the product process, and solve plenty of challenging technical problems!

I currently lead a team that builds products in the workload identity space (an emerging field that’s captured my heart), but over the past few years I’ve worked as a backend engineer and SRE at a variety of startups.

For all the details, check out my CV.

GitHub

FOSS is awesome, and I've made a few contributions here and there. Here are projects I'm a maintainer of:

bsky-furry-feed's People

Contributors

ab-gh avatar itstolf avatar kevslashnull avatar kiosion avatar strideynet avatar vilkoviak avatar xhayper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

bsky-furry-feed's Issues

Add line breaks to admin comments

Admin comments will often be informally structured in some way. Allowing line breaks is most important for clear structuring. Later, we can add other formatting as well.

Cannot approve actor with no profile configured

{
  "insertId": "m6yn1gtpdlxigfgx",
  "jsonPayload": {
    "error": "updating profile and following actor: getting profile: resolving rpath within mst: mst: not found",
    "stacktrace": "github.com/strideynet/bsky-furry-feed/api.New.unaryLoggingInterceptor.func1.1\n\t/app/api/server.go:132\nconnectrpc.com/connect.NewUnaryHandler[...].func2\n\t/go/pkg/mod/connectrpc.com/[email protected]/handler.go:81\nconnectrpc.com/connect.(*Handler).ServeHTTP\n\t/go/pkg/mod/connectrpc.com/[email protected]/handler.go:239\ngithub.com/strideynet/bsky-furry-feed/proto/bff/v1/bffv1pbconnect.NewModerationServiceHandler.func1\n\t/app/proto/bff/v1/bffv1pbconnect/moderation_service.connect.go:307\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2136\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2514\ngithub.com/strideynet/bsky-furry-feed/api.New.(*Cors).Handler.func3\n\t/go/pkg/mod/github.com/rs/[email protected]/cors.go:236\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2136\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2938\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:2009",
    "level": "error",
    "caller": "api/server.go:132",
    "msg": "gRPC request failed",
    "logger": "api",
    "ts": 1692556311.928075,
    "procedure": "/bff.v1.ModerationService/ProcessApprovalQueue"
  },
  "resource": {
    "type": "k8s_container",
    "labels": {
      "pod_name": "bff-api-5bcf76cd9c-27prf",
      "container_name": "api",
      "cluster_name": "us-east",
      "namespace_name": "default",
      "location": "us-east1",
      "project_id": "bsky-furry-feed"
    }
  },
  "timestamp": "2023-08-20T18:31:51.928298438Z",
  "severity": "ERROR",
  "labels": {
    "k8s-pod/app_kubernetes_io/name": "api",
    "compute.googleapis.com/resource_name": "gk3-us-east-default-pool-4c438cbe-216m",
    "k8s-pod/pod-template-hash": "5bcf76cd9c",
    "k8s-pod/app_kubernetes_io/part-of": "bff"
  },
  "logName": "projects/bsky-furry-feed/logs/stderr",
  "receiveTimestamp": "2023-08-20T18:31:55.082269667Z"
}

Introduce #commissions-open feed

This feed should be chronological and pick up any post with the hash tag #comms-open . Perhaps only show the latest post from each actor to stop it being spammed.

More resilient handling of websocket stream dying

Right now, it causes the ingester to exit and then k8s takes care of restarting it, we should handle this in-app and indicate an unhealthy state whilst we are unable to bring the websocket stream back online.

After Dark feed

Remove any After Dark content from the primary feed and into its own feed.

Do this based on:

  • Labels of posts at ingest
  • Post manually marked AD by a moderator
  • Account marked AD by a moderator

Persist websocket cursor and resume upon restart

Right now we lose events during restarts or outages, let's persist the cursor and resume from last known position.

Potentially need to persist the cursor every X seconds (5 ?) to avoid hammering the database - the cursor can be a bit behind as long as we ensure we gracefully handle duplicate inserts.

Set up integration tests with PDS/BGS

We can use Bluesky’s Indigo to set up a PDS/BGS and the run integration tests on the ingest server.

The goal is to test that an event results in a database entry (or update in the case of deletions).

Improve admin navigation

Following #68, the admin site will have two kinds of pages, namely the queue (on /) and profiles (on /users/[did]). This makes navigation UX more complicated.

Once (and if) we have the backend ability to list users and their profiles, this distinction will become clearer. For now, we can replace the Admin text with a Home link.

image

Remove premature post tagging from ingester

Right now, the ingester is responsible for scanning hash tags and the media status of a post and then assigning it a "tag" from a pre-selected list. This means when we add new hash tags for new feeds, we end up having to wait for new data to come in rather than being able to use old data. It feels like the classification is essentially too premature.

What I'd suggest instead is we start extracting more "general" data about a post and deprecate the tags field. I'd like to see the following new columns:

  • hashtags - I think there's still a big performance benefit in us storing hashtags as an array column rather than doing a text search during feed generation. But rather than only supporting a fixed list, we should extract all hash tags within the post and normalize them (e.g lowercase them).
  • media_attached - A simple boolean value indicating if the post has any media.
  • raw - The JSON representation of the raw resource we received from the firehose. We probably won't query on this as part of standard feed generation, but we can use it for migrations in future which add new extracted fields.

I think with these new columns, we'll still have good performance but we'll be more able to design new feeds on the fly without needing to modify the ingester and wait for posts matching that description to be made.

Moderation: Approval Queue MVP

We need a web dashboard so I can onboard moderators who can approve new accounts to be "furries"

Thoughts:

  • Login with bsky

  • Use bsky tokens for authentication against the gRPC BFF API and also to make calls directly to bsky.

  • Semi responsive UI - we don't need it to work on mobile, but something Ipad shaped should look sane.

  • Deploy with Vercel as we do for the info site (probably admin.furryli.st)

  • #20

  • #21

add support for "pinned" users on certain feeds

for e.g. con feeds, it may be useful to just always put con account skeets onto the feed even if they aren't tagged with the con.

one consideration is that if bff does not know about the con account, it won't ingest any of its posts. should the con account be added directly into the database without requiring them to follow furryli.st? should con skeets be shown on other feeds too?

Track actor profile changes

It'd be nice to track the content of users profiles over time. We should capture it on their addition to the database and then we should capture changes to this profile over the firehose.

This'll be helpful in a few ways:

  • Moderation: If someone removes hateful content from their bio, our moderation team will still be able to view it.
  • Feed generation: we can detect certain content within bios/handles (e.g "denfur") and automatically push their posts into a feed.
  • Content Rating: we can detect certain content within bios/handles and use this to exclude all of their posts from the clean feeds (e.g "🔞" or "minors dni")

Add moderation page for individual users

In order to mark users as artists, add mod comments, or—in the worst case—remove them from the feed, we need a place to process these actions in the admin app.

Light mode improvements

The UI at times is design too much for dark mode, so some elements might look weird on light mode.

For example, the audit action items:

image

Community guidelines

Determine the rules by which we will determine when to:

  • Mark a post as AD
  • Remove a post from all feeds
  • Mark an account as AD
  • Ban an account from all feeds

RBAC for Moderation API

We need the ability to have less privileged moderators during their training period. An RBAC model based on casbin or OPA will give us the flexibility to tune this to our needs.

Caching of feed first loads

Cache the "first" view of the feed in memory and update every X minutes, this should alleviate database load and improve user experience.

Refactor Transactions

Take a look at how we handle approving an actor, a number of the queries fall outside of the transaction. Find a way to put them all inside the same transaction so when things go wrong, we don't half complete actions.

Background Post Hotness generation

Right now, we calculate post hotness on the fly which makes cursoring and performance problematic.

We should calculate these every X minutes and persist them to a table we can join across. We should support multiple "algos" and store the last X post history.

Possibly only generate Post Hotness for posts w/in the last X hours (possibly 48/72) to keep the size of this a bit more bound.

Self labels test keeps failing

The integration test for self-labelling keeps failing in CI.

Logs

See also https://github.com/strideynet/bsky-furry-feed/actions/runs/5916587573/job/16044028143?pr=148

    logger.go:130: 2023-08-20T09:31:13.476Z	ERROR	failed to handle repo commit	{"error": "create (at://did:plc:c1e1ee347119cd4f/app.bsky.feed.post/3k5exq62yqer): handling record create: handling app.bsky.feed.post create: creating post: executing CreateCandidatePost query: failed to encode args[6]: unable to encode &bsky.FeedPost{LexiconTypeID:\"app.bsky.feed.post\", CreatedAt:\"2023-08-20T09:31:13.386Z\", Embed:(*bsky.FeedPost_Embed)(nil), Entities:[]*bsky.FeedPost_Entity(nil), Facets:[]*bsky.RichtextFacet(nil), Labels:(*bsky.FeedPost_Labels)(0xc000584c90), Langs:[]string(nil), Reply:(*bsky.FeedPost_ReplyRef)(nil), Text:\"paws paws paws\"} into text format for jsonb (OID 3802): json: error calling MarshalJSON for type *bsky.FeedPost_Labels: cannot marshal empty enum"}

...

    ingester_test.go:280: 
        	Error Trace:	/home/runner/work/bsky-furry-feed/bsky-furry-feed/ingester/ingester_test.go:282
        	            				/opt/hostedtoolcache/go/1.21.0/x64/src/runtime/asm_amd64.s:1650
        	Error:      	Received unexpected error:
        	            	no rows in result set
    ingester_test.go:280: 
        	Error Trace:	/home/runner/work/bsky-furry-feed/bsky-furry-feed/ingester/ingester_test.go:280
        	Error:      	Condition never satisfied
        	Test:       	TestFirehoseIngester/waiting_for_posts/self_labels

Improve dev environment setup

Thoughts:

  • Perhaps we ought to have a tools.go for migrate, buf and golangci-lint
  • Improve documentation in developing.md
  • Add more env vars where necessary to make configuring the behaviour of bffsrv easier for local envs.

firehose ingester: deletions can potentially be reordered before creations

Currently, the firehose ingester does something like this:

  • Receive commit event from firehose.
  • Send commit event to evtChan.
  • 1 of n workers reads from evtChan and processes it in parallel.

Consider a scenario where the firehose emits the following two operations in separate commits: follow -> unfollow

  • Assume the GraphFollow is fanned out to worker 1 and GraphUnfollow is fanned out to worker 2.
  • Assume that worker 2 is scheduled before worker 1.
  • Worker 2 processes follow: no-op/error, because the follow didn't exist in the repo in the first place.
  • Worker 1 processes unfollow: follow is recorded.

However, the correct state should be no follow exists in the repo.

I think we can fix this by either:

  • processing completely sequentially based on the collection (follows, posts, etc), with 1 worker per collection type (painful to scale out), or,
  • writing tombstones on delete instead of deleting the record, then cleaning up tombstones later (optional).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.