strideynet / bsky-furry-feed Goto Github PK
View Code? Open in Web Editor NEWA BlueSky custom feed generator for furry content !
Home Page: https://bsky-furry-feed-info.vercel.app
License: MIT License
A BlueSky custom feed generator for furry content !
Home Page: https://bsky-furry-feed-info.vercel.app
License: MIT License
Following #44, we want to add a CI job to check that buf generate
was executed by the developer.
Determine the rules by which we will determine when to:
This'll let us dynamically add or remove mods without having to make changes to the hard-coded list...
We've been changing a lot with the different feeds recently, so having one place where users can view information about feeds and how to pin them to your main timeline.
See also this reply to qdot.
As first iteration, we can just expand the How do I get started? point on https://furryli.st.
Right now, it causes the ingester to exit and then k8s takes care of restarting it, we should handle this in-app and indicate an unhealthy state whilst we are unable to bring the websocket stream back online.
Admin comments will often be informally structured in some way. Allowing line breaks is most important for clear structuring. Later, we can add other formatting as well.
This will make this service a bit more reliable (as one failing won't cause the other to fail) & let us adjust resource allocation more granularly.
It'd be nice to track the content of users profiles over time. We should capture it on their addition to the database and then we should capture changes to this profile over the firehose.
This'll be helpful in a few ways:
{
"insertId": "m6yn1gtpdlxigfgx",
"jsonPayload": {
"error": "updating profile and following actor: getting profile: resolving rpath within mst: mst: not found",
"stacktrace": "github.com/strideynet/bsky-furry-feed/api.New.unaryLoggingInterceptor.func1.1\n\t/app/api/server.go:132\nconnectrpc.com/connect.NewUnaryHandler[...].func2\n\t/go/pkg/mod/connectrpc.com/[email protected]/handler.go:81\nconnectrpc.com/connect.(*Handler).ServeHTTP\n\t/go/pkg/mod/connectrpc.com/[email protected]/handler.go:239\ngithub.com/strideynet/bsky-furry-feed/proto/bff/v1/bffv1pbconnect.NewModerationServiceHandler.func1\n\t/app/proto/bff/v1/bffv1pbconnect/moderation_service.connect.go:307\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2136\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2514\ngithub.com/strideynet/bsky-furry-feed/api.New.(*Cors).Handler.func3\n\t/go/pkg/mod/github.com/rs/[email protected]/cors.go:236\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2136\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2938\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:2009",
"level": "error",
"caller": "api/server.go:132",
"msg": "gRPC request failed",
"logger": "api",
"ts": 1692556311.928075,
"procedure": "/bff.v1.ModerationService/ProcessApprovalQueue"
},
"resource": {
"type": "k8s_container",
"labels": {
"pod_name": "bff-api-5bcf76cd9c-27prf",
"container_name": "api",
"cluster_name": "us-east",
"namespace_name": "default",
"location": "us-east1",
"project_id": "bsky-furry-feed"
}
},
"timestamp": "2023-08-20T18:31:51.928298438Z",
"severity": "ERROR",
"labels": {
"k8s-pod/app_kubernetes_io/name": "api",
"compute.googleapis.com/resource_name": "gk3-us-east-default-pool-4c438cbe-216m",
"k8s-pod/pod-template-hash": "5bcf76cd9c",
"k8s-pod/app_kubernetes_io/part-of": "bff"
},
"logName": "projects/bsky-furry-feed/logs/stderr",
"receiveTimestamp": "2023-08-20T18:31:55.082269667Z"
}
Remove any After Dark content from the primary feed and into its own feed.
Do this based on:
The integration test for self-labelling keeps failing in CI.
See also https://github.com/strideynet/bsky-furry-feed/actions/runs/5916587573/job/16044028143?pr=148
logger.go:130: 2023-08-20T09:31:13.476Z ERROR failed to handle repo commit {"error": "create (at://did:plc:c1e1ee347119cd4f/app.bsky.feed.post/3k5exq62yqer): handling record create: handling app.bsky.feed.post create: creating post: executing CreateCandidatePost query: failed to encode args[6]: unable to encode &bsky.FeedPost{LexiconTypeID:\"app.bsky.feed.post\", CreatedAt:\"2023-08-20T09:31:13.386Z\", Embed:(*bsky.FeedPost_Embed)(nil), Entities:[]*bsky.FeedPost_Entity(nil), Facets:[]*bsky.RichtextFacet(nil), Labels:(*bsky.FeedPost_Labels)(0xc000584c90), Langs:[]string(nil), Reply:(*bsky.FeedPost_ReplyRef)(nil), Text:\"paws paws paws\"} into text format for jsonb (OID 3802): json: error calling MarshalJSON for type *bsky.FeedPost_Labels: cannot marshal empty enum"}
...
ingester_test.go:280:
Error Trace: /home/runner/work/bsky-furry-feed/bsky-furry-feed/ingester/ingester_test.go:282
/opt/hostedtoolcache/go/1.21.0/x64/src/runtime/asm_amd64.s:1650
Error: Received unexpected error:
no rows in result set
ingester_test.go:280:
Error Trace: /home/runner/work/bsky-furry-feed/bsky-furry-feed/ingester/ingester_test.go:280
Error: Condition never satisfied
Test: TestFirehoseIngester/waiting_for_posts/self_labels
Currently, the firehose ingester does something like this:
evtChan
.evtChan
and processes it in parallel.Consider a scenario where the firehose emits the following two operations in separate commits: follow -> unfollow
However, the correct state should be no follow exists in the repo.
I think we can fix this by either:
We need the ability to have less privileged moderators during their training period. An RBAC model based on casbin or OPA will give us the flexibility to tune this to our needs.
for e.g. con feeds, it may be useful to just always put con account skeets onto the feed even if they aren't tagged with the con.
one consideration is that if bff does not know about the con account, it won't ingest any of its posts. should the con account be added directly into the database without requiring them to follow furryli.st? should con skeets be shown on other feeds too?
The baseUrl
in web/admin/composables/useAPI.ts
should be configurable based on the environment.
We need a web dashboard so I can onboard moderators who can approve new accounts to be "furries"
Thoughts:
Right now we lose events during restarts or outages, let's persist the cursor and resume from last known position.
Potentially need to persist the cursor every X seconds (5 ?) to avoid hammering the database - the cursor can be a bit behind as long as we ensure we gracefully handle duplicate inserts.
Let’s look into TanStack Query for fetching data. This should help us with explicitly having to handle errors and easily being able to refresh data.
minor issue: right now they only take the timestamp into account, so if two posts have identical timestamps, across a pagination boundary it is possible for the other post to be lost.
Thoughts:
tools.go
for migrate
, buf
and golangci-lint
developing.md
bffsrv
easier for local envs.More of a cosmetic thing but you can’t distinguish the disabled
action buttons versus the active ones.
Currently, we need to make a new client before we do anything - we should have a single client that will manage its own credential lifetime.
Cache the "first" view of the feed in memory and update every X minutes, this should alleviate database load and improve user experience.
I would love to try and do this
Following #68, the admin site will have two kinds of pages, namely the queue (on /
) and profiles (on /users/[did]
). This makes navigation UX more complicated.
Once (and if) we have the backend ability to list users and their profiles, this distinction will become clearer. For now, we can replace the Admin text with a Home link.
We can use Bluesky’s Indigo to set up a PDS/BGS and the run integration tests on the ingest server.
The goal is to test that an event results in a database entry (or update in the case of deletions).
Right now, we calculate post hotness on the fly which makes cursoring and performance problematic.
We should calculate these every X minutes and persist them to a table we can join across. We should support multiple "algos" and store the last X post history.
Possibly only generate Post Hotness for posts w/in the last X hours (possibly 48/72) to keep the size of this a bit more bound.
We have protobuf generated files and generated SQL which could get forgotten in PRs, run buf generate
and sqlc generate
and check the diff, if there's any changes, fail the CI run.
Record when, what, who and why
After #51 and reducing the number of furs in the queue, we want to reenable SSR for the admin page.
We might be blocked by bluesky-social/atproto#910.
In order to mark users as artists, add mod comments, or—in the worst case—remove them from the feed, we need a place to process these actions in the admin app.
Take a look at how we handle approving an actor, a number of the queries fall outside of the transaction. Find a way to put them all inside the same transaction so when things go wrong, we don't half complete actions.
Right now, the ingester is responsible for scanning hash tags and the media status of a post and then assigning it a "tag" from a pre-selected list. This means when we add new hash tags for new feeds, we end up having to wait for new data to come in rather than being able to use old data. It feels like the classification is essentially too premature.
What I'd suggest instead is we start extracting more "general" data about a post and deprecate the tags field. I'd like to see the following new columns:
hashtags
- I think there's still a big performance benefit in us storing hashtags as an array column rather than doing a text search during feed generation. But rather than only supporting a fixed list, we should extract all hash tags within the post and normalize them (e.g lowercase them).media_attached
- A simple boolean value indicating if the post has any media.raw
- The JSON representation of the raw resource we received from the firehose. We probably won't query on this as part of standard feed generation, but we can use it for migrations in future which add new extracted fields.I think with these new columns, we'll still have good performance but we'll be more able to design new feeds on the fly without needing to modify the ingester and wait for posts matching that description to be made.
In order to start a test PDS and BGS, we need to upgrade https://github.com/bluesky-social/indigo as they have recently started exporting relevant helpers (such as testing.TestPDS
).
We should also check out the diff to see if there’s other updated or newly added functionality we can leverage: bluesky-social/indigo@da6d879...fdb5f97
This feed should be chronological and pick up any post with the hash tag #comms-open . Perhaps only show the latest post from each actor to stop it being spammed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.