cfc-servers / gm_express Goto Github PK
View Code? Open in Web Editor NEWAn unlimited, out-of-band, bi-directional networking library for Garry's Mod
Home Page: https://gmod.express
License: GNU General Public License v3.0
An unlimited, out-of-band, bi-directional networking library for Garry's Mod
Home Page: https://gmod.express
License: GNU General Public License v3.0
The web is the web. Stuff happens.
Sometimes numbskulls like me accidentally push breaking changes, sometimes Cloudflare has outages, sometimes players have firewalls... etc.
So what is Express supposed to do when it can't reach the API? Should it fallback to regular net messages, effectively acting like (or maybe directly using) Netstream? This would complicate the code a bit, and would probably create some annoying regression issues, but it would make messages sent with Express much more reliable.
And what if only one party has issues connecting to the API? Would the Netstream solution work in this case too?
As reported by another developer who implemented a similar system on their server, clients with very slow upload speeds (~1mb/s) could experience timeouts when uploading data to Express.
They suggested that splitting the data into 500kb chunks worked for them. I want to experiment to see if we can find another solution that won't require too much data manipulation.
First, we could try setting the default timeout value on all of our HTTP requests to 2 or 3 minutes (it's 60s by default). We'd have to see what the ramifications of that would be. For example, how many pending HTTP requests can be active at the same time?
Default GMod uses HTTP/1.1, so something fancy like HTTP/2 streaming is out of the question.
Right now, if the recipient sends back an incorrect proof, it would be a no-op because the callback wouldn't be found in the awaitingProof
table (ref: https://github.com/CFC-Servers/gm_express/blob/main/lua/gm_express/sh_init.lua#L131-L141)
We should have some way of erroring or alerting of this case at minimum, but letting the sender define an error handling function would also be nice.
Reproduce:
express_domain
to something that resolves but doesn't respond, like loopback.cfcservers.org:12345
express_domain
to gmod.express
After changing the convar, Express should basically re-initialize, attempting to check the version and re-register again
Right now, the server retrieves two tokens on startup; one for itself, and one for all clients.
This client token can be used to write or retrieve data for as long as it's in the KV store.
This creates an avenue for attack wherein a malicious client could use their key to upload any amount of data at any rate.
One low-hanging improvement: Have the server issue clients short-term (or even one-time) tokens. The client code would passively manage the tokens, asking for a new one when it needs one.
Another improvement: force the clients to inform the server of any would-be uploads before they happen. Clients could send the server the hash and size of their data, the server would respond with a specially crafted token (maybe a jwt?) to use for that specific upload. The API could then validate that the upload was entirely correct (size and hash match), responding with an error and maybe short-term API timeout.
Doing transactions in this way would require more overhead, but the server could, for any reason, refuse to grant a token for the proposed upload. This gives a lot of flexibility to server owners for, realistically, a minor performance hit.
When running our GLuaTest suite in Github's Actions, Express still reaches out to the default domain and does a revision check and registration.
This is unnecessary for our use case, so we should tell Express not to bother when it's in such an environment.
There may be a convar we can check (or create) that would indicate it's in a test environment. A little bit of discovery is necessary for this.
Perhaps the Server wants to communicate via http
but make the clients use https
, or whatever else.
We should just make a couple of new convars and use them as-needed.
Cloudflare's K/V has a major issue for Express: it doesn't guarantee that the value you store will be accessible in every region immediately.
For Express, this means the server could upload a payload and only some clients could simply not see it. Of course the issue happens in reverse, too; a client could upload a payload that the server couldn't see if they were in different regions.
K/V is super fast, faster than the alternatives, so I'd like to keep using it. I think the best plan is to also use Cloudflare's D1 when it reaches general availability as a backup.
So for every request, we'd now store the data in both KV and D1 and for every retrieval, check KV first and then check D1 if the ID wasn't found.
Or, if D1 latency gets low enough, we could just use D1 entirely 🤷
Lots of measurements and testing to do.
A few open questions:
Here's the bit of code in question:
https://github.com/CFC-Servers/gm_express/blob/main/lua/gm_express/sh_init.lua#L131-L141
When a message is sent, it creates a new entry in the express._awaitingProof
table, using the hash of the data (prefixed with the recipient's Steam ID, if called serverside) and then removes the entry from the table when proof is received.
But what should Express do if the same message with the same data is sent multiple times in a short timespan?
I suppose the expected behavior would be to get a callback for each message sent, but right now it'd only run the callback once (the first run would remove it from the callbacks table).
Perhaps we could make an incrementing transactionID
that would get automatically sent and incremented with each message, and then use that number in the key for express._awaitingProof
. Then, the recipient would reply with same transactionID
we sent them, and we'd use that to run the correct callback.
We could implement this in transparently so the user doesn't have to worry about it, but I worry this could create a maybe-exploit where a malicious actor could reply with a different transactionID
, potentially running the wrong callback. Granted, it would still be prefixed with their SteamID, so they'd only be running a callback we already expected them to run.... I dunno.
Just a braindump for now, will revisit when some of the more pressing tasks have been completed.
Hello. Do you have plans to implement JS back-end not only through CloudFlare? For me personally, and I think for many users of your library, it would be much more convenient to host these files on the same VDS.
With especially large data structures, just the process of preparing the data to send or read can be massively taxing on the server.
Because we already operate on callbacks, we could run these processes asynchronously to spread the work out over multiple ticks, easing any massive spikes that might occur.
This probably belongs in the https://github.com/CFC-Servers/gm_express_service repo, but seeing as this is the most visible repo I'll put things here.
I'll need to make a new Cloudflare account and go through the process of setting this up so I can optimize and document the process.
A simple test suite to verify that functions just like, run properly would be a good place to start.
To do this, we have to make proof sending mandatory, with optional callbacks for proof receiving.
Then, we can send a success
bool on the proof message to indicate errors or timeouts.
This will let the sender handle cases where the recipient never received their message.
This is a really annoying issue that, according to Cloudflare's docs, shouldn't be happening.
Cloudflare KV docs state:
When you write to KV, your data is written to central data stores. It is not sent automatically to every location’s cache
Initial reads from a location do not have a cached value. The data must be read from the nearest central data store, resulting in a slower response.
So what they're saying here is that KV achieves low latency by caching KV lookups. On the first read from a given location, it'll get a cache MISS and have to traverse the Cloudflare nodes to find the actual value, and then it'll cache that value to the location where the lookup occurred.
This is fine. Good, even - we can deal with that added latency for a single client.
A quick refresher on how Express works:
Our current flow is something like this:
However, what we're actually seeing, is the Recipient asks Express Service for the UUID, and gets a 404!
How does that work? It's not possible for it to actually be a 404 at this point because the data definitely exists in KV. We wouldn't have the UUID unless it were already stored!
The worst part about this bug is that this 404 is actually cached for that location! Meaning anyone else trying to read the Data for that UUID just gets a cached 404 🤦♂️
Assuming this isn't some obscure bug with the Express Service (I suppose it's possible), there are a few different ways to tackle this.
1. Have clients retry for the data
(Proposed in #34)
We could just make the client ask for the data over and over until the cache busts? 😅
2. Have the sender wait a brief moment before sending the net message with the data's UUID
(Proposed in #34)
If it really was a timing issue, perhaps delaying the net message by even a few fractions of a second could increase the chance of a successful first GET. This would be a very minor delay, but it would increase the minimum time of every message by that much.
3. Create a comprehensive cross-region, bare-bones example demonstrating this as a Cloudflare bug and ask Cloudflare to fix it
This should probably be done, if at least just to confirm that the bug is Cloudflare's like I'm describing it... But man, that's a lot of work.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.