Comments (5)
@JLO64 firstly, thanks for all the great work on this - yay for open source!
So a few comments from me, and I'll also let @garritfra opine too..
Broken sites
We definitely need to do something with these. We currently have the site checker which spits out a list of any status that isn't 200, then we manually go through the list. We haven't run it for a while, which is probably why you're seeing lots of broken sites.
Having something automated to clean up would be better for obvious reasons, but I think we need to consider some kind of safety net so that things aren't accidentally deleted that are valid. Maybe a CRON that runs and if a site fails over, say 7 days, it gets deleted. I'm just spit-balling ideas here, as this is all way out of my skillset.
If that can't be done easily, we will continue with the manual script and cleanup.
Cloudflare auth and .env
file
I can set up an account so it's tied to this project. I wonder if instead of an .env
file we use GitHub secrets instead? That way we can work it all into GH Actions, as you mentioned?
Markdown file output
I think that's fine - whatever makes it easiest to parse is good with me.
Thanks again,
Kev
from 512kb.club.
No objections!
An idea regarding the automatic size check: Renovate, the dependabot competitor, opens an issue in each repository it's active in and lists all dependencies that are out of date. Maybe this approach works here as well?
We could have an issue that regularly gets updated by an action, listing all domains that could be removed, alongside the time it has been down for or above 512 kB. To avoid having to maintain a backend, the body of the issue should be in machine-readable form.
If we want, we can even add checkmarks next to the domains to schedule removal, or remove any domains that were unhealthy for n days (or n iterations of the check).
These are just my unfiltered thoughts, I'm totally open for remarks or alternative solutions.
from 512kb.club.
Sorry for the lack of updates, I got really sick on Monday and am still recovering.
Quick update on the script. I tried running it overnight but it errored out after ~250 entries. I'm not quite certain why that happened since when I reran it from the entry it failed at it ran properly. Additionally, I think there's an issue with some form of rate limiting going on.
This is a patern throughought the table where every 60-ish entries for roughly 10 entries it failes to scan them. At most the script querries the API 2 times per 20 seconds which is well within the limits of the overall API and that particular endpoint.
Additionally, there are a bunch of sites that are failing seemingly at random despite being accessable via a browser and Cloudflare. The frequency of this is low at roughly 1 per 40 entries.
Thankfully, I can just have the script retest these entries by having it sort sites.yml
by last_passed
instead of last_checked
. I'll be doing that once I'm done with the entire list of sites.
Long term I don't think this will be an issue if the script is run periodically via GitHub Actions for just a couple of sites at a time, but this is going to be a problem if we ever have to rerun the entire list again. For now I'll PR the script as it is. I haven't changed the API token stuff, documentation, or comments. I'll do that in later PRs.
from 512kb.club.
Sorry for not responding sooner! I've had a hectic weekend and haven't been able to touch VS Code or GitHub, but I'll definely be able to submit a PR based on the above comments soon.
I wish I could contribute more on the GitHub Actions side of things, but I've never used it before now. I'm gonna try giving myself a crash course on how that all works this week, but for now I'll absain from commenting on that stuff. That said, switching to GitHub Secrets seems like a good idea!
Regarding broken links and websites larger than 512KB, maybe a new variables should be added to the yaml entries. last_checked
andlast_passed
? (EDIT: I went to sleep late last night, so I didn't realize that we already have a variable checking date last checked lol) This could additionally be used to filter links for the website by adding a liquid if statement to index.md
that checks to make sure that these two dates are the same. This could really help the QoL of someone just browsing the list.
from 512kb.club.
This looks fantastic!
from 512kb.club.
Related Issues (20)
- Two green sites HOT 1
- Switch `master` -> `main` HOT 2
- Questions about the rules in the faq HOT 2
- Change footer URL link for "Bradley Taunt" HOT 1
- I think that a1cy0n.xyz is breaking the rules HOT 2
- The GTmetrix on www.nytimes.com expired HOT 3
- Automating regular website checks HOT 5
- RSS Feed from Entries
- Site review HOT 3
- Navbar is looking Odd HOT 3
- CI Push Permission HOT 1
- GTMetrix is no longer usable without account HOT 22
- Some porn /broken sites on your index HOT 2
- Builds are failing HOT 7
- broken/ad/malware sites HOT 7
- Is 10xdev.cc eligible for listing? HOT 2
- Sites for review HOT 4
- Remove si3t.ch (UM?) HOT 2
- Review small sites for UM HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from 512kb.club.