4chan / 4chan-api Goto Github PK
View Code? Open in Web Editor NEWDocumentation for 4chan's read-only JSON API.
Home Page: http://www.4chan.org/
Documentation for 4chan's read-only JSON API.
Home Page: http://www.4chan.org/
Have a small personal project in nodejs and I found out that the 'If-Modified-Since' header does not seem to work (anymore), thus never returning the supposed 304 when the json has not been changed. API rule 3 is rendered useless because of this.
Here a snipped for nodejs with request:
var request = require('request')
var options = {
uri: 'url of a random json that has not been updated',
json: true,
headers: {'If-Modified-Since': new Date().toUTCString()}
}
request(options, function(err, res, body) {
console.log(options.headers['If-Modified-Since'] + ' - If-Modified-Since')
console.log(res.headers["last-modified"] + ' - last-modified')
console.log('StatusCode: ' + res.statusCode)
})
Sample output, that should yield a 304, bceause it was last modified an hour ago:
Mon, 31 Aug 2015 02:24:24 GMT - If-Modified-Since
Mon, 31 Aug 2015 01:26:01 GMT - last-modified
StatusCode: 200
Or did I miss something here?
archive.json
would be much more useful if it contained full thread info like boards.json
, rather than just thread IDs.
Hello,
I'm a developer of a small archive tool that can monitor & download threads/images.
Sometime this week my tool started to log a bunch of http 503 errors, for catalog and thread updates, but also for the images. In firefox (without ad blockers or any add-ons running) the thread updating seems to stop working too.
I noticed when I get a new ip address from my isp (I have a dynamic ip address) the tool works for some time without problems before it starts logging 503 error again.
My tools wait at least 30 seconds before requesting thread or catalog infos again, uses the same protocoll and makes requests with the "if-modified-since" header.
I think that some important data is missing from the api as it currently exists.
Now apps need to hardcode the configurations into their code, and need to update each time a new board is created, or when a board changes config.
The boards.json file already exists, and I think it would be a good idea to put these settings in there.
The following board configurations are missing from the api:
And optionally, knowing these would be nice (see also #17 and #18), but they are really specific:
About the image spoiler thing, it currently exists on threads OPs, but there is no way to know if image spoilering is supported when posting.
I know that the api is read-only, but this would help me out a lot.
I suggest the following in the boards.json:
{
"boards": [
{
"board": "hr",
"title": "High Resolution",
"ws_board": 0,
"per_page": 15,
"pages": 10,
"cooldowns": {
"thread": 600,
"reply": 30,
"image": 60,
},
"file_types": ["gif", "webm", "png", "jpg", "pdf", ...],
"max_file_size": 8388608,
"max_webm_size": 3145728,
"max_comment_length": 2000,
"preupload_captcha_enabled": 1,
"uid_enabled": 0,
"image_spoiler_enabled": 4,
"text_spoiler_enabled": 0
},
...
]
}
Optionally, I suggest encoding the file_types like this, if the max sizes differ per type. (It already differs for webms):
"files": {
"pdf": {
"max_size": 4194304
},
"webm": {
"max_size": 3145728
},
"png": {
"max_size": 3145728
},
...
}
Hello,
I've created a 4chan App concept, similar to Clover (available on the Play Store), but compatible with all main mobile systems and including a better ui/design.
My question is about monetization. It's clear to me that I can't use the "official" tag or any kind of subjection that may lead the user to think it is (like using the name '4chan' explicitly). That's fine, but how the rules apply to monetization? Can I add modules to my app (non intrusive adds or donatives for example) to make it sustainable/worth-while? (iOS store for instance, is expensive to maintain)
Thanks.
John.
From README.md
You may not use "4chan" in the title of your application, product, or service.
So, let's say, purely hypothetical of course, one develops C bindings for the 4chan REST API, it may not be called something with 4chan in its name? Eg. #include <4chan.h>
? Shall the include be called something along #include <digit_japanese_name_suffix_imageboard_reader.h>
then?
(found in 2a980eb, the D is silent, kthxbye)
See title. I have a script processing the threads.json endpoint and I recently noticed that it's getting blocked by CloudFlare's browser integrity check. Can the check be disabled for that endpoint, since it's meant to be consumed by scripts?
Some additional details that could be useful to have in boards.json:
Causes issues like
http://boards.4chan.org/qa/thread/1297688
How can I use /bant/ from clover or other mobile apps? This happened during the mixed boards like /fitlit/ too and links don't show up either
A list of all the banners would be nice to have. Right now I'm keeping a list for the use of 4chan X's banner-changing feature, but the only way to update it is to make HEAD requests to all banner filenames from 0 to 299 and see which ones 404. And even this can be unreliable due to variances in what Cloudflare has cached in different places.
For the endpoints
http(s)://api.4chan.org/board/res/threadnumber.json
http(s)://api.4chan.org/board/pagenumber.json
http(s)://api.4chan.org/board/catalog.json
http(s)://api.4chan.org/board/threads.json
we should be able to provide a query parameter last_modified
so that only objects that are newer than the last_modified
directive are returned. Could save a lot of time when accessing large threads.
This is somewhat a duplicate of one of the points in #17.
Currently, there is no way for a program to deal with a change in endpoints, for example when the image URL changed from http(s)://i.4cdn.org/'board'/src/'tim'.'ext'
to http(s)://i.4cdn.org/'board'/'tim'.'ext'
. Active developers could change these URLs, but programs that weren't updated were left in the dust.
I propose a new file called options.json
(or similar), sitting at the root of http(s)://a.4cdn.org/
.
This file would define all the routes to allow for programs to dynamically resolve any and all URLs needed for operation.
It's format could perhaps be like follows:
{
"version": 1,
"json": {
"thread": "a.4cdn.org/`board`/thread/`no`.json",
"page": "a.4cdn.org/`board`/`page`.json",
"catalog": "a.4cdn.org/`board`/catalog.json",
"threads": "a.4cdn.org/`board`/threads.json",
"boards": "a.4cdn.org/boards.json"
},
"images": "i.4cdn.org/`board`/`tim`.`ext`",
"thumbnails": "t.4cdn.org/`board`/`tim`s.jpg",
"spoiler": "s.4cdn.org/image/spoiler.png",
"custom-spoiler": "s.4cdn.org/image/spoiler-`board``custom_spoiler`.png"
"icons": {
...list of all the other icon resources go here
...again using the backtick if there is to be replacing (only country for now)
}
}
The version would be incremented for any change to the API, allowing programs to output a warning if the API is a newer version than what has been programmed.
The maximum possible value for fsize
is listed as 8388608, but the API covers /f/, which allows uploads up to 10485760 bytes in size.
Also, the tim
attribute is described as "UNIX timestamp + microseconds," but the last three digits are milliseconds (thousandths of seconds), not microseconds (millionths of seconds).
For whatever reason, this seldom-used feature causes the post and image limits to be reported as 1, instead of as 0 like they are for stickies. This interferes with both native and user extensions, making it difficult to post images.
Since this is a feature rarely seen in action, I have attached the API output from the thread that brought this to MVB's, and therefore my, attention:
https _a.4cdn.org_a_thread_152333795.json.txt
Side note: I'm more used to APIs returning -1 to mean infinite/unlimited, as opposed to 0. Since we do have locked stickies, which might confuse a client not equipped to deal with them, I would suggest we consider at some point making the reported post/image limits for stickies and immortals as -1.
This is something /pol/ users ask for a lot because of the large number of threads where OP drops some bait and never participates in the thread again. If this information was added to catalog.json, it would be possible to filter such threads without having to download the JSON for every thread individually.
When images are deleted from a thread, they still count toward the image limit. However, in such cases, the API doesn't report the thread as having reached the image limit when it has, presumably because the number of actual images in the thread falls short of the technical image limit.
https://boards.4chan.org/u/res/1391685
3 deleted images, 148 current images (151 total image limit for /u/) - no more images can be posted
https://api.4chan.org/u/res/1391685.json
"imagelimit":0
can we have an attribute to indicate these?
http://api.4chan.org/q/res/649174.json
{
"...":"...",
"capcode_replies": {
"admin": [
650368,
652587,
652901,
653087,
654685,
655512,
655517,
655541,
658801,
658964
]
}
The images
and omitted_images
fields in catalog.json are not always accurate:
See http://boards.4chan.org/a/res/98127702
First reply has an image.
Relevant data in catalog.json is, as of writing:
{"no":98127702,"now":"12\/10\/13(Tue)02:14","name":"Anonymous","sub":"Arpeggio of Blue Steel","com":"This fucking episode.","filename":"arpeggio","ext":".jpg","w":1280,"h":720,"tn_w":250,"tn_h":140,"tim":1386659693477,"time":1386659693,"md5":"UA5KmLmZGeNHREoEin3Iyg==","fsize":155846,"resto":0,"bumplimit":0,"imagelimit":0,"custom_spoiler":1,"spoiler":1,"replies":12,"images":1,"omitted_posts":7,"omitted_images":0,"last_replies":[{"no":98130432,"now":"12\/10\/13(Tue)03:30","name":"Anonymous","com":"<a href=\"98127702#p98130397\" class=\"quotelink\">>>98130397<\/a><br><span class=\"quote\">>isn't sacrificed<\/span><br>corrected","time":1386664222,"resto":98127702},{"no":98130759,"now":"12\/10\/13(Tue)03:42","name":"Anonymous","com":"<a href=\"98127702#p98130059\" class=\"quotelink\">>>98130059<\/a><br>Not really. I'm pretty sure their cores are still separate. So the ship would appear to be controlled by both of them, although given that Takao gave Iona a MM, she might have also subourned herself as a sub-processer, leacving Iona with primary control of the ship.","time":1386664920,"resto":98127702},{"no":98131516,"now":"12\/10\/13(Tue)04:10","name":"Anonymous","com":"<a href=\"98127702#p98127702\" class=\"quotelink\">>>98127702<\/a><br>I DONT GIVE A SHIT, WHERE IS TAKAO CORE!!!","time":1386666627,"resto":98127702},{"no":98131594,"now":"12\/10\/13(Tue)04:13","name":"Anonymous","com":"<a href=\"98127702#p98131516\" class=\"quotelink\">>>98131516<\/a><br>In her engine room, with all the lewdness.","time":1386666783,"resto":98127702},{"no":98131665,"now":"12\/10\/13(Tue)04:16","name":"Anonymous","com":"<a href=\"98127702#p98131594\" class=\"quotelink\">>>98131594<\/a><br>this was not supposed to end like this, Im still recovering from Jewbro, and now, this?<br><br>FUCK IT, FUCK EVERYTHING!!<br><br>http:\/\/mp3.zing.vn\/album\/Mother-lan<wbr>d-Yuuka-Nanri\/ZWZAEI67.html<br><br><s>;________________________________;<\/s>","time":1386666969,"resto":98127702}]}
especially:
{
"images": 1,
"omitted_images": 0
}
That one reply's file seems to be ignored.
replies
and omitted_posts
seems to be accurate though.
The file http://a.4cdn.org/boards.json does not include the new /biz/ board.
Albeit at a much higher rate for v1 than for v2. But even with a low solve rate for v2, bots get through because validation is between Google and the user, meaning 4chan has no way to ban IPs that get the captcha wrong too many times.
The captcha needs to be replaced with a non-Google solution ASAP.
Sorry for not being API related.
Posted this in feedback also, but I'm not sure how closely that's monitored anymore.
Code looks something like
headers={'If-Modified-Since': pastResponseTime} query = 'https://a.4cdn.org/pol/thread/' + str(id) + '.json' response = requests.get(query, headers)
Using either current GMT or header 'last-modified' always returns <200> rather than the <304> desired.
The thumbnail endpoint (t.4cdn.org
) is currently unavailable.
The site currently uses the i domain for thumbnails, and that does work. Could the t domain be redirected to the i domain?
It would be great to be able to sort threads by thread id, reply count, last modification and so. Am I missing something or these features doesn't exist?
Thread in HTML: http://boards.4chan.org/v/res/197439890
Thread in JSON: http://api.4chan.org/v/res/197439890.json
The response for the API call doesn't contain the JSON.
The JSON isn't regenerated even after posting. It appears to be happening on a couple threads, only on /v/, since about 12 hours ago.
I'm seeing an m_img
parameter in threads. What is this?
Example thread: https://a.4cdn.org/k/thread/40218314.json
{"no":40218469,"now":"01\/07\/19(Mon)19:44:54","name":"Anonymous","com":"Look at my tactical shotgun<br><br><a href=\"#p40218314\" class=\"quotelink\">>>40218314<\/a><br>It depends","filename":"0586D749-EB4B-405F-B525-F7CF7B184C80","ext":".jpg","w":1936,"h":2592,"tn_w":93,"tn_h":125,"tim":1546908294712,"time":1546908294,"md5":"sWA3f9BnHTGl1vuQnveVpQ==","fsize":1323626,"resto":40218314,"m_img":1}
It would be useful to be able to get the board categories that are displayed on the 4chan main page ("Japanese Culture", "Interests", etc), for organizational purposes.
The tag
attribute as used on /f/ is not documented.
boards.json should be updated to reflect this.
Also the JSON for posts in the threads should contain the original number of the post. This would make it easier to preserve hidden threads and replies as well as the list of which posts in a thread are yours.
New threads on /news/ seem to require a subject, yet its require_subject
is not set to 1 in http://a.4cdn.org/boards.json.
Thumbnail resource URLs have a seemingly random subdomain prefix which is an integer.
The document says:
Thumbnails: http(s)://t.4cdn.org/{board}/{tim}s.jpg
Whereas in reality this URL can be like this:
http://0.t.4cdn.org/v/99999999s.jpg
I couldn't find anything related in the documentation.
Hello, I am an imageboard engines developer (http://tinyboard.org and https://int.vichan.net/devel/). Both engines already implement a 4chan compatible api and I, for one, would like to standarize our efforts with you, and another developers.
Many imageboards have custom URLs for images etc. IMO, the config.json file would be a nice fit, so that some application can actually work with every imageboard, of course if it's actually conforming to the new API, without any changes. The file would have to eg. describe the urls (especially, that you recently changed your URLs).
We are currently just responding to the client like if the thread had only one file. We can do some extension sending an array of additional files.
tim field is specified to be integer. Many imageboards, tho, have another characters in the filename.
It might be worthwhile to require or suggest to API users that they include info about who wrote the code doing the requests in the user-agent value. An email address or some other way for you guys to contact the author in case of issues (ex: unusual requests coming from a user-agent, causing other issues, etc).
What else is there to say? If I figured out dragonfly I would provide some debug information.
It would be useful to have a field in the thread objects in catalog.json
that is guaranteed to change when the thread is bumped or otherwise modified. We can't reliably use replies
nor last_replies
for this because a reply could get added and then deleted between fetches of catalog.json
.
I did notice the field last_modified
in the README, but haven't seen it used yet. Is this only for posts that are edited (e.g. USER WAS BANNED)?
Just over the past day or two, I've run into the no
attribute being missing from some returned posts while using the http(s)://a.4cdn.org/<board>/thread/<threadnumber>.json
API. Specifically in the qa
board.
I haven't got a clear example yet (keeping an eye out for it now), but is this a known issue with the API?
/qa/ is missing from the boards.json list. I don't know if you are going to keep the board around, but users from Clover can't add the board if it is not in boards.json.
As said in title.
Would be nice to have an API for getting the current list so it doesn't have to be hard coded and we can keep up to date automatically as flags are added/removed.
http://a.4cdn.org/g/catalog.json
200 ok response, empty body
The flags image URLs in the API docs are incorrect, these days there's only one flag image and 4chan uses CSS sprite rendering techniques to render the flag properly.
If there was some sort of API to grab flag data (flag image which you only load once, per board) and if each post had a 'flag_x' and 'flag_y' so we could select the proper flag from the sprite, I think this would work.
If you're using the catalog to display information, you usually have something like:
getThumbnailUrl: function(data) {
if(typeof(data.tim) == 'undefined' || data.filedeleted == 1 || (data.tn_w < 10 && data.tn_h < 10)) {
return 'https://s.4cdn.org/image/filedeleted.gif';
}
if(data.spoiler == 1) {
if(typeof(data.custom_spoiler) != 'undefined' && data.custom_spoiler > 0) {
return "https://s.4cdn.org/image/spoiler-" + Board.current + data.custom_spoiler + ".png";
} else {
return "https://s.4cdn.org/image/spoiler.png";
}
}
return "https://t.4cdn.org/" + Board.current + "/" + data.tim + "s.jpg";
},
The problem is, the "t" subdomain isn't the only valid subdomain for thumbnails anymore.
Now I'm seeing:
http://0.t.4cdn.org/(board)/(time)s.jpg
"0.t.4cdn.org" - I'm not sure where this "0" comes from.
There doesn't seem to be any recent commits to the API documentation which explain this behavior.
My script keeps receiving a Failed HTTP (503 Error) when attempting to retrieve the contents from http://a.4cdn.org/pol/archive.json. It was working yesterday but hasn't worked since.
It would be useful to document which characters are escaped so that they can be easily converted back to the original text. I thought it was just the stuff necessary for insertion into HTML
'&': '&', ''': "'", '"': '"', '<': '<', '>': '>'
but I'm now also seeing ç
at https://a.4cdn.org/pol/thread/79102063.json in the country name of Curaçao.
Whether or not this is a bug, the fact that most of the fields have their characters escaped (not just the comment) should be documented.
A new capcode for managers has been spotted in the wild, but has yet to be documented! Would you mind including something for this?
Example Post [dead]: http://boards.4chan.org/qst/thread/5#p639
HTML: http://pastebin.com/cLvjrcvf
Screenshot: https://i.imgur.com/iJPGwRG.png
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.