GithubHelp home page GithubHelp logo

4chan-api's People

Contributors

andyklimczak avatar battleprogrammershirase avatar catamphetamine avatar cyggy avatar dcdholder avatar desuwa avatar ereizas avatar fission-aad avatar ltrel avatar mootykins avatar r3c0d3x avatar ryanml avatar scrazzz avatar wokdav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

4chan-api's Issues

If-Modified-Since without function

Have a small personal project in nodejs and I found out that the 'If-Modified-Since' header does not seem to work (anymore), thus never returning the supposed 304 when the json has not been changed. API rule 3 is rendered useless because of this.

Here a snipped for nodejs with request:

var request = require('request')
var options = {
  uri: 'url of a random json that has not been updated',
  json: true,
  headers: {'If-Modified-Since': new Date().toUTCString()}
}
request(options, function(err, res, body) {
  console.log(options.headers['If-Modified-Since'] + ' - If-Modified-Since')
  console.log(res.headers["last-modified"] + ' - last-modified')
  console.log('StatusCode: ' + res.statusCode)  
})

Sample output, that should yield a 304, bceause it was last modified an hour ago:

Mon, 31 Aug 2015 02:24:24 GMT - If-Modified-Since
Mon, 31 Aug 2015 01:26:01 GMT - last-modified
StatusCode: 200

Or did I miss something here?

More info in archive.json

archive.json would be much more useful if it contained full thread info like boards.json, rather than just thread IDs.

API & CDN Endpoints returning http 503 errors

Hello,

I'm a developer of a small archive tool that can monitor & download threads/images.

Sometime this week my tool started to log a bunch of http 503 errors, for catalog and thread updates, but also for the images. In firefox (without ad blockers or any add-ons running) the thread updating seems to stop working too.
I noticed when I get a new ip address from my isp (I have a dynamic ip address) the tool works for some time without problems before it starts logging 503 error again.

My tools wait at least 30 seconds before requesting thread or catalog infos again, uses the same protocoll and makes requests with the "if-modified-since" header.

Important data missing from the API

I think that some important data is missing from the api as it currently exists.
Now apps need to hardcode the configurations into their code, and need to update each time a new board is created, or when a board changes config.
The boards.json file already exists, and I think it would be a good idea to put these settings in there.

The following board configurations are missing from the api:

  • are image spoilers enabled (and how many spoiler images are there)
  • are text spoilers enabled
  • supported filetypes
  • max file size
  • max webm size
  • thread cooldown time
  • reply cooldown time
  • image cooldown time
  • comment character limit
  • is preupload_captcha supported
  • are ID's enabled

And optionally, knowing these would be nice (see also #17 and #18), but they are really specific:

  • Country flags enabled, and if they're /troll/ flags or not
  • [code] tag support
  • [math] tag support
  • The current announcement
  • min/max image dimensions

About the image spoiler thing, it currently exists on threads OPs, but there is no way to know if image spoilering is supported when posting.
I know that the api is read-only, but this would help me out a lot.

I suggest the following in the boards.json:

{
    "boards": [
        {
            "board": "hr",
            "title": "High Resolution",
            "ws_board": 0,
            "per_page": 15,
            "pages": 10,

            "cooldowns": {
                "thread": 600,
                "reply": 30,
                "image": 60,
            },

            "file_types": ["gif", "webm", "png", "jpg", "pdf", ...],
            "max_file_size": 8388608,
            "max_webm_size": 3145728,
            "max_comment_length": 2000,

            "preupload_captcha_enabled": 1,
            "uid_enabled": 0,
            "image_spoiler_enabled": 4,
            "text_spoiler_enabled": 0
        },
        ...
    ]
}

Optionally, I suggest encoding the file_types like this, if the max sizes differ per type. (It already differs for webms):

"files": {
    "pdf": {
        "max_size": 4194304
    },
    "webm": {
        "max_size": 3145728
    },
    "png": {
        "max_size": 3145728
    },
    ...
}

API Rules

Hello,

I've created a 4chan App concept, similar to Clover (available on the Play Store), but compatible with all main mobile systems and including a better ui/design.

My question is about monetization. It's clear to me that I can't use the "official" tag or any kind of subjection that may lead the user to think it is (like using the name '4chan' explicitly). That's fine, but how the rules apply to monetization? Can I add modules to my app (non intrusive adds or donatives for example) to make it sustainable/worth-while? (iOS store for instance, is expensive to maintain)

Thanks.
John.

API Rule, 4chan name in product

From README.md

You may not use "4chan" in the title of your application, product, or service.

So, let's say, purely hypothetical of course, one develops C bindings for the 4chan REST API, it may not be called something with 4chan in its name? Eg. #include <4chan.h>? Shall the include be called something along #include <digit_japanese_name_suffix_imageboard_reader.h> then?

More stuff that would be useful in boards.json

Some additional details that could be useful to have in boards.json:

  • maximum and minimum image dimensions
  • maximum webm duration
  • whether audio is allowed
  • supported filetypes (PDF on /tg/, /po/, and /gd/; SWF on /f/)
  • number of replies shown in the index (3 for /b/ and /vg/, 1 for /t/, 5 for others)
  • whether the board is forced anon (like /b/ and /soc/)
  • whether threads require a subject (like /vg/)
  • whether the robot is enabled (like /r9k/)
  • whether images are disallowed in replies (like /news/)
  • other special features (dice on /tg/, oekaki on /i/, fortune on /s4s/)
  • special remarks from beneath the post form

List of all banners

A list of all the banners would be nice to have. Right now I'm keeping a list for the use of 4chan X's banner-changing feature, but the only way to update it is to make HEAD requests to all banner filenames from 0 to 299 and see which ones 404. And even this can be unreliable due to variances in what Cloudflare has cached in different places.

Last Modified Parameter

For the endpoints

  • http(s)://api.4chan.org/board/res/threadnumber.json
  • http(s)://api.4chan.org/board/pagenumber.json
  • http(s)://api.4chan.org/board/catalog.json
  • http(s)://api.4chan.org/board/threads.json

we should be able to provide a query parameter last_modified so that only objects that are newer than the last_modified directive are returned. Could save a lot of time when accessing large threads.

A way to find all the endpoints of the API

This is somewhat a duplicate of one of the points in #17.

Currently, there is no way for a program to deal with a change in endpoints, for example when the image URL changed from http(s)://i.4cdn.org/'board'/src/'tim'.'ext' to http(s)://i.4cdn.org/'board'/'tim'.'ext'. Active developers could change these URLs, but programs that weren't updated were left in the dust.

I propose a new file called options.json (or similar), sitting at the root of http(s)://a.4cdn.org/.
This file would define all the routes to allow for programs to dynamically resolve any and all URLs needed for operation.

It's format could perhaps be like follows:

{
    "version": 1,
    "json": {
        "thread": "a.4cdn.org/`board`/thread/`no`.json",
        "page": "a.4cdn.org/`board`/`page`.json",
        "catalog": "a.4cdn.org/`board`/catalog.json",
        "threads": "a.4cdn.org/`board`/threads.json",
        "boards": "a.4cdn.org/boards.json"
    },
    "images": "i.4cdn.org/`board`/`tim`.`ext`",
    "thumbnails": "t.4cdn.org/`board`/`tim`s.jpg",
    "spoiler": "s.4cdn.org/image/spoiler.png",
    "custom-spoiler": "s.4cdn.org/image/spoiler-`board``custom_spoiler`.png"
    "icons": {
         ...list of all the other icon resources go here
         ...again using the backtick if there is to be replacing (only country for now)
    }
}

The version would be incremented for any change to the API, allowing programs to output a warning if the API is a newer version than what has been programmed.

fsize + tim

The maximum possible value for fsize is listed as 8388608, but the API covers /f/, which allows uploads up to 10485760 bytes in size.

Also, the tim attribute is described as "UNIX timestamp + microseconds," but the last three digits are milliseconds (thousandths of seconds), not microseconds (millionths of seconds).

"Immortal threads" create incorrect post/image limits in the API

For whatever reason, this seldom-used feature causes the post and image limits to be reported as 1, instead of as 0 like they are for stickies. This interferes with both native and user extensions, making it difficult to post images.

Since this is a feature rarely seen in action, I have attached the API output from the thread that brought this to MVB's, and therefore my, attention:
https _a.4cdn.org_a_thread_152333795.json.txt

Side note: I'm more used to APIs returning -1 to mean infinite/unlimited, as opposed to 0. Since we do have locked stickies, which might confuse a client not equipped to deal with them, I would suggest we consider at some point making the reported post/image limits for stickies and immortals as -1.

Posts by OP ID in catalog.json

This is something /pol/ users ask for a lot because of the large number of threads where OP drops some bait and never participates in the thread again. If this information was added to catalog.json, it would be possible to filter such threads without having to download the JSON for every thread individually.

Image limit & deleted images

When images are deleted from a thread, they still count toward the image limit. However, in such cases, the API doesn't report the thread as having reached the image limit when it has, presumably because the number of actual images in the thread falls short of the technical image limit.

https://boards.4chan.org/u/res/1391685
3 deleted images, 148 current images (151 total image limit for /u/) - no more images can be posted

https://api.4chan.org/u/res/1391685.json
"imagelimit":0

Inaccurate data

The images and omitted_images fields in catalog.json are not always accurate:

See http://boards.4chan.org/a/res/98127702

First reply has an image.

Relevant data in catalog.json is, as of writing:

{"no":98127702,"now":"12\/10\/13(Tue)02:14","name":"Anonymous","sub":"Arpeggio of Blue Steel","com":"This fucking episode.","filename":"arpeggio","ext":".jpg","w":1280,"h":720,"tn_w":250,"tn_h":140,"tim":1386659693477,"time":1386659693,"md5":"UA5KmLmZGeNHREoEin3Iyg==","fsize":155846,"resto":0,"bumplimit":0,"imagelimit":0,"custom_spoiler":1,"spoiler":1,"replies":12,"images":1,"omitted_posts":7,"omitted_images":0,"last_replies":[{"no":98130432,"now":"12\/10\/13(Tue)03:30","name":"Anonymous","com":"<a href=\"98127702#p98130397\" class=\"quotelink\">&gt;&gt;98130397<\/a><br><span class=\"quote\">&gt;isn&#039;t sacrificed<\/span><br>corrected","time":1386664222,"resto":98127702},{"no":98130759,"now":"12\/10\/13(Tue)03:42","name":"Anonymous","com":"<a href=\"98127702#p98130059\" class=\"quotelink\">&gt;&gt;98130059<\/a><br>Not really. I&#039;m pretty sure their cores are still separate. So the ship would appear to be controlled by both of them, although given that Takao gave Iona a MM, she might have also subourned herself as a sub-processer, leacving Iona with primary control of the ship.","time":1386664920,"resto":98127702},{"no":98131516,"now":"12\/10\/13(Tue)04:10","name":"Anonymous","com":"<a href=\"98127702#p98127702\" class=\"quotelink\">&gt;&gt;98127702<\/a><br>I DONT GIVE A SHIT, WHERE IS TAKAO CORE!!!","time":1386666627,"resto":98127702},{"no":98131594,"now":"12\/10\/13(Tue)04:13","name":"Anonymous","com":"<a href=\"98127702#p98131516\" class=\"quotelink\">&gt;&gt;98131516<\/a><br>In her engine room, with all the lewdness.","time":1386666783,"resto":98127702},{"no":98131665,"now":"12\/10\/13(Tue)04:16","name":"Anonymous","com":"<a href=\"98127702#p98131594\" class=\"quotelink\">&gt;&gt;98131594<\/a><br>this was not supposed to end like this, Im still recovering from Jewbro, and now, this?<br><br>FUCK IT, FUCK EVERYTHING!!<br><br>http:\/\/mp3.zing.vn\/album\/Mother-lan<wbr>d-Yuuka-Nanri\/ZWZAEI67.html<br><br><s>;________________________________;<\/s>","time":1386666969,"resto":98127702}]}

especially:

{
 "images": 1,
 "omitted_images": 0
}

That one reply's file seems to be ignored.

replies and omitted_posts seems to be accurate though.

All versions of reCAPTCHA are letting bots through

Albeit at a much higher rate for v1 than for v2. But even with a low solve rate for v2, bots get through because validation is between Google and the user, meaning 4chan has no way to ban IPs that get the captcha wrong too many times.

The captcha needs to be replaced with a non-Google solution ASAP.

Sorry for not being API related.
Posted this in feedback also, but I'm not sure how closely that's monitored anymore.

Rule 3

  • Use If-Modified-Since when doing your requests.

Code looks something like

    headers={'If-Modified-Since': pastResponseTime}
    query = 'https://a.4cdn.org/pol/thread/' + str(id) + '.json'
    response = requests.get(query, headers)

Using either current GMT or header 'last-modified' always returns <200> rather than the <304> desired.

Thumnail endpoint unavailable

The thumbnail endpoint (t.4cdn.org) is currently unavailable.
The site currently uses the i domain for thumbnails, and that does work. Could the t domain be redirected to the i domain?

Undocumented m_img parameter

I'm seeing an m_img parameter in threads. What is this?

Example thread: https://a.4cdn.org/k/thread/40218314.json

{"no":40218469,"now":"01\/07\/19(Mon)19:44:54","name":"Anonymous","com":"Look at my tactical shotgun<br><br><a href=\"#p40218314\" class=\"quotelink\">&gt;&gt;40218314<\/a><br>It depends","filename":"0586D749-EB4B-405F-B525-F7CF7B184C80","ext":".jpg","w":1936,"h":2592,"tn_w":93,"tn_h":125,"tim":1546908294712,"time":1546908294,"md5":"sWA3f9BnHTGl1vuQnveVpQ==","fsize":1323626,"resto":40218314,"m_img":1}

Board categories

It would be useful to be able to get the board categories that are displayed on the 4chan main page ("Japanese Culture", "Interests", etc), for organizational purposes.

thumbnail resource path

Thumbnail resource URLs have a seemingly random subdomain prefix which is an integer.

The document says:

Thumbnails: http(s)://t.4cdn.org/{board}/{tim}s.jpg

Whereas in reality this URL can be like this:

http://0.t.4cdn.org/v/99999999s.jpg

I couldn't find anything related in the documentation.

4chan api standarization

Hello, I am an imageboard engines developer (http://tinyboard.org and https://int.vichan.net/devel/). Both engines already implement a 4chan compatible api and I, for one, would like to standarize our efforts with you, and another developers.

config.json

Many imageboards have custom URLs for images etc. IMO, the config.json file would be a nice fit, so that some application can actually work with every imageboard, of course if it's actually conforming to the new API, without any changes. The file would have to eg. describe the urls (especially, that you recently changed your URLs).

Multiple file upload

We are currently just responding to the client like if the thread had only one file. We can do some extension sending an array of additional files.

tim field

tim field is specified to be integer. Many imageboards, tho, have another characters in the filename.

Requirements For User-Agent

It might be worthwhile to require or suggest to API users that they include info about who wrote the code doing the requests in the user-agent value. An email address or some other way for you guys to contact the author in case of issues (ex: unusual requests coming from a user-agent, causing other issues, etc).

Last modification/bump date field for threads in catalog.json

It would be useful to have a field in the thread objects in catalog.json that is guaranteed to change when the thread is bumped or otherwise modified. We can't reliably use replies nor last_replies for this because a reply could get added and then deleted between fetches of catalog.json.

I did notice the field last_modified in the README, but haven't seen it used yet. Is this only for posts that are edited (e.g. USER WAS BANNED)?

'no' post attribute missing sometimes

Just over the past day or two, I've run into the no attribute being missing from some returned posts while using the http(s)://a.4cdn.org/<board>/thread/<threadnumber>.json API. Specifically in the qa board.

I haven't got a clear example yet (keeping an eye out for it now), but is this a known issue with the API?

/qa/ missing from boards.json

/qa/ is missing from the boards.json list. I don't know if you are going to keep the board around, but users from Clover can't add the board if it is not in boards.json.

Troll flags on /pol/

Would be nice to have an API for getting the current list so it doesn't have to be hard coded and we can keep up to date automatically as flags are added/removed.

Flag data not up to date

The flags image URLs in the API docs are incorrect, these days there's only one flag image and 4chan uses CSS sprite rendering techniques to render the flag properly.

If there was some sort of API to grab flag data (flag image which you only load once, per board) and if each post had a 'flag_x' and 'flag_y' so we could select the proper flag from the sprite, I think this would work.

No CORS headers

CORS is supported with an origin of http(s)://boards.4chan.org

Persuing the response headers, I see no Access-Control-Allow-Origin header in either boards.4chan.org or a.4cdn.org.

screen shot 2016-01-05 at 17 05 02

screen shot 2016-01-05 at 17 09 11

Thumbnail urls broken after change

If you're using the catalog to display information, you usually have something like:

    getThumbnailUrl: function(data) {
        if(typeof(data.tim) == 'undefined' || data.filedeleted == 1 || (data.tn_w < 10 && data.tn_h < 10)) {
            return 'https://s.4cdn.org/image/filedeleted.gif';
        }

        if(data.spoiler == 1) {
            if(typeof(data.custom_spoiler) != 'undefined' && data.custom_spoiler > 0) {
                return "https://s.4cdn.org/image/spoiler-" + Board.current + data.custom_spoiler + ".png";
            } else {
                return "https://s.4cdn.org/image/spoiler.png";
            }
        }

        return "https://t.4cdn.org/" + Board.current + "/" + data.tim + "s.jpg";
    },

The problem is, the "t" subdomain isn't the only valid subdomain for thumbnails anymore.

Now I'm seeing:
http://0.t.4cdn.org/(board)/(time)s.jpg
"0.t.4cdn.org" - I'm not sure where this "0" comes from.

There doesn't seem to be any recent commits to the API documentation which explain this behavior.

Document which characters are escaped

It would be useful to document which characters are escaped so that they can be easily converted back to the original text. I thought it was just the stuff necessary for insertion into HTML
'&amp;': '&', '&#039;': "'", '&quot;': '"', '&lt;': '<', '&gt;': '>'
but I'm now also seeing &ccedil; at https://a.4cdn.org/pol/thread/79102063.json in the country name of Curaçao.

Whether or not this is a bug, the fact that most of the fields have their characters escaped (not just the comment) should be documented.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.