GithubHelp home page GithubHelp logo

transfer-size's Introduction

Archive of "Transfer Size Policy"

This incubation has been archived and no longer being persued (see #19). It may restart if there is renewed insterest.

Goal: enable developers to set and enforce limits on network usage by nested contexts (i.e. iframes).

Today the top-level frame is unable to audit or enforce size limits on resources used by a nested context. For example, the developer may take great care to optimize their assets to fit within a specific quota, but the moment they add a third party resource that uses (or injects) a nested context, they have no way to track or enforce limits on the resulting size.

In practice, this can result in mis{behaving, configured} parties fetching large amounts of data, which violate site owners expectations and harm the user experience. To address this, frames could trigger an event once they've consumed at least a specified number of bytes over the network. The page could then handle the event in any way it chooses, such as logging it, displaying a notice to the user, pausing it, unloading it, etc. There might even be a future API to cancel all current and future network requests for a frame as a response.

Here is an example which limits an iframe to 300KB total:

  <script>
    function onBigFrame(event) {
      console.log("Frame id: " + event.target.id + " exceeded its threshold bytes in category: " + event.category);
    }
  </script>

  <iframe id="foo" ontransferexceeded="onBigFrame(event)" transfersize="300" src="...">

Here is an example that limits the total size of the page to 500KB, of which CSS can use 50KB max:

  <iframe id="foo" ontransferexceeded="onBigFrame(event)" transfersize="style:50, total:500" src="...">

And here is an example that only limits the size of videos on the frame to 10MB:

  <iframe id="foo" ontransferexceeded="onBigFrame(event)" transfersize="video:10240" src="...">

To see a list of resource types that can be limited, see the Fetch spec's request-destination section.

Transfer Size Policy can also be configured via response headers. This is especially useful if third-party script creates the frame such that you can't add the attribute to the frame yourself. In this example, the default threshold for a frame is 100KB, the site's origin is unrestricted, and *.example.com is limited to 500KB and 50KB of style:

  transfersizes: {"default":"100", "self":"*", "*.example.com":"style:50, total:500"}

and in your script:

document.ontransferexceeded=onBigFrame;

Use Cases

  • Enforcing size policies on content, such as AMP's 50,000 byte limit on CSS
  • Logging how often your own site grows larger than desired
  • Monitoring (or even enforcing policy upon) how often ads get too large

Details

Each frame with a transfersize will count the number of over-the-wire bytes that it and its child frames (transitively) receive from requests that utilize the network (cached responses are not counted). Once a threshold is crossed, the bubbling transferexceeded event (a new event type which includes the exceeded threshold and resource category ) is fired on the frame element in the parent frame.

If any of the network bytes are from origins other than the embedding frame's origin, a random pad is added to transfersize so as not to reveal the exact size of the resources in the frame.

There will be a cap on the number of transferexceeded events that will be fired per top-level pageload in order to help preserve privacy.

Headers and IFrame Attribute Conflict Resolution

If both headers a transfersize attribute apply to a frame, the policies are merged such that the most restrictive policy wins.

Response

It is up to the event handler to determine what to do about the frame exceeding its set size. Examples include:

  • Logging that the frame exceeded its threshold
  • Pausing the frame
  • Unloading the frame
  • Informing the user

Privacy & Security

According to the same origin policy, it should not be possible to determine the response size of a cross-origin resource. It's impossible for this API to exist and not leak some amount of size information. In particular, it leaks information about the full size of a page. The best we can do is to limit the leak to as small amount as is practically possible. That is what is done for other APIs that leak size information.

There are two mitigations to help minimize size leakage:

  1. Frames with network responses that are cross-origin with respect to the embedding page have a random amount of bytes added to that component of their transfersize. The developer can be certain that the transferexceeded event will only fire if at least the threshold bytes were observed, but it may have been even more in a cross-origin situation.
  2. There is a cap on the number of transferexceeded events that can be fired per top-level page load to stop adversaries from taking lots of samples in order to statistically defeat the random padding. (e.g., 10-100)

Specifics about the random pad distribution and the size of the event cap will be provided in the specification.


transfer-size's People

Contributors

addyosmani avatar igrigorik avatar jkarlin avatar marcoscaceres avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transfer-size's Issues

Mitigating the cross-origin size leak if we don't use TAO opt-in

Imagine that all cross-origin requests are counted against the frame's size. Then the size of the cross-origin resource is trivially leaked by the policy. For example, set a max size of 2KB and then load a cross-origin resource in the frame, did it violate? If yes it's bigger than 2KB. Rinse and repeat to find the exact size.

We'll never be able to leak no information about the size of the cross-origin resource, but ideally we can leak no more information that can already be obtained via other side channels, such as network timing the vast majority of the time.

One way to get there is to cloak the response size by fuzzing the enforced max size. For example, the developer says 100KB but the UA actually enforces 100KB + some secret random value. Information is still leaked this way: the distribution of violations to non-violations tells you how close your resource is to the max size. If most (but not all) samples violate the size policy, then the resource is just under max_size. If very few samples violate the size policy, then the resource is near max_size - max_random_pad. And if it's 50/50, then the resource is somewhere around max_size + average_random_pad_value.

The fuzzy approach is safer the fewer the samples that the attacker can gather. So I propose limiting the total number of size policy violations to some constant value per top-level-page navigation, and after that point all frames with a size-policy on the page are frozen.

I'll write up a less hand-wavy doc with specific values and attach it to this issue.

Request to rename Content Size Policy to Network Policy

Content Size Policy conflates responses from the network and responses from cache/cachestorage/indexeddb/etc. These are really separate resources (network and storage) and can be treated individually with their own policies. Sites are likely are more interested in network policy (as network bytes cost users money and bandwidth is often severely constrained) than storage policy (in fact, we want to encourage cache usage), so I propose that we rename this repo to network policy and consider a storage policy down the road.

What do you think? If it doesn't make sense to rename this repo and we want to leave the original content size policy here, I can start a new repo.

Restrictions modified by browser config or platform flag?

This would be most helpful on low memory devices (eg phones, old computers, SoC), but less helpful on my Intel i5 20Gb laptop. Could the restrictions be targeted to mobile? Might be helpful if the browser auto-configs according the available memory buffer, but could also be a 'fingerprint' that can ID an user.

Specifying limits in iframe request headers

It would be useful if limits on iframe were sent along with iframe request. For example website owner restricts all iframes to 100kb to make sure that ads are small (I assume iframe === ad here). But ad servers have no idea about limits and will return bigger ads. At the same time if ad server new about restriction they could have either server smaller ads (if possible) or just return empty response saving bandwidth and cpu.

CSS property for TSP?

Let's say you want to enforce a TSP policy only on iframes in a div on your page, even those frames that third-party script inject. How would you do that? You could use the response header and list iframes by origin, but that's rather coarse and might be used outside of the div. It would be really nice if you could use CSS instead, like so:

#divId iframe { max-transfer-size: 100KB }

I'm not a CSS expert, is this crazy talk?

Setting transfer size in the response header?

<iframe max-transfer-size="300kb" src="..."> is useful for iframes inserted by the first-party document, but doesn't help the first-party to control what frames inserted by third-party script can do. Which makes me wonder if we should have a response header, something like:
Transfer-Size-Policy: default 100; self *; *.example.com 200

Where the default unit is KB. WDYT? Paging @igrigorik @ojanvafai

Report-only mode

As written in the explainer, once a frame violates it is no longer allowed to request resources, but is otherwise allowed to run. It seems useful to have some sort of default enforcement (to make it more like other sandboxy apis). But it may also be useful to only provide a report and not stop future requests, allowing the embedding page to deal with the violating frame (if at all).

Use case 1: Measure how often frames violate without disabling them.
Use case 2: Alternative violation enforcement, such as displaying a warning.

Scenario: video playback

One of the frequent uses of an iframe is for video (any have time for an HTTPArchive query?).
So what happens if someone wants a higher quality video playback? Of if they travel to another 'suggested' video afterwards?

Document resource-types supported in transfersize

transfersize appears to support specifying a resource type (style, video etc) for providing budgets beyond just the frame level. It would be great to document what this list of resource types is.

Is it a list? Are we just matching these transfer-sizes to tag-types? (e.g <style>, <video>, <script>?) Some clarification would be helpful here.

  <iframe id="foo" ontransferexceeded="onBigFrame(event)" transfersize="style:50, total:500" src="...">
  <iframe id="foo" ontransferexceeded="onBigFrame(event)" transfersize="video:10240" src="...">

Is this still active?

Just checking if this incubation is still active... last activity seems to have been a year ago? Should we archive this?

Header vs attribute configuration

How both configuration interact? For example in the following use case: ad network js creates an iframe and sets transfersize to 300. At the same time website owner returns configuration in header which sets transfersize to 100. Which one wins? The one with stricter limits? Assuming that both ad js and website js register listeners, will they be able to determine which limits were exceeded?

Accounting with encodedBodySize doesn't work with SDCH

A malicious iframe can request tiny resources that advertise huge dictionaries. Unless we data account those dictionaries, the frame can use the huge dictionaries as an effective way to bypass data accounting.

Counting SDCH downloads to a particular frame has a complex implementation cost in Chromium. Do we think we can move encodedBodySize to decodedBodySize to fix this bug?

Questions about design

It seems like one of the primary motivators for transfer-size is (per the README), enforcing size policies on content. In the current shape, transfer-size doesn't appear to actually perform the enforce step, deferring this behavior to the end developer in ontransferexceeded. Do we know what this was? (pardon if this is a redundant question).

As a result of this, transfer-size seems minimally useful unless you also have the Pause Document API. It's unclear to me as someone that would like the control to enforce size policies why the ability to pause the document loading is not something baked in. e.g ontransferexceeded=pause without having to jump back into JS to get this behavior.

Logging, while useful, provides questionable utility when we're talking about ads as each ad-based iframe payload can wildly vary. Perhaps I guess it's useful to beacon back to analytics how often your ads are breaking their budgets?.

I could see granularity/control being something we can do in a very limited way declaratively, thus a JavaScript API that supports doing this during runtime would make more sense. Especially since Pause Document supports doing this for rendering, loading and script. `ontransferexceeded=pause-script, pause-rendering' is likely going to be too clunky.

Mostly trying to understand some of the design decisions here :)

TAO opt-in: pros, cons, and implementation

As a thought experiment, let's say we defined Content Size to require mandatory TAO opt-in:

  • The (iframe) document must provide TAO opt-in: this exposes the size of the document to the embedder.
  • All resources within the embedded context must provide TAO opt-in: this exposes the size of each resource to the nested context. The iframe'd document by providing the TAO opt-in to the embedder then also exposes it's subresource total.
    • Resources that don't provide the TAO opt-in are blocked by the user agent.

The above model means we can expose exact byte counts. The embedder wouldn't see the specific resources fetched by the nested context, but it would know their total size.

The downside to the above is that it requires explicit opt-in by the emdedded content.. which may or may not be practical for some of the use cases we'd like this be used in.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.