beakerbrowser / specs Goto Github PK

View Code? Open in Web Editor NEW

0.0 3.0 1.0 219 KB

Beaker's specifications [way outdated and inaccurate]

specs's Introduction

Beaker Browser Specs

Every specification has its own file. To discuss the content of a spec, open issues and PRs on this repo.

Getting started

Looking to learn about how Beaker works? Start here:

User data and identity
- Beaker user identities. APIs and UI flows for user identities.
- Beaker user filesystem. The filesystem for users' personal information.
Filesystem APIs
- Object-store folders. Managed folders which help applications share data.
Manifest files and metadata
- dat.json. The dat.json standard manifest file. Used to describe a dat.
- Dat types. Standard dat "type" values and their effects in Beaker.

Spec	Description	Last Updated
`unwalled.garden`	Not yet written. A collection of schemas used by Beaker.
`DatArchive` API	Not yet written. Refer to the documentation for now.
`PeerSockets` API	Not yet written. Refer to the PR for now.
`DatPubkeyFile` API	Not yet written. Refer to this gist for now.

Spec	Description	Last Updated
Beaker user filesystem	The filesystem for users' personal information.	Nov 6, 2018
Beaker user identities	APIs and UI flows for user identities.	Nov 7, 2018
Dat types	Standard dat "type" values and their effects in Beaker.	Nov 6, 2018
Object-store folders	Managed folders which help applications share data.	Nov 6, 2018

Spec	Description	Last Updated
dat.json	The dat.json standard manifest file. Used to describe a dat.	April 20, 2018

Status badges

Deprecated

specs's People

Contributors

Watchers

Forkers

isabella232

specs's Issues

User Dats and Privacy

What's the long-term plan for dat privacy (e.g. read access)? AFAIK, in its current form, the only way to keep a dat private, is by preventing the exposure of its URL. For me user dats are the first point where I might desire some semblance of privacy.

If URL exposure is going to be the primary means of privacy, going forward, we should probably start providing APIs that aren't built around the archive URL. For example the navigator.session.id API mentioned in 0004 is problematic. If the entire API hinges on navigator.session.id, it would make hiding the URL later very difficult.

If URL privacy was going to become a thing, we would have to start...

supplying FS access (possibly to specific folders later) without exposing a full Archive object with archive.url
worrying about applications reading the dat.json to expose the url that way.
worrying about outbound https traffic that might try to send user data to other apps (that stricter CSP is starting to sound like a good idea)
worrying about an app writing private user data to another archive that the user owns that might be publicly available.
Figuring out how a dev would be able to easily display things like images and use similar browser APIs that rely heavily on the URL.

If URL privacy is unreasonable, do we have an alternative? If not, we probably need to start being defensive about this from the outset.

Site routing in DAT?

Not sure if this is the right place to hatch this idea (or whether it’s already been suggested elsewhere), but I want to put forward a use case and a possible implementation and get your thoughts on it (including on where I might have the wrong preconceptions and/or incorrect assumptions):

Use case

I don’t want a single archive per DAT site as this will quickly get rather large. I also would like to delete posts, etc., and that’s not currently possible.

For my blog, the sweet spot might be one archive (e.g.) per post.

Currently, as far as I can see in the DAT DNS DEP, I can set up a DNS mapping in ./well-known/dat for the root of the site and any paths off of that route are assumed to be in the same archive that the root maps to.

So, for dat://live.ar.al, for example, I have:

dat://bfb2eeb077826ecee6c1105d419755d5d8e0893d653d3ce39e50aee2c00b7701/
TTL=60

And the site is comprised of a single DAT archive (bfb2eeb…).

One way to implement a more granular mapping would be to extend (bastardise) this to include routing from multiple internal URLs to DAT public keys. However, this is not ideal for a number of reasons, including that the routing information itself would not be exposed over DAT so the site itself would not be a DAT site.

Suggestion

Beaker follows a DNS mapping as per the DAT DNS DEP
When/if it finds the root/index DAT archive, it looks for a routing file within the root of that archive (e.g., index-dat.json)
Beaker uses the routing information from that file, if it exists, to map paths relative to the root to multiple DAT archives.

e.g.,

{
  "routes": [
    {
      "path": "assets",
      "key": "6ef7628650d24ea0dbb8aa20788e9b6750c32a78e2e251a58a88c75c6bbd3876"
    },
    {
      "path": "videos",
      "key": "2968a49e9d87317c51d086bf497c648331efd8650fbc517ee0d9d25a00d104df"
    },
    {
      "path": "post1",
      "key": "87ed2e3b160f261a032af03921a3bd09227d0a4cde73466c17114816cae43336"
    },
  ]
}

So, for example, the above index would lead the following route mappings:

dat://live.ar.al/assets/images/some-image.jpg → dat://6ef7628…/images/some-image.jpg
dat://live.ar.al/videos/large-video.mp4 → dat://2968a49…/large-video.mp4_
*dat://live.ar.al/post1/ → dat://87ed2e3b…/
dat://live.ar.al/some-other-path/amazing.html → dat://bfb2eeb…/some-other-path/amazing.html (the default archive as presented by the DAT DNS lookup)

Advantages

Index DAT file contains routing (and the HTML index of the site) and is under DAT itself
Multiple archives allow for deletion of content
Multiple archives allow for separation of concerns. Linking of other archives into a site (embedding archives)

Questions

Apart from the deletion use case, how much of the file size/replication use case is already covered by sparse archives?

Thoughts? :)

Consider folder extensions over Index.json types

One of the things I loved when I first started using OSX was the way they handled app packages (e.g. Safari.app). In my mind I see a lot of parallels with objectstore and keystore. They are types of folders that the average user shouldn't ever need to view the contents of. I also imagine they will be presented to the user as a custom UI. By giving these folders extensions like .objstore & .keystore and "pretending" that they are just normal files, I believe that it will be much more intuitive for a user to double click these packages and open them in an "objectstore editor/browser" or similar.

protected is different to me since it's purely permissions and as I mentioned in #13 it still feels weird to have file-based permissions.

"Blob storage" in object stores

Apps that access the object-stores need a way to store/publish non-objects such as pictures. This can be as part of the object-stores or by some other mechanism, but it needs to fit into the object-store flows.

`beaker-browser --disable-web-security` command to make Beaker friendly to automated tests

In some situations in chrome I've needed to chromium-browser --disable-web-security. Something similar for Beaker Browser would be useful in situations when you are doing test driven development and don't want to have to click to do things like authorizing the creation of a Dat Archive.

Projects and Workspaces Feedback

First off I'll say that I like the idea as a whole. I like that we're moving publishing to the background. I especially like that we're removing this from the DatArchive api.

To me the terms Projects and Workspaces don't immediately speak to what they do. Both terms are very similar (in the IDE world at least) and a bit generic. I think this is OK, but it will add to the overall learning curve.

One thing that's not entirely clear to me is what you're doing with (1) Forked Websites and (2) Websites that are created by the DatArchive apis. Both of these things are similar to Projects in that you "own" them, but for self-mutating sites, the user may not need to edit their files at all (and therefore wouldn't need a workspace).

I'm not sure how y'all were going to be laying out the ui, but I'd like to propose a workflow based on what I know:

Drop "Projects" and instead add a "My Websites" section.
This "My Websites" section would contain all forked websites, DatArchive created websites, and manually created websites.
You can manually create a site under this section by clicking a create or + button. This would immediately create a Workspace and send you to the "Workspace" screen.
For any website in the "My Websites" section, you would be able to click an edit button that will send you to the "Workspace" screen. You could delay Workspace generation until this button is clicked for the first time.
On this screen there should be a preview button that sends you to the workspace:// url.
an "unpublished changes" badge should be displayed on the "My Websites" section for any website with a workspace that still has changes in the staging area.

The main benefit of the "My Websites" concept is that it makes the "Projects" page more useful to our less technical users. Even without technical knowledge, the "My Websites" section could be used to browse, navigate to, and manage personal websites. As a bonus, if they ever decide to dip their toes into the dev world, a workspace is just an "edit" away. I think Beaker is unique in that it blurs the lines of user and developer. This would be one way to embrace that quality.

Concerns w/ Index.json

As I've mentioned, I like most of where the FS proposal is going, but the Index.json proposal seems problematic to me. I'm specifically worried about the user experience of non-devs if the file was exposed to them in the UI.

I feel that it could easily get in the way of actions like "select all Documents and drag them into a new folder". I also think that users would have a knee-jerk reaction of deleting the file since they didn't make it (see prior examples of users deleting system32 because they needed more space for their mp3s). In my mind, these types of files make sense for devs building a website (e.g. dat.json), but as soon as we enter the space of pure users, they're going to be tripping hazards.

You could make index.json a hidden file, but even with this you're one step away from recreating the oh so beloved .DS_Store. Deleting something like a DS_Store is usually harmless, but in the case of Index.json you would be accidentally deleting permissions, which to me seems dangerous.

At the very least I think these files should be hidden somehow, but possibly even completely abstracted from the UI. It would be nice if there was an alternative way of sharing folder metadata between devices that wasn't as problematic, but I honestly don't have any suggestions here.

(side note: the name is also a bit confusing, but that's a smaller issue imo)

App designated archive for intenal storage.

Overview

I have being writing multiple exploratory apps and have noticed recurring pattern:
App creates content which needs to be stored somewhere. And it has following options, which unfortunately have different downsides (that I'll describe inline below)

Ask user to select an archive to write into.

Often implies that data will be shared across multiple apps, which is for some kind of data is undesired.
Implies that there might be multiple archives. Which makes UX suffer and complicates the application architecture.
Good UX suggest to not ask questions prematurely, which forces you to either risk loosing data user created before you ask for permission to save, or ask permission sooner than it would make sense.
Requires app to preserve selected archives somewhere like indexDB or localStorage. But it also means hat app should be able to gracefully handle case that get's cleared and have even worse UX when that happens.
Raises question how frequently should content be saved into chosen archive. Each write has overhead in creating version communicating update etc so in case user content is a text document it's becomes non trivial choice.

Create a designate archive for writing into.

Has all the same issues as above, except you can have more subtle permission request that does not requires user to take immediate action.

Use IndexedDB instead

This addresses most UX concerns but has engineering burden

IndexedDB as far slower (at least on my machine) sometimes ridiculously so.
IndexedDB API is very different so when it's time to actually make user created content sharable moving it from IndexedDB into DatArchive is fairly inconvenient.
There is virtually no pass data from IndexdDB to the other app. In theory combination of some RPC mechanism and ArrayBuffers might allow that but as things stand now it's not.

Disclaimer: Last claim isn't exactly true I end up hacking around using combination of hidden iframe, new tab, BroadcastChannel and Blob URLs. But I don't think it's fare to say it's impractical.

Proposal

I propose to assign each app designated DatArchive instance with a limited space. Maybe something like DatArchive.loadStorage(location):Promise<DatArchive>. In a way it is similar to self-mutating site pattern except:

Loaded archive will be archive designated to write data into rather than archive of the site doing a call.
There will be 1x1 mapping between calling archive and designated storage one (so that app won't have to remember).
Loaded archive will have some designated storage capacity (I think it would be reasonable to let app request increase in quota).

This would address use case describe nicely as apps will be able to:

Store draft content in the own archive.
Store as frequently as it makes sense.
Would not need to prompt user ahead of time.
Would allow app to pass data to another app just by passing a URL. (For example navigate to the app that does publishing and pass query argument for data it wishes to be published).

Considerations

If app is able to write data that in turns is an app that gets own writing space it becomes possible to overcome space limit. I propose to treat archive returned by DatArchive.loadStorage(location):Promise<DatArchive> differently in that same call from with-in that archive would just return same archive. That would prevent hack to avoid space limit, but at the same time would not limit space archive in any other way.
Should space archives be replicated or stay local ? There is a benefit to allowing replication as it would allow simple backup and with multi-writer might allow sync across devices as well. That being said it might make sense to prompt user first time other peer will attempt to fetch data from it as it would mitigate accidental data leak while isn't going to be too much burden for backup / sync.
Should space archive be versioned / should quota take versioning into account ? One one side you'd want to write there often and not care about keeping the history but on the flip-side some apps might benefit from having revisions this could provide, but versioning without taking that into quota considerations could be abused too. I'm inclined to suggest that dropping versions and have a way to manually pin / unpin specific versions might be the best. If that is technically not viable, I'd suggest no versions as at the end of the day app could create own versioning mechanism if that is desired.
Is storage mapped to a domain name or an archive key ?
I'm not sure what would be a good choice as on one hand I can see wanting domain name in case I map a different dat to it in the future. On the other hand I also see not wanting to makes users loose data in case I end up changing domain name.

Ideally store could survive both domain name changes and mapped archive changes but I do not know how it can be pulled off. Another option would be to detect those migrations somehow and prompt user when store is being requested to provide a choice.
Should site forks also fork corresponding store ?

I think one forks site to make changes to it so it would make sense to also also fork store along or maybe share the same store ? Furthermore not sure what the user of a fork should get. My instinct is it might be best to just create separate stores and have some mechanism for user to to migrate data from one to other.

Securing the user's personal dats

We need to make sure the private/public dats can't have html/js written which then does something ugly to the user's data.

Some ideas:

The private dat maybe shouldn't execute HTML/JS at all, because it creates a huge data-exfiltration risk. It also shouldn't allow CORS.
Perhaps the public dat should have reduced privileges. For instance, it should not be allowed to self-modify.
We could remove the ability of apps to write html/js, though that does stop "website making apps" from working on the user's website.

What would the Shared Data Layer look like without WebDB?

Why I'm Uncomfortable with WebDB

In the WebDB proposal, you are laying out some recommended "social primitives" built on top of the shared data layer proposal. As mentioned on twitter, I'm not completely comfortable with WebDB yet. Based on our conversation, I think you eventually want to provide developers with a way to share their own WebDB-like user data-types. If WebDB is just a stopgap for the more generic solution and it's API will eventually be "explained" I can kind of understand moving forward with some version of WebDB, however I think it produces some knee-jerk reactions for some developers like me. Some of the reasons I think I'm having these knee-jerk reactions are:

I'm not used to a browser having strong opinions about the contents of my app

"likes" vs. "reactions"
Even with enforced data-types, I feel like there will still be compatibility issues

I want bold and italic in posts, did we all agree on saving our posts as html, markdown, rotondeml, or something else?

or...

I don't want bold and italic in my app, am I going to have to filter out everybody's html now?
I don't want new Beaker developers to give up on Beaker as a whole because they felt pressured to build their app a specific way or else suffer a loss to power/usability.

I think the last point is the main reason I'm knee-jerking about this. If WebDB is just a way for you to "buy time" until you can put together a better solution I think I'd argue that even without these proposed APIs Beaker is already super cool and super powerful. Developers are already building interesting things. There's still plenty to explore in the currently landscape.

The Shared Data Layer without WebDB

But I'm also having a hard time understanding what the Shared Data Layer would offer without WebDB. In the proposal, most of the examples also mention or include WebDB. So I had a few questions about some things that I felt the proposal didn't completely clarify:

In the following example, what is the service?
```
var session = await navigator.sessions.request(service, {permissions?:})
```
Later on you use the string 'webdb'. Is that a browser defined token just to get a "WebDB" session (whatever that might mean)? Is it just a user defined string? If so how does that work? Do we have to worry about naming conflicts?
Other than pass a ServiceSession into a WebDB constructor what else can I do with it?
On twitter you mentioned:

apps can still ask for full write access to a user dat and then create their own indexes

What would that look like? Would this be done through the session api or would a developer need to use something like DatArchive.selectArchive()? Are user dats exposed to DatArchive.selectArchive() like a normal archive (that'd be a little weird but... idk)?

In general I'm just trying to figure out if there's any use for the session API without WebDB or if it relies on its existence to be useful.

The fifth element of a minimal social network: blocking

From the WebDB proposal:

Here are the data types for WebDB which we're proposing for 0.8:

User profiles

Timeline posts

Comments

Votes

This is the basic toolset needed to make a simple Twitter clone. We'll expand it with new data models and types with subsequent 0.8.x releases.

I hate to be a broken record about this, but the above is not all of the basic toolset. The capacity to block a user from reading your content is necessary either to build in as early as possible, or to have a very clear story about. To do neither of these things, or to do the second unconvincingly, is to tell a large proportion of the population that one purported future direction of the web is not being made with them in mind at all.

While I expect that 0.8 will draw a lot of attention, public relations is not the only reason to be thinking about blocking right now: for all I know, the best solution implies breaking changes to WebDB as proposed here. I don't know of a way to hide some records from some readers without putting them in a different archive entirely, for example.

Questions about WebDB and Injest

I'm confused by the discussion of the distinction between the DatFS, Injest and WebDB "layers" on the WebDB page. In particular, all links to Ingest redirect from https://github.com/beakerbrowser/ingestdb to https://github.com/beakerbrowser/webdb. Are these separate packages/things? Is this documentation out of date and everything has now been rolled into WebDB? 🤔
At a glance WebDB / IngestDB looks to be a reasonably well thought-through JS database implementation. But I wonder why you are taking this on (defining/implementing/maintaining yet another DB/Query Language/API on top of IndexedDB / LevelDB) when many other possible solutions exist, are much more mature, and have a ton of mindshare and tooling around them... I understand well that evaluating and debating the merits of one database over another can be a black hole, but the answer can't always be "we'll just avoid all that by creating yet another database project!" I have zero direct involvement with any other Web database projects, and so have no personal investment in your choice; but I have to ask: did you consider just adopting/extending something like PouchDB? Perhaps you did and found other available solutions lacking in some way (that writing a plug-in/extension couldn't solve), but I haven't found any discussion of that anywhere here.

It's clear that having some kind of robust database solution on top of Dat and easily accessible to Beaker Apps will be a huge win, so kudos for what you've done so far.

I'm also interested in the relationship you see between WebDB / IngestDB and the Dat team's own "roll your own" Dat backed JS database project HyperDB. There seems to be some significant overlap in vision with that as well.

Anyway, I'm not asking this intending to criticize, I'm just genuinely interested in why you decided to take on a sub-project in the context of Beaker that has the potential (over time) to be such a heavy lift, while not seeming to be the core value-add of what you are creating here. Beaker and the P2P Web discussion it is igniting are too exciting to have you guys bogged down maintaining yet another browser database!

Thanks for reading!

Export identity keys with passphrase

In order to backup identities or transfer them to another machine there needs to be a mechanism for the user to export their private keys and import them to another instance of Beaker. Maybe a UI allowing the user to review all their identities and Export/Import one or more of them to an ascii armor file would work. Prompt the user to create a passphrase and explain that the file contains their private keys so they need be careful where they store it and use it.

Does `navigator.session.request` reject or resolve to `false` on close/cancel?

In NavigatorSession API, it says navigator.session.request resolves to false, but in the example, onLogin() seems to be try-catching. If it throws an exception as well, when does it do so?

As a side-note, a suggested enhancement is to add listening to changed in the example. Didn't open as a separate issue because I've opened already so many :)

Is there a sample implementation of object-store-folder?

I know it's not probably hard to implement and one can take much inspiration from libfritter. Asking just to not duplicate work in case a sample implementation exists.

Great job btw!

FYI, I'm currently working on a project that has a db similar to fritter's (without the webdb/indexing part), was going to refactor it anyway and was thinking about using the object-store-folder design since it looks so nice and to try it out.

(Note: I can see the Not Implemented tag. I'm asking in case there's a standalone prototype implementation.)

An "applications" folder

As a possible replacement to the app:// scheme that you had talked about previously (and a natural follow up to #14), you could have an "applications" folder w/ a bunch of .wapp packages that are literally just renamed mounts to websites. Double click on them and open up that website. You could then allow users to click files and "Open with.." whatever .wapp packages they have installed on their dat fs. The app packages could have something akin to a .webmanifest and indicate their icons, what types of files they can open, and more.

update: after thinking through this a bit more, I realized that a mount wouldn't be appropriate, since web apps will need to route based on the dat root. So I guess if you did something like that, these packages would be more-so "anchor links" than they would be "symlinks". I'm also thinking that I may be mis-remembering the purpose of the app:// scheme and I'm going to try to read up on that again and post some more thoughts at a later time.

Namespace (and version) archive types from the outset

Hi, in reading through the new Archive Types document, one thing immediately pops into mind. The need for namespaces and versions...

I believe that the "type" spec should be "namespaced" from the outset, and there should be no "non-namespaced" types allowed. That is, it should be rotonde:user or beaker:application, not just user and application (I don't really care what the separator is...) Without this there will be confusion and type squatting and incompatibility. There are going to be a million uses of all of this when it takes off, and something as fundamental as user shouldn't be strapped down to an initial definition at the outset. This could all be done by convention of course, but at this moment you have the opportunity to anticipate the need and build it in.

I also think versioning could be useful, so that you could explicitly declare a Dat to contain a rotonde:user:v2 or some such. Again, I don't care much what the syntax is, but now is the time to think longer term about such issues. It will never be easier to make such changes than right now.

I think it's fine for you to define some common/anticipated types now just to get the ball rolling, but please, please don't forever pollute the root global type namespace with an unversioned "user" type conceived in the infancy of this project. We will all likely live to regret that.

Thanks for listening.

cross origin versioned module cache

I had a longer discussion on twitter some time ago about web apis and user land.
In regular browsers it might be harder to do, but beaker browser natively supports dat so I imagine the following to be feasible.
If this is already a well discussed thought that doesn't make sense - then I'm just not familiar with the reasons and would be happy to learn about them.

I'm wondering if its possible to instead of populating the global namespace window with more and more new web apis, to rather add something like require(...) as "the only new web api" in order to require more new web apis similar to how node can require('http')
But probably it should be possible to explicitly tell the version someone wants to use, maybe as an additional argument or as a global config file window['package.json'] in order to make the cross origin behavior safe.

In beaker, that would for now probably be:

const WebDB = require('webdb')
const DatArchive = require('dat-archive')
...

Those might be built into beaker browser like node has some built in modules
BUT: this would enable in the long run, that all those modules could become user land modules (maybe on npm) that use more low level beaker APIs, once they exist. This would enable to remove require('webdb') or require('dat-archive') and when installing Beaker Browser they can instead could come bundled into a cross-origin versioned module cache pre-populated with former web apis as now user land modules (e.g. like require('webdb'), ...)

If beaker browser would even force all those require('...') calls to be async, then - if a module is missing - it could even fetch and populate the module cache on the fly and download missing modules.

That way, adding more and more built-in web apis, like WebDB is completely ok, because they will or might turn into user land modules in the future without spamming the global namespace. Beaker browser itself will then over time probably try to add a minimal core of "beaker system calls" (also using require(...), because maybe in the future there can be an even more minimal core), but deprecated web apis that become user land only could move out of core using the module cache instead.

What is `navigator.showFileDialog()` for?

In Beaker user filesystem spec, what is an example use-case of navigator.showFileDialog()?

I was struggling a bit with this spec, but when I understood it as just a foundation for Beaker identities, which I may be wrong in doing, I understood it. However, I can't see navigator.showFileDialog()'s place in all that.

Thanks in advance!

[Suggestion/Feedback] Reading order/organize the specs into layers

The problem
Reading the specs (which are awesome in design and awesome in writing, btw!), one doesn't know where to start and where to end. Also, sometimes it's not clear why dats are involved in some things; it's because they're the most primitive building block in beaker.

One suggested solution
After some time, I imagined a stack like this:

 -----------------------------------
| Beaker user identities            |
|-----------------------------------|
| Beaker user filesystem            |
|-----------------------------------|
| Object-store folders              |
|-----------------------------------|
| dat.json | Dat types | index.json |
|-----------------------------------|
| datprotocol                       |
 -----------------------------------

So, perhaps providing such stack to the readers may help, or a reading order.