johari / minicell Goto Github PK
View Code? Open in Web Editor NEW(wip) A rich visicalc dialect with new datatypes inside cells. Recalc or die. ๐ดโโ ๏ธ
(wip) A rich visicalc dialect with new datatypes inside cells. Recalc or die. ๐ดโโ ๏ธ
=LOAD("ouroboros")
Populate and extract V (vertices) and E (edges) from semi-structured data.
The most basic way to extract a graph from a table is to extract it from the incidence matrix. We implemented this particular feature in b5d538c. These are the lines that are responsible for it:
minicell/src/Spreadsheet/Evaluator/Parser.hs
Lines 230 to 256 in 6b9b6d6
EYouTube
value could be rendered as an embedded iframe<img>
tag<video>
tagerb
mustache
markdown
I'm not saying that Excel is *the best *spreadsheet program out there. But it is relatively simple, quick, and powerful and the combination of those three qualities as well as its relative prevalence makes it the go to choice for most people and industries. It's also what will probably continue to drive its popularity, despite all its quirks[1].
[1] Wikipedia Entry: Microsoft Excel Quirks (http://en.wikipedia.org/wiki/Microsoft_excel#Quirks)
This is a meta-issue tracking everything we need from backend to support Full COMET
.
/minicell/all.json
/minicell/all.json
One extreme example is =UNIXTIME()
changes value every second.
Implementing this issue means we get much closer to a prototype of a collaborative graphsheet environment! :)
It is more important for us to
.xlsx
filesat this stage in the process.
However, it'd be nice if we could
.xlsx
files as well.Often times, I prefer to see the graphical rendering of a graph that is stored in one cell, but in the mean time, I want to edit another cell..
For example, suppose A1
holds a graph of cities, A2
and A3
hold string literals that specify a source and a sink, and A4=SP(A1, A2,A3)
.
It is preferable that I can pin the side-view of A1
while I'm editing A2
and A3
.
Later on, we can extend this issue to pin side-view of multiple cells, not just one!
I think this would be a super-useful usability improvement!
I'm doing little experiments so I can bring Haskell's diagrams
package (1, 2 and 3) into Minicell. It only took a few lines of code, and I got an SVG in sideview after a few minutes! (via diagrams-svg
)
This past week I've been writing a lot in my notebooks about how Minicell can benefit from the magic of Diagrams, even more so when we implement = UNIXTIME()
and let cell values depend on time. (See #33).
I've already been thinking about simple animations that I can make by superimposing multiple cells containing basic shapes (We are not too far from an actual interactive demo for this issue, I think!)
The following image is taken from 2:
This issue brings us:
We don't have a formula bar.
A formula bar is an essential part of the spreadsheet interface.
One issue is combining different namespaces with each other. These are examples of namespaces:
SHA1
snapshotNow if you think about it, there are multiple ways these namespaces are linked internally and also among each other:
Lines 104 to 116 in 48a2cc3
??
)In a software project, issues in the bug tracker certainly do depend on each other, however GitHub doesn't provide any means to explicitly declare dependency between issues.
Ticket dependency: Bugzilla has a feature that lets you describe relationships between tickets. Each ticket can dependOn
another ticket, and one ticket could have many other tickets that depend on it.
This information structure (where one issue depends on completion of another issue) is an example of a graph1, and I think Minicell can do a great job providing an interoperability with GitHub issues.
=IMPORT_JSON
function will load data from json
1: it's an example of a lattice
After we read GitHub data, we can wrangle graphs. To begin with, we can grep the issue description and mine links (e.g. #14
in the markdown) and use Minicell graph primitives to express interconnections among issues in our bug tracker.
Would it be possible to someday use Minicell itself to track all the bugs and tickets related to implementation of Minicell?
I've been intending to bring a basic support for video and audio through ffmpeg for a while. It's wise to offload the idea into a GitHub issue.
The following ffmpeg one-liner performs some of these:
$ ffmpeg -i z.mkv -ss 00:02:54 -t 49.1 -vn acodec copy sijal.mp3
Apart form this, ffmpeg
can apply various filters on a video stream.
Furthermore, it can combine two or many video streams into one.
EExpr
)eval
function will use ffmpeg
in the backgroundThere's a wealth of content on YouTube, but there's no granular access to YouTube videos. Often times, I'm only interested in a portion of a longer video. Although YouTube supports links to a specific point of a video (and also an ending, if you use the embedded player) it still isn't convenient to access, retrieve and mix portions of YouTube videos with each other.
Many values change in the backend, but the frontend is not notified about these updates.
/minicell/all.json
after every write operation/minicell/all.json
/minicell/all.json
Right now we re-fetch everything after each individual write to a cell. (Pull)
See ( ... , cometUpdateAll)
in line 396:
Lines 389 to 396 in 6b9b6d6
Should I go and learn more about CRDTs?
We plan to provide means to manipulate the graph inside the client. I estimate that task to be as big this issue.
However, it calculates the right value once you re-write the same formula to the cell
I love webgraphviz.com. It's probably my favorite tool for thought. I use it everyday, and I've recommended it to my friends. It's not your typical writing tool, of course. But if you're open-minded, you can get a lot out of it.
Consider this example:
minicell/examples/graphviz/make-a-website-for-a-friend.dot
Lines 1 to 28 in 4b3d3a0
Right now, we provide a basic support for parsing this file, although we are not converting it to an EGraphFGL
value. I don't know why the implementation of dotToGraph
in graphviz
package does not pass on node and edge labels. Nevertheless, we can implement it ourselves using graphNodes
and graphEdges
minicell/src/Spreadsheet/Evaluator/Parser.hs
Lines 215 to 222 in 4b3d3a0
fgl
graphs via dot
and display as png inside the side-viewIt would be nice to explore how graphsheets can provide graphql endpoints.
I think our datatypes and querying capabilities are rich enough to handle some interesting basic examples.
I don't have time to implement this myself as of late 2018. Getting the core UI working is more urgent at this point.
APART FROM HANDLING ERRONEOUS FORMULAS
THIS ISSUE IS DONE
The SimpleServer.hs
currently stores the model
inside a TVar
.
Lines 71 to 99 in 33df1f9
This means we can use a command like this to write a new value to a cell:
$ curl -X post http://localhost:3000/minicell/A2/write.json -d 'formula=''=SP(A1)'
(taken from 33df1f9#diff-7d442b7eb49f5fc377f51e74b291cfc1R40)
Edit mode
to Idle mode
. The logic of this is implemented in the Save
.
/show.json
endpoint returns/write.json
and update the cell
write.json
via ElmLines 351 to 356 in 33df1f9
Lines 382 to 406 in 2990f94
Gunrock is a high-performance graph processing library that runs on GPU.
Setting up Gunrock and working with it directly requires a lot of work, and is intimidating for non-programmers as well as casual programmers (say a social network expert).
I propose we implement a Gunrock backend for Minicell. This way, non-programmers and casual programmers can run high-performance computations without investing time in setting up Gunrock and implementing code in C++ or Gunrock's Python API.
It seems like libgunrock.so
is pretty solid. Here's an example usage with Python's FFI:
I think it will be nice to try using libgunrock.so
in Haskell.
The evaluator (#13) could have a new graph types for Gunrock, and the table of operations can use Gunrock to load large graphs into memory. Then we can use Gunrock to perform high-performance computations on graphs inside cells.
We implemented a mock evaluator inside Elm, but it's time step up and plug in a real parser and evaluator to the system.
This issue intends to
From high-priority to low priority:
I experimented with a toy evaluator inside Haskell. The main heavy lifting is done inside the evalIO
function. I will post a link once they are added to the repo.
I plan to implement these operations as part of the operations provided by Minicell
I made a demo of a simple drag and drop mechanism last week (before December 9th) and I really like it. This week (December 16th) I implemented a basic feature in backend to accept file uploads for each cell.
I can upload files via $ curl
now.. Here's an example:
Lines 26 to 28 in 9bba94b
And as demonstrated in the video, the frontend supports
However, I need to
POST
request (similar to what $curl
does).mp4
, .mov
, .avi
) and YouTube videos (with optional start
and end
)=GIF(A1:A5)
or =ANIMATE(A1:A5)
=GIFSPEED(G1, 0.5)
: Change playback speed=GIFREV(G1)
: Reverse the frames of a gifbase64
encoding of the uploaded file inside the memory.mat
files (related to #16)png
corresponding to the fgl
graph that is stored in the cellimg
tag to display it in the side viewCometValue
s in Haskell
CometSLit
, but also CometFormula
and so onCometValue
s into Elm values
Lines 363 to 382 in 5f37d9d
Lines 59 to 65 in 33df1f9
Also, adding data to these lines:
minicell/src/Spreadsheet/Types.hs
Lines 207 to 224 in 33df1f9
A_
: endpoint, and B_
: EExpr
values)
erb
, mustache
, and other template engines (see also #36)By that I mean:
What if, you could store HTTP servers, inside a cell?
I described this in my "Zine" notebook.
We are not trying to be a solution for persistency. (Ingres, MySQL, etc. are better at this) However, we aim to pull data from various sources, and we aim to expose contents of a sheet in as many ways as we can, including but not limited to
This issue is mostly about HTTP serverlets.
=RENDER("layout.html", "content", "Hello World!")
=RENDER("main.html")
connect="B2"
attribute, used for a <td>
and an <input>
https://dspace.mit.edu/handle/1721.1/100512#files-areaIn Minicell, these are our most computationally-intense domains:
sox
and ffmpeg
, but imagining "Digital Signal Processing" primitives inside Minicell is realistic) (#50)With respect to Incremental Computation, the main questions are:
Spreadsheets are intersection point of
As we support more streaming types (like audio, video, dynamic graphs, time and other temporal values), the codebase of Minicell becomes more complicated. On top of that, some of the IO that we do on large image, audio or video files can be costly. Of course we want to avoid unnecessary computation as much as we can.
As I was working on basic formulas for processing audio (#50, 19e4e83), I attempted to write a few lines of code to avoid unnecessary IO.
Instead of passing things down to ffmpeg, we first calculate an md5 sum of each file.
We name the output of audio computations based on the hash of their content.
Line 359โ363 check to see if we have previously calculated the result of =ACONCAT(A1,A2)
or not.
minicell/src/Spreadsheet/Evaluator/Parser.hs
Lines 354 to 365 in 6b9b6d6
I'm not quite sure at this point. But I sense that we need to approach things in a more principled way. We need to be more organized about:
(This issue addresses only the third one. We need separate issues for the first two.)
This issue keeps track of development of the formula set.
G3=UNION(G1, G2)
I don't know what title to pick for this issue:
G2 = [ <VV, EE> | <V, E> <- G1, VV <- V, in_degree(VV) > 5, EE@(V1, V2) <- E, {V1, V2} โ VV ]
)It seems like the main part of work must be done in cometStorage
in Elm.. (https://github.com/johari/minicell/search?l=Elm&q=cometStorage)
We treated COMET values very specially when we implemented the frontend in Elm. But as we move along, the way we treat COMET, computed values and literals are changing.
We have addressed saving values via COMET (see #21) but something is fundamentally lacking from the data that we send from Haskell to Elm.
computed value
AND original expression
are sent from Haskell to Elm.The parser and interpreter (#13) are related, but separate.
The parser is implemented with parsec
. The code is mainly here:
minicell/src/Spreadsheet/Evaluator/Parser.hs
Lines 25 to 67 in 33df1f9
Right now, we support parsing these expressions:
42
=42
=A1
=SP(A1)
Hello world!
In commonplace spreadsheets, each cell represents one value. Like a string literal, or a number.
It's unconventional for spreadsheets to store a list of numbers inside one cell (One +2010 system from MIT explores having lists as cell types)
I'm impartial about supporting lists as cell types for now. But I think, because of our emphasis on graphs, we need to support tuples. This way, we may have cells that capture a (from, to)
relation (or a (from, to, label)
).
Right now is handled in a very ad-hoc way
Lines 109 to 122 in 2990f94
Although the implementation of the interface in Elm is in good shape, there are missing bits here and there that make the interface unsuitable for a serious demo. For example
rendering is not implemented
for some critical data typesFor example, certain graph types are implemented in this file:
minicell/src/Spreadsheet/Types.elm
Lines 51 to 57 in 237f17a
but these functionalities are not implemented.
This is a medium-size task. The scale of this task is bigger than "polishing", but smaller than implementing things from scratch.
This is a meta-issue that keeps track of our bootstrapping efforts.
Please see these issues:
nix-shell -p "haskellPackages.ghcWithPackages (pkgs: with pkgs; [fgl generic-random QuickCheck brick fgl-arbitrary hspec diagrams palette mysql-simple hslogger wai warp aeson wai-websockets wai-extra wai-cors fgl-visualize graphviz wreq stache temporary pureMD5 time hxt tagsoup hoauth2 pandoc probability cborg serialise haxl fb http-conduit http-client-tls async hashable resourcet cabal2nix extra dhall heterocephalus csound-expression hint gitlib libgit2 hlibgit2 gitlib-libgit2 hlint])"
I use Nix for development, mainly because a one-liner in Nix properly drops me in a shell that just works. I add packages to the one-liner once in a while, but as of now, it looks like this:
nix-shell -p "haskellPackages.ghcWithPackages (pkgs: with pkgs; [fgl generic-random QuickCheck brick fgl-arbitrary hspec diagrams palette z3 mysql-simple logict HFuse hslogger aeson scotty])"
Since scotty
doesn't work with Nix on Mac, I wrote SimpleServer.hs
which uses wai
and warp
. My one-liner looks like this on mac:
nix-shell -p "haskellPackages.ghcWithPackages (pkgs: with pkgs; [fgl generic-random QuickCheck brick fgl-arbitrary hspec diagrams palette z3 mysql-simple logict hslogger wai warp aeson wai-websockets wai-extra])"
I tried to setup stack mainly because Haskero [1] and Intero [2] depend on it. I had success on Linux, but Mac failed me with esoteric link errors.
[1]: https://marketplace.visualstudio.com/items?itemName=Vans.haskero
[2]: https://github.com/commercialhaskell/intero
Mark the cell with appropriate error messages in case of following
I've mentioned this to a labmate once. Minicell would be a nice playground to experiment with ideas like this:
[...]
Maybe you could work on a deep learning toolkit inside the spreadsheet environment..
One that is natural and effective for a non-programmer to work with,
and leverages everything that the spreadsheet interface provides
I donโt know what the related work on this is already covering..
I'm sure others have done a couple of works in this spirit.
Perhaps nothing that leverages spreadsheets though.
On the other hand, I don't necessarily think making models like that commonplace is a good idea. For example, disappointingly, there are more than 3-4 repositories on GitHub that are aiming to provide models for "ethnicity detection", "race detection", "gender detection" and other things of this nature.. The last thing you'd want is for some loan or insurance company to plug-in an "ethnicity detection" model to their budgeting spreadsheet.
My criticism is that models like this provide little explanation about the answer that they come up with. It's still an open challenge to make these system describe their answers in a way that human can understand. Any piece of work that tried to address this open challenge has been out of my realm of comprehension so far.
I think it would be "cool" to have TF models in a cell. It would probably keep a lot more GPUs busy, but cool and computationally intensive doesn't imply applicability in real-world phenomena. And I'm worried that having them in Minicell will reverberate something unholy. Models like this tend to encode, calcify and amplify hard-to-trace biases in the original training data. Let alone their wide applicability in extremely brutal surveillance.
There are some other applications though. For example, I wish you could implement a clone of Dynamicland's object detection inside spreadsheets. (think of what https://paperprograms.org/ does, but entirely within Minicell instead of Node)
Piping a stream of images and extracting a stream of structured information and geometrical attributes from them is a great fit for the capabilities that Minicell (or any spreadsheet environment) is planning to implement. But I need more concrete use-cases that sound healthy and interesting enough before starting to bridge between Minicell and TF.
The key difference between SourceForge and GitHub was that GitHub adopted <username>/<repo>
addressing (the bazaar), mixing social networks and software projects, whereas SF.net encouraged a project-centric view of the open-source world (the cathedral). (This was facilitated, of course, by advances in distributed version control systems, DVCS, but GitHub had a particular emphasis on people from the get go.)
As #23 matures, Minicell spreadsheets become more than just spreadsheets. After #23, Minicell documents will resemble glitch.com applications (serving hypertext and dynamic webpages).
If we look at each minicell document as an "app", it will make sense for these "apps" to be forked and remixed as well. This is where this issue comes in.
One level of bringing collaboration inside spreadsheet environments is doing it the way Google does it. Realtime, colaborative editors, both for word processing and basic data processing and computation (spreadsheets and forms). Another is the glitch model, which puts trust in the node ecosystem and containers.
I think there's a mid-point between glitch model and google spreadsheet model. And that's Minicell. Collaborative, Approachable and Programmatic.
In the interface, we don't have a nice way to edit mustache strings.
Or a few lines of a function definition. (in JS, or Python, or Lisp)
I added a basic support for Mustache templates here:
minicell/src/Spreadsheet/Evaluator/Parser.hs
Lines 187 to 201 in 48a2cc3
Even if we implement the formula bar (see #38) , I think we need a <textarea>
-like editor so that we can edit mustache templates with ease.
The interesting thing is that as of now (DEC 19th), we provide a basic support for HTML rendering. The main issue that addresses this is #23, but to summarize, we are able to browse to http://localhost:3000/minicell/B2/HTTP/ and view an HTML document. Basic support for routing and serving remote images is included as well. Please refer to these snippets:
Lines 163 to 207 in 48a2cc3
Lines 74 to 92 in 9e673aa
With mustache and HTTP support (#36 and #23), and given how we provide a basic support for including images inside cells (#22), Minicell is at a stage that it can render simple HTML pages.
The pages, in increased order of difficulty, could be
EExpr
expressionsWe definitely want to be done with the following issues. They will make the current one much easier:
Show website
in Glitch) #23
To some extent
this issue reminds me of
glitch.com
where they chose to bias towards javascript
(for practical reasons)
we biased towards spreadsheets
(for a different set of practical reasons)
Let #36 populate cells,
Let #23 materialize!
Create dynamic hypertext.
Share dynamic hypertext.
Fork dynamic hypertext.
We have (used to have?) type-level support for lists, but so far (as of v0.0.2) we haven't been utilizing lists.
To be more precise, I mean using lists as value for one single cell.
I think I have a use-case for lists.
As I was thinking through GitHub interoperability (#42), I thought it would be nice if we could establish a dependsOn
relationship among individual issues.
[TO BE CONTINUED]
This type extends EGraphFGL (Gr String Int)
to EGraphFGLE (Gr EExpr EExpr)
This type came up in context of #39.
=LOAD("cities")
=LOAD("org-tree")
=MF
=SP
=SHORTEST_PATH(G1, "davis", "berkeley")
.mat
file via Gunrock APIFIXME
jpg
imageshostname
Either tag1 tag2
, Both tag1 tag2
, Just tag1
)Right now, there are a lot of examples in Elm and Haskell..
For the purposes of this issue, the examples mainly need to be in Haskell.
minicell/src/Spreadsheet/Types.hs
Lines 160 to 166 in 33df1f9
These are examples in other languages:
minicell/src/Spreadsheet/Example.elm
Lines 15 to 854 in 33df1f9
minicell/src/Spreadsheet/Example.elm
Lines 854 to 949 in 33df1f9
Lines 691 to 706 in 33df1f9
I'm trying to see how I can map reflex-frp abstractions to spreadsheet abstractions.
My goal is to have a basic example consisted of 3 cells:
From the interface side, this is what I need to implement
How long does it take to eval
a minicell query?
For example
Additionally, if we store the start timestamp
of our computation, we can plot these information on a timeline!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.