roc-lang / book-of-examples Goto Github PK

View Code? Open in Web Editor NEW

30.0 6.0 15.0 1.56 MB

Software Design by Example in Roc

License: Other

Roc 91.72% HTML 6.45% CSS 1.71% Makefile 0.12%

roc software-design tutorial

book-of-examples's People

Contributors

Stargazers

Watchers

Forkers

hajagosnorbert fabhof stuarth faldor20 jwoudenberg shritesh nathanielknight isaacvando agu-z ashleydavis neilccbrown hristog kdziamura noelrivasc monmcguigan

book-of-examples's Issues

Topic proposal: A transpiler to JavaScript

I propose a new topic for the book: A transpiler to JavaScript. The idea to write a transpiler (a source-to-source compiler) for a simple imperative programming language to JavaScript (JS). This new language — that I call KahwahScript¹ or KS for short — is similar to JS in syntax, but has fewer constructs and simpler/different semantics.

Some examples of different semantics:

Dynamically typed like JS, but strongly typed like Python.
Variable declaration requires initialization.
Only false and null are falsey.
All variables are lexically scoped.
Functions have fixed arity.
No undefined.

Some examples of missing constructs:

No anonymous functions.
No for loops, only while loops.
Only number, boolean, string and function types.
No classes or objects.

In terms of code, we’ll need to write a Tokenizer, a Parser, a Translator and a CodeGenerator. This can be structured as two chapters, first for the first two components, and second of the last two.

Alternatively, we can make writing the Tokenizer and Parser an exercise for the readers while providing hints for their implementations, because there may be other chapters about parsing. In such case, we can have only one chapter that focuses on translation of KS to JS, and code generation.

We limit the scope to these components only, and leave the rest of the work as exercises for the readers. This may include:

Writing an interpreter for KS.
Using the interpreter to test the transpiler implementation.
Writing a REPL for KS.
Enriching the language/transpiler and the runtime to enable writing a self-hosted KS-to-JS transpiler.

The main takeaways from the chapter(s) will be:

Writing an AST-to-AST translation based compiler.
Learning the nuances of JS, and work around them to maintain the semantics of the source language.
Optionally, if we choose to have a chapter on tokenization and parsing, how to use parser combinators to write a tokenizer and a recursive descent parser.

The main potential complication that I see is, the readers may have to learn too much about the idiosyncrasies of JS.

For a proof of this concept, I’ve already written an implementation of the transpiler in Haskell.

Kahwah is a traditional preparation of green tea consumed in the North Indian subcontinent, and in this case, a word play on Java (a coffee). ↩

topic: HTML template engine

I have an idea for a design; will update the issue when I get a chance.

topic proposal: machine learning from first principles

This is a bit more general than #4.

The proposal involves introducing fundamental machine learning concepts such as data preprocessing, feature engineering, model training, validation and evaluation (essentially, steps which make up a machine learning pipeline).

I've been working on a library called RocLearn, with API that is close to Python's sklearn library. The idea is to implement fundamental machine learning algorithms (k-nearest neighbours, Principal Component Analysis, Support Vector Machines, Linear Regression etc) from first principles and at the same time abstract out the details of each one, in order to enable them to be viewed as modules that could be plugged in and out, as part of the more general steps, associated with a machine learning pipeline.

topic: diff tool

A git-like, minimalistic version control system.
If the term isn't deemed too hyped up (and as such distract from the actual substance), the chapter could also touch upon the connection to and parallels with blockchain technologies.

add link to rendered site

I appreciate this might still be wip but why not add a link to https://roc-lang.github.io/book-of-examples/
from the github page and/or README file?

topic: file backup

The problem

Create a system that allows creating a backup of a directory and restoring directory state from the backup

Technical challenges

Multiple backups with little overhead

Takeaways

Hash functions: how to represent smth big with smth small
Testing of inversion functions, in this case, dirState == restore(backup(dirState))

Storytelling ideas

The Library of Babel

proposal: use Jekyll/GitHub Pages _temporarily_ to build a website for this material

I propose using standard GitHub Pages tooling (i.e., Jekyll) to build a website for this material while we are working on rough drafts, and to move to something Roc-based once we have a better idea of what we need.

Pros:

Immediately available: see the website branch of this repo, which uses a template from previous book-ish projects I've worked on.
Having something up right away will give us a better idea of what we actually want from whatever static site generator we eventually choose.
"Use GHP's default" can save a lot of arguing over what's the best SSG and why.

Cons:

Local preview requires Jekyll.
- Contributors shouldn't be worried about formatting at this point (@gvwilson can tidy that up as PRs come in if necessary).
- Jekyll is straightforward to set up for people who do want to preview.
We ought to use a Roc-based SSG.
- Translation will be low effort, since the templating is very simple.
- Building this with an off-the-shelf tool will give us a better idea of what we actually need in that SSG.

topic: HTML parser

topic: continuous integration

The focus here would not be "how do I run a process" but "how do I handle partial failure and restart".

topic: file transfer (FTP)

I'd like to go forth with this chapter, which includes sending files between a client and a server via network.

topic proposal: logging framework

Original proposal:

discovering tests

running them

mocking external dependencies

Revised proposal (see thread below): implement a logging framework on top of two different platforms to compare/contrast the ways those platforms affect design.

topic proposal: Parser combinator library

Something that could be quite fun and useful to do is showing how to implement a simple parser combinator library.

Could build on some of the BSON encoding/decoding, where we can take that shape of data and show how you would extend it to parse it into the ADT representing it.

I'd include showing how to combine simple operations for parsing like A | B, A ~ B, as well as for simple data types like strings, bools, digits etc.

topic proposal: property-based testing framework

A simple property-based testing framework, with generators provided for primitive types and collections and the ability to define generators for your own custom types.

Advanced features like shrinking may have to be skipped to keep the project reasonably small though.

topic proposal: HTML linter

depends on #2

topic proposal: a parser with useful error messages

#2 proposes an HTML parser; this chapter would be a follow-on that would show how to generate useful error messages from such a parser. (This suggestion motivated in part by the strengths and weaknesses of Roc's own error messages and the fact that most undergrad courses on compilers give this important topic very little attention.)

topic: text editor

I would like to work on the test editor chapter. I have a package for handling ANSI escape sequences which are supported in most modern terminals, and is most useful when working with the basic-cli platform.

Here is an illustration of an app using this package, and the code for the tui-menu example.

I think I can expand this example to include opening, editing, and saving files. The main limitation I can think of is limiting this to ASCII and not supporting full unicode.

Upload slides

Can you upload the slides from yesterday to this repo @gvwilson?

meeting 2024-04-10

Agenda

Welcome new participants
Vote: temporary GHP site #50
Vote: prohibiting external packages #46
Discuss: lesson patterns
- PETE (problem, example, theory, elaboration)
- PRIMM (predict, run, investigate, modify, make)
- C&R (challenge and response)
Next steps for April

topic: package manager

@rtfeldman discussed several ways to manage packages with Louis Pinfold in a recent podcast; implementing some of those options in Roc in order to show why it's such a hard problem might not reveal anything about functional programming per se, but would help people understand Roc's design choices.

topic: regex pattern matching

A Roc adaptation of https://third-bit.com/sdxjs/pattern-matching/ will be a nice chance to show off a functional approach. I'd love to volunteer for it!

topic: dynamic thumbnail gallery layout

I have a photo gallery layout algorithm that I'd like to port from TypeScript to Roc.

It is capable of doing incremental layout (as new assets are downloaded in the background) which allows it to do the layout for 100k images without the user even noticing.

Live demo with 100k assets: https://photosphere-100k.codecapers.com.au

The TypeScript type signature looks like this:

export interface IGalleryItem {
    _id: string;
    width: number;
    height: number;
    thumbWidth?: number;
    thumbHeight?: number;
    aspectRatio?: number;
    // --snip--
}

export interface IGalleryRow {
    items: IGalleryItem[];
    offsetY: number;
    width: number;
    height: number;
    // --snip--
}

export interface IGalleryLayout {
    rows: IGalleryRow[];
    galleryHeight: number;
}

function computePartialLayout(layout: IGalleryLayout | undefined, items: IGalleryItem[], 
    galleryWidth: number, targetRowHeight: number): IGalleryLayout;

It will be interesting to benchmark the TypeScript code against the Roc code.

It would be nice to do the entire gallery frontend in Roc, but for the book it is possible to demonstrate the layout algorithm in code that is runnable from the CLI.

Here are some other ideas related to this that might also be worth considering:

Virtual view port
- Only showing the rows of the gallery that are visible (making it possible to render a gallery with 100k assets). Performance is key here, because we have to rerender as the user is scrolling through the gallery.
Prioritized download queue
- Thumbnails are queued for download to be displayed in the gallery.
- There are two priorities:
  - High priority for thumbnails that should currently be on screen (we need to download and show these in the shortest possible time).
  - Low priority for thumbnails in the pages before and after the current page being rendered (these are the thumbnails that the user will scroll to next, so they don't need to be displayed immediately).
Page-based loading of thumbnails
- Loading individual thumbnails from AWS S3 is very slow.
- To make a big set of thumbnails load quickly they are aggregated into a "page" containing 100 thumbnails. This means that one request replaces 100 when loading thumbnails. The "page" is packed on the backend (and cached in S3) and unpacked on the frontend.

topic: packing and unpacking binary data

Like this Python lesson, show how to pack and unpack binary data according to format specs

topic proposal: auto-completion

Original proposal:

I've been working on a Roc project recreating this yot neural network engine https://github.com/karpathy/micrograd, and think it might be an interesting chapter.

Revised proposal (see thread below): show how autocompletion works.

topic proposal: Redis-like key value store with write-ahead log

@gvwilson had suggested this, and I'd be very interested in writing it. That said, I don't believe it'd fit in an existing platform and would likely require something like a basic-tcp. Thoughts @rtfeldman or others?

topic: SVG rendering

IMO the book should showcase a platform implementation and turtle graphics is a fun and simple one to do.

topic proposal: pseudorandom number generators

We already have an implementation of PCG pseudorandom number generation in roc-random but implementing random generators in a purely functional language seems like a good fit for this book because:

Most languages have something like a random() function which isn't pure (since it potentially returns a different answer each time it's called). Because of this, it's a common beginner surprise to discover that a random number generator in a pure FP language has to present a different API.
There are a bunch of different PRNG algorithms out there, including ones that are simple enough that you can write the "randomness" part in a few lines of code, and then focus on the concept of how computers do "randomness" and pure FP API differences (including generating things other than numbers) for the rest of the chapter - which are the more interesting parts.
There's a good opportunity for some "further reading" notes on why algorithms like PCG and xoroshiro provide better distributions of outputs, performance tradeoffs, entropy sources for secure randomness, etc.

proposal: Prohibit external packages

Writing the simulation chapter, I'm currently using an external pseudo-random package. Obviously I don't want to implement it while teaching simulations, so a package is needed in this case. That begs the question: How should we handle packages?
I have 2 related proposals:

We should restrict the use of packages to those that are created in one of the chapters. (I know there will be a chapter on PRNG, so simulation would be covered) I suspect the reader would find great satisfaction in using his/her own implementation of a package. That way, every used line of Roc code would be explained in the book. If an external package is needed to implement a chapter's code, that package needs explaining as well in one of the chapters. That being said, we can't expect the reader to first write the package code and then use it. Some may do it, others won't, so the packages need to be hosted somewhere.
This github repo should host the package releases which would be used by other chapters. External packages imported from an url are fragile, especially if those urls are out of our control. Those urls will be physically printed out, so they should strive to be as stable as the page they are written onto. Hosting them under roc-lang's github would yield us more control and also make things be in a single place.

Related question to the 2. point of the proposal:

I know little about Roc packages, but I think they need to have a pre-defined structure. Adhering to that structure while teaching a concept inside a chapter could be distracting to the learner (but I don't know by how much). Should the hosted packages be the same as the content of the chapters, or should the chapters focus on teaching the concept in expense of not being a proper package that the user could import locally, thus later in a different chapter needing to import it from a url regardless if he/she has written it already?

Note: By packages, I mean Roc packages, not platform packages.

topic: JSON ADT and codecs

I'm imagining writing an ADT for JSON structure and then showing how you can write encoders/decoders for this representation.

It would cover modelling data with algebraic data types, pattern matching, encoding/decoding etc.

I like this topic because I feel like there's lots of different details and levels you can dive into.

topic proposal: SSH

I don't know how doable it would be, but I think a chapter on a small version of SSH would be great.

governance proposal: adopt Martha's Rules for decision making

See this blog post for a description of Martha's Rules (a lightweight mostly-consensus decision-making process for collaborative projects).

infrastructure proposal: create a blog for this book

RSS is an easy way for interested parties to keep up with occasional announcements and progress reports. We can easily set up a GitHub Pages blog (https://roc-lang.github.io/book-of-examples/) using Jekyll for now, and switch to a Roc-based tool if there's interest while recycling the posts. I envision posts consisting of things like "person X has taken on topic Y", "person Q has a first draft of topic P ready for review", and, "we've made the following decisions about managing the glossary".

Please add thoughts below: in particular, is this Yet Another Distraction and everything should stay in Zulip or some other channel?

topic: compression

Data compression is used all the time by software engineers, but I suspect many do not have a clear understanding of how it is implemented. I believe this makes it a good topic for the book. @bhansconnect recommended the LZ77 algorithm. I have not investigated it heavily, but it seems doable in a chapter and it is a component of gzip which is also valuable. However, there could be another more suitable algorithm to use instead. I am interested in working on this if we decide to do it.

topic: pretty-printing library

The idea is to implement a pretty printing library based on Philip Wadler's "A prettier printer" paper.

The paper already contains code for an implementation, though it is a bit rudimentary. The wl-print Haskell package contains an extension of it, with some extra bells and whistle for convenience.

We can stick with the paper's implementation or something in-between that and wl-print's. We would also demonstrate the library usage by pretty printing one of these:

An ADT, such as the JSON value representation
An AST representation of s-exps (simple AST)
An AST representing some simple imperative language (complex AST)

This has been done before in the chapter 5 of the Real World Haskell book.

The Roc implementation may not be exactly same as the Haskell implementations because of strict-vs-lazy nature of the two languages.

CC0-1.0 license for code?

I really like the CC0-1.0 license we currently use for the Roc examples repo. It's a license with minimal restrictions. It allows users to just copy-paste an example into their codebase (even commercial) without needing to learn about attributions.

The current license for this project (CC-BY-NC-ND-4.0) requires users to include an attribution, prohibits commercial use, and does not allow code with any modifications to be included in for example a Roc package.

The CC-BY-NC-ND-4.0 license feels like a net negative to me. I think the CC0-1.0 license would make the book more beneficial to society and lead to a better user experience without making any real sacrifices.

It seems so strange to show high quality examples but not allow people to just copy-paste them into their own projects.

Opinions welcome :)

discuss: how to manage inter-chapter dependencies?

Following on from #46: how should dependencies between chapters be managed? I.e., if Chapter B uses code developed in Chapter A, how should Chapter A be structured and how should Chapter B get what it needs? Please add comments to this issue with thoughts and proposals.

topic proposal: discrete event simulator

Simulation was one of the reasons OOP was invented, and interacting objects with their own state and control flow is a natural way to represent things like the flow of messages through a complex tangle of microservices. Showing how to do such simulations in a pure functional language would put a spotlight on the differences between the two paradigms.