matz / streem Goto Github PK
View Code? Open in Web Editor NEWprototype of stream based programming language
License: MIT License
prototype of stream based programming language
License: MIT License
http://doc.cat-v.org/plan_9/4th_edition/papers/rc
having variables be lists would be great, as well as ^ for concat. other nice things would be & for async proc, '''cat <{progread} <{progread2} | tee >{progwrite}''' (where <{} or >{} evaluates to a pipe file connected to the subproc for read or write).
any thoughts on incorporating these ideas?
I do a lot of flow-oriented programming in shell, so I'm interested in streem as a replacement for the traditional unix shell. (I use zsh, but the differences are minimal enough in this sense that we can talk about shell generally.)
Unix shell scripting was designed from the very beginning with flow-oriented programming is mind. However, while it is trivial to make pipelines in shell, it is not easy to construct logically-driven stream processing functions, where data flows are managed directly using code written in shell. There are a number of workarounds. Often it's easiest to write such functions in other languages and glue them together in shell. Take, for instance, the universe of perl, awk, and sed one-liners that litter an old web of ~user pages and persist in the various stackexchanges and github. I think this is very much a path-dependent situation that relates to the difficulty of writing an efficient, interactive, and interpreted language.
Despite its age and cruft, shell has a huge following in data-oriented science, where I imagine streem would have the greatest impact. These are exactly the users we'd like to attract, and the most complete form of capture would be if they switched from using bash, zsh, and tcsh to using streem itself. I don't think this would be too difficult to achieve, and would have little bearing on the functionality delivered by the language. It would have an effect on its usability.
I propose that streem adopts command syntax that would enable a streem REPL to be used in place of a traditional unix shell. In my mind, this would imply a few basic considerations:
[command] [arg]*
syntax. In other words, it shouldn't require the use of parentheses for every function call.[command] [arg]* >[file-or-function]
.>
rather than |
to allow for forks and clarify the type of the sink that is being used (is it conceptually a named pipe, a file, or a function). For example, >@
could indicate forking/splitting a pipeline into a named streem function or command.&
for spawning subprocesses and job control.The idea isn't to remain compliant with *nix standards, but to make an environment which is attractive to the largest group of people who are currently using flow-oriented patterns.
It seems that certain features and changes to Streem are popular or unpopular, but wouldn't it be better to actually manage polls/votes on certain things to decide, instead of doing it by hand?
I am volunteering to write a package manager similar to npm
for Streem. I thought it might be named "River." Here are my ideas:
river install <git-repo>
install project from git repo.river install
installs current director if it is a river project.river remove <name>
removes a project with this name.river setup
an interactive prompt wizard for setting up your project.river version
.river run
runs main.strm
in the src/
directoryriver bin
creates a shebang file in the bin/
directory with your program.This comment #18 (comment) made me think that it would be awesome to have some way of pulling in other streem files inline.
Maybe something like method_missing would look on the file system. Or an explicit syntax for pulling streems in so they could be preloaded.
#main.stm
STDIN | %x{split_into_lines} | map %x{match_today} | STDOUT
#split_into_lines.stm
STDIN.split("\n") | STDOUT
#match_today.stm
STDIN.match Date.today | STDOUT
or with method missing style
STDIN | split_into_lines | map match_today | STDOUT
As I have seen, one of a key point of Streem is streaming data flows on concurrent situation. The FizzBuzz example shows one simple stream([1..100]->FizzBuzz->STDOUT). But if we write complicated concurrent programs, we tame complicated relations of processes. For example:
Streem has fascinating syntax like UNIX pipe. But it already have function and if statement, and we can implement process control structures using these. Besides this, we can also implement by extending the pipe syntax. So it will be important which control structures to assign to pipe. It will affect usability and expressiveness. How do you think?
puts(0)
if 1 > 0 {
puts(1)
return
puts(2)
}
this prints
0
1
2
false
when return not given arguments, ctx->exc
may be set?
@matz I watched the Full Stack Fest 2015 on youTube just minutes ago and hurried to see what exciting things you are up to :)
Perusing the examples got me thinking if the *nix stream descriptors could come in handy (and be (mis)used) ?
Like
[ 1,2,3 ] | > ( { |x| [ x, x*x, x*x*x ] } |1> { |x| puts x } |2> { |x| puts x } |3> { |x| puts x } ) | stdout
#
# output would be
1
1
1
2
4
8
3
9
27
In that sense you would build your own concurrency and leave it to the interpreter to hand each stream to what core resources would be available -- in the above example 1,2 or 3 cores might be used depending upon availability - and the parenthesis(es) would indicate scopes of concurrency, heck you could even (given some pre-knowledge of the computational challenges throw a seeding value into the mix like
| stream_description, resource_allocation_weight >
|1,1> { |x| puts x } |2,2> { |x| puts x } |3,3> { |x| puts x }
I cannot wait for you to 'build' this streem (and apply it to Ruby) in effect providing us with some relief when we do:
- resources.each do | resource |
= render partial 'item', locals: { item: resource }
(and resources is a 10,000 row SQL select ActiveRecord::Relation object)
The stream descriptors would obviously have to work the other way around too. Think about a ticket system in the subway
ticket_verifier 1< reader1 2< reader2 3< reader3 .... N< readerN |1> big_screen 2> logfile
with no one using the reader24 - reader32 they are (today) just sitting there eating up cycles - with your streem, they stop becoming an embarrassment :)
Full 'steem' ahead Matz :D
oh - and thank you so much for Ruby!!!!
cheers
Walther
ps: if this entire post is just about the dumbest you have ever read - please excuse me for waisting your bandwidth!
Which do you think is the best? .st .str .stm .strm or .streem?
I was thinking it would be useful to either have the ability to create operators or have macros. I know macros are controversial and hard to understand for some people, but I think at least being able to create operators as functions (a poor man's macro) would be useful. I don't really have any ideas about syntax though...
The similarity between the flow operator (|
) and the block parameter syntax (|x|
) will probably be a source of some confusion since |x|
can be misinterpreted as "a pipe through x".
The problem is not the flow operator (|
). On the contrary – using the pipe as the flow operator conveys the meaning very intuitively for everyone who has ever used a shell, so let's keep that very good choice.
Instead I suggest switching from |x|
to say (x) ->
, (x)
or x ->
to denote block parameters.
The (…) ->
alternative is implemented here: https://github.com/practicalswift/streem/commit/193fa8413d14dce03d609ec597006a40ca678ec1
Reasoning:
(parameter1, parameter2, parameter3, …)
should be familiar to most programmers.->
is used for assignment in the Streem syntax (see op_rasgn
in lex.l
).Summary of discussed alternatives:
Alternative 1 (current syntax).
seq(100) | { |x|
if x % 15 == 0 {
"FizzBuzz"
}
...
} | STDOUT
Alternative 2 (implemented here: https://github.com/practicalswift/streem/commit/193fa8413d14dce03d609ec597006a40ca678ec1).
seq(100) | { (x) ->
if x % 15 == 0 {
"FizzBuzz"
}
...
} | STDOUT
Alternative 3.
seq(100) | { (x)
if x % 15 == 0 {
"FizzBuzz"
}
...
} | STDOUT
Alternative 4.
seq(100) | { x ->
if x % 15 == 0 {
"FizzBuzz"
}
...
} | STDOUT
task_tid function in core.c chooses strm_task's tid based on max queue size when s->tid < 0 and tid parameter < 0, but I feel it might need using min queue size.
I want to write a test case to cover this function's execution path but don't know how to write a parallel streem program now or it does not support parallel program now.
The code snippet is 30-48 line in core.c
else {
int n = 0;
int max = 0;
for (i=0; i<thread_max; i++) {
int size = strm_queue_size(threads[i].queue);
if (size == 0) break;
if (size > max) {
max = size;
n = i;
}
}
if (i == thread_max) {
s->tid = n;
}
else {
s->tid = i;
}
}
I am writing my cram tests on an up to date version (I pulled this repo onto my fork and replayed my changes back on). But when I am writing a test for the var '=' expr
part of the grammar, I get this error:
lexical error ('=').
I am not sure weather I am misunderstanding something.
I don't be in hurry. But currently, mingw32 doesn't have strndup.
parse.o: In function `yylex':
C:\dev\streem\src/lex.l:60: undefined reference to `strndup'
collect2.exe: error: ld returned 1 exit status
mingw32-make: *** [../bin/streem] Error 1
I'll send PR in later, but FYI.
I think the idea of a well written, open source, stream-based/flow-based/data-flow language is a fantastic one. My only concern about this idea is the initial example
seq(100) | {|x|
if x % 15 == 0 {
"FizzBuzz"
}
else if x % 3 == 0 {
"Fizz"
}
else if x % 5 == 0 {
"Buzz"
}
else {
x
}
} | STDOUT
Do we need an procedural language? Please ignore the syntax, but what about something like this:
from fizzbuzz import fizzbuzz
seq(100) | fizzbuzz | STDOUT
where fizzbuzz
is a module written in a conventional imperative language, such as Python, Ruby or C++?
While Streem is young enough that we probably won't need a full unit test suite any time soon, in my opinion it still would be helpful to have some basic smoke tests (or something like that) running in CI (especially since there are so many pull requests lately, and CI can be pretty helpful in finding bugs that people may not notice otherwise). Travis is especially useful for testing open source projects without being too complex to set up and I particularly like it, though any CI system would work.
I think it would be nice if Streem used prototypal inheritance and object literals, (as in JavaScript). This makes for a very light-weight, expressive, powerful, and most of all, quick OOP-type programming experience.
There are already some functional syntax in the design of streem.
Will the "class", "method", "inheritance" or any other OOP elements be in this language?
I think "everything is an object" is good in ruby. Can I use the "Stream", "Pipleline" as an object in runtime?
So far, it seems that the only thing anybody knows how to write with Streem is the FizzBuzz and cat programs. Obviously, a language can't be that useful if it can only do FizzBuzz and cat.
Please comment on this issue with your examples of more useful things you can do with Streem. I still need to be convinced that there are any.
Flow-based porgramming resources:
http://www.jpaulmorrison.com/fbp/
Contact me [email protected] for copies of 2 papers about my form of reactive (embedded) FBP.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.93.9511
Also, contact me for techinques on how to compile diagrams to code. FBP (and probably Streem) lends itself to a concrete syntax that uses diagrams. And contact me for techniques for doing this without processes/threads.
My github might be of interest (esp. vsh)
https://github.com/guitarvydas
[I'd be glad to discuss here, on the fbp group or privately]
Ravi Sethi (dragon book) published a paper for a compiler using denotational semantics which employed pipe syntax. You might find inspiration there:
Thank you @matz
Usually several instances of the same processing running to take advantage of whatever paralellism is available.
note that alternatives 2,3,5,6,8 will cause shift-reduce conflict.
No. | Votes | Notation | comment |
---|---|---|---|
1 | 2 | `{ | x |
2 | 3 | { (x) -> … } |
|
3 | 4 | { (x) … } |
|
4 | 2 | { x -> … } |
|
5 | 5 | (x) -> { … } |
|
6 | 1 | (x) => { … } |
|
7 | 2 | { } |
parameters are $0 , $1 , $2 , etc. |
8 | 1 | (x) { … } |
|
9 | 1 | \x -> { … } |
|
10 | 1 | -> x { ... } |
my vote |
... in Streem
foo(a) {|x| ...}
is a syntactic sugar tofoo(a,{|x|...})
.
In my opinion, this syntax sugar is too confusing. The only syntactic difference between defining a function foo and calling foo while passing it another function is the def
keyword. I think this will lead to people reading this syntax and assuming that foo is being defined, or with people creating functions and forgetting the def keyword. Also, bash and C have very similar syntax to defining functions (without def).
Meanwhile, I think it's pretty obvious that foo(a,{|x|...})
calls foo, and isn't defining it. JavaScript callbacks work like this, and I prefer it because it's clear that the function is just a normal argument. While I like Ruby's style of passing blocks, I think it would be better for Streem to just have a simple anonymous function syntax (which it already has) and stick to that explicitly.
I'm interested in the streem language development!
However, where should we discuss it? Are there already any communication platforms? Mailing list, IRC, or..?
This might be completely strange, but I feel like the fizzbuzz example could benefit from some kind of conditional write operator for function operators. The syntax would be something like:
cond_expr => expr
When cond_expr is true the function object will stream that value to the next receiver
For example:
seq(100) | {|x|
x % 15 == 0 => "FizzBuzz"
x % 3 == 0 => "Fizz"
x % 5 == 0 => "Buzz"
true => x
} | STDOUT
I am currently working on a package manager, and the next part, actually making it so that when you install a package, it goes into the include/import/require scope for streem. But we haven't even specified how that should work yet. I was wondering if anyone has any ideas?
My idea looks something like this:
import("package")
# and then, package.blah
It wouldn't be that hard to implement that once we have objects of some kind. Just use the interpreter to eval that file before the rest of the code is executed. And the assign the anonymous class that executing the file returns to the name of the file. I implemented a toy language called Bike in ruby that did it that way.
I find that using exper -> var
is more stream oriented and more idiomatic for stream oriented, because it conveys a flow of the value into the variable. This seems more stream-like to me. Also, since we might be using the thin stabby for lambdas, we might use: var <- expr
. Although this alternative isn't as nice, it still conveys the streaminess of variables. But I do think that it might be a good idea to only support one type of assignment.
plz
I think the test file should cover all keywords just like emit.
The current implementation of I/O uses the epoll
system call, which only exists on Linux. I think it should use libevent, which is a library for I/O polling that can use epoll, poll, select, etc. depending on the operating system. It seems pretty complicated, but I think it would be worth it. What do you think?
$ perl -e 'print "foo\n" while 1' | ./a.out
growing up memories.
I found a really great C library for higher level programming. It makes C much more modern and nice. http://http://libcello.org. I was thinking that since we are going to do this in C, we could at least use Cello to make C development nicer.
As a shell script, is there a function to execute a shell command (e.g. "ls", "pwd", etc.), already?
#ifndef _WIN32
# include <sys/fcntl.h>
# include <sys/types.h>
# include <sys/socket.h>
# include <netinet/in.h>
# include <netdb.h>
# include <stdio.h>
#else
# include <ws2tcpip.h>
#define closesocket(fd) close(fd)
#endif
#include "strm.h"
#include "node.h"
struct socket_data {
int fd;
node_ctx *ctx;
};
static void
accept_cb(strm_task *strm, strm_value data)
{
struct socket_data *sd = strm->data;
struct sockaddr_in writer_addr;
int writer_len;
int sock;
writer_len = sizeof(writer_addr);
sock = accept(sd->fd, (struct sockaddr *)&writer_addr, &writer_len);
if (sock < 0) {
closesocket(sock);
if (sd->ctx->strm)
strm_close(sd->ctx->strm);
node_raise(sd->ctx, "socket error: listen");
return;
}
strm_emit(strm, strm_task_value(strm_readio(sock)), NULL);
}
static void
server_accept(strm_task *strm, strm_value data)
{
struct socket_data *sd = strm->data;
strm_io_start_read(strm, sd->fd, accept_cb);
}
static void
server_close(strm_task *strm, strm_value d)
{
struct socket_data *sd = strm->data;
closesocket(sd->fd);
}
static int
exec_tcp_server(node_ctx* ctx, int argc, strm_value* args, strm_value *ret)
{
struct sockaddr_in reader_addr;
int sock;
int port;
struct socket_data *sd;
strm_task *task;
#ifdef _WIN32
WSADATA wsa;
WSAStartup(MAKEWORD(2, 0), &wsa);
#endif
if (argc != 1) {
return 1;
}
if ((sock = socket(PF_INET, SOCK_STREAM, 0)) < 0) {
node_raise(ctx, "socket error: socket");
return 1;
}
port = strm_value_int(args[0]);
memset((char *) &reader_addr, 0, sizeof(reader_addr));
reader_addr.sin_family = PF_INET;
reader_addr.sin_addr.s_addr = htonl(INADDR_ANY);
reader_addr.sin_port = htons(port);
if (bind(sock, (struct sockaddr *)&reader_addr, sizeof(reader_addr)) < 0) {
node_raise(ctx, "socket error: bind");
return 1;
}
if (listen(sock, 5) < 0) {
close(sock);
node_raise(ctx, "socket error: listen");
return 1;
}
sd = malloc(sizeof(struct socket_data));
sd->fd = sock;
sd->ctx = ctx;
task = strm_alloc_stream(strm_task_prod, server_accept, server_close, (void*)sd);
*ret = strm_task_value(task);
return 0;
}
void
strm_socket_init(node_ctx* ctx)
{
strm_var_def("tcp_server", strm_cfunc_value(exec_tcp_server));
}
currently, I'm writing tcp_server for examples/06eho.strm
.
# simple echo server on port 8007
tcp_server(8007) | {s ->
s | s
}
but I'm confusing how to make stream for s
. s is prod or cond? or filt?
seq(100) | |x|
if x % 15 == 0
"FizzBuzz"
else if x % 3 == 0
"Fizz"
else if x % 5 == 0
"Buzz"
else
x
| STDOUT
Indentation could be used instead.
Would you like to add more error handling for return values from functions like the following?
I think that currying would be highly convenient in a stream-based programming language! Check this out:
seq(100) | (% 5) | STDOUT
or
seq(100) | (_ % 5) | STDOUT
Does streem have a schedule ?
docker pull debian
Pulling repository debian
2014/12/16 22:19:08 Get https://index.docker.io/v1/repositories/debian/images: x509: certificate has expired or is not yet valid
I am writing some tests for the parser using the cram testing library
I want tee. (not tea)
stdin | tee | sock
or
stdin | tee stdout | sock
In Ruby, there is a strange mixture of blocks, which cannot be stored, procs, which can be stored, but they do not closure, and lambdas, which are the real thing, but aren't really well supported by Ruby. Is this going to happen again??
Not that I don't like Ruby or anything, but being a functional programmer.
Gitter has a useful activity sidebar that you can add things like GitHub issues, GitHub commits, CI status, etc. to. While Streem is in the fairly early stages of development and probably won't need full CI any time soon, I generally find the GitHub integration to be pretty useful when people want to discuss the issues further in a Gitter chatroom. However, these integrations can only be set up by a collaborator on GitHub.
By the way, sorry if this is a bit off topic (as it's not directly related to Streem).
I was thinking that someone (me) should have a fork that is an actual working toy implementation of Streem, so that new features and examples could actually be run and tested, and so that we could actually get an idea of what it would be like to work with Streem. Then, every time a new feature or syntax is added/changed, it could be implemented in the fork for testing and experimentation.
I know, I know, but In my opinion to achieve very fast performances I would opt for a typed and compiled language.
Something in the swift vein can be used for scripting and as well full featured programs.
What do you think?
DD
I would like to point out that identifiers like "_NODE_H_
" and "_STRM_H_
" do not fit to the expected naming convention of the C language standard.
Would you like to adjust your selection for unique names?
$ cat | bin/streem
seq(100) | {x ->
if x % 15 == 0 {
"FizzBuzz"
}
else if x % 3 == 0 {
"Fizz"
}
else if x % 5 == 0 {
"Buzz"
}
else {
x
}
} | STDOUT
^D
Syntax OK
---------
VALUE(ARRAY):
OP: (NOTE: op1)
OP: (NOTE: op2)
CALL:
NIL
IDENT: 6305192
VALUE(ARRAY):
VALUE(NUMBER): 100.000000
NIL
|
CALL:
NIL
NIL
NIL
BLOCK:
VALUE(ARRAY):
IDENT: 6305204
VALUE(ARRAY):
|
IDENT: 6305358
op_bar is separated as several operator. in the future, to handle op_bar as pipe, how about to parse op_bar as optional array to put nodes into same layer like below?
---------
VALUE(ARRAY):
PIPE:
CALL: (NOTE: seq(100))
NIL
IDENT: 6305192
VALUE(ARRAY):
VALUE(NUMBER): 100.000000
NIL
CALL: (NOTE: {...})
NIL
NIL
NIL
BLOCK:
VALUE(ARRAY):
IDENT: 6305204
VALUE(ARRAY):
IDENT: 6305358 (NOTE: STDOUT)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.