matz / streem Goto Github PK

View Code? Open in Web Editor NEW

4.6K 4.6K 237.0 841 KB

prototype of stream based programming language

License: MIT License

Makefile 0.40% C 91.08% Shell 0.11% Lex 1.32% Yacc 6.91% Dockerfile 0.18%

streem's People

Stargazers

Watchers

Forkers

nghialv mattn yhara ksss nivertech nobu sanketsudake chadwick artintal horacio chreble proppy ankurp seuros chancancode bundgaard zixan l8d galtmidas davidcelis eiriklv rashoodkhan collinglass fritexvz jta37 snormore tankerwng neuroradiology prlxnva kmk-online tangyao0792 junk16 fatboy1 pageman aguynamedsteve evian yanglian99 alexislitool longvhdeveloper liexusong rabbitz brandnewlink b4syth mrkn lisp-ceo jueeha aruneinstein marado iamdeveloper trmmy thomaslau ahastudio xiangchenpublicgithub pombredanne alexispurslane gitter-badger gaurav-icare dyhjgl zzqlm xiongshiqin qizl liveinjs shanyechen soarpenguin changrh zj831007 377262688 febwindy liufeigit suker-xu nofounture switch-st eettetz xczswt1993 learn-os cloud-hot yutaoxu supershijia l371559739 allengaller martinwoitzik rui-almeida sorafter pitikocaio test-telecom ukd1 znatz mongolia19 iaalm wadeaalsubaihi nickmccurdy sunsheng juancate halogenandtoast raziataus gamix255 zglx2008 yuzhuqingyun axgle uchihara

streem's Issues

thoughts on RC-like syntax?

http://doc.cat-v.org/plan_9/4th_edition/papers/rc

having variables be lists would be great, as well as ^ for concat. other nice things would be & for async proc, '''cat <{progread} <{progread2} | tee >{progwrite}''' (where <{} or >{} evaluates to a pipe file connected to the subproc for read or write).

any thoughts on incorporating these ideas?

encourage dataflow programmers to switch to streem from shell

I do a lot of flow-oriented programming in shell, so I'm interested in streem as a replacement for the traditional unix shell. (I use zsh, but the differences are minimal enough in this sense that we can talk about shell generally.)

Unix shell scripting was designed from the very beginning with flow-oriented programming is mind. However, while it is trivial to make pipelines in shell, it is not easy to construct logically-driven stream processing functions, where data flows are managed directly using code written in shell. There are a number of workarounds. Often it's easiest to write such functions in other languages and glue them together in shell. Take, for instance, the universe of perl, awk, and sed one-liners that litter an old web of ~user pages and persist in the various stackexchanges and github. I think this is very much a path-dependent situation that relates to the difficulty of writing an efficient, interactive, and interpreted language.

Despite its age and cruft, shell has a huge following in data-oriented science, where I imagine streem would have the greatest impact. These are exactly the users we'd like to attract, and the most complete form of capture would be if they switched from using bash, zsh, and tcsh to using streem itself. I don't think this would be too difficult to achieve, and would have little bearing on the functionality delivered by the language. It would have an effect on its usability.

I propose that streem adopts command syntax that would enable a streem REPL to be used in place of a traditional unix shell. In my mind, this would imply a few basic considerations:

It should use, or at least allow for [command] [arg]* syntax. In other words, it shouldn't require the use of parentheses for every function call.
It should employ shell-like semantics for stream redirection. This already seems to be the case for pipes. Writing to a file, or into a named streem function, should be as easy as [command] [arg]* >[file-or-function].
STDIN and STDOUT should be synonymous with /dev/stdin and /dev/stdout.
streem should have a PATH that describes where non-streem system commands can be found. Some efforts should be made to not clobber functions that are considered standard (cd, ls, pwd, ...), but it should also be OK for streem functions to redefine these commands (seq being an example that has already been shown in the examples).
streem should adopt syntax that eases the kinds of patterns that are met by functions in moreutils, such as tee (write a stream to multiple sinks/files), and pee (push a data stream into multiple other pipelines or functions). One approach would be to use > rather than | to allow for forks and clarify the type of the sink that is being used (is it conceptually a named pipe, a file, or a function). For example, >@ could indicate forking/splitting a pipeline into a named streem function or command.
Perhaps streem could employ the shell syntax & for spawning subprocesses and job control.

The idea isn't to remain compliant with *nix standards, but to make an environment which is attractive to the largest group of people who are currently using flow-oriented patterns.

Should we use GitPoll to vote on syntax and features for Streem?

It seems that certain features and changes to Streem are popular or unpopular, but wouldn't it be better to actually manage polls/votes on certain things to decide, instead of doing it by hand?

http://poll.gitrun.com/

Package manager?

I am volunteering to write a package manager similar to npm for Streem. I thought it might be named "River." Here are my ideas:

river install <git-repo> install project from git repo.
river install installs current director if it is a river project.
river remove <name> removes a project with this name.
river setup an interactive prompt wizard for setting up your project.
river version.
river run runs main.strm in the src/ directory
river bin creates a shebang file in the bin/ directory with your program.
I plan to write in it either C or Rust. Let me know of any suggestions.

require/load as part of the syntax

This comment #18 (comment) made me think that it would be awesome to have some way of pulling in other streem files inline.
Maybe something like method_missing would look on the file system. Or an explicit syntax for pulling streems in so they could be preloaded.

#main.stm
STDIN | %x{split_into_lines} | map %x{match_today} | STDOUT

#split_into_lines.stm
STDIN.split("\n") | STDOUT

#match_today.stm
STDIN.match Date.today | STDOUT

or with method missing style

STDIN | split_into_lines | map match_today | STDOUT

Stream control structure

As I have seen, one of a key point of Streem is streaming data flows on concurrent situation. The FizzBuzz example shows one simple stream([1..100]->FizzBuzz->STDOUT). But if we write complicated concurrent programs, we tame complicated relations of processes. For example:

switching the next process
generating multiple processes
receiving from multiple processes and merge to one data flow

Streem has fascinating syntax like UNIX pipe. But it already have function and if statement, and we can implement process control structures using these. Besides this, we can also implement by extending the pipe syntax. So it will be important which control structures to assign to pipe. It will affect usability and expressiveness. How do you think?

BUG: return breakthrough statements

puts(0)
if 1 > 0 {
  puts(1)
  return
  puts(2)
}

this prints

0
1
2
false

when return not given arguments, ctx->exc may be set?

could there be more streams?

@matz I watched the Full Stack Fest 2015 on youTube just minutes ago and hurried to see what exciting things you are up to :)

Perusing the examples got me thinking if the *nix stream descriptors could come in handy (and be (mis)used) ?

[ 1,2,3 ] | > ( { |x| [ x, x*x, x*x*x ] } |1> { |x| puts x } |2> { |x| puts x } |3> { |x| puts x } ) | stdout
# 
# output would be
1 
1 
1
2
4
8
3
9
27

In that sense you would build your own concurrency and leave it to the interpreter to hand each stream to what core resources would be available -- in the above example 1,2 or 3 cores might be used depending upon availability - and the parenthesis(es) would indicate scopes of concurrency, heck you could even (given some pre-knowledge of the computational challenges throw a seeding value into the mix like

| stream_description, resource_allocation_weight >

 |1,1> { |x| puts x } |2,2> { |x| puts x } |3,3> { |x| puts x }

I cannot wait for you to 'build' this streem (and apply it to Ruby) in effect providing us with some relief when we do:

- resources.each do | resource |
  = render partial 'item', locals: { item: resource }

(and resources is a 10,000 row SQL select ActiveRecord::Relation object)

The stream descriptors would obviously have to work the other way around too. Think about a ticket system in the subway

ticket_verifier 1< reader1 2< reader2 3< reader3 .... N< readerN |1> big_screen 2> logfile

with no one using the reader24 - reader32 they are (today) just sitting there eating up cycles - with your streem, they stop becoming an embarrassment :)

Full 'steem' ahead Matz :D

oh - and thank you so much for Ruby!!!!

cheers
Walther

ps: if this entire post is just about the dumbest you have ever read - please excuse me for waisting your bandwidth!

Filename extension for Streem programs

Which do you think is the best? .st .str .stm .strm or .streem?

Operator Creation or Macros

I was thinking it would be useful to either have the ability to create operators or have macros. I know macros are controversial and hard to understand for some people, but I think at least being able to create operators as functions (a poor man's macro) would be useful. I don't really have any ideas about syntax though...

The block parameter syntax ("|x|") might be misinterpreted as "a pipe through x"?

The problem is not the flow operator (|). On the contrary – using the pipe as the flow operator conveys the meaning very intuitively for everyone who has ever used a shell, so let's keep that very good choice.

Instead I suggest switching from |x| to say (x) ->, (x) or x -> to denote block parameters.

The (…) -> alternative is implemented here: https://github.com/practicalswift/streem/commit/193fa8413d14dce03d609ec597006a40ca678ec1

Reasoning:

(parameter1, parameter2, parameter3, …) should be familiar to most programmers.
-> is used for assignment in the Streem syntax (see op_rasgn in lex.l).

Summary of discussed alternatives:

Alternative 1 (current syntax).

seq(100) | { |x|
  if x % 15 == 0 {
    "FizzBuzz"
  }
...
} | STDOUT

Alternative 2 (implemented here: https://github.com/practicalswift/streem/commit/193fa8413d14dce03d609ec597006a40ca678ec1).

seq(100) | { (x) ->
  if x % 15 == 0 {
    "FizzBuzz"
  }
...
} | STDOUT

Alternative 3.

seq(100) | { (x)
  if x % 15 == 0 {
    "FizzBuzz"
  }
...
} | STDOUT

Alternative 4.

seq(100) | { x ->
  if x % 15 == 0 {
    "FizzBuzz"
  }
...
} | STDOUT

task_tid function

task_tid function in core.c chooses strm_task's tid based on max queue size when s->tid < 0 and tid parameter < 0, but I feel it might need using min queue size.
I want to write a test case to cover this function's execution path but don't know how to write a parallel streem program now or it does not support parallel program now.

The code snippet is 30-48 line in core.c

else {
  int n = 0;
  int max = 0;
  for (i=0; i<thread_max; i++) {
    int size = strm_queue_size(threads[i].queue);
    if (size == 0) break;
    if (size > max) {
      max = size;
      n = i;
    }
  }
  if (i == thread_max) {
    s->tid = n;
  }
  else {
    s->tid = i;
  }
}

foo = "bar" lexical error

I am writing my cram tests on an up to date version (I pulled this repo onto my fork and replayed my changes back on). But when I am writing a test for the var '=' expr part of the grammar, I get this error:

lexical error ('=').

I am not sure weather I am misunderstanding something.

For your memo, windows doesn't have strndup.

I don't be in hurry. But currently, mingw32 doesn't have strndup.

parse.o: In function `yylex':
C:\dev\streem\src/lex.l:60: undefined reference to `strndup'
collect2.exe: error: ld returned 1 exit status
mingw32-make: *** [../bin/streem] Error 1

mattn@67eb127

I'll send PR in later, but FYI.

Logo?

I was thinking a png icon along the lines of:

BTW, I'm not really that great at drawing, so any other icons would be great!

Another procedural language

I think the idea of a well written, open source, stream-based/flow-based/data-flow language is a fantastic one. My only concern about this idea is the initial example

seq(100) | {|x|
  if x % 15 == 0 {
    "FizzBuzz"
  }
  else if x % 3 == 0 {
    "Fizz"
  }
  else if x % 5 == 0 {
    "Buzz"
  }
  else {
    x
  }
} | STDOUT

Do we need an procedural language? Please ignore the syntax, but what about something like this:

from fizzbuzz import fizzbuzz
seq(100) | fizzbuzz | STDOUT

where fizzbuzz is a module written in a conventional imperative language, such as Python, Ruby or C++?

Consider using CI (continuous integration)

While Streem is young enough that we probably won't need a full unit test suite any time soon, in my opinion it still would be helpful to have some basic smoke tests (or something like that) running in CI (especially since there are so many pull requests lately, and CI can be pretty helpful in finding bugs that people may not notice otherwise). Travis is especially useful for testing open source projects without being too complex to set up and I particularly like it, though any CI system would work.

Prototypal

I think it would be nice if Streem used prototypal inheritance and object literals, (as in JavaScript). This makes for a very light-weight, expressive, powerful, and most of all, quick OOP-type programming experience.

Will any OOP grammar be supported?

There are already some functional syntax in the design of streem.
Will the "class", "method", "inheritance" or any other OOP elements be in this language?
I think "everything is an object" is good in ruby. Can I use the "Stream", "Pipleline" as an object in runtime?

Streem examples

So far, it seems that the only thing anybody knows how to write with Streem is the FizzBuzz and cat programs. Obviously, a language can't be that useful if it can only do FizzBuzz and cat.

Please comment on this issue with your examples of more useful things you can do with Streem. I still need to be convinced that there are any.

Flow-based programming

Flow-based porgramming resources:

http://www.jpaulmorrison.com/fbp/

[email protected]

Contact me [email protected] for copies of 2 papers about my form of reactive (embedded) FBP.

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=4054691&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F4054516%2F4054517%2F04054691.pdf%3Farnumber%3D4054691

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.93.9511

Also, contact me for techinques on how to compile diagrams to code. FBP (and probably Streem) lends itself to a concrete syntax that uses diagrams. And contact me for techniques for doing this without processes/threads.

My github might be of interest (esp. vsh)

https://github.com/guitarvydas

[I'd be glad to discuss here, on the fbp group or privately]

Ravi Sethi (dragon book) published a paper for a compiler using denotational semantics which employed pipe syntax. You might find inspiration there:

http://dl.acm.org/citation.cfm?id=357227

This is beautiful

Thank you @matz

Will there be a way to specify more than one consumer of a flow?

Usually several instances of the same processing running to take advantage of whatever paralellism is available.

New function object (and block) notation

note that alternatives 2,3,5,6,8 will cause shift-reduce conflict.

No.	Votes	Notation	comment
1	2	`{	x
2	3	`{ (x) -> … }`
3	4	`{ (x) … }`
4	2	`{ x -> … }`
5	5	`(x) -> { … }`
6	1	`(x) => { … }`
7	2	`{ }`	parameters are `$0`, `$1`, `$2`, etc.
8	1	`(x) { … }`
9	1	`\x -> { … }`
10	1	`-> x { ... }`	my vote

Request: Remove syntax sugar for calling a function and passing it another function

As @matz mentioned in #15:

... in Streem foo(a) {|x| ...} is a syntactic sugar to foo(a,{|x|...}).

In my opinion, this syntax sugar is too confusing. The only syntactic difference between defining a function foo and calling foo while passing it another function is the def keyword. I think this will lead to people reading this syntax and assuming that foo is being defined, or with people creating functions and forgetting the def keyword. Also, bash and C have very similar syntax to defining functions (without def).

Meanwhile, I think it's pretty obvious that foo(a,{|x|...}) calls foo, and isn't defining it. JavaScript callbacks work like this, and I prefer it because it's clear that the function is just a normal argument. While I like Ruby's style of passing blocks, I think it would be better for Streem to just have a simple anonymous function syntax (which it already has) and stick to that explicitly.

How to contribute

I'm interested in the streem language development!
However, where should we discuss it? Are there already any communication platforms? Mailing list, IRC, or..?

Conditional write for function objects

This might be completely strange, but I feel like the fizzbuzz example could benefit from some kind of conditional write operator for function operators. The syntax would be something like:

cond_expr => expr

When cond_expr is true the function object will stream that value to the next receiver

For example:

seq(100) | {|x|            
  x % 15 == 0 => "FizzBuzz"
  x % 3 == 0 => "Fizz"
  x % 5 == 0 => "Buzz"
  true => x
} | STDOUT

Imports

I am currently working on a package manager, and the next part, actually making it so that when you install a package, it goes into the include/import/require scope for streem. But we haven't even specified how that should work yet. I was wondering if anyone has any ideas?
My idea looks something like this:

import("package")
# and then, package.blah

It wouldn't be that hard to implement that once we have objects of some kind. Just use the interpreter to eval that file before the rest of the code is executed. And the assign the anonymous class that executing the file returns to the name of the file. I implemented a toy language called Bike in ruby that did it that way.

Only -> for assginment

I find that using exper -> var is more stream oriented and more idiomatic for stream oriented, because it conveys a flow of the value into the variable. This seems more stream-like to me. Also, since we might be using the thin stabby for lambdas, we might use: var <- expr. Although this alternative isn't as nice, it still conveys the streaminess of variables. But I do think that it might be a good idea to only support one type of assignment.

hurry up

plz

The test file should cover all keywords

I think the test file should cover all keywords just like emit.

libevent

The current implementation of I/O uses the epoll system call, which only exists on Linux. I think it should use libevent, which is a library for I/O polling that can use epoll, poll, select, etc. depending on the operating system. It seems pretty complicated, but I think it would be worth it. What do you think?

Make clean doesn't remove bin/

memory leak on lib/a.out

$ perl -e 'print "foo\n" while 1'  | ./a.out

growing up memories.

Cello?

I found a really great C library for higher level programming. It makes C much more modern and nice. http://http://libcello.org. I was thinking that since we are going to do this in C, we could at least use Cello to make C development nicer.

Is there function to execute a shell command

As a shell script, is there a function to execute a shell command (e.g. "ls", "pwd", etc.), already?

reader but writer

#ifndef _WIN32
# include <sys/fcntl.h>
# include <sys/types.h>
# include <sys/socket.h>
# include <netinet/in.h>
# include <netdb.h>
# include <stdio.h>
#else
# include <ws2tcpip.h>
#define closesocket(fd) close(fd)
#endif

#include "strm.h"
#include "node.h"

struct socket_data {
  int fd;
  node_ctx *ctx;
};

static void
accept_cb(strm_task *strm, strm_value data)
{
  struct socket_data *sd = strm->data;
  struct sockaddr_in writer_addr;
  int writer_len;
  int sock;

  writer_len = sizeof(writer_addr);
  sock = accept(sd->fd, (struct sockaddr *)&writer_addr, &writer_len);
  if (sock < 0) {
    closesocket(sock);
    if (sd->ctx->strm)
      strm_close(sd->ctx->strm);
    node_raise(sd->ctx, "socket error: listen");
    return;
  }

  strm_emit(strm, strm_task_value(strm_readio(sock)), NULL);
}

static void
server_accept(strm_task *strm, strm_value data)
{
  struct socket_data *sd = strm->data;

  strm_io_start_read(strm, sd->fd, accept_cb);
}

static void
server_close(strm_task *strm, strm_value d)
{
  struct socket_data *sd = strm->data;

  closesocket(sd->fd);
}

static int
exec_tcp_server(node_ctx* ctx, int argc, strm_value* args, strm_value *ret)
{
  struct sockaddr_in reader_addr; 
  int sock;
  int port;
  struct socket_data *sd;
  strm_task *task;

#ifdef _WIN32
  WSADATA wsa;
  WSAStartup(MAKEWORD(2, 0), &wsa);
#endif

  if (argc != 1) {
    return 1;
  }

  if ((sock = socket(PF_INET, SOCK_STREAM, 0)) < 0) {
    node_raise(ctx, "socket error: socket");
    return 1;
  }

  port = strm_value_int(args[0]);

  memset((char *) &reader_addr, 0, sizeof(reader_addr));
  reader_addr.sin_family = PF_INET;
  reader_addr.sin_addr.s_addr = htonl(INADDR_ANY);
  reader_addr.sin_port = htons(port);

  if (bind(sock, (struct sockaddr *)&reader_addr, sizeof(reader_addr)) < 0) {
    node_raise(ctx, "socket error: bind");
    return 1;
  }

  if (listen(sock, 5) < 0) {
    close(sock);
    node_raise(ctx, "socket error: listen");
    return 1;
  }

  sd = malloc(sizeof(struct socket_data));
  sd->fd = sock;
  sd->ctx = ctx;
  task = strm_alloc_stream(strm_task_prod, server_accept, server_close, (void*)sd);
  *ret = strm_task_value(task);
  return 0;
}

void
strm_socket_init(node_ctx* ctx)
{
  strm_var_def("tcp_server", strm_cfunc_value(exec_tcp_server));
}

currently, I'm writing tcp_server for examples/06eho.strm.

# simple echo server on port 8007
tcp_server(8007) | {s ->
  s | s
}

but I'm confusing how to make stream for s. s is prod or cond? or filt?

What is your opinion about removing the curly brackets?

seq(100) | |x|
  if x % 15 == 0
    "FizzBuzz"
  else if x % 3 == 0
    "Fizz"
  else if x % 5 == 0
    "Buzz"
  else
    x
| STDOUT

Indentation could be used instead.

Completion of error handling

Would you like to add more error handling for return values from functions like the following?

malloc ⇒ node_pair_new
fputs ⇒ print_id

Partial Application

I think that currying would be highly convenient in a stream-based programming language! Check this out:

seq(100) | (% 5) | STDOUT

seq(100) | (_ % 5) | STDOUT

Does streem have a schedule ?

how can i run docker for mac

docker pull debian
Pulling repository debian
2014/12/16 22:19:08 Get https://index.docker.io/v1/repositories/debian/images: x509: certificate has expired or is not yet valid

Cram tests

I am writing some tests for the parser using the cram testing library

tee

I want tee. (not tea)

stdin | tee | sock

stdin | tee stdout | sock

Procs, Blocks, and Lambdas

In Ruby, there is a strange mixture of blocks, which cannot be stored, procs, which can be stored, but they do not closure, and lambdas, which are the real thing, but aren't really well supported by Ruby. Is this going to happen again??

Not that I don't like Ruby or anything, but being a functional programmer.

Add integrations to Gitter chatroom

Gitter has a useful activity sidebar that you can add things like GitHub issues, GitHub commits, CI status, etc. to. While Streem is in the fairly early stages of development and probably won't need full CI any time soon, I generally find the GitHub integration to be pretty useful when people want to discuss the issues further in a Gitter chatroom. However, these integrations can only be set up by a collaborator on GitHub.

By the way, sorry if this is a bit off topic (as it's not directly related to Streem).

Toy Implementation using current lexer and parser?

I was thinking that someone (me) should have a fork that is an actual working toy implementation of Streem, so that new features and examples could actually be run and tested, and so that we could actually get an idea of what it would be like to work with Streem. Then, every time a new feature or syntax is added/changed, it could be implemented in the fork for testing and experimentation.

Typed

I know, I know, but In my opinion to achieve very fast performances I would opt for a typed and compiled language.

Something in the swift vein can be used for scripting and as well full featured programs.

What do you think?

reserved identifier violation

I would like to point out that identifiers like "_NODE_H_" and "_STRM_H_" do not fit to the expected naming convention of the C language standard.
Would you like to adjust your selection for unique names?

op_bar should be an optional array?

$ cat | bin/streem
seq(100) | {x ->
  if x % 15 == 0 {
    "FizzBuzz"
  }
  else if x % 3 == 0 {
    "Fizz"
  }
  else if x % 5 == 0 {
    "Buzz"
  }
  else {
    x
  }
} | STDOUT
^D
Syntax OK
---------
VALUE(ARRAY):
 OP: (NOTE: op1)
  OP: (NOTE: op2)
   CALL:
     NIL
     IDENT: 6305192
     VALUE(ARRAY):
      VALUE(NUMBER): 100.000000
     NIL
   |
   CALL:
     NIL
     NIL
     NIL
     BLOCK:
      VALUE(ARRAY):
       IDENT: 6305204
      VALUE(ARRAY):
  |
  IDENT: 6305358

op_bar is separated as several operator. in the future, to handle op_bar as pipe, how about to parse op_bar as optional array to put nodes into same layer like below?

---------
VALUE(ARRAY):
 PIPE:
  CALL: (NOTE: seq(100))
   NIL
   IDENT: 6305192
   VALUE(ARRAY):
    VALUE(NUMBER): 100.000000
   NIL
  CALL:  (NOTE: {...})
   NIL
   NIL
   NIL
   BLOCK:
     VALUE(ARRAY):
     IDENT: 6305204
    VALUE(ARRAY):
  IDENT: 6305358   (NOTE: STDOUT)

matz / streem Goto Github PK

streem's People

Stargazers

Watchers

Forkers

streem's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs