GithubHelp home page GithubHelp logo

bud's People

Contributors

aarondav avatar freenerd avatar j-wang avatar jhellerstein avatar joshrosen avatar michaelficarra avatar neilconway avatar palvaro avatar sciolizer avatar sriram-srinivasan avatar vjoel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bud's Issues

Make use of unified_ruby in rewriter

Before passing an sexp to Ruby2Ruby, you should first pass it through unified_ruby. This should hopefully avoid the need for kludges like SaneR2R.

Better implementation of stdin reader

Current send-data-via-TCP method is a massive kludge; would be much better to integrate the terminal's file descriptor into the EM event loop.

Similar comments probably apply to other IO-oriented collections.

Add concept of ordered channels?

The current channel semantics don't capture the behavior provided by several common types of channels (e.g., TCP sockets, typical Unix FDs, even typical asynchronous queues, etc.) -- if we send tuple t_1 at time 1 and t_2 at time 2, t_1 will be delivered before t_2. Choosing the delivery timestamp in a completely non-deterministic manner doesn't capture this; in practical programs, this means that the "correct" operation of the program depends on semantics not explicitly provided by the language spec (i.e., two writes to stdio in adjacent timesteps preserves the timestep order).

String interpolation doesn't work

stdio <~ pipe.map {|p| ["got message: #{p.msg}"]}

Produces the output: "got message:", but doesn't appear to interpolate the given string correctly.

Fix pingpong example

Uses deprecated syntax (explicit strata). Explicit specification of hostname/port is ugly, but harder to resolve.

Async request/response handling

via a library or sugar. Some related things:

  • atomic deferred/async
  • batch scheduling
  • storage: UNIX I/O, SQL
  • RESTful web service interation
  • HTML protocol bindings
  • interrupt-driven/push collections vs. read/pull

"terminal" collection type + multiple Buds per process

Right now, terminal automatically tries to read from stdin. This is annoying, because stdin might be used for other purposes. It also causes problems since a given Bud instance creates sub-instances to do stratification and similar tasks. Each instance spawns a thread to read from stdin, leading to contention.

Fix joins for TC-backed tables

Current exception:

NoMethodError: undefined method `each_value' for nil:NilClass
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:103:in `each_from'
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:102:in `each'
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:102:in `each_from'
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:112:in `each_storage'
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:656:in `send'
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:656:in `hash_join'
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:590:in `each'
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:585:in `each'
/Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:584:in `each'

Should <+ and <- be evaluated "eagerly"?

Right now, <+ and <- derivations don't cause a new tick to occur. That is, suppose I have a program:

  • When X occurs, add a new fact to Y via <+
  • When Y contains 10 facts, fire the missiles

With current Bud, if there are exactly 10 X events (and no more), the missiles will never be fired: the 10th Y fact will be "pending", but won't be added to Y until the next tick, which might never occur.

Rewrite failure: empty "declare" block

Given a Bud program with a "declare def foo; end" block, Bud produces the output:

Running original (ZkTableTest) code: couldn't rewrite stratified ruby (Invalid top-level clause length 1: '[:nil]')

ordering features

  • queue/serializer
  • nonce/sequence generation
    • nested time?
    • Solution: assign_ordinals. Given a set and a total order for that set, return the input set annotated with the ordinal of each element of the set according to the ordering.
    • How does this generalize to ordered tables?

Test case failure: test_visualization

.......................E...Running original (VarBudDup) code: couldn't rewrite stratified ruby (Invalid op (x) in top-level block [:attrasgn, [:self], :x=, [:array, [:lit, 4]]]
)
.Running original (VarBud) code: couldn't rewrite stratified ruby (Invalid op (x) in top-level block [:attrasgn, [:self], :x=, [:array, [:lit, 4]]]
)
..
Finished in 33.719155 seconds.

  1. Error:
    test_visualization(TestMeta):
    NoMethodError: undefined method cycle' for #<DepAnalysis:0x101791948> ./tc_meta.rb:92:intest_visualization'
    /Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:106:in each_from' /Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:103:ineach_value'
    /Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:103:in each_from' /Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:102:ineach'
    /Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:102:in each_from' /Library/Ruby/Gems/1.8/gems/bud-0.0.1/lib/bud/collections.rb:98:ineach'
    ./tc_meta.rb:92:in `test_visualization'

30 tests, 138 assertions, 0 failures, 1 errors

scope and aliasing

currently if I want to use a module Foo, I say:

include Foo

and this brings into (global) scope the rules and relations defined by Foo. what if I want to use two Foos? what if I want to extend Foo by redeclaring one of its input interfaces and interposing additional logic? we really want something like:

import Foo as f1

this would kill both birds...

Persistent storage: on-disk state

Where should we store the on-disk state associated with persistent tables? Requirements:

  • If two concurrent Bud instances are running, they shouldn't try to write to the same on-disk state
    • Solution: include port number in path to on-disk state
  • If the same Bud program is run twice, the persistent state from the previous instance should be accessible to the new instance
    • What defines "a new instance of the same Bud program"? One approach would be to include the full path name of the Bud program in the path to the on-disk state; running /x/y/foo.rb on port 5555 twice in a row means that the second instance reuses the state written by the first instance

Other behavior questions:

  • What happens if the format of an on-disk table changes?
    • Possible solution: have a Bud flag/option that means "Blow away any previously-stored persistent state". When the schema of an on-disk table changes, require that this flag be specified, deleting any data stored with the previous schema

refactor collection interface

Building a collection type based on TokyoCabinet is somewhat awkward, because the base BudCollection class provides/assumes that @storage is in in-memory hash containing the "normal" tuples in the table. It is inconvenient to maintain this invariant for TC; it would be better if BudCollection didn't make this assumption.

Other areas for improvement:

  • interface is large; it isn't clear what stuff an implementation needs to provide
  • perhaps move some functionality into a module/mixin?

safe_rewrite should actually do some validation

Currently, safe_rewrite doesn't validate things. Some validation that might be nice is "are there any variables in this code that are undefined?" and "are all of the tables referenced actually defined?"

Ideally, we'd like to catch stuff like this example (which illustrates the previous two points)

def Foo < Bud
declare
def rules
j # undefined variable
xyz <= [[1]] # undefined collection xyz
end
end

UDA for disorderly cart

we need to take the disorderly schema (session, item, cnt) and transform it to the array representation (session, array_of_items) as we describe in the cidr paper. array_of_items should contain each item cnt times. so we need either a binary UDA, or some flattening trick. the current unary UDA accum() does not account for cnt.

Fix deletion semantics

Currently we check for <- LHS matches only looking at the key columns; should look at the entire tuple.

Bud API: support interaction with async bud

When Bud is running as a background thread, we need a safe way to interact with it (eg insert tuples, probe table contents) w/o race conditions. Easiest way to do that is to use a thread-safe queue.

Bud API: easier insert into scratch

There isn't an easy way to insert tuples into a scratch collection between calls to tick(): right now, scratchs are emptied before the next tick() begins.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.