GithubHelp home page GithubHelp logo

hcchen / algebird Goto Github PK

View Code? Open in Web Editor NEW

This project forked from twitter/algebird

0.0 2.0 0.0 4.66 MB

Abstract Algebra for Scala

Home Page: https://twitter.com/scalding

License: Apache License 2.0

Scala 95.06% Java 1.89% Shell 1.66% Ruby 1.39%

algebird's Introduction

Algebird Build status Coverage status

Abstract algebra for Scala. This code is targeted at building aggregation systems (via Scalding or Storm). It was originally developed as part of Scalding's Matrix API, where Matrices had values which are elements of Monoids, Groups, or Rings. Subsequently, it was clear that the code had broader application within Scalding and on other projects within Twitter.

See the current API documentation for more information.

What can you do with this code?

> ./sbt algebird-core/console

Welcome to Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_40).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import com.twitter.algebird._
import com.twitter.algebird._

scala> import com.twitter.algebird.Operators._
import com.twitter.algebird.Operators._

scala> Map(1 -> Max(2)) + Map(1 -> Max(3)) + Map(2 -> Max(4))
res0: scala.collection.immutable.Map[Int,com.twitter.algebird.Max[Int]] = Map(2 -> Max(4), 1 -> Max(3))

In the above, the class Max[T] signifies that the + operator should actually be max (this is accomplished by providing an implicit instance of a typeclass for Max that handles +).

  • Model a wide class of "reductions" as a sum on some iterator of a particular value type. For example, average, moving average, max/min, set union, approximate set size (in much less memory with HyperLogLog), approximate item counting (using CountMinSketch).
  • All of these combine naturally in tuples, vectors, maps, options and more standard scala classes.
  • Implementations of Monoids for interesting approximation algorithms, such as Bloom filter, HyperLogLog and CountMinSketch. These allow you to think of these sophisticated operations like you might numbers, and add them up in hadoop or online to produce powerful statistics and analytics.

Community and Documentation

This, and all github.com/twitter projects, are under the Twitter Open Source Code of Conduct. Additionally, see the Typelevel Code of Conduct for specific examples of harassing behavior that are not tolerated.

To learn more and find links to tutorials and information around the web, check out the Algebird Wiki.

The latest ScalaDocs are hosted on Algebird's Github Project Page.

Discussion occurs primarily on the Algebird mailing list. Issues should be reported on the GitHub issue tracker.

Maven

Algebird modules are available on maven central. The current groupid and version for all modules is, respectively, "com.twitter" and 0.11.0.

Current published artifacts are

  • algebird-core_2.11
  • algebird-core_2.10
  • algebird-test_2.11
  • algebird-test_2.10
  • algebird-util_2.11
  • algebird-util_2.10
  • algebird-bijection_2.11
  • algebird-bijection_2.10

The suffix denotes the scala version.

Questions

Why not use spire?

We didn't know about it when we started this code, but it seems like we're more focused on large scale analytics.

Why not use Scalaz's Monoid trait?

The answer is a mix of the following:

  • The trait itself is tiny, we just need zero and plus, it is the implementations for all the types that are important. We wrote a code generator to derive instances for all the tuples, and by hand wrote monoids for List, Set, Option, Map, and several other objects used for counting (DecayedValue for exponential decay, AveragedValue for averaging, HyperLogLog for approximate cardinality counting). It's the instances that are useful in scalding and elsewhere.
  • We needed this to work in scala 2.8, and it appeared that Scalaz 7 didn't support 2.8. We've since moved to 2.9, though.
  • We also needed Ring and Field, and those are not (as of the writing of the code) in Scalaz.
  • If you want to interop, it is trivial to define implicit conversions to and from Scalaz Monoid.

Authors

License

Copyright 2015 Twitter, Inc.

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

algebird's People

Contributors

avi-stripe avatar avibryant avatar azymnis avatar berngp avatar bkirwi avatar caniszczyk avatar cheecheeo avatar daniellesucher avatar df-stripe avatar echen avatar hansmire avatar ianoc avatar isnotinvain avatar jackcwang1 avatar jcoveney avatar johnynek avatar julienledem avatar koertkuipers avatar mansurashraf avatar mikegagnon avatar mosesn avatar reconditesea avatar rgcase avatar sid-kap avatar singhala avatar smarden1 avatar snoble avatar sritchie avatar stephanh avatar wlue avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.