GithubHelp home page GithubHelp logo

ozzie00 / snowplow Goto Github PK

View Code? Open in Web Editor NEW

This project forked from snowplow/snowplow

0.0 2.0 0.0 16.88 MB

Enterprise-strength web, mobile and event analytics, powered by Hadoop, Kinesis, Redshift and Elasticsearch

Home Page: http://snowplowanalytics.com

CSS 0.01% HTML 0.24% JavaScript 0.51% Java 2.56% Clojure 1.19% Scala 82.12% Thrift 0.25% Ruby 6.87% Shell 1.07% LookML 4.74% PLpgSQL 0.44%

snowplow's Introduction

Snowplow

[ ![Build Status] travis-image ] travis [ ![Release] release-image ] releases [ License license-image ] license

Snowplow logo

Snowplow is an enterprise-strength marketing and product analytics platform. It does three things:

  1. Identifies your users, and tracks the way they engage with your website or application
  2. Stores your users' behavioural data in a scalable "event data warehouse" you control: in Amazon S3 and (optionally) Amazon Redshift or Postgres
  3. Lets you leverage the biggest range of tools to analyze that data, including big data tools (e.g. Hive, Pig, Mahout) via EMR or more traditional tools e.g. Tableau, R, Looker, Chartio to analyze that behavioural data

To find out more, please check out the [Snowplow website] website and the [Snowplow wiki] wiki.

Snowplow technology 101

The repository structure follows the conceptual architecture of Snowplow, which consists of six loosely-coupled sub-systems connected by five standardized data protocols/formats:

![architecture] architecture-image

To briefly explain these six sub-systems:

  • Trackers fire Snowplow events. Currently we have 12 trackers, covering web, mobile, desktop, server and IoT
  • Collectors receive Snowplow events from trackers. Currently we have three different event collectors, sinking events either to Amazon S3 or Amazon Kinesis
  • Enrich cleans up the raw Snowplow events, enriches them and puts them into storage. Currently we have a Hadoop-based enrichment process, and a Kinesis-based process
  • Storage is where the Snowplow events live. Currently we store the Snowplow events in a flatfile structure on S3, and in the Redshift and Postgres databases
  • Data modeling is where event-level data is joined with other data sets and aggregated into smaller data sets, and business logic is applied. This produces a clean set of tables which make it easier to perform analysis on the data. We have data models for Redshift and Looker looker
  • Analytics are performed on the Snowplow events or on the aggregate tables. We currently have an online cookbook of ad hoc analyses that work with Redshift, Postgres and Hive. We also have data models for Looker looker in LookML

For more information on the current Snowplow architecture, please see the [Technical architecture] architecture-doc.

Quickstart

Assuming git, [Vagrant] vagrant-install and [VirtualBox] virtualbox-install installed:

 host$ git clone https://github.com/snowplow/snowplow.git
 host$ cd snowplow
 host$ vagrant up && vagrant ssh
guest$ cd /vagrant/3-enrich/scala-common-enrich
guest$ sbt test

Find out more

[Technical Docs] techdocs [Setup Guide] setup Roadmap roadmap Contributing contributing
[![i1] techdocs-image] techdocs [![i2] setup-image] setup [![i3] roadmap-image] roadmap [![i4] contributing-image] contributing

Contributing

We're committed to a loosely-coupled architecture for Snowplow and would love to get your contributions within each of the six sub-systems.

If you would like help implementing a new tracker, adding an additional enrichment or loading Snowplow events into an alternative database, check out our Contributing contributing page on the wiki!

Questions or need help?

Check out the [Talk to us] talk-to-us page on our wiki.

Copyright and license

Snowplow is copyright 2012-2013 Snowplow Analytics Ltd.

Licensed under the [Apache License, Version 2.0] license (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

snowplow's People

Contributors

aalekh avatar alexanderdean avatar bigsnarfdude avatar bogaert avatar butlermh avatar dideler avatar duncan avatar fblundun avatar gisripa avatar gkushida avatar gregakespret avatar jasonbosco avatar jbeemster avatar kazjote avatar kinabalu avatar kingo55 avatar knservis avatar mmoulton avatar mrwalker avatar mtibben avatar ngsmrk avatar oagr avatar petervanwesep avatar ramn avatar rgabo avatar richo avatar rupeshmane avatar sambo1972 avatar shermozle avatar yalisassoon avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.