GithubHelp home page GithubHelp logo

jtkostman / reelbid Goto Github PK

View Code? Open in Web Editor NEW

This project forked from emptyr1/reelbid

0.0 2.0 0.0 125.33 MB

Real time ad bidding framework

Home Page: http://muppal.com/reelbid

Python 57.03% Shell 2.68% Scala 8.72% CSS 1.90% XSLT 7.22% HTML 0.36% JavaScript 0.89% Makefile 0.06% Protocol Buffer 6.91% Ruby 1.30% Java 12.93%

reelbid's Introduction

ReelBid


ReelBid is a real time ad bidding / programmatic framework, which can be used to bid on website advertisements in real time(100 ms). Currently, it works with openrtb[]-protocol-complaint Ad exchanges like Google Adx, AppNexus Smaato etc. It was a part of Insight data engineering fellowship.

Find related slides for more details: https://goo.gl/3Bl15V

Getting started


(Due to few security concerns and credentials the code is still being uploaded and somewhat incomplete)

What is real time bidding? Simply put, imagine a ebay auction where auction lasts for 100 milliseconds. There's a seller side(SSP) and buying side(DSP). In this case, an AdExchange would be ebay, hosting the auction and ReelBid[this project] would be the buying side, used by any advertiser. Depending on the business logic, registered/interested bidders/companies bid for a spot on a webpage for every user impression before the page fully loads -- hence, more targeted advertising. To know more, I recommend reading this, this or watch this 60 second video. (Remember you are bidding on every user impression on that website) Or check out my slides here.

To get started, you need:

Methods & Technologies used to acheive high throughput


All technologies used were supposed to be programmed in a highly asynchronous -- the key to make the system more effective and acheive sub-second latency. The clients (and ideally should) use non-blocking IO to implement request pipelining and achieve higher throughput. i.e., clients can send requests even while awaiting responses for preceding requests since the outstanding requests will be buffered in the underlying OS socket buffer. (Java did not turn out to be the best language for this, due to global lock because of Java garbage collection. Go/golang would have been perfect for this and would be my next step)

Architecture 2.0


imagetxt

A total of 14 nodes were used and I tried using Amabari from Hortonworks for cluster mgmt.

Other tools include Vagrant & Docker.. (inside processing folder) -- so you need to install Vagrant (>= 1.6) and VirtualBox, then run:

imagetxt2 imagetxt3 imagetxt4

Analysis and Networking challenges


Q1: Why use redshift and what analysis is being done?

We are bidding on online ads based on probablistic models. We would like to know anything about what are paying for ads at this moment -not just how much I have spent in the past hour, but what percent of ads have I spent more than a penny on, for example. Or what's the 90th or 5th percentile- which involves alot of aggregations and groupings, and relational db seemed like a good choice for this. Redshift being an OLAP is pretty fast and support extremely large data sizes.

Q2: How do you handle 2 million hits per second?

Sampling! Using Ziggurat Algorithm to sample some random values following gaussian or gamma distribution. Check around slide 10 here. Reservoir sampling or VIRB's(Variable incoming rate biased samplers) are another good techniques which I tried & worked well when I need a biased sample. Check out my recent post on medium.

Q3: What strategy were used to accept so many requests per second?

There's this interesting paper published in IEEE on Data mining on Scaling RTB which was a good and sole inspiration for this project. Find it here.

Testing


How did I test the system?

I tested the system by creating a mock exchange, which sent random valid(or invalid) bid request on port 12336 and WINS on port 12339. Validation was checked at the receiving side. Unit testing was done with Smaato Ad Exchange. I also tried using this parallec.io to basically DDos attack my system.

QA


What does a bid request look like? How did I deserialize the object?


A typical bid request looks like:

{
	"id": "32a69c6ba388f110487f9d1e63f77b22d86e916b",
	"imp": [{
		"id": "1",
		"banner": {
			"h": 250,
			"w": 300,
			"battr": [2, 3],
			"btype": [1, 3]
		}
	}],
	"site": {
		"id": "102855",
		"name": "mashable.com",
		"domain": "http://www.example.com",
		"cat": ["IAB15", "IAB15-10"],
		"page": "http://easy.example.com/easy?cu=13824;cre=mu;target=_blank",
		"ref": "http://refer+url",
		"publisher": {
			"id": "qqwer1234xgfd",
			"name": "site_name",
			"domain": "my.site.com"
		}
	},
	"device": {
		"ua": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.13  (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2",
		"ip": "192.168.5.5",
		"geo": {
			"lat": 37.789,
			"lon": -122.394,
			"country": "USA",
			"city": "San Francisco",
			"region": "CA",
			"zip": "94105",
			"type": 2
		}
	},
	"user": {
		"buyeruid": "89776897686798fwe87rtryt8976fsd7869678",
		"id": "55816b39711f9b5acf3b90e313ed29e51665623f",
		"gender": "M",
		"yob": 1975,
		"customdata": "Data-asdfdwerewr",
		"data": [{
			"id": "pub-demographics",
			"name": "data_name",
			"segment": [{
				"id": "345qw245wfrtgwertrt56765wert",
				"name": "segment_name",
				"value": "segment_value"
			}]
		}]
	}
}

and Bid Response which ReelBid sends back to some Ad exchange includes the id and the price.

openRTB is the standard protocol used in real time bidding. I'm using 2.2 version with nodejs, which you can find on npm.

List of major adexchanges: AppNexus, google AdX, Facebook

Contribution


Contact


I do not have alot of background in advertisement or real time bidding. If I made a mistake or something does not make sense, please let me know. Feel free to reach me at: mudituppal247[at]gmail[dot]com

reelbid's People

Contributors

emptyr1 avatar

Watchers

James Cloos avatar F3N1X avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.