GithubHelp home page GithubHelp logo

sparklivecodingassignment's Introduction

Spark live coding assignment

Live coding Spark assignment that I did during one of my interviews

Coding assignment


Session Data - Holds aggregated information about a full session , a session is a visit to a website!

Path: /data/session_data.parquet

Columns:

  • tenant_uuid – uuid of the customer
  • session_ts: timestamp of the start of the given session, in ms
  • session_uuid: unique ID for the session in Glassbox
  • struggle_types: an array of dictionaries representing various struggles and their aggregated values throughout the session.
  • is_converted
  • struggle_converted

user_action - Holds information about specific “actions” (predefined list) performed during a session

Path: /data/user_events/*.parquet

Columns:

  • tenant_uuid – uuid of the customer
  • date
  • dom_element – element on which the action occurred
  • session_ts - timestamp of the start of the given session, in ms
  • session_uuid: unique ID for the session in Glassbox
  • potential_revenue: collected revenue
  • client_action: name of the action that triggered a row in this table (‘click’, ‘scroll’, ‘load’, ‘hover’)
  • struggle_score: struggle score for the session up to this action.

Take few minutes to learn and get acquintance with the data before proceeding further


Exercise 1

Compute for each and session, and for each tenant

  • Number of click actions in the session
  • Number of unique dom_elements in session
  • Number of clicks that has a potential revenue (more than 0) in the session
  • Number of page loads in the session

Exercise 2

A revenue is considered lost if the session is not converted. (is_converted=False) We would like to compute the daily average revenue loss by session, for each struggle.

Since revenue is saved in the action table, while the struggles as a dictionary in the session table, the computation for struggle revenue loss by session is:

We will define a struggle potential revenue as :

sum(potential_revenue) * (#struggle_type_occurences / #struggles in session)

example: example

sparklivecodingassignment's People

Contributors

sergeioff avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.