GithubHelp home page GithubHelp logo

gord-nuttall-adswerve / google_analytics_flattener Goto Github PK

View Code? Open in Web Editor NEW

This project forked from adswerve/google_analytics_flattener

1.0 0.0 0.0 28 KB

Google Cloud Platform solution that provides an event driven process that flattens (unnests) Google Analytics 360 data that has been exported to BigQuery.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

google_analytics_flattener's Introduction

README

Google Analytics 360 Flattener. A Google Cloud Platform (GCP) solution that unnests (flattens) Google Analytics Data stored in Bigquery. The GCP resources for the solutions are installed via Deployment Manager.

Dependencies

  • Python 3.7 as base interpreter
  • Create a virtual environment
  • Install packages using cf/requirements.txt
  • pip install google-cloud-pubsub==1.6.0 [for tools/pubsub_message_publish.py only]

Directories

  • cf : pub/sub triggered cloud function that executes a destination query to unnest(flatten) the .ga_sessions_yyyymmdd table immediately upon arrival in BigQuery into 5 tables:
    • ga_flat_sessions_yyyymmdd
    • ga_flat_hits_yyyymmdd
    • ga_flat_products_yyyymmdd
    • ga_flat_experiments_yyyymmdd
    • ga_flat_promotions_yyyymmdd
  • tests : units test for both cloud functions and deployment manager templates
  • cfconfigbuilder : http triggered cloud function that finds all BigQuery datasets that have a ga_sessions table and adds them to the default configuration on Google's Cloud Storage in the following location: [DEPLOY NAME]-[PROJECT_NAME]-adswerve-ga-flat-config\config_datasets.json

Files

  • dm_helper.py: provides consistent names for GCP resources accross solution. Configuration and constants also found in the class in this file
  • dmt-*: any files prefixed with dmt_ are python based Deployment Manager templates
  • ga_flattener.yaml: Deployment Manager configuration file. The entire solution packages in this file. Used in the deployment manager create command
  • tools/pubsub_message_publish.py : python based utility to publish a message to simulate an event that's being monitored in GCP logging. Useful for smoke testing and back-filling data historically.
  • LICENSE: BSD 3-Clause open source license

Prerequisites

  1. Create Google GCP project or use an existing project that has Google Analytics data flowing to it. Referred to as [PROJECT_NAME]
  2. Enable the Cloud Build API
  3. Enable the Cloud Functions API
  4. Add "Logs Configuration Writer", "Cloud Functions Developer" pre defined IAM roles to [PROJECT_NUMBER]@cloudservices.gserviceaccount.com (built in service account) otherwise deployment will fail with permission errors. See https://cloud.google.com/deployment-manager/docs/access-control for detailed explanation.
  5. Install gCloud locally or use cloud shell.
  6. Clone this github repo
  7. Create bucket for staging code during deployment, for example: [PROJECT_NAME]-function-code-staging.
  8. Edit the ga_flattener.yaml file, specifically the properties-->codeLocation value of the function and httpfunction resource. Make the value for both to name of the bucket created in #7 (above step)

Installation steps

  1. Execute command: gcloud config set project [PROJECT_NAME]
  2. Execute command: gcloud config set account [email protected]
  3. Navigate (locally) to root directory of this repository
  4. Execute command: gcloud deployment-manager deployments create [Deploy Name] --config ga_flattener.yaml

Testing / Simulating Event

  1. After installation, set values in lines 6-11 of tools/pubsub_message_publish.py
  2. Run tools/pubsub_message_publish.py, which will publish a simulated logging event of GA data being ingested into BigQuery

Un-install steps

  1. Optional command to remove solution: gcloud deployment-manager deployments delete [Deploy Name] -q

google_analytics_flattener's People

Contributors

gord-nuttall-adswerve avatar iamnutzy1975 avatar

Stargazers

Bilge Nur Ocak avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.