GithubHelp home page GithubHelp logo

snowplow / snowplow-gtm-server-side-client Goto Github PK

View Code? Open in Web Editor NEW
8.0 15.0 3.0 278 KB

A Google Tag Manager Server-side Client template for collecting events using the Snowplow JavaScript Tracker

Home Page: https://snowplowanalytics.com

License: Apache License 2.0

Smarty 100.00%
snowplow snowplow-javascript-tracker google-tag-manager google-tag-manager-server-side

snowplow-gtm-server-side-client's Introduction

Snowplow Client for Google Tag Manager Server-side

early-release License Release

Snowplow is a scalable open-source platform for rich, high quality, low-latency data collection. It is designed to collect high quality, complete behavioral data for enterprise business.

To find out more, please check out the Snowplow website and our documentation.

Snowplow Client for Google Tag Manager Server-side Overview

A Google Tag Manager Server-side Client template for collecting events using the Snowplow JavaScript Tracker.

This Client allows you to collect events in your Google Tag Manager Server container and forwards them to Tags.

Event Data

This client populates the "Common Event Data" specified in the Google Tag Manager documentation which allows it to easily integration with other Tags. This client also populates a number of Snowplow properties which allows for Tags to be created which leverage the rich data collected by the Snowplow JavaScript Tracker in the browser.

Installation

Installing from the Google Tag Manager Gallery

Coming Soon

Manual Installation

To manually install this Client:

  1. Download template.tpl
  2. Create a new Client in the Templates section of a Google Tag Manager Server container
  3. Click the More Actions menu and select Import
  4. Import template.tpl downloaded in Step 1
  5. Click Save.

Find out more

Technical Docs Setup Guide
i1 i2
Technical Docs Setup Guide

Contributing

Feedback and contributions are welcome - if you have identified a bug, please log an issue on this repo. For all other feedback, discussion or questions please open a thread on our discourse forum.

Contributing
i3
Contributing

Copyright and license

Snowplow Client for Google Tag Manager Server-side is copyright 2021-2024 Snowplow Analytics Ltd.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

snowplow-gtm-server-side-client's People

Contributors

adatzer avatar paulboocock avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

sahava ulad-k

snowplow-gtm-server-side-client's Issues

Parse `client_session` schema from mobile tracking

When tracking with a mobile tracker, the client_session schema is usually populated. This includes a userId and session information. Mobile tracking rarely populates duid, sid and vid so we should prefer this client_session information when capturing events from mobile trackers.

first_visit and session_start when sending events to GA4.

Hi there, at my work we've been trying to use this Client to receive events in GTM SS and send events to GA4 through a Tag.

I've found that some of the reports end up broken because of a few parameters collected by the GA4 Client which are not calculated or shown in the Snowplow Client. Mainly the ones that makes metrics which regard the first session or the first visit of the user.

In a GA4 Client when a new session is set the "ss" is set to 1 which triggers a "session_start" event and when a new client_id is set then "fv" is set to 1 which triggers a "first_visit" event. Don't mind the "dbg" parameter which means debug.

image

"x-ga-system-properties" currently does not exist in events coming from a Snowplow Client, could it be added?

Thanks, looking forward to your reply. :)

Fix tests that throw error

While each individual test succeeds, when running with "Run tests" all but the first fail with Tried to claim a request after a Client had returned. Calling claimRequest from a callback is not supported..
Probably can be resolved with splitting callback mocks to respective tests and also mocking claimRequest.

Add GitHub actions

We could add actions to lint metadata.yaml (prevent syntax errors) file and release on push-tag.

Request to sGTM Snowplow Client getting CORS blocked

After having configured a Snowplow JS Tracker with the client-GTM templates (Tag + Variables) and having pointed the endpoint to the sGTM server as explained in this chart, my POST events cannot come through as the console shows the following error:
blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource.

Obviously I also imported and setup the Snowplow client in my server-side GTM as instructed [here].(https://docs.snowplowanalytics.com/docs/forwarding-events-to-destinations/forwarding-events/google-tag-manager-server-side/snowplow-client-for-gtm-ss/)

Did I forget something?

Optimize claiming checks for enriched requests

Currently the client checks to claim an incoming request are:

  1. if is request for JS tracker dist file
  2. if Snowplow tp2 request
  3. if Snowplow enriched request

Based on the most recommended and probably most common setup being Destinations Hub, we could switch to dealing with enriched requests first. Also since these if conditions are mutually exclusive we could switch to else if, i.e.:

  1. if Snowplow enriched request
  2. else-if Snowplow tp2 request
  3. else-if is request for JS tracker dist file

data.gtmOnSuccess() handling

Remove data.gtmOnSuccess() references from the Client template, as they are not necessary – they're only used in tags.

Fix failing enriched event test

The Enriched test currently fails as the expected object has resolutions of nullxnull which was an earlier bug that has already been fixed in the template.

Reduce code lines

It seems we are hitting a limit where additional code lines (does not apply to code comments) result in Invalid Template in preview/test environments.
As a temporary solution we could reduce code lines where possible.

client unable to claim requests without a header (400 Bad Request)

I have setup GTM server side following this guide: https://aws-solutions-library-samples.github.io/advertising-marketing/using-google-tag-manager-for-server-side-website-analytics-on-aws.html

Now I have two links to access my GTM server:

Primary Server-Side Container (analytics.example.com)

Preview Server Container (preview-analytics.example.com)

I am using these parameters in the aws_ecs_task_definition setup for GTM:

PreviewContainer:

  container_definitions    = <<TASK_DEFINITION
  [
  {
    "name": "preview",
    "image": "gcr.io/cloud-tagging-10302018/gtm-cloud-image",
    "environment": [
      {
        "name": "PORT",
        "value": "80"
      },
      {
        "name": "RUN_AS_PREVIEW_SERVER",
        "value": "true"
      },
      {
        "name": "CONTAINER_CONFIG",
        "value": "${var.CONTAINER_CONFIG}"
      }
    ],

PrimaryContainer:

{
    "name": "primary",
    "image": "gcr.io/cloud-tagging-10302018/gtm-cloud-image",
    "environment": [
      {
        "name": "PORT",
        "value": "80"
      },
      {
        "name": "PREVIEW_SERVER_URL",
        "value": "${var.PREVIEW_SERVER_URL}"
      },
      {
        "name": "CONTAINER_CONFIG",
        "value": "${var.CONTAINER_CONFIG}"
      }
    ]

Now I am trying to send data to GTM using Snowbridge, which runs using a docker container on an Ec2 instance. Snowbridge reads data from a Kinesis data stream and forwards it to GTM.

https://docs.snowplow.io/docs/destinations/forwarding-events/snowbridge/configuration/targets/http/google-tag-manager/

config.hcl.tmpl

source {
  use "kinesis" {
    stream_name = "${stream_name}"
    region      = "${region}"
    app_name    = "${app_name}"

    role_arn = "${role_arn}"
    read_throttle_delay_ms = 500

    # Maximum concurrent goroutines (lightweight threads) for message processing (default: 50)
    concurrent_writes = 50
  }
}

target {
  use "http" {
    url                        = "https://analytics.xx/com.snowplowanalytics.snowplow/enriched"
    request_timeout_in_seconds = 60
    content_type               = "application/json"

    # this line is optional, in case you want to send events to GTM Preview Mode
    headers                    = "{\"x-gtm-server-preview\": \"AAAAAAAXXXX==\"}"
  }
}

transform {
  use "spEnrichedToJson" {}
}

This works as expected and I am able to see incoming data in the preview mode. From my understanding, data is still being sent to the original mode but it is just being forwarded to the preview mode when this option is enabled.

When I remove this environment variable for the PreviewContainer to deactivate the Preview mode:

 {
        "name": "RUN_AS_PREVIEW_SERVER",
        "value": "true"
 },

Snowbridge still successfully sends data (as shown by Cloudwatch logs). If I am not using the Preview Server, that means I would no longer need the "header" in my snowbridge POST requests either. However, as soon as I remove the optional header (i.e., the x-gtm-server-preview parameter), I get errors on Snowbridge's Cloudwatch while sending data:

level=warning msg="Retrying func (attempts: 2): target.Write: Error sending http requests: 1 error occurred:\n\t* Got response status: 400 Bad Request\n\n"

I also checked the logs for the ECS service (Primary Container). As far as I have the header in my POST request, I see statements like these in my logs:

“Snowplow enriched request, claimed...”

but as soon as I remove the header, I no longer see these "claim" statements in the log.

Is it possible that the snowplow client is not designed to work without a header?

Provide control over setting the client_id in common event

Currently the client_id is populated by domain_userid or by mobile_context's user id, if exists. There are may be events where none of them exists (e.g. a server-side tracked event), so this results in client_id being undefined. Since client_id is a property that downstream tags could count upon, we could provide an option to override the default client_id setting logic to better support those use cases.

Allow option to serve sp.min.js

Currently the Client offers the option to serve the sp.js file of the JS tracker. We could possibly offer the option to select the sp.min.js to serve, which could be desirable in cases like described in this discourse post.

Provide control over setting client_id and user_id in common event

At the moment the client maps the uid/user_id Snowplow property to the user_id property of the common event (here and here). Sometimes this may not be desirable (for example when using the Snowplow Client in combination with GA4 Tag).

Even though gtm-server-side tags usually provide options to configure any mapping downstream, we could also explore providing an option to disable this mapping from the Client, while also leaving the original behaviour as default.

Edit: We could provide a common pattern to also take into account #26

Consider option to map mkt fields to GA4 expected format

This could aid forwarding Snowplow events to GAv4.
Currently the mappping can become available to GA4 Tag through GTM-SS Transformations, but we could consider adding a configuration option for the Client to do (see "Populate GAv4 Client Properties" config option).

The GAv4 equivalent fields:

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.