mozilla / gud Goto Github PK

Mozilla Growth & Usage Dashboard, pronounced "Good"

Home Page: https://gud.telemetry.mozilla.org

CSS 1.77% HTML 0.82% JavaScript 34.43% Dockerfile 0.23% Svelte 62.74%

gud's Introduction

Growth and Usage Dashboard

This is a light, server-powered dashboard showing the smoot growth metrics. The frontend talks to a tiny node server by passing it the segments / usage criteria / etc. necessary for the query, and the tiny web server sends the query to be run by BigQuery.

Community

Post in #gud on Slack for any other questions.

Reporting Issues and Feature Requests

Feel free to file an issue in this repository w/ questions / concerns.

Development

Dependencies:

– Node 11.5.0 / current NPM version

To install:

Make sure you nave Node / npm.
run npm install in the directory where you cloned this repository.

To run locally:

The GCP commands in these instructions will not work unless you work under Katie Parlante. If you want to run this project and you don't work under Katie Parlante, please contact Jason Thomas or Blake Imsland.

Run gcloud auth application-default login
Run gcloud config set project moz-fx-data-shared-prod
To run the server, run node server which starts a tiny web server on port 3000 (go to localhost:3000 in your browser).
To build / update the frontend, type npm run dev, which spins up another web server (that we're not going to use, sorry for the redundancy here) and builds the little dev version of the frontend. – I'll make it so you don't have to run two servers like this at some point, but this works for now!

gud's People

Contributors

Stargazers

Watchers

Forkers

wlach openjck

gud's Issues

Update desktop usage criteria

We should rename the usage criteria "Any Firefox Activity" to "Any Desktop Firefox Activity" to make clearer that it is desktop-only. I believe it is already the latter in the backend tables.

Mentions
@hamilton

clean up default error handler

migrate to a global Svelte store + immer

once we have an updated roadmap in-place, it would be worthwhile to revisit / harmonize how we're handling the server + frontend so any engineer who works on this project can follow a unified set of design principles for similar dashboards. Step 1 of that is to migrate the store handling to use immer. There are some simple patterns that are being built-out that I'll link to here once they're written up.

Support multiple series in Explore Mode

I would like to be able to support multiple series on a single plot in Explore Mode. An example use case is, knowing that a metric is moving in some slice (say Mac OS), to see how the movement looks in key slices across another dimension (say country).

I'm open to suggestions for the UI, but perhaps having the dimension selectors (currently platform, country, and channel) in a tabbed pane and have a widget next to the tabs to add a new tab (start with just one tab). Each tab allows selection of the slice for a distinct series. And have each tab have a color corresponding to the plotted series?

An example plot with multiple series:

date ranges should rescale flexibly depending on range & size of graph

design and implement body controls

(1) move the date selector over to the top left of the body
(2) add a filters: COUNTRY US x GB x. type display in body
(3) leave room for other view selection controls

Attributed / Non-Attributed should have some description to them

I've gotten questions about what "attributed" / "non-attributed" means in the interface. We have some documentation about this, but a short description in the menu itself would probably keep people from having to navigate away from GUD.

support somewhere in the UX "comparing slices"

We will also want to support the notion of e.g. us VS ca VS gb specifically, which is distinct from what is in mind for "Compare" mode, where you define slices for two distinct groups, then compare the output of those.

Change "Intensity" metric name to "Average Days Per Week"

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] Please don't include screenshots or specific numbers. This issue will be publicly viewable, but the GUD data itself is under NDA.

Concensus that "Intensity" name is confusing.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Change name to "Average Days Per Week"

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Mentions
Add mentions for anybody you'd like to make aware of this issue; likely @hamilton, @jmccrosky, @klukas, and @openjck.

Add some segments to GUD, once they are defined

This ticket is a "heads up" about a possible future request, and is not a direct request yet.

I'm working on developing some canonical user segments for desktop. The goal is to find segments that include/exclude sets of similar clients, so that we can reduce the impact of confounding variables when analysing data.

For example, we'll likely want to isolate "activated" users from non-activated users, for some definition of activated, and by studying the retention of "activated" users we can remove the effects due to bots and computer labs that create short term profiles. Or we might want to study the properties of heavy users.

We would like these segments to be available on GUD, as well as in mozanalysis, and have example queries on DTMO to help people use the common set of segments in manual queries too.

The most eccentric part of this is that I would like the freedom to iterate on the segments - so that we can start using segments soon, and tweak them as we learn what's useful. I imagine having version numbers in the names of segments until they're stable, and I imagine the old versions becoming replaced by the new versions (so presumably GUD would only need the most recent version or two, but the version number should be included in the segment name).

I am starting my search for segments by using features from clients_last_seen, and building heuristics that can be represented in a SQL SELECT expression. In the long term we might want to move past this and involve ML in deciding which clients fit which segments, but that feels a long way away and there's a lot of value we can unlock in the meantime.

Here are some example segments that give a flavour of what we might want to look at:

Users who visited/didn't visit 5 uris on a day 7-13 days before submission_date
Users who visited at least 213 URIs on submission_date
Users who visited x URIs in period y before submission_date
Users from Tier 1 countries

For each segment, we want to be able to plot MAU, DAU, retention rates, etc - the full range of metrics.

Describe the solution you'd like
When I provide some segment definitions (e.g. as a PR to bigquery-etl), I would like the GUD front end to allow people to filter graphs to include or exclude users that fit a certain segment. Some segments will come in pairs ("included by the criteria"/"excluded by the criteria"). Others may have multiple levels (e.g. "low usage"/"medium usage"/"high usage"). Comparing included/excluded users will be a common use case.

It seems like some segments will fit under "Product / usage criteria", some might fit under "Country", and others might require their own dropdown?

Describe alternatives you've considered
Still working out the main proposal, haven't got to the point of multiple alternatives yet!

Additional context
Proposal document where I guessed that implementing segments like "visited 5 uris on a day 7-13 days before submission_date might take 1-2 weeks from the day I submit a PR to bigquery-etl, and I pointed out that "time estimations are plucked from my gut and involved no consultations"

Mentions
@hamilton, @jmccrosky, @klukas, and @openjck.

graphs should appropriately display information about the units on y-axis

clean up / simplify query params file to contain all relevant metadata

integrate graph-paper layout components

This involves adding (1) a left drawer and (2) a content body.

migrate over storybook example of GUD from the graph-paper repository for main body

add new GUD logo

y rollover vals should match formatting of y axis

Ensure eslint + prettier are configured properly

Change name/branding

From "Smoot" to "Mozilla Growth & Usage Dashboard" or "GUD" or MGUD" or M-GUD" or something ;)

separate out querystring-centric kv pairs from left-side menu-specific ones

This is as simple as having two separate derived stores on the Svelte side – one that is used to compile and format the query string (as well as cache everything) and one that is used for the Explore menu set explicitly.

move to hash-based routing for views

This entails moving form ?mode=explore to /#explore, for instance. Each of these hash-based views are likely to have different query params / other possible routes that need to be split up.

put "coming soon" placeholders for the compare & table views

implement a multiselector

This will be used for dimensions such as country (where a user might want to select multiple countries for aggregation / comparison).

integrate query hitting telemetry.smoot_usage_all_v1 into server.js

The query

SELECT
  `date`,
  usage,
  SUM(dau) AS dau,
  SUM(wau) AS wau,
  SUM(mau) AS mau,
  SAFE_DIVIDE(SUM(active_days_in_week),
    SUM(wau)) AS intensity,
  SAFE_DIVIDE(SUM(active_in_week_1),
    SUM(new_profiles)) AS retention_1_week_new_profile,
  SAFE_DIVIDE(SUM(active_in_weeks_0_and_1),
    SUM(active_in_week_0)) AS retention_1_week_active_in_week_0
FROM
  telemetry.smoot_usage_all_v1
WHERE true
  AND `date` = '2019-05-01'
GROUP BY
  usage,
  `date`
ORDER BY
  1, 2

was presented to me. Let's make sure everything is understood easily before using this.

re-implement SingleSelector with new selection components

Now that we have MultiSelector, we can easily reuse the CSS to implement the single selector with the same style.

Allow focus on a single metric

We would like to support showing just a single metric using the full content pane (to make the graph as large as possible). This could mean suppressing rendering of other graphs or could just be some sort of zoom function in which case the other metric graphs would be just a scroll away.

Dates are off by one

Describe the bug
The dates attached to values in GUD appear to be one day before the submission_date associated with the values.

To Reproduce
Steps to reproduce the behavior:

Go to https://growth-stage.bespoke.nonprod.dataops.mozgcp.net/?endDate=2020-03-01&mode=explore&usage=Any%20Firefox%20Desktop%20Activity&attributed=%5B%5D&metric=all&os=%5B%5D&language=%5B%5D&country=%5B%5D&channel=%5B%5D&startDate=2020-02-02
Mouse over the last data point on the WAU graph. Even though the end date is specified as 2020-03-02, the last point shown in the mouseover is marked as 2020-03-01
Note the WAU value shown for 2020-03-01 and compare to the query below; it exactly matches the value in the query for submission_date = '2020-03-01'

SELECT
  submission_date,
  COUNTIF(days_since_seen < 7) AS wau
FROM
  `moz-fx-data-shared-prod.telemetry.clients_last_seen`
WHERE
  submission_date >= '2020-03-01'
GROUP BY
  1
ORDER BY
  1

Expected behavior
The dates shown on the graphs should match the submission dates associated with the values.

Mentions
@hamilton, @jmccrosky

Date Picker does not freeze axes before transitions apply

re-implement the jackknife calculations on the client

redesign the GUD url scheme

GUD needs to have a bit easier to use URL scheme that is shorter. This could open up the door for much, much more expressive querying and utility on the client and server sides, including arbitrary comparisons between sets of query params. I'm sure this is reinventing someone's wheel, but if done right we might be able to continue to not have a more involved server component for GUD for another few years.

A few improvements:

(1) remove empty kv pairs such as country=[]. We can infer if it is empty that a default value will be set.
(2) consder a short value that is a single alphanumeric for each key that maps to a single character, for instance US=>u. This is more meaningful when we can reduce something like All Firefox Desktop Activity to x. These alphanumeric shorts are unique to the dimension or metric. a-zA-Z0-9 contains 62 values, more than enough to represent almost all these dimensions going forward. In the case of usage criteria, a dimension which could go beyond 62 values, we could easily use two alphanumerics, yielding 3,844 values, or go with three - 238,828 values, just to be safe. In any case the reduction will still be pretty considerable, and throwing out a delimiter like a comma here keeps the length short.
(3) dimension names can be short and dependent on the usage criterion specifically, reducing even further. If we follow (2) above, then we can make any dimension listing delimited by something like -, leaving something like the full country specification to be Cugb0, where C means country, and the rest of the alphanumerics represent individual countries.
(4) we can leave in startDate and potentially other view filters as-is, since they are not specific to the dat itself. For dates, we could easily have sd414, representing number of days since jan 01 2015 or something like that. We can also change somthing like mode=explore to also just be a hash-route.

examples of compression

1

?startDate=2017-06-17&endDate=2020-04-04&mode=explore&usage=Any Firefox Desktop Activity&attributed=[]&metric=all&os=[]&language=[]&country=[]&channel=[] (153 chars)

#explore/?sd905&ed1032&v=e&q=Ufda (33 chars, ~21% the size of original)

2

/?startDate=2017-06-17&endDate=2020-04-04&mode=explore&usage=Any Firefox Desktop Activity&attributed=["TRUE"]&metric=all&os=["Windows_NT"]&language=[]&country=["DE"%2C"GB"%2C"US"]&channel=[] (190 chars)

#/explore/?sd905&ed1032&q=Ufda-AL-Ow-CdgU (41 chars, ~21% or original)

clean up menu selectors

The menu selectors as they are right now were half-implemented to get a POC together. Using the select html element, however, is obviously limiting, so we will need to impement a radiobox-like thing (a multi-select dropdown).

generalize menu selection to be per-criterion

All dimension options should be specified by the usage criterion. Default sets (such as OS / channel for Firefox Desktop criteria) should be easy to specify.

each selector requires a tooltip explaining what it is

have a way of clearing all query params in frontend

the date selector is fiddly

implement usage criterion-specific channel specifications (eg for fenix / fennec)

This should be fairly easy to implement. in options.json, for the usage criteria listed below, we should put a new channels option (instead of the flag that disabled the channels) with an array of string values:

fennec android: release, beta, Other, nightly, and aurora
Fennec iOS: release, Other, and beta
Focus Android: release, nightly, beta, and Other

Then the channel selector should easily be able to read these values from options.json.

ensure querystring -> store update logic works

Update documentation links

The in-product documentation links should point to the new DTMO documentation instead of the old Google doc. https://docs.telemetry.mozilla.org/tools/gud.html

Mentions
@hamilton

Can't seem to generate graphs using dimensions other than country

For example: http://localhost:3000/?mode=explore&usageCriteria=any-desktop&platform=mac&country=ID&channel=all results in an "Uh oh!" error message after a long hang on "Loading Data".

"reset all" not working properly due to recent store changes

delay running query until user hits a "run query" button?

Now that we have a MultiSelector, it's a bit too easy to start, say, 5-6 queries by selecting a bunch of countries. I could see this both causing the server to bottleneck with a bunch of extraneous queries & also possibly cost a bunch of money. Here are some options:

(1) delay running the query for k seconds: this allows for other quick selections before query automatically runs
(2) require user to hit a "run query" button: this would allow a user to select a whole bunch of things, then hit "run query" or something like that. In theory we could also have a "cancel changes" button if they want to go back to the last run query. There is probably a history component that could be used to page between the different (cached) queries a user has run.
(3) forget about this entirely, since this never ends up being that expensive: I think Jeff could answer this question better than I could. If the tables we're querying against are relatively small and cheap, then let's not sweat it too much. The node server is mostly awaiting Promises to resolve, so there isn't much additional weight against the server, I think.

automatically pull in all mozilla product informatiom (release dates, etc.)

we can use https://product-details.mozilla.org/1.0/all.json to get all release dates for major / minor versions of FF, allowing us to provide additional information in both the date picker and also in the graphs.

investigate more server-side caching methods, eg redis

fix footer contents + links

Date picker does not disappear after clicking "Apply" button

Describe the bug
Date picker does not disappear after clicking "Apply" button

To Reproduce
-click on date range
-optionally, modify date range
-click "Apply" in date picker

Expected behavior
I would expect this to cause the date picker to disappear

@hamilton

graphs should appropriately scale the magnitude

right now, graphs do not scale down the y-axis magnitude in a human-friendly way.

Implement date selectors

We would like to have a date range picker to allow the graphs to display a limited date range.

figure out the authentication requirements for BigQuery

We hope to send queries from the small node proxy server to BigQuery, returning the results to the frontend. I'm not sure what the requirements here are.

add shortcuts / hotkeys / interactions guide

There are actually some great features we can easily implement w/ graph-paper around shift+click, click & drag, etc that would enable all sorts of great ways of comparing things. We have some of this today in GUD but it is invisible to the end user.

We can show a list of commands or a button that pops up a command list.

support somewhere in the UX "aggregated slices"

We will need to support (somewhere) the idea of aggregated slices on the frontend, e.g. I want us+ca+gb, and I'd like to see this as a single line.

Replace "FirefoxConnect" string with "Firefox for Echo Show"

FirefoxConnect is listed as one of the products; Roland asked me what FirefoxConnect was and I had to do a Github code search to find out. :)