GithubHelp home page GithubHelp logo

splunk / corona_virus Goto Github PK

View Code? Open in Web Editor NEW
78.0 32.0 35.0 390.98 MB

This project includes an app that allows users to visualize and analyze information about COVID-19 using data made publicly-available by Johns Hopkins University. For more information on legal disclaimers, please see the README.

License: Apache License 2.0

CSS 1.51% Shell 4.89% HTML 16.02% Python 77.58%

corona_virus's Introduction

Coronavirus App

This is a set of dashboards for analyzing the Corona Virus using Splunk.

Contributors to this app: Ryan O'Connor, Miranda Luna, Caleb Dyck, Anthony Barbato, Giovanni Mola

The Splunk Corona Virus dashboard provided in this GitHub repo is an informational tool provided by Splunk without charge to all those who are working to understand and combat COVID-19. The dashboard is intended for informational purposes only and relies entirely on data provided by various third-parties including, inter alia, Johns Hopkins University and any information entered by the user. https://github.com/CSSEGISandData/COVID-19. This dashboard is not for commercial use and is intended and should be used to provide background and context on the evolving COVID-19 situation. Splunk disclaims any and all representations and warranties with respect to the dashboard, including accuracy, fitness for use and merchantability.

Installing the App

This app should be installed directly into $SPLUNK_HOME/etc/apps. You simply clone the app directly into that directory and it will be self-contained. Please see the install instructions below.

You must use the git clone method for this app to work properly. See Cloning this App

App Requirements

  • This app currently is supported on Linux only.

  • For the Confirmed Cases/Locations Overlay dashboard to load optimally, please ensure you have the Maps+ App installed from Splunkbase.

  • Please also ensure you are installing using the git clone method below.

Cloning this App

This package depends on a submodule from here: https://github.com/CSSEGISandData/COVID-19 which is the main source of data for the Coronavirus. As a result, when you run git clone, please add the --recurse-submodules parameter after the clone. So for example:

git clone --recurse-submodules https://github.com/splunk/corona_virus.git

This will ensure the required submodule is cloned into the correct directory inside of the app. Once you have cloned the app, please restart Splunk.

Dashboard Information

  1. Coronavirus image
    1. This is a static analysis of the Coronavirus.
  2. covid-19 Patterns & Trends image
    1. This is the same dashboard as the one publicly available on https://covid-19.splunkforgood.com
  3. Coronavirus - Timelapse
    1. This is a timelapse of the Coronavirus from the first day it was detected, until the current day.
  4. Confirmed Cases/Locations Overlay image
    1. This is a dashboard that can be used to overlay locations of your choosing, with confirmed cases of COVID-19. By default, we are simply using U.S. State Capitals as an example. But you can choose to modify locations.csv to fit your own purposes.

Lookup Table Updating

confirmed.csv, recovered.csv, and deaths.csv

There is a scripted input inside of this app that is enabled by default. It can be found in the GUI by going to Settings > Data Inputs > Scripts and enabling the input "update_git.sh". This script will be used to pull the latest csse timeseries data from JHU and create a symbolic link to each file in the $SPLUNK_HOME/etc/apps/corona_virus/lookups directory.

This scripted input will send it's output by default to index=main sourcetype=git_update_corona. You can use this index/sourcetype to find out when the latest update to the Coronavirus git repository took place.

A search to find out when the last time the JHU Git Repository was updated would look like the following:

index=main sourcetype=git_update_corona _raw!="*Already up to date.*" 
| head 1 
| eval time=strftime(_time,"%m/%d/%Y %H:%M:%S") 
| table time

An example update would look like this:

2020-03-09 20:59:32	Entering 'git/COVID-19'

Updating 382bda4..473681f
Fast-forward
 .../time_series_19-covid-Confirmed.csv             | 541 ++++++++++-----------
 .../time_series_19-covid-Deaths.csv                | 541 ++++++++++-----------
 .../time_series_19-covid-Recovered.csv             | 541 ++++++++++-----------
 3 files changed, 801 insertions(+), 822 deletions(-)

combined_jhu_us_daily.csv

This file contains US State Level Data including Hospitalizations and Tests. It is a combination of all previous CSSE US Daily Reports. I also have added the script I use to generate this titled merge_us.py.

combined_jhu.csv

This file contains US State Level Data, County Level Data, and is a combination of all previous CSSE Daily Reports.

I've added a script that I use to merge all of the daily reports into one massive csv file. This can be used to get historical State Level time series data once again. I will keep this file up to date as often as JHU provides daily reports. It is a lookup table called combined_jhu.csv. If you'd like to update it on your own or explore the methodology I use to merge, I am providing more details below.

Background of this file

As mentioned in the documentation:

"This package depends on a submodule from here: https://github.com/CSSEGISandData/COVID-19 which is the main source of data for the Coronavirus."

Methodology for creating combined_jhu.csv

Our merge.py file simply takes the daily reports provided from the aforementioned Github repository (specifically the CSSE Daily Reports), and appends all of them onto one another. The only adjustment going on, is standardizing on field names. For example: in some files we have "Province/State" and in others we have "Province/State". This adjustment of field names allows us to search commonly for "Country" Across all dates as far back as JHU provides daily reports. I also add a column called "file_name" so you can determine which file the record came from.

This methodology creates a massive CSV with all of the following fields:

  •     "Latitude"
    
  •     "Longitude"
    
  •     "Country"
    
  •     "State"
    
  •     "Deaths"
    
  •     "Confirmed"
    
  •     "Recovered"
    
  •     "County"
    
  •     "FIPS"
    
  •     "file_name"
    

Note: Not all of these fields are filled out for all daily reports. For example, County level data only started coming in very recently. Those fields will be blank for some files.

To Update combined_jhu.csv on your own

This script does require pandas, which does not ship with Splunk today. But you can run the script via cron on a Linux machine to keep the file up to date. Please note that the daily reports only come in once per day, and I will be updating this file once per day. so you don't need to do this part. But for the sake of being open about our process, I've provided the information below.

* * * * * SPLUNK_HOME="/opt/splunk" /usr/bin/python /opt/splunk/etc/apps/corona_virus/bin/merge.py

Change Record

07/22/20

  • Changes are now mostly daily updates to combined_jhu.csv and combined_jhu_us_daily.csv. Any major changes will be documented in the Change Record, but daily updates will not.

04/04/20

  • Updated combined_jhu.csv
  • Resolved issue with table on main app. Ended up cutting over to using my combined_jhu.csv which is a much easier format to work with.
  • resolved issue #16

04/03/20

  • Updated combined_jhu.csv

04/02/20

  • Updated combined_jhu.csv

04/01/20

  • Updated combined_jhu.csv

3/30/20

  • Updated combined_jhu.csv with the latest daily reports.

3/29/20

  • Updated combined_jhu.csv with the latest daily reports.
  • Formatted some of the main dashboard panels to work in the timelapse dashboard. Some of them include a bubble chart and also an updated table.

3/28/20

  • Updated combined_jhu.csv with the latest daily reports.

3/27/20

  • Updated combined_jhu.csv with the latest daily reports.
  • Fixed the map overlay dashboard with some fancy new colors. SPL help courtesy of Scott Haskell.
  • Updated README to provide more information.
  • Modified main dashboard to use ISO8601 Timestamps in all timeseries charts

3/26/20

  • Added a python script called "merge.py" which you can use to merge all of the Daily reports into one massive csv file. This allows for US State level Data once again.
  • Going to be keeping a lookup table called combined_jhu.csv up to date for people to use. This will be a combination of whatever daily csse reports that are posted publicly.
  • Tried to correct some of the renames of the files that JHU made this week so the symbolic links should be up to date.
  • Updated Dashboards Beta to contain annotations on Area Chart

3/24/20

  • Added a scripted input to take the latest daily report from JHU and symlink it to a lookup table called update_daily.csv

3/23/20

  • Updated app to conform to changes in the JHU Data Repository. See the following link for more information
  • Major callout in this update is the removal of Recovered data with the hopes of more granular county level insights coming soon.

3/19/20

  • Merged in the public Splunk Dashboards Beta into the app. So people downloading this app can look at both Simple XML Dashboards, as well as the new dashboards beta dashboard.
  • Fixed a couple bugs to make the main Simple XML Dashboard Mobile Friendly
  • Turned the scripted input on by default to make setup easier for everyome

corona_virus's People

Contributors

antontsv avatar back2root avatar ryanwoconnor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

corona_virus's Issues

can't run on local Splunk Enterprise on windows

I cloned the repo as suggested under the apps folder. The application comes up, i can view the dashboard but no data is available. I noticed that the script .sh file does a git pull which does not seem to work after enabling. So i ran in manually from the command line, the data is downloaded and updated, however, the data still does not show up as per the documentation.

Let me know if you have any quick tips to enable this to run on windows.

Wrong numbers on Main Dashboard "Coronavirus"

Problem Description

The Dashboard "Coronavirus" displays wrong numbers compared to the numbers you would expect and that are recorded within the CSVs from JHU.
For example the Dashboard records 0 deaths for Italy.

Some more examples:
grafik

Error

The different numbers for confirmed cases, deaths and recoverd people originate from the different CSV files.
Searches within the dashboard combine these numbers using appendcols like:
| appendcols [ | inputlookup ... ]

Correlating events like this will produce a valid result only if the different datasets are sorted the same way and contain the exact same number of rows.
For the three datasets this is not the case.
The different datasets get sorted, but the datasets have a different size. This will result in mixed death and recover rates.

grafik

Solution

Rework the searches...

how do you set up the source type?

I've installed this and updated from git but the dashboard is blank.

I don't use splunk often so what part of the process is missing?

Missing KML Overlay for Dashboard "Confirmed Cases/Location Overlay"

Problem Description

The Dashboard "Confirmed Cases/Location Overlay" is configured to use a kml overlay file which is not provided with the app.

<option name="leaflet_maps_app.maps-plus.kmlOverlay">EarthPointExcel_054900.kml</option>

Error

This will throw an HTTP 404 Error + the Information from the kml will obviously not be shown on the map.

GET https://.../en-US/static/app/leaflet_maps_app/visualizations/maps-plus/contrib/kml/EarthPointExcel_054900.kml 404
Failed to load resource: the server responded with a status of 404 ()

grafik

Solution

Provide the referenced kml file with the app

docs: methodology on data

Great stuff! I wish I could use it for my dashboard Pandemic Estimator but for that your API lacks documentation on methodology. I'm using JHU directly and I know what chaos it is, the most blatant example being that they provide "cumulative data" that's not cumulative quite often in practice. And the whole change of file formats, etc.

Can you please describe methodology in readme how you deal with it, especially what's going on inside merge.py file? What has been omitted, what has been "adjusted" and how? Thank you!

Testing

Describe the bug
A clear and concise description of what the bug is. Testing @ryanwoconnor

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

messages flooding console

Describe the bug
Viewing covid-19 or other dashboards on https://covid-19.splunkforgood.com/dashboard_hub
floods messages to the console, repeating with each autoupdate.js call this increasing memory usage in the browser.

To Reproduce
Steps to reproduce the behavior:

  1. Go to 'Open dashboard, such as covid-19'
  2. Open whatever development tool in browser
  3. Check the console output
  4. See messages like the following:

main.d14d1855.chunk.js:1 Current version is 92b13f7
16:21:50.411 11.0380a9aa.chunk.js:2 Failed to calculate layout scale: containerWidth= 0 , width= 1800 ; falling back to scale=1
l.warn @ 11.0380a9aa.chunk.js:2
16:21:50.909 static/js/1.2d69059b.chunk.js:2 The option type 'string' for 'seriesColors' has been deprecated.
l.warn @ static/js/1.2d69059b.chunk.js:2
16:21:50.909 static/js/1.2d69059b.chunk.js:2 The option type 'string' for 'data.fieldShowList' has been deprecated.
l.warn @ static/js/1.2d69059b.chunk.js:2
16:21:50.909 static/js/1.2d69059b.chunk.js:2 The option type 'string' for 'data.fieldHideList' has been deprecated.
l.warn @ static/js/1.2d69059b.chunk.js:2
16:21:50.909 static/js/1.2d69059b.chunk.js:2 The option type 'string' for 'legend.labels' has been deprecated.
l.warn @ static/js/1.2d69059b.chunk.js:2
16:21:50.909 static/js/1.2d69059b.chunk.js:2 The option type 'string' for 'fieldColors' has been deprecated.
l.warn @ static/js/1.2d69059b.chunk.js:2
16:21:51.686 static/js/1.2d69059b.chunk.js:2 The option type 'string' for 'seriesColors' has been deprecated.
l.warn @ static/js/1.2d69059b.chunk.js:2
16:21:51.686 static/js/1.2d69059b.chunk.js:2 The option type 'string' for 'fieldColors' has been deprecated.
l.warn @ static/js/1.2d69059b.chunk.js:2
16:21:52.400 static/js/19.28086849.chunk.js:2 Theme property "LEGEND_FONT_COLOR" not available in default theme
16:21:52.400 static/js/19.28086849.chunk.js:2 Theme property "LEGEND_UNHIGHLIGHT_COLOR" not available in default theme
16:21:52.424 static/js/19.28086849.chunk.js:2 drawing a chart with dimensions: Object
16:21:52.424 static/js/19.28086849.chunk.js:2 drawing a chart with properties: Object
16:21:52.425 static/js/19.28086849.chunk.js:2 drawing a chart with data: Object
16:21:52.440 static/js/19.28086849.chunk.js:2 Theme property "AXIS_LABELS_FONT_COLOR" not available in default theme
16:21:52.440 static/js/19.28086849.chunk.js:2 Theme property "AXIS_TITLE_FONT_COLOR" not available in default theme
16:21:52.440 static/js/19.28086849.chunk.js:2 Theme property "AXIS_GRID_LINE_COLOR" not available in default theme
16:21:52.442 static/js/19.28086849.chunk.js:2 Theme property "AXIS_LABELS_FONT_COLOR" not available in default theme
16:21:52.442 static/js/19.28086849.chunk.js:2 Theme property "AXIS_TITLE_FONT_COLOR" not available in default theme
16:21:52.442 static/js/19.28086849.chunk.js:2 Theme property "AXIS_GRID_LINE_COLOR" not available in default theme
16:21:52.454 static/js/19.28086849.chunk.js:2 config object to be sent to highcharts: Object
16:21:53.033 static/js/19.28086849.chunk.js:2 Theme property "LEGEND_FONT_COLOR" not available in default theme
16:21:53.033 static/js/19.28086849.chunk.js:2 Theme property "LEGEND_UNHIGHLIGHT_COLOR" not available in default theme
16:21:53.109 static/js/19.28086849.chunk.js:2 drawing a chart with dimensions: Object
16:21:53.109 static/js/19.28086849.chunk.js:2 drawing a chart with properties: Object
16:21:53.110 static/js/19.28086849.chunk.js:2 drawing a chart with data: Object
16:21:53.113 static/js/19.28086849.chunk.js:2 Theme property "AXIS_LABELS_FONT_COLOR" not available in default theme
16:21:53.113 static/js/19.28086849.chunk.js:2 Theme property "AXIS_TITLE_FONT_COLOR" not available in default theme
16:21:53.113 static/js/19.28086849.chunk.js:2 Theme property "AXIS_GRID_LINE_COLOR" not available in default theme
16:21:53.113 static/js/19.28086849.chunk.js:2 Theme property "AXIS_LABELS_FONT_COLOR" not available in default theme
16:21:53.113 static/js/19.28086849.chunk.js:2 Theme property "AXIS_TITLE_FONT_COLOR" not available in default theme
16:21:53.113 static/js/19.28086849.chunk.js:2 Theme property "AXIS_GRID_LINE_COLOR" not available in default theme
16:21:53.125 static/js/19.28086849.chunk.js:2 config object to be sent to highcharts: Object

Expected behavior
Minimal console logging.

Desktop (please complete the following information):

  • OS: Windows 10
  • Browser: Tried in Chrome and Firefox

earliest and latest elements in corona_virus.xml

corona_virus.xml contains several components with earliest and latest specified, while they actually rely on the base search. See also warnings when opening the xml source in splunk web.

Also a bit puzzled by the usefulness of the 10s refresh. Since data only updates once a day.

Suggest Detection Rate Metric

Professor Gabriel Leung who lead the SARS pandemic response in Hong Kong, recently estimated that the death rate is a good indicator of the overall true infection rate.
Source: https://www.youtube.com/watch?v=Y7nZ4mw4mXw

The true infection count could be calculated as:

  • lower threshold = (number of deaths) * 80
  • upper threshold = (number of deaths * 100

Using this count we can calculate the delta (likely asymptomatic carriers) using the upper/lower thresholds and the currently detected cases to determine if we are detecting (testing) enough people.

  • Asymptomatic Delta = ((upper threshold - detected cases) + (lower threshold - detected cases)) / 2

asymptomatic_delta_calculation

Problem Sorting Countries by Total Reported Cases

Sorting the countries with most reported cases is done by "|addtotals | sort 0 - Total" on the lookup of confirmed cases. Each value in each date field is the total number of cases reported per country since the start of the pandemic. For example, China had 81,340 cases on March 26, and 81,285 cases on March 25. Adding these 2 numbers is meaningless - the addtotals returns not the number of total cases reported in the country so far, but the total of total cases reported in the country so far (per day). The sort should be done by the maximum number of total reported cases per day for each country.

Auto deleting when splunk restarts

Hi,

First of all, thank you for this. Can I ask what is the problem behind the deletion of the content of the app except "metadata" folder after Splunk restarts? Thank you!

Regards,
Raj

Static Proto for Map tiles in Dashboard "Confirmed Cases/Location Overlay"

Problem Description

The Dashboard "Confirmed Cases/Location Overlay" is configured in a way that it loads map tiles via http protocol.

<option name="leaflet_maps_app.maps-plus.mapTile">http://{s}.basemaps.cartocdn.com/dark_all/{z}/{x}/{y}.png</option>

Error

In case Splunk is configured with https encryption this causes a mixed content error:

Mixed Content: The page at '' was loaded over HTTPS, but requested an insecure image ''. This content should also be served over HTTPS.
confirmed_cases_location_overlay?form.country=:1 Mixed Content: The page at 'https://.../en-GB/app/corona_virus/confirmed_cases_location_overlay?form.country=' was loaded over HTTPS, but requested an insecure image 'http://c.basemaps.cartocdn.com/dark_all/6/14/24.png'. This content should also be served over HTTPS.

grafik

Solution

Define the map tile source without protocol. This will cause the browser to use the protocol that is used for loading the web page.

<option   ##name="leaflet_maps_app.maps-plus.mapTile">//{s}.basemaps.cartocdn.com/dark_all/{z}/{x}/{y}.png</option>

Suggest/consider state/county dashboard data inputs

Once state & county time-series data are reliably present... one should be able to integrate state and county data inputs for dashboards. Defaults for data inputs could even be set so that states could use this app for a state-specific view (by default).

I'm wondering if anyone is working on such an enhancement already. I'm interested and willing to contribute... probably ready to pursue this in next few days.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.