covid19tracking / covid-public-api-build-v2 Goto Github PK
View Code? Open in Web Editor NEWVersion 2 of the public API build tool
License: Apache License 2.0
Version 2 of the public API build tool
License: Apache License 2.0
The COVID Tracking Project was founded in the early days of the COVID pandemic arriving in the US, and provided an API from day one. This API receives millions of requests per day, and is used by large and small organizations to inform their users. Our API expands our reach and mission by providing consistently high-quality data to others.
Since March, the data we collect has undergone several big changes. We have twice as many data fields. Definitions of data that seemed solid in March have changed considerably. Some data that was one field now needs more context, or is different from state-to-state.
Unfortunately, so many apps use our API that changing field names breaks things for our clients. We also support two formats of data: CSV and JSON, which means we can’t have nested or structured data if we want to keep the two formats in parity. We serve data endpoints like states/daily.json
that are over 6MB in size, but cannot add pagination because CSV users would miss out on that data.
We get many feature requests like providing data as a percentage of population, or adding calculations like 7-day-rolling-average. While we have built internal tools to do this within our website, we are afraid of adding just more fields that may or may not need to be changed as means of analyzing the pandemic change, or our understanding of our own data improves.
Our proposal is to create a new, versioned API for our COVID data that improves time-to-release of data, prevents changes from breaking well-built applications, and gives space for things like computed fields.
The new API will be served from api.covidtracking.com/, while the old API at covidtracking.com/api/v1 will still be maintained and updated daily. On October 1, the V1 of the API will no longer receive updates, and will remain online until January 1, 2021.
CSV files are necessary tools for researchers and the public, but they are the biggest source of issues filed about formatting problems. No modern API service delivers data in CSV format because it is a format for bulk migration of data, not real-time application messaging.
Instead, the covidtracking.com website will build CSV files for users to download from the various sections of our site. Researchers and other users will be able to use these generated CSV files to download the latest data, but these files will not have fields like computed values. We will make a best effort attempt to keep these files in line with the latest changes in the API.
We have been using BigQuery as a generalized datastore for non-core data, and have a public datastore of our own COVID data. Let’s add all our API data into a public BigQuery dataset that anyone can query against.
Our JSON data is currently a long JSON array of data with no structure or context. We propose standardizing all API responses based on JSONAPI:
{
"links":{
"self":"https://api.covidtracking.com/state/ca"
},
"meta":{
"build_time":"2020-07-05T14:00:00Z",
"data_definitions":"https://covidtracking.com/definitions/state",
"license":"https://covidtracking.com/license",
"version":2.1
},
"data":[
]
}
We would follow the following standards for naming and formats:
Every endpoint would provide the last time the API data was updated, a link to license and data definitions, and the API version.
All endpoints will include field definitions in the meta
object. This will allow us to rename and flag fields for deprecation. Fields will include a formerly
array that indicates what the field used to be named, and can be used as a fallback for applications in case a field changes its name.
Fields have an optional “unit” designation that indicates whether the field represents people or samples.
{
"meta":{
"field_definitions":[
{
"field":"cases.cases.current",
"deprecated":false,
"unit":"people",
"formerly":[
"positive",
"positiveCurrent"
]
}
]
},
"data":[
]
}
Each data element will have its own meta
object that defines things like edit notes and last-update times:
{
"data":[
{
"state":"CA",
"date":"2020-04-05T00:00:00Z",
"meta":{
"last_update":"2020-04-06T05:00:00Z"
}
}
]
}
All endpoints will have a data
array of objects. Each object can be nested to group like data elements together. Each data element will have a computed
object that includes 7-day averages and computed values as a percentage of the population.
Data elements will be nested as [category].[field].values
{
"data":[
{
"state":"CA",
"date":"2020-04-05T00:00:00Z",
"cases":{
"cases":{
"current":{
"value":400,
"computed":{
"average_7_day":380,
"population_percent":0.06
}
},
"cumulative":{
"value":5000,
"computed":{
"population_percent":0.1
}
}
}
},
"tests":{
"negative":{
"current":{
"value":4500,
"computed":{
"average_7_day":4000,
"population_percent":0.06
},
"cumulative":{
"value":50000,
"computed":{
"population_percent":2.4
}
}
},
"pending":{
"current":{
"value":4500,
"computed":{
"average_7_day":4000,
"population_percent":0.06
},
"cumulative":{
"value":50000,
"computed":{
"population_percent":2.4
}
}
}
}
}
}
}
]
}
Some fields, such as a simple total test results, are impossible to treat globally across all states. In this case, we will not provide a value for these fields, and instead give an object representing the most complete time series (since March 2020), and the most accurate time series (where we have data over 120 days):
[
{
"state": "CA",
"date": "2020-09-01",
"tests": {
...
"total_test_results": {
"complete_field": "tests.positive_negative",
"preferred_field": "tests.viral.total"
}
...
}
}
]
Users who just want raw values can request endpoints that return simpler values instead by appending /simple
to the URL:
[
{
"state":"CA",
"date":"2020-04-05T00:00:00Z",
"cases":{
"cases": {"current":400,
"cumulative":5000
}
},
"tests":{
"negative":{
"current":4500,
"cumulative":50000
},
"pending":{
"current":4500,
"cumulative":50000
}
}
}
]
Users are making multiple API calls for state metadata and daily or current information. Instead, we can provide a single state endpoint that includes all state information, and then append the state metadata for each state to the beginning of all state API calls.
In addition, we will add unique slug metadata fields to all states and state endpoints.
Fields currently marked as Deprecated in the V1 API will not be brought over to V2.
The new API will have the following endpoints (all prefixed by /v2/
):
/changes
- A running changelog of additions and changes to the API/status
- Information about the last build time and API health/fields
- A list of all fields, their definitions, and long-names/states
- A list of all states and their state metadata, same as our current State Metadata./states/history
- A list of all historic records for all states/state/[state-code]
- All the state’s metadata, and their most recent data record/state/[state-code]/history
- All the state’s metadata, and a list of all historic records for that state/us
- The most recent record for the US/us/history
- All the US historyWe will no longer use .json
at the end of endpoint URLs.
Changes to endpoints and API will be communicated through a dedicated Headway page and Twitter account. We will handle changes in fields or field definitions in a consistent manner:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.