GithubHelp home page GithubHelp logo

leaderanalytics / vyntix.fred.fredclient Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 0.0 1.12 MB

Easy-to-use client for accessing Federal Reserve Bank of St. Louis FRED® API

License: Other

C# 100.00%
fred alfred economic-data vintage fredapi fred-api

vyntix.fred.fredclient's Introduction

vyntix.fred.fredclient's People

Contributors

leaderanalytics avatar sam-wheat avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

vyntix.fred.fredclient's Issues

Opportunities to improve the FRED API

From: Sam Wheat
Sent: Thursday, January 5, 2023 8:03 PM
To: [email protected]

Subject: Opportunities to improve the FRED API

As I expand my usage the API I continue to find inconsistencies that increasingly support my suspicion that the API is built on concepts that are themselves correct but are misapplied throughout the API. I also believe the API returns data that is incorrect - by my definition and as defined by the current published documentation.

I apologize for the verbosity of this email. It's been nearly four years since I raised my original questions and I have not received a clarifying response from St. Louis Fed. Given that my prior communications have failed to garner attention, I feel it is incumbent on me to provide additional evidence that improvements can and should be made. I also hope to participate in suggesting a path forward.

I am confident that the concepts I present below are well known and understood by all of you. My purpose in articulating them is to demonstrate detailed examples where I believe API diverges from known practice.

My background in economics is limited but my experience as a software developer is extensive (see samwheat.com). If you would like to further discuss the issues I've raised from the perspective of a developer, please feel free to contact me. I will be glad to discuss these issues via phone or Zoom at your convenience.

This email is a draft for several articles I will be writing for my personal blog. My goal is to share my experience using the API and to assist developers such as myself with learning and using the API effectively. Your feedback is requested, valued, and appreciated. It will be shared with the developer community via my blog as well as relevant social media sites.

Real-time periods versus Vintage dates

Definition of real-time periods

The definition of real-time periods as stated here:

"The real-time period marks when facts were true or when information was known until it changed".

This definition is accurate but far from complete. Specifically, it fails to identify several criteria that separate the concept of a real-time period from a vintage date.

Real-time periods are:

• A period of time when facts were true or when information was known until it changed.
• Delimited by any arbitrary historical range of dates/times. Real-time periods are unrelated to and unconstrained by any vintage date.
• User defined. Real-time periods are conceived by the user. They are not stored in any FRED database or computed by the API.
• Not identifiers. No unit of information that is available from the API can be identified by a real-time period by itself or as part of a compound key.
• Inputs to queries. Real-time start and end dates are compared to vintage dates to determine if a data element was valid during the user defined real-time period.

Definition of vintage dates

The definition of vintage date as stated here:

"Vintage dates are the release dates for a series excluding release dates when the data for the series did not change."

This definition is accurate but also incomplete.

Vintage dates are:

• The moment in time when information is released.
• Not arbitrary. With respect to a given series, not every day in history is a vintage date. Only the dates when new information is released are vintage dates.
• Mark a moment in history when an event occurred and as such are immutable historical facts.
• Not user defined - vintage dates are historical records. Their values are stored in the FRED database.
• Identifiers - a vintage date uniquely identifies a release of data for a given series.
• Outputs from queries - Vintage dates are compared to real-time dates to determine if a data element was valid during the user defined real-time period. If the vintage is valid within the real time period the observation is returned in the result set. It is identified by its vintage date.

Having defined real-time periods and vintage dates we can see that both concepts share two common characteristics: 1.) both identify moments in time and 2.) both can be expressed as calendar date/times (i.e. Jan 1 1980 2:30pm). The fact that these attributes are shared does not in any way give license to use these two concepts interchangeably. They are fundamentally different and must always be used and identified correctly.

It is possible that the term "Real-time" has some other ordained usage in the domain of economics. If that is the case than my argument still stands and different terminology should be used in this context. In fact, I will suggest that a term such "User defined period of interest" is much more meaningful to the common man than Real-time. While lacking the mystique enjoyed by Real-time, a term such as "User defined period of interest" is practical and clear in purpose. For example, it is tempting to say "Vintage dates delimit the real-time period when facts were known." This is a misapplication of concepts and is incorrect. The correct statement is "Vintage dates coincide with a real-time period of matching dates when facts were known". If we use the term "User defined period of interest" the missing component becomes more apparent: "Vintage dates delimit the user defined period of interest when facts were known." What user defined period of interest would that be?

Functional Requirement

If we accept the preceding concepts and definitions as correct, we can use them to construct a requirement that describes how the API should function. We can also create criteria for determining if the data that is returned by the API is correct:

Requirement 1. Query inputs (parameters) are user defined real-time periods. The API may assist the user in constructing real-time periods that coincide with Vintage dates but behind the scenes the logic to limit data to a real-time period is the same.

Requirement 2. Data elements returned by the API are identified by valid Vintage dates. Vintage dates are labeled as vintage dates. Real-time start and end dates are not used as identifiers.

Requirement 3. Vintage dates are never interchangeable with real-time periods.

Testing the current implementation against the requirement

If we accept the preceding requirements as correct we can use them to test the validity of the API and the correctness of the documentation as it exists now:

Documentation

"Economic data sources, releases, series, and observations are all assigned a real-time period."

This statement is incorrect. The correct statement is "...data sources, releases, series, and observations are all assigned a vintage date." (Requirement 2)

"Sources, releases, and series can change their names, and observation data values can be revised."

This statement is incomplete and ambiguous. The correct statement is "Historical sources, releases, and series data are immutable. When a new vintage is created, Sources, releases, and series can change their names, and observation data values can be revised." (Requirement 2)

Documentation

Despite the clearly identified output formats, all data returned by this API is in fact identified by vintage date (as all data provided by FRED should be). The documentation in the section for Observations by Real-Time Period confirms this assertion:

"The real-time period start date defines the first vintage date for which a data value is the latest revision available. The real-time period end date defines the last vintage date for which a data value is the latest revision available."

It is confusing to the user that despite choosing to receive data by Real-Time Period, there is no facility for the user to input a real time period. How the API determines a real-time period (that is different to a real-time periods that corresponds to vintage dates) is unknown.

Documentation

"Sometimes it may be useful to enter a vintage date that is not a date when the data values were revised."

This is an invalid instruction. A vintage date is an identifier. When a user requests data using an invalid identifier the API should return an error. (Requirement 2)

Consider the following sentence: "Sometimes it may be useful to enter a series identifier for a series that is not maintained by FRED." This instruction is functionally equivalent to the instruction provided for vintage dates. It is equally nonsensical. If a user requests observations for series 123XYZ the API will return an error because 123XYZ is an invalid series identifier. Why then, should the API not return an error when an invalid vintage identifier is requested? When the user requests vintages that are valid between a range of dates that they define they should use the real-time start and real-time end date parameters as previously defined.

Functional Inconsistency

The following request supplies no real-time start or end dates so the API (correctly) assumes the current date (2023-01-04) as the real-time period:

Request:

https://api.stlouisfed.org/fred/series/observations?series_id=GDP&api_key=123&observation_start=1975-01-01&observation_end=1975-01-01

Response:

 <observations realtime_start="2023-01-04" realtime_end="2023-01-04" observation_start="1975-01-01" observation_end="1975-01-01" units="lin" output_type="1" file_type="xml" order_by="observation_date" sort_order="asc" count="1" offset="0" limit="100000">
    <observation realtime_start="2023-01-04" realtime_end="2023-01-04" date="1975-01-01" value="1616.116"/>
</observations>

The data above is incorrect because the realtime_start date is used to identify the vintage instead of the vintage date, which is 2018-07-27 (Requirements 2/3). Note that the realtime_start and realtime_end dates are reported in the response header. The correct response is:

<observation vintage_date="2018-07-27" date="1975-01-01" value="1616.116"/>

Functional Inconsistency

The following request specifies realtime_start and realtime_end dates that span multiple vintages. The realtime_start and realtime_end dates do not exactly match any vintage dates:

Request:

https://api.stlouisfed.org/fred/series/observations?series_id=GDP&api_key=123&observation_start=1975-01-01&observation_end=1975-01-01&realtime_start=1991-12-15&realtime_end=1992-01-15

Response:

<observations realtime_start="1991-12-15" realtime_end="1992-01-15" observation_start="1975-01-01" observation_end="1975-01-01" units="lin" output_type="1" file_type="xml" order_by="observation_date" sort_order="asc" count="2" offset="0" limit="100000">
    <observation realtime_start="1991-12-15" realtime_end="1991-12-19" date="1975-01-01" value="1512.7"/>
    <observation realtime_start="1991-12-20" realtime_end="1992-01-15" date="1975-01-01" value="1513.6"/>
</observations>

The response shown above is arguably the clearest example of how the API confuses real-time dates with vintage dates. Note that realtime_start and realtime_end dates are correctly reported in the header of the response. This is where they belong as they are inputs to a query (Requirement 1). The data elements, however, are badly constructed:

1991-12-15 is a random, arbitrary historical date. It does not mark the start or end of any vintage. Nothing related to GDP happened on this day.
1991-12-20 is a meaningful historical date. It is a vintage date. On this day in history a revision to GDP was released.

These two dates are profoundly different - yet the API reports them both as realtime_start dates and the user is left to wonder if either, both, or neither of them are of any significance. (Requirement 2/3) The correct response is:

<observation vintage_date="1991-12-04" date="1975-01-01" value="1512.7"/>
<observation vintage_date="1991-12-20" date="1975-01-01" value="1513.6"/>

When the data is reported this way the user receives accurate, useful data. The definitions of real-time versus vintage dates are respected.

Example of Expected Functionality
QueryDates

Example request:

https://api.stlouisfed.org/fred/series/observations?series_id=GDP&api_key=123&realtime_start=1980-01-01&realtime_end=2010-01-01

Expected response (abbreviated):

<observations realtime_start="1980-01-01" realtime_end="2010-01-01" >
    <observation vintage_date="1975-01-01" date="1975-01-01" value="1"/>
    <observation vintage_date="1985-01-01" date="1975-01-01" value="2"/>
    <observation vintage_date="2000-01-01" date="1975-01-01" value="3"/>
</observations>

Data formatting issues

This page describes four output types:

1 = Observations by Real-Time Period
2 = Observations by Vintage Date, All Observations
3 = Observations by Vintage Date, New and Revised Observations Only
4 = Observations, Initial Release Only

It turns out that output type is somewhat of a misnomer as the setting actually controls both the format of output and the actual values that are returned.

Problem 1: Output type 1 (Observations by Real-Time Period) returns mixed data with respect to all observations vs. new and revised observations.
Steps to reproduce:
Submit this request:

https://api.stlouisfed.org/fred/series/observations?series_id=gdp&vintage_dates=2022-09-29,2022-10-27&output_type=1&file_type=json&api_key=123

The first vintage returned contains all observations while the second vintage returns only new and revised observations. This mixed result is functionally useless in all cases except two:
1.) The user requests every vintage for the series. This will return all observations for the first vintage and new/revisions only for every subsequent vintage. Unfortunately this option is impractical as many series have too many vintages to include in a single request.
2.) The user includes only one vintage date per request. This allows the user to at least obtain a consistent result - every response from the API will include all observations for the series. However, if the user wants only new/revised data this option will not work.

Problem 2: Output type 3 returns unparsable json/xml: { "date":"2017-01-01","GDP_20220929":"19148.194"}
Steps to reproduce:
Submit this request:

https://api.stlouisfed.org/fred/series/observations?series_id=gdp&vintage_dates=2022-09-29,2022-10-27&output_type=3&file_type=json&api_key=123

I am not going to quibble over whether the json/xml is invalid per the spec. I can tell you that most deserializers can not handle this format elegantly as the vintage date column can not be mapped to a property on a statically defined object.

Suggestion:
I suggest a new output type be introduced with the behavior of output type 3 and data format of output type 1. This can be a non-breaking change that gives the user both a consistent result and a usable format.

Nuget Package Install Error (LeaderAnalytics.Caching)

Thanks for this great project.

I am running into the following issue trying to install the nuget.

When I run dotnet add [myproject.csproj] package LeaderAnalytics.Vyntix.Fred.FredClient --version 1.0.14-beta.1, I receive the following error message:

error: NU1102: Unable to find package LeaderAnalytics.Caching with version (>= 1.0.15)
error: - Found 3 version(s) in nuget.org [ Nearest version: 0.0.30 ]
error: Package 'LeaderAnalytics.Vyntix.Fred.FredClient' is incompatible with 'all' frameworks in project

Correspondence with FRED

The emails below are my correspondence with the FRED web team related to questions I had about real-time dates versus vintage dates. The FRED technician who responded did not identify themself.

If you are wondering if you should spend time reading this the answer is "Probably not." The FRED technician who responded seemed more confused than I was about how the API is supposed to work. When pressed to answer the question "... under what circumstances would the real-start/end columns show a date that is not a vintage date?" the reply was "Never". The answer is correct however the API returns dates that are not vintage dates all the time. Even so, the tech was unable or unwilling to acknowledge why this is true.

The last email from FRED shown below is their last correspondence with me. They stopped communicating so I had no choice but to give up on them. I concluded that the API is simply wrong. Instead of repeating my questions I formulated my own definitions and requirements and sent them to FRED. You can and should read that email here. Having done that, I was able to move forward with writing Vyntix.FredClient with clarity.

I post this correspondence because I want to explain and justify the reasons why Vyntix.FredClient does not simply pass along values that are returned by the FRED API. I also want definitive answers on the questions I raised to FRED. If you have an opinion on the subject you are welcome to share it here.

Emails are copy/pasted from my email client so read from the bottom up.


From: Sam Wheat
Sent: Saturday, August 20, 2022 2:23 PM
To: [email protected]
Subject: API still returning incorrect data for over three years

I've been trying to write a dotnet client for Fred for a while now. I've been stuck on this issue for the last three years. I am really hoping you guys can fix it.

This API returns invalid vintage dates:

https://api.stlouisfed.org/fred/series/observations?series_id=NROU&api_key=x&vintage_dates=2011-03-16

I made an argument for the way the API should work in the emails below. Argument is summarized in these two statements:

Incorrect statement from someone at St. Louis Fed: "All dates (i.e. single days) on the whole real-time time line are vintages dates"

Correct statement from Fred documentation: "Vintage dates are the release dates for a series excluding release dates when the data for the series did not change." See https://research.stlouisfed.org/docs/api/fred/series_vintagedates.html.

If you respond to this email kindly include your name.

Thank you,


From: Sam Wheat
Sent: Wednesday, May 22, 2019 6:23 PM
To: [email protected] [email protected]
Subject: FW: API Question

Hi Christian,

I am an independent software developer. Over the last six years I have been writing a desktop application that will allow users to use and interact with the valuable vintage data provided by the St. Louis Fed. I have some questions about the data and the FRED API which I have addressed to the general info email address at St. Louis Fed. The responses I've received (shown in the email thread below) do not answer my questions, do not reconcile with the documentation, or do not reconcile with data returned by the API.

I would like to escalate my questions within the St. Louis Fed organization. I hope you are the correct person to escalate these issues to. If not, would you kindly escalate this email to the correct person(s)?

My questions are:

1.) Why does the API ever return a Realtime_Start (or Vintage_Start) date that does not correspond to a valid vintage date?
If we get a list of valid vintage dates from this API: https://api.stlouisfed.org/fred/series/vintagedates?series_id=NROU&api_key=x shouldn't we be able to expect that every real-time start or vintage start from this api: https://api.stlouisfed.org/fred/series/observations?series_id=NROU&api_key=x&vintage_dates=2011-03-16 will exist in the list of valid vintage dates?
More generally - the API appears to work inconsistently as is demonstrated by my original email as shown below.

2.) What are the practical difference between Realtime_Start/Realtime_End and Vintage_Start/Vintage_End? The explanation of Real-Time periods found here seems no different than the description of Vintages found here. Are these concepts redundant?

My original email is shown at the very bottom of this email. Relies are shown as you scroll up.

Thank you very much for your insight and assistance!!

Regards,

Sam Wheat


From: Sam Wheat
Sent: Friday, March 22, 2019 3:48 PM
To: STLS FRED
Subject: Re: Re: FW: Re: FW: Re: FW: API Question

"Vintage dates are the release dates for a series excluding release dates when the data for the series did not change."

This statement seems very straightforward and your comments in your last email do not conflict with it or clarify it. In fact, your comments confirm it means what I think it means. What, specifically, is not clear or confusing about the above statement? Would you please re-phrase it exactly as it should appear in it's correct form?

Based on your responses to my questions thus far, there does not appear to be anything abstract about vintage dates. If on 2019-03-22 a new (and therefore different) observation is released for NROU than 2019-03-22 is a vintage date. More specifically, if that release is the only one in March 2019, than 2019-03-20 is NOT a vintage date, nor is 2019-03-23.

If I make a call to https://api.stlouisfed.org/fred/series/observations?series_id=NROU&api_key=x&vintage_dates=2019-03-22 than I expect to see observations that are valid and current as of that vintage date. I would further expect to see each realtime_start and realtime_end date set to a valid vintage date (2019-03-22), based on your statement "A real-time period starts with a vintage date and ends with a vintage date".

If I make a call to https://api.stlouisfed.org/fred/series/observations?series_id=NROU&api_key=x&vintage_dates=2019-03-20 than I would expect the api to return a null set because 2019-03-20 does not mark the start of a realtime period. This all seems fairly simple to grasp - what am I missing?

Thank you for your replies.


From: STLS FRED [email protected]
Sent: Friday, March 22, 2019 1:23 PM
To: sam.wheat
Subject: RE: Re: FW: Re: FW: Re: FW: API Question

NONCONFIDENTIAL // EXTERNAL

The documentation for the fred/series/vintagedates says:

"Get the dates in history when a series' data values were revised or new data values were released. Vintage dates are the release dates for a series excluding release dates when the data for the series did not change."

I admit this is not clear\confusing and should be corrected.

The most gradual notion of time in real-time in the FRED API is a single day or vintage date. The whole universe of real-time is formed from vintage dates. Think of vintage dates as dots or instants on the real-time time line. Real-time periods are intervals on the real-time time line defined by a starting vintage date and an ending vintage date. Release dates are publicly announced dates when the series on a release as a group have observations that are updated. Sometimes on a release date the observations for a particular series don't change- no observations revise and no new observations are initially released.

The fred/series/vintagedates endpoint returns the subset of vintage dates when the observations actually changed.


From: Sam Wheat
Sent: Friday, March 22, 2019 2:08 PM
To: STLS FRED [email protected]
Subject: [External] Re: Re: FW: Re: FW: API Question

NONCONFIDENTIAL // EXTERNAL

PLEASE NOTE: This email is not from a Federal Reserve address.
Do not click on suspicious links. Do not give out personal or bank information to unknown senders.

All dates (i.e. single days) on the whole real-time time line are vintages dates.

I don't understand that at all...................

Are you saying that this statement is incorrect?:

"Vintage dates are the release dates for a series excluding release dates when the data for the series did not change."

https://research.stlouisfed.org/docs/api/fred/series_vintagedates.html
St. Louis Fed Web Services: fred/series/vintagedates
Federal Reserve Bank of St. Louis, One Federal Reserve Bank Plaza, St. Louis, MO 63102
research.stlouisfed.org


From: STLS FRED [email protected]
Sent: Friday, March 22, 2019 11:51 AM
To: sam.wheat
Subject: RE: Re: FW: Re: FW: API Question

NONCONFIDENTIAL // EXTERNAL
Sam

All dates (i.e. single days) on the whole real-time time line are vintages dates. For a given vintage date, observations may or may not change. Periods are intervals (e.g. '1 month') attached to a specific place on a time line (e.g. 2000-01-01 to 2000-01-31). Periods are defined by a start and end date. Real-time periods exist on the whole real-time time line and can have start and end dates anywhere on this time line.

Given that "A real-time period starts with a vintage date and ends with a vintage date" under what circumstances would the real-start/end columns show a date that is not a vintage date?

Never.


From: Sam Wheat [mailto:sam.wheat]
Sent: Wednesday, March 20, 2019 9:12 PM
To: STLS FRED [email protected]
Subject: [External] Re: Re: FW: API Question

NONCONFIDENTIAL // EXTERNAL

PLEASE NOTE: This email is not from a Federal Reserve address.
Do not click on suspicious links. Do not give out personal or bank information to unknown senders.

Thanks for your reply.

The vintage dates returned from the fred/series/vintagedates requests are the distinct real-time start dates for a series' observations.

A real-time period starts with a vintage date and ends with a vintage date.

In the example I provide below 2011-03-16 is returned as a realtime_start date.
Question: Is 2011-03-16 a vintage date for the series in the example provided in my original email?

If yes, why does it not appear on the distinct real-time start dates returned by fred/series/vintagedates?
If no, how does the result shown support the statement "A real-time period starts with a vintage date and ends with a vintage date."?
Same question phrased differently: Given that "A real-time period starts with a vintage date and ends with a vintage date" under what circumstances would the real-start/end columns show a date that is not a vintage date?


From: STLS FRED [email protected]
Sent: Wednesday, March 20, 2019 11:54 AM
To: sam.wheat
Subject: RE: Re: FW: API Question

NONCONFIDENTIAL // EXTERNAL

NONCONFIDENTIAL // EXTERNAL

PLEASE NOTE: This email is not from a Federal Reserve address.
Do not click on suspicious links. Do not give out personal or bank information to unknown senders.

A vintage date is any day on the whole real-time timeline. A vintage date is not necessarily a day when data revised. The fred/series/vintagedates request returns a subset of vintage dates- only the dates when observations values change because these are the interesting dates. A real-time period starts with a vintage date and ends with a vintage date.

The FRE/ALFRED relational database schema that stores revisions does not use standard foreign keys. Real-time periods start and end dates are stored per observation. The vintage dates returned from the fred/series/vintagedates requests are the distinct real-time start dates for a series' observations. Instead of using foreign keys, triggers are used to check that all dates within a real-time period are contained by real-time periods in other tables. In this way, these triggers are stricter than foreign keys because not only the real-time start and end dates are checked.

For more on the database concepts used to store real-time revisions, read chapters 1-7 in Developing Time-Oriented Database Applications in SQL by Richard T. Snodgrass at:

http://www2.cs.arizona.edu/~rts/tdbbook.pdf


From: Sam Wheat [mailto:sam.wheat]
Sent: Wednesday, March 06, 2019 8:04 PM
To: STLS FRED [email protected]
Subject: [External] API Question

NONCONFIDENTIAL // EXTERNAL

PLEASE NOTE: This email is not from a Federal Reserve address.
Do not click on suspicious links. Do not give out personal or bank information to unknown senders.

Hello,

I have a question about this API:
https://api.stlouisfed.org/fred/series/observations?series_id=NROU&api_key=x&vintage_dates=2011-03-16

A vintage with a realtime_start of 2011-03-16 does not exist as shown in the vintage_date list below.

Calling the api with a single realtime_start of 2011-03-16 returns data that neither started nor ended on 2011-03-16 (per the vintage_date list) however the realtime_start and realtime_end dates are set to that date.

Calling the api with two invalid vintage_dates results in data being returned with realtime_start dates that correspond to actual vintage_dates.

I would expect the behavior to be as follows:
I think of a vintage_date as a foreign key so if I pass a foreign key that does not exist I would expect the api to return no data.

In all cases where a vintage_date is passed in and the api returns data I would expect the realtime_start date to always correspond to a valid vintage_date.

At the very least the behavior of api with respect to the two examples below is confusing. Why does the api work this way?

Valid vintage dates:
https://api.stlouisfed.org/fred/series/vintagedates?series_id=NROU&api_key=x

<vintage_dates realtime_start="1776-07-04" realtime_end="9999-12-31" limit="10000" offset="0" sort_order="asc" count="14" order_by="vintage_date">
<vintage_date>2011-02-02</vintage_date>
<vintage_date>2012-01-31</vintage_date>
<vintage_date>2012-08-22</vintage_date>
<vintage_date>2013-02-05</vintage_date>
<vintage_date>2014-02-04</vintage_date>
<vintage_date>2014-08-27</vintage_date>
<vintage_date>2015-01-26</vintage_date>
<vintage_date>2015-08-25</vintage_date>
<vintage_date>2016-01-25</vintage_date>
<vintage_date>2017-01-24</vintage_date>
<vintage_date>2017-06-29</vintage_date>
<vintage_date>2018-04-09</vintage_date>
<vintage_date>2018-08-13</vintage_date>
<vintage_date>2019-01-28</vintage_date>
</vintage_dates>

EXAMPLE 1:

Passing one invalid vintage date results in a dataset being returned with a realtime_start that does not match any vintage date in the list above.
https://api.stlouisfed.org/fred/series/observations?series_id=NROU&api_key=x&vintage_dates=2011-03-16

EXAMPLE 2:

Plugging two invalid vintage returns data (presumably) within the range and realtime_start dates are set correctly
https://api.stlouisfed.org/fred/series/observations?series_id=NROU&api_key=x&vintage_dates=2011-01-16,2012-02-04

Step-by-step guide to get vintage data for a real-time period using the FRED API

Before reading this article, read this email to FRED that proposes expanded definitions for real-time periods and vintage dates. The proposed definitions are used in this article since they are the only meaningful and consistent way to use the API and interpret it's output.

In this article we select a random series of GNPCA, a random observation period of 2018-01-01, and a random real-time period between 2019-05-01 and 2020-05-01. We ask the FRED API this question: Within the example period of interest (aka real-time period) between 2019-05-01 and 2020-05-01, what values did we know for the GNPCA 2018-01-01 observation period and when were those values released?

These are the some of the parameters we will use to construct queries. We will discover more parameters as we progress:

series_id=GNPCA
realtime_start=2019-05-01 (aka period of interest start)
realtime_end=2020-05-01 (aka period of interest end)
observation_start=2018-01-01
observation_end=2018-01-01

Note we are asking two questions of the API: 1.) What values for the 2018-01-01 observation period were known within our period of interest and 2.) when were those values released.

Before walking through the API to get the desired information, download a spreadsheet for GNPCA from ALFRED. Select All vintages, output type 2 by Vintage date all vintages. We will use this spreadsheet to determine in advance the responses we expect to see from the API.

Look at the row for the 2018-01-01 observation period. Note that the initial value for that observation period was released on 2019-03-28 (a vintage date). That date is prior to the start of the example real-time period but is the only release so it is the one that is in effect on the start of the real-time period. The only revision prior to the end of the example real-time end occurred on 2019-07-26 (a vintage date).

In summary, the correct response we expect to see from the API to the two questions posed above is:

Vintage 2019-03-28 - 18815.882 // initial release
Vintage 2019-07-26 - 18897.80 // revision, in effect until end of period of interest

Query the API using the parameters we have at hand

The query below is constructed using the observations api. Start and end dates of the period of interest are passed as real-time start and end date parameters. Output type 1 is selected.

https://api.stlouisfed.org/fred/series/observations?series_id=GNPCA&realtime_start=2019-05-01&realtime_end=2020-05-01&observation_start=2018-01-01&observation_end=2018-01-01&output_type=1&api_key=123

<observations realtime_start="2019-05-01" realtime_end="2020-05-01" observation_start="2018-01-01" observation_end="2018-01-01" units="lin" output_type="1" file_type="xml" order_by="observation_date" sort_order="asc" count="2" offset="0" limit="100000">
  <observation realtime_start="2019-05-01" realtime_end="2019-07-25" date="2018-01-01" value="18815.882"/>
  <observation realtime_start="2019-07-26" realtime_end="2020-05-01" date="2018-01-01" value="18897.8"/>
</observations>

The above response is incorrect for several reasons. Firstly, it is returning the start of the example period of interest (2019-05-01) as a release date for the value 18815.882. The correct release date (vintage date) of that value is 2019-03-28. Secondly, the one vintage date that it is returning correctly is mis-labeled as a real-time start date. There is only one correct real-time start date and it is 2019-05-01.

Same query with output_type 2:

https://api.stlouisfed.org/fred/series/observations?series_id=GNPCA&realtime_start=2019-05-01&realtime_end=2020-05-01&observation_start=2018-01-01&observation_end=2018-01-01&output_type=2&api_key=123

<observations realtime_start="2019-05-01" realtime_end="2020-05-01" observation_start="2018-01-01" observation_end="2018-01-01" units="lin" output_type="2" file_type="xml" order_by="observation_date" sort_order="asc" count="1" offset="0" limit="100000">
  <observation date="2018-01-01" GNPCA_20190501="18815.882" GNPCA_20190726="18897.8" GNPCA_20200326="18897.8" GNPCA_20200501="18897.8"/>
</observations>

The above response is also incorrect. This response also co-mingles real-time periods with actual vintage dates. The response neglects to report the actual date the observation was initially released (2019-03-28) and instead reports the start of the real-time period as the vintage date. The format of this response is unparsable by most deserializers.

Correct way to use the API

As demonstrated above the FRED API becomes confused when trying to differentiate between real-time dates and vintage dates. Fortunately there is a way to use the API to obtain the desired information. It is lengthy but it can be done.

Step 1: Get dates when information was released about observation period 2018-01-01

The question we need to ask FRED is "Within the real-time period between 2019-05-01 and 2020-05-01, what vintages existed or were created that impacted our knowledge of GNPCA for the observation period of 2018-01-01?". To ask this question we construct a vintage date query using the example real-time start and end dates:

 https://api.stlouisfed.org/fred/series/vintagedates?series_id=GNPCA&realtime_start=2019-05-01&realtime_end=2020-05-01&offset=0&api_key=123

  <vintage_dates realtime_start="2019-05-01" realtime_end="2020-05-01" order_by="vintage_date" sort_order="asc" count="2" offset="0" limit="10000">
    <vintage_date>2019-07-26</vintage_date>
    <vintage_date>2020-03-26</vintage_date>
  </vintage_dates>

The FRED documentation is not clear what this query should return so we can not say definitively whether the response is right or wrong. The response above appears to answer the question "What vintages were released between 2019-05-01 and 2020-05-01?". Of course this is not the question we intended to ask. We know that this query is not useful for our purpose because it does not return the date of the Vintage that was in effect on 2019-05-01.

Unfortunately, there is no way to query FRED for vintages that are effective within a real-time period. The only way to get the vintages we need is too request all vintages, manually or programmatically scan the list, and select the vintages that were in effect during the period of interest (aka real-time period).

https://api.stlouisfed.org/fred/series/vintagedates?series_id=GNPCA&api_key=123

<vintage_dates realtime_start="1776-07-04" realtime_end="9999-12-31" order_by="vintage_date" sort_order="asc" count="181" offset="0" limit="10000">
  // snip
  <vintage_date>2019-03-28</vintage_date>
  <vintage_date>2019-07-26</vintage_date>
  <vintage_date>2020-03-26</vintage_date>
  // snip
</vintage_dates>

The first vintage in the list (2019-03-28) is included because it was in effect on the first day of the example real-time period (2019-05-01). Vintage 2020-03-26 was the last vintage to be released before the end of the example real-time period (2020-05-01).

These vintage dates give us an additional query parameter we can use:

vintage_dates=2019-03-28,2019-07-26,2020-03-26

Step 2: Use vintage dates to construct a query

The following query can now be constructed:

https://api.stlouisfed.org/fred/series/observations?series_id=GNPCA&vintage_dates=2019-03-28,2019-07-26,2020-03-26&observation_start=2018-01-01&observation_end=2018-01-01&output_type=3&api_key=123

<observations realtime_start="2019-03-28" realtime_end="2020-03-26" observation_start="2018-01-01" observation_end="2018-01-01" units="lin" output_type="3" file_type="xml" order_by="observation_date" sort_order="asc" count="1" offset="0" limit="100000">
  <observation date="2018-01-01" GNPCA_20190328="18815.882" GNPCA_20190726="18897.8"/>
</observations>

The response above returns the correct vintage of 2019-03-28 which indicates when information was released that was in effect at the start of the example real-time period. The vintage when the value was revised is also correctly reported. You will need to deserialzie the xml or json by hand or write code to do it since most deserializers cannot parse this format into a statically defined object.

FRED API returns inconsistent realtime_start dates

I sent the the question below to the FRED support team. I don't expect a reply from them but in the unlikely event I get one I'll post it here.

API returns inconsistent realtime_start dates

https://api.stlouisfed.org/fred/series/release?series_id=IRA&realtime_start=1776-07-04&api_key=123&file_type=json

The query above returns three rows with the following realtime_start dates:

1996-12-12
1998-12-10
2002-05-02

Running the same query with any date after 2002-05-02 (or without specifying a realtime_start date) returns a single row with the realtime_start set to the date that is passed to the query i.e.:

https://api.stlouisfed.org/fred/series/release?series_id=IRA&realtime_start=2024-02-18&api_key=123&file_type=json

The query above returns one row with the following realtime_start date:

2024-02-18

We know from this document that "The real-time period marks when facts were true or when information was known until it changed".

Based on the results of the queries shown above it appears that the real-time start date is the real-time period when facts were true or when information was known until it changed - unless you query after such date - in which case the realtime_start is the date you query the API.

What is the purpose of reporting a date other than the correct realtime_start date when no date is passed to the query or when the supplied date is after the latest realtime_start date?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.