GithubHelp home page GithubHelp logo

psuedomagi / fedcal Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 29.34 MB

A feature-rich Python calendar that enables time series analyses of changes in federal workforce schedules and shifts in executive department funding status.

License: MIT License

Python 100.00%
data-analysis data-science econometrics economic-data economics federal federal-government hr pandas pandas-python python pandas-library pydata

fedcal's Introduction

fedcal, a democratic python calendar enhancement to pandas

fedcal logo

Not dead; just taking a break to work on other things.

Available/Functional but untested: offsets.py - custom pandas DateOffsets for civilian and military paydays (FedPayDay, MilitaryPayDay), Federal holidays including past proclamation holidays (FedHolidays), federal business days accounting for federal holidays (FedBusinessDay), and a class to identify likely military passdays falling on business days and adjacent to federal holidays (MilitaryPassDay). _status_factory.py fetch_index - this is really part of the backend, but for anyone who wants accessible time series appropriations data by executive department, you can currently pull a multiindex using the fetch_index function. You can also roll your own by pulling the status_intervals.json where the data are stored.

fedcal

fedcal is a simple calendar library with one big goal: enable new perspective on the U.S. Government to build transparency, improve government, and bolster democracy.

a calendar... why?

The U.S. Government is massive. Over two million civilian and military employees. 1.7 trillion dollars in FY23 spending. Even tiny and short-lived changes and shifts in the government workforce and spending can have far-reaching impacts over time.

Time is at the heart of those impacts. I'm a federal manager, and I started writing fedcal because I wanted to understand how shifts in personnel availability, seasonality, and budgetary factors impacted productivity to help me plan and drive better results. For example, I know first hand how devastating the continuing resolution cycle can be on productivity and outcomes, but didn't know precisely how devastating those impacts could be (or how subtle and unexpected). I was shocked to find there weren't off-the-shelf solutions for understanding the basic rhythms and routines of the massive U.S. Government machine

fedcal is about answering big and small questions, improving predictions, and understanding the U.S. Government, its effect on society and the world

Note

If you think about it, the U.S. Government is a big control group, of sorts. Two million people are mostly at work... or they aren't because of holidays, weekends, military passdays, or government shutdowns. One minute the government is 'on', and the next it is 'off'. Of course, it's more complicated than that -- sometimes only half, a third, or 90% of the government is impacted, which offers opportunities for even richer differential analysis (and one fedcal hopes to enable). But it's not just shutdowns -- a continuing resolution instead of a full year appropriation is another kind of binary relationship, or even whether a holiday falls on a Monday or Tuesday. fedcal aims to give you the tools to explore these relationships and their significance (...or insignificance).

A few example questions fedcal can help answer:

productivity analysis
  • What impacts do continuing resolutions (CRs), full-year appropriations, and, of course, shutdowns, have on Federal services? How long do those impacts linger, and how quickly, when, and how are they propagated?
  • Which departments feel these impacts the most?
  • How do holidays and military passdays affect seasonal outcomes and workforce productivity? How long do those effects extend?
  • How do those same factors impact Federal contractor productivity? ... private business? ...agriculture? ...industrial security?
economic/financial analysis
  • How do changes in federal appropriations (i.e. full-year appropriations, continuing resolutions, gaps/shutdowns) affect not just the U.S. economy, but state, local, international economies? Federal contractors and employees? Businesses reliant on federal employees?
  • How fast do those impacts propagate? How quickly do they heal?
  • How do federal paydays and holidays impact small businesses and local economies (e.g. for a military town, or a deli across the street from a federal building)?
business and human resources analysis
  • How do changes in U.S. Government budgeting and productivity impact contracting and demand? Are they accurately predictable? How long does it take those shifts to propagate into financial outcomes for businesses dependent on the Government.
  • How do shifts in government budgeting or personnel affect supply and demand for human capital? Federal hiring? Talent availability?
social science
  • Are Congress' irregularities more routine and predictable than they might appear? Are certain committees more consistently effective at passing appropriations? Why?
  • What are the social, political, and economic costs of irregular Federal budgeting?
  • What unexpected correlations are there for leave or short interruptions in personnel from holidays, military passes, and similar factors (e.g. Some shots in the dark: are smokeless tobacco sales measurably impacted by military leave in military communities? Are certain crimes more or less likely when federal employees get Christmas Eve off? Do military passdays alter dependent educational outcomes in subsequent weeks)?
  • Are federal employees more likely to make certain purchases or financial decisions before or after paydays?
more routine stuff

Of course, as a calendar library, fedcal can help you streamline more development for routine needs like:

  • leave and pay automation systems
  • financial systems (i.e. bank holidays, deposit prediction (hello USAA!))
  • informational websites
  • whatever else you can think up

okay, I'm sold, but what does it do?

The fedcal API bolts on Federal data/time information enhancements to two of the most powerful classes in data science for time series analysis - pandas' Timestamp and DatetimeIndex:

  1. FedStamp - fedcal's enhanced sub-class to pandas' Timestamp class

  2. FedIndex a companion class that adds similar functionality to pandas' DatetimeIndex class

Important

FedStamp and FedIndex retain all functionality of pandas Timestamp and DatetimeIndex. (... or they should, pending testing and refinement and your issue submissions...)

Core Enhancements to Timestamp and DatetimeIndex

  1. Department-level appropriations/operational status for all top-level Federal Departments over time, with shutdown and appropriations gap data from FY75 to present, and continuing resolutions data from FY99 to present (currently -- I plan to delve back to FY75 eventually)

  2. Federal date attributes 1970 to whenever:

    • Federal Holidays, including historical holidays by Presidential proclamation (also back to FY75).
    • Federal businessdays (1970 to indefinite future - why 1970? I use POSIX time as a reasonable floor; no other reason.)
    • Federal fiscal years and fiscal quarters (1970 to whenever)
    • fedcal can even take a very rough guess of whether a future President will declare a given Christmas Eve a holiday
  3. Federal civilian biweekly paydays

  4. Military paydays and estimated holiday passdays

  5. Supporting time utilities like to_timestamp and to_datetimeindex

  6. Suggestions?

what's next?

  • Lots of bug fixes before initial alpha
  • error handling
  • thorough test coverage
  • CI/CD pipeline and packaging for pypi deployment
  • fullsome documentation
  • performance optimizations (e.g. vectorize date range lookups for department statuses)
  • API enhancements (e.g. feature encoding) and removing artifacts of design changes
  • adding CR data to FY75

how can I help?

Review the CODE_OF_CONDUCT.md and CONTRIBUTING.md to get started. Please help!

fedcal's People

Contributors

psuedomagi avatar

Stargazers

 avatar

Watchers

 avatar  avatar

fedcal's Issues

Overhaul Front-End to fully use Pandas extensions API accessors and ExtensionArray

As it stands, I used a pretty slick (in my opinion) metaclass to automagically delegate functionality for our Datetimeindex and Timestamp-like classes (FedIndex and FedStamp) to their attribute pandas' objects. This provided pretty seamless functionality, and we will likely still need it for point queries for FedStamp/Timestamp, but in line with our goal to be neatly integrated into pandas, it's better to align ourselves with the extension API wherever possible. This will also allow us to integrate easily into Series and DataFrame on top of DatetimeIndex. The plan looks like this:

  • Build a mix-in to handle Fedcal attributes for pandas objects. Use the series, index, and dataframe accessor APIs to feed this mixin (presumably will require some object-specific customization on top of the mixin) into the pandas ecosystem. Pirate and reverse (that's a long 'arrrr' in reverse) engineer pandas' internal functions as needed to make it as pandas-like as possible.

  • Further integrate fedcal into Timestamp using internal Timestamp mechanics to the extent possible without subclassing (we tried that... it didn't play as nice as we needed it to... probably because of the heavy Cython backend to Timestamp and pydatetime - maybe one day we can do a Cython implementation to make it clean). Most likely we'll continue to use a refined version of the MagicDelegator metaclass for this, which I also plan to spin off into its own library at some point because it's really handy.

  • Figure out how best to serve up the appropriations status data and integrate functionality. I'm leaning towards a custom ExtensionArray and dtype(s) that use custom department and status objects for rich functionality. I haven't figured out what this looks like yet. Please send suggestions. In the meantime, _status_factory.py's fetch_index can deliver a functional multiindex with the data.

Once all that is done, we'll have our baseline core functionality and it'll be time to develop robust tests, beta test, and then join the pydata extensions community.

Refactor Date Attributes to Make More Effective Use of Pandas offset classes

Currently we use Pandas offsets as attributes, but we can improve both functionality and concision if we make more effective use of Pandas built-in offset classes, namely:

FedHoliday -- create a custom AbstractHolidayCalendar of Holiday objects instead of using prebuilt USFederalHolidayCalendar

FedBusDay -- evaluate directly subclassing CDay.

MilitaryPayDay -- combine offsets from FedBusDay/CDay and Semi-Month-Begin, or evaluate subclassing. We get it right we can simplify calculations to 'dates += MilitaryPayDay [or its attribute]'

passdays -- this one I'm less sure about. I suspect we can also use offset classes to build this and apply it as a direct offset.

Restructure and Simplify Department Status Management and Delivery

The current implementation that provides department statuses for time intervals is overly complicated and error-prone. I became enamored with the idea of an interval tree serving up this data. While optimal on paper, the subsequent translation to something usable counteracts this advantage while adding substantial complexity. We're also not dealing with noticeable performance differences given the size of the dataset.

I propose/plan a major restructuring on the backend, the principal aspects of which are:

  • directly load constants.py data into pandas objects
    -ditch _tree and the interval tree delivery
  • Add time-state and pandas index attributes to FedDepartment objects; add attributes that allow them to represent themselves as various pandas objects to facilitate easy construction
  • Build a functional factory for FedDepartment objects.
  • Tap this pool directly from FedIndex and FedStamp such they have these objects as attributes.
  • build fedindex DFs directly from FedDept multi-index tuples

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.