GithubHelp home page GithubHelp logo

psuedomagi / fedcal Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 29.35 MB

A feature-rich Python calendar that enables time series analyses of changes in federal workforce schedules and shifts in executive department funding status.

License: MIT License

Python 100.00%
data-analysis data-science econometrics economic-data economics federal federal-government hr pandas pandas-python python pandas-library pydata

fedcal's Issues

Overhaul Front-End to fully use Pandas extensions API accessors and ExtensionArray

As it stands, I used a pretty slick (in my opinion) metaclass to automagically delegate functionality for our Datetimeindex and Timestamp-like classes (FedIndex and FedStamp) to their attribute pandas' objects. This provided pretty seamless functionality, and we will likely still need it for point queries for FedStamp/Timestamp, but in line with our goal to be neatly integrated into pandas, it's better to align ourselves with the extension API wherever possible. This will also allow us to integrate easily into Series and DataFrame on top of DatetimeIndex. The plan looks like this:

  • Build a mix-in to handle Fedcal attributes for pandas objects. Use the series, index, and dataframe accessor APIs to feed this mixin (presumably will require some object-specific customization on top of the mixin) into the pandas ecosystem. Pirate and reverse (that's a long 'arrrr' in reverse) engineer pandas' internal functions as needed to make it as pandas-like as possible.

  • Further integrate fedcal into Timestamp using internal Timestamp mechanics to the extent possible without subclassing (we tried that... it didn't play as nice as we needed it to... probably because of the heavy Cython backend to Timestamp and pydatetime - maybe one day we can do a Cython implementation to make it clean). Most likely we'll continue to use a refined version of the MagicDelegator metaclass for this, which I also plan to spin off into its own library at some point because it's really handy.

  • Figure out how best to serve up the appropriations status data and integrate functionality. I'm leaning towards a custom ExtensionArray and dtype(s) that use custom department and status objects for rich functionality. I haven't figured out what this looks like yet. Please send suggestions. In the meantime, _status_factory.py's fetch_index can deliver a functional multiindex with the data.

Once all that is done, we'll have our baseline core functionality and it'll be time to develop robust tests, beta test, and then join the pydata extensions community.

Restructure and Simplify Department Status Management and Delivery

The current implementation that provides department statuses for time intervals is overly complicated and error-prone. I became enamored with the idea of an interval tree serving up this data. While optimal on paper, the subsequent translation to something usable counteracts this advantage while adding substantial complexity. We're also not dealing with noticeable performance differences given the size of the dataset.

I propose/plan a major restructuring on the backend, the principal aspects of which are:

  • directly load constants.py data into pandas objects
    -ditch _tree and the interval tree delivery
  • Add time-state and pandas index attributes to FedDepartment objects; add attributes that allow them to represent themselves as various pandas objects to facilitate easy construction
  • Build a functional factory for FedDepartment objects.
  • Tap this pool directly from FedIndex and FedStamp such they have these objects as attributes.
  • build fedindex DFs directly from FedDept multi-index tuples

Refactor Date Attributes to Make More Effective Use of Pandas offset classes

Currently we use Pandas offsets as attributes, but we can improve both functionality and concision if we make more effective use of Pandas built-in offset classes, namely:

FedHoliday -- create a custom AbstractHolidayCalendar of Holiday objects instead of using prebuilt USFederalHolidayCalendar

FedBusDay -- evaluate directly subclassing CDay.

MilitaryPayDay -- combine offsets from FedBusDay/CDay and Semi-Month-Begin, or evaluate subclassing. We get it right we can simplify calculations to 'dates += MilitaryPayDay [or its attribute]'

passdays -- this one I'm less sure about. I suspect we can also use offset classes to build this and apply it as a direct offset.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.