An R package for cleaning and manipulating panel and hierarchical data.
This is a suite of tools extending the dplyr
package to perform data manipulation. These tools are geared towards use in panel data and hierarchical data.
Unlike other suites dealing with panel data, all functions in pmdplyr
are designed to work even when considering a set of variables that do not uniquely identify rows. This is handy when working with any kind of hierarchical data, or panel data where there are multiple observations per individual per time period, like student/term/class education data.
Install this package using devtools::install_github('NickCH-K/pmdplyr')
and use help(pmdplyr)
for more information.
Functions included in the package:
between
andwithin
: Standard between and within panel calculations.fixed_check
: Checks a list of variables for consistency within a panel structure.fixed_force
: Forces a list of variables to be constant within a panel structure.id_variable
: Takes a list of variables that make up an individual identifier and turns it into a single variable.time_variable
: Takes a time variable, or set of time variables, and turns them into a single well-behaved integer time variable of the kind required by most panel functions.inexact_join
: Set of wrappers for thedplyr
join
functions which allows for a variable to be matched inexactly, for example joining a time variable inx
to the most recent previous value iny
.pdeclare
andis_pdeclare
: Set the panel structure for a data set, or check if it is already set.mutate_cascade
: A wrapper fordplyr::mutate
which runs one period at a time, allowing changes in one period to finalize before the next period is calculated.mutate_subset
: A wrapper fordplyr::mutate
that performs a calculation on a subset of data, and then applies the result to all the observations (within group).panel_fill
: Fills in gaps in the panel. Can also fill in at the beginning or end of the data to create a perfectly balanced panel.panel_locf
: A last-observation-carried-forward function for panels. Fills inNA
s with recent nonmissing observations.tlag
: Lags a variable in time.