GithubHelp home page GithubHelp logo

laurii / awesome-official-statistics-software Goto Github PK

View Code? Open in Web Editor NEW

This project forked from snstatcomp/awesome-official-statistics-software

0.0 2.0 0.0 33 KB

An awesome list of statistical software packages useful for creating official statistics.

R 100.00%

awesome-official-statistics-software's Introduction

Awesome official statistics software Awesome

An awesome list of open source statistical software packages useful for creating and accessing official statistics.

An item on this list is awesome because

  1. it is free, open source, and available for download;
  2. it is confirmed to be used in the production of official statistics by at least one institute, or
  3. it provides access to official statistics publications.

We prefer packages that are reasonably easy to install and use, that have at least one stable version, and that are actively maintained.

Contributions are welcome.


Design frame and sample (GSBPM 2.1)

  • R package SamplingStrata. Optimal Stratification of Sampling Frames for Multipurpose Sampling Surveys.

Sampling (GSBPM 4.1)

  • R package sampling. Several algorithms for drawing (complex) survey samples and calibrating design weights.
  • R package surveyplanning. Tools for sample survey planning, including sample size calculation, estimation of expected precision for the estimates of totals, and calculation of optimal sample size allocation.

Scraping for Statistics (GSBPM 4.3)

  • Java application URLSearcher. An application for searching Urls. Can be used to find websites of enterprise. By ISTAT.
  • Java application URLScorer. Gives a rule based score to scraped documents in a Solr database. By ISTAT.
  • node.js tool RobotTool. A tool for checking price changes on the web. By Statistics Netherlands.
  • node.js package S4Sroboto. A crawler framework, derived from the general package roboto extended with some functionalities for statistical scraping. By Statistics Netherlands

Process (GSBPM 5)

  • Java application Java-VTL. A partial implementation of the Validation Transformation Language, based on the VTL 1.1 draft specification. By Statistics Norway.
  • Java application ADaMSoft implements procedures for data analysis, data, web and text mining. Also contains procedures for data validation and imputation, based on the principle of Fellegi and Holt.

Data integration and record linkage (GSBPM 5.1)

  • R package RecordLinkage. Implementation of the Fellegi-Sunter method for record linkage.
  • R packages stringdist and fuzzyjoin allow for matching records based on inaccurate keys.
  • R package XBRL. Extraction of Business Financial Information from XBRL Documents

Statistical data editing and imputation (GSBPM 5.3 | 5.4)

  • R package validate. Rule management and data validation.
  • R package errorlocate. Error localisation based on the principle of Fellegi and Holt.
    • Uses validate rule definitions
    • supports categorical and/or numeric data
    • supports linear equalities, inequalities and conditional rules.
    • Configurable backend for MIP-based error localization.
  • R package VIM. Visualisation and Imputation of missing values.
    • Advanced visualisation of missing data patterns
    • Imputation using (robust) linear regression methods
    • Imputation using several donor-based methods (kNN, hot-deck)
  • R package VIMGUI. Graphical frontend to VIM
  • R package simputation. Simple imputation: many methods using a uniform interface following the tidy tools manifesto
    • Allows to easily combiny many imputation methods/strategies.
    • Supports regression (standard, M-estimation, ridge/lasso/elasticnet), hot-deck methods (powered by VIM), randomForest, EM-based, and iterative randomForest imputation. Reuse of fitted models and definition of simple user-defined methods are supported as well.
  • R package SeleMix. Detection of outliers and influential errors using a latent variable model for selective editing.
  • R package extremevalues. Detection of univariate outliers based on modeling the bulk distribution.
  • R package deductive. Deductive correction and imputation using edit rules and (partially) complete data.

Estimation and weighting (GSBPM 5.6 | 5.7)

  • R package survey. Weighting and estimation for complex survey designs, possibly under nonresponse. Also computes estimator variance. See also R package srvyr for integration with tidy tools.
  • R package hbsae. Small area estimation based on hierarchical Bayesian models.
  • R package rsae. Small area estimation based on (robust) maximum likelihood estimation.
  • R package calibrateSSB. Calculate weighs and estimates for panel data with non-response.
  • R package ReGenesees (only availableon joinup) has a similar interface as the R package survey and many different estimators with sampling errors are implemented.
  • R package vardpoor. Linearization of non-linear statistics and variance estimation.
  • R package convey. Variance estimation on indicators of income concentration and poverty using complex sample survey designs. Wrapper around the survey package.

Time series and seasonal adjustment (GSBPM 5.6 | 5.7)

  • X-13ARIMA-SEATS Seasonal adjustment software produced maintained and distributed by the US Census Bureau.
  • R package seasonal. Interface to the X13-ARIMA-SEATS program from R with a very nice shiny GUI.
  • R package x12. Alternative interface to the X13-ARIMA-SEATS program from R with a focus on batch processing time series.
  • JDemetra+ The seasonal adjustment software officially recommended for the European Statistical System.

Output validation (GSBPM 6.2)

  • R package validate. Rule management and data validation.

Statistical disclosure control (GSBPM 6.4)

  • Argus and SDC Tools. Tools like Tau-Argus and Mu-Argus for dististical disclosure control from Statistics Netherlands and the Statistical disclosure control netwerk.
  • R package sdcMicro. Disclosure control for statistical microdata.
  • R package sdcTable. Disclosure control for tabulated data.
  • R package simPop. Simulation of synthetic populations from census/survey data considering auxiliary information.

Statistical Dissemination (GSBPM 7.2)

  • SDMX Converter. Converter between differnt versions of SDMX and formats such as CSV, FLR etc. from Eurostat.
  • SDMX-RI. Framework for disseminating data in SDMX webservices from Eurostat.
  • R package rsdmx. Writing SDMX from R.
  • StatMiner, Experimental visualization framework from Statistics Netherlands. (github), (demo)
  • SDMX-JSON. JSON variant of SDMX. This is still a candidate standard.
  • JSON-Stat. Lightweight JSON based message format for statistical dissemination.

Visualisation (GSBPM 7.2)

  • R package tabplot. Compare up to about 10-20 variables simultaneously using a tableplot. See also tabplotd3 for a web-based GUI.
  • R package tmap Thematic geographic maps, including bubble charts, choropleths, and more.
  • A (growing) list of simplified maps useful for web cartography for World, Europe and countries cartomap
  • R package treemap. Space-filling visualisation of hierarchical data.

Access to official statistics (GSBPM 7.4)

  • R package rsdmx. Easy access to data from statistical organisations that support SDMX webservices. The package contains a list of SDMX access points of various national and international statistical institutes.
  • R package oecd Search and Extract Data from the OECD
  • R package sorvi Finnish Open Government Data Toolkit
  • R package eurostat Tools to download data from the Eurostat database together with search and manipulation utilities.
  • R package acs Download, Manipulate, and Present American Community Survey and Decennial Data from the US Census.
  • R package inegiR Access to data published by INEGI, Mexico's official statistics agency.
  • R package cbsodataR. Access to Statistics Netherlands' (CBS) open data API from R.
  • npm package cbsodata.js. Access to Statistics Netherlands' (CBS) open data API from js.
  • R package rjstat. Read and write data sets in the JSON-stat format.
  • R package censusapi A wrapper for the U.S. Census Bureau APIs that returns data frames of Census data and metadata.
  • R package nsoApi builds on other packages to access data from official statistics and tries to harmonize the API.

Contributions

Awesome contributions are welcome, here are ways to do it:

  • The GitHub way: send us a pull request to add directly to this list.
  • Add an item to the issue tracker issue tracker. (you need a GH account)
  • Send an e-mail to mark dot vanderloo at gmail dot com or olav dot tenbosch at gmail dot com or tweet @markvdloo

License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

awesome-official-statistics-software's People

Contributors

alexkowa avatar dickoa avatar djhurio avatar edwindj avatar markvanderloo avatar olavtenbosch avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.