GithubHelp home page GithubHelp logo

horker / psstan Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 35 KB

PowerShell module to use Stan nicely in PowerShell

License: MIT License

PowerShell 100.00%
stan cmdstan stanc stansummary powershell

psstan's Introduction

psstan

This is a PowerShell module to provide helper cmdlets to use Stan nicely in PowerShell.

This module is built on top of CmdStan v2.18.1.

Installation

The module is available in PowerShell Gallery.

Install-Module -Scope CurrentUser psstan

Configuration

  1. Download and install CmdStan according to the guidance of the CmdStan official site. The documentation "CmdStan Interface User's Guide" available on the release page contains the step-by-step instructions to install CmdStan in Windows.

  2. Define the variables $PSSTAN_PATH and $PSSTAN_TOOLS_PATHS (in your profile.ps1, for example). The former should be set to the directory where CmdStan is installed. The latter an array of the directories where g++ and make to compile Stan models are installed. For example:

Import-Module psstan
$PSSTAN_PATH = "C:\your_app_path\cmdstan"
$PSSTAN_TOOLS_PATHS = @(
    "C:\RTools\bin"
    "C:\RTools\mingw_64\bin"
)

Cmdlet Synopsis

New-StanExecutable [[-Path] <string>] [[-MakeOptions] <string>]

This cmdlet compiles the stan model file (.stan) and builds an executable for training. Internally, this cmdlet uses the make build tool which calls stanc and g++ in turn.

As input file, you can specifiy a .stan model file or a target .exe executable.


Start-StanSampling [-ModelFile] <string> [-DataFile] <string> [[-ChainCount] <int>] [[-OutputFile] <string>] [[-CombinedFile] <string>] [[-ConsoleFile] <string>] [[-Parallel]] [[-NumSamples] <int>] [[-NumWarmup] <int>] [[-SaveWarmup] <bool>] [[-Thin] <int>] [[-RandomSeed] <int>] [[-Option] <string>]

This cmdlet starts sampling based on the model file specified by the -ModelFile parameter. When the executable corresponding to the model file does not exist or is older than it, the cmdlet will compile the model file.

The result of sampling is written to the file specified by the -OUtputFile parameter. The value of the -OUtputFile parameter should contain '{0}' as the placeholder of a sampling chain. The default value of the -OUtputFile parameter is output{0}.csv (The output file names will be output1.csv, output2.csv and so on).

Additionally, the stripped versions of the outputs (that is, without any diagnosis information in them) are saved. Their file names end with _stripped.

You can specify the number of sampling chains by the -ChainCount parameter. If you add the -Parallel switch parameter, each sampling chain is running in parallel.

The outputs of all chains are combined to a single file and saved to the file specified by the -CombinedFile parameter. The default value of the -CombinedFile parameter is combined.csv.

The other parameters take the same effects as those of the original cmdstan executable.


Show-StanSummary [-Path] <string> [[-SigFig] <int>] [[-Autocorr] <int>] [[-CsvFile] <string>]

This cmdlet reads the output file generated by the CmdStan executable and displays its summary. Internally, this cmdlet calls the stansummary command and accepts the same optional arguments.


Get-StanSummary [-Path] <string> [[-Autocorr] <int>]

This cmdlet reads the output file generated by the CmdStan executable and returns its summary as PSObjects. Internally, this cmdlet calls the stansummary command.


ConvertTo-StanData [-InputObject] <psobject> [[-DataCountName] <string>] [[-AsString]]

This cmdlet takes objects from the input stream and converts them to StanData objects from which you can produce output in the R data format that CmdStan requires as training data.


New-StanData [-Name] <string> [-Data] <double[]> [[-Dimensions] <int[]>]
New-StanData [-Name] <string> [-First] <double> [-Last] <double> [[-Dimensions] <int[]>]
New-StanData [-Name] <string> [-Type] {integer | double} [-Count] <int> [[-Dimensions] <int[]>]

This cmdlet creates StanData objects directly to prepare data in the R data format. See the example section for more details.

Examples

Example 1

The following session shows how to compile and train the bernoulli.stan model file included in the CmdStan source code.

PS> cd C:\your_app_path\cmdstan\examples\bernoulli
PS> Start-StanSampling bernoulli.stan -ChainCount 2 -Parallel
:
(snip)
:
PS> dir *.csv | fw

    Directory: C:\your_app_path\cmdstan\examples\bernoulli

combined.csv          output1.csv
output2.csv           output_stripped1.csv
output_stripped1.csv

PS> Show-StanSummary output1.csv
Inference for Stan model: bernoulli_model
1 chains: each with iter=(1000); warmup=(0); thin=(1); 1000 iterations saved.

Warmup took (0.011) seconds, 0.011 seconds total
Sampling took (0.043) seconds, 0.043 seconds total

                Mean     MCSE   StdDev     5%   50%   95%    N_Eff  N_Eff/s    R_hat
lp__            -7.3 3.3e-002 7.3e-001   -8.8  -7.0  -6.8 4.8e+002 1.1e+004 1.0e+000
accept_stat__   0.91 4.4e-003 1.4e-001   0.63  0.97   1.0 9.8e+002 2.3e+004 1.0e+000
stepsize__       1.1 2.2e-015 1.6e-015    1.1   1.1   1.1 5.0e-001 1.2e+001 1.0e+000
treedepth__      1.4 1.7e-002 4.9e-001    1.0   1.0   2.0 8.2e+002 1.9e+004 1.0e+000
n_leapfrog__     2.3 3.2e-002 9.7e-001    1.0   3.0   3.0 9.2e+002 2.1e+004 1.0e+000
divergent__     0.00      nan 0.0e+000   0.00  0.00  0.00      nan      nan      nan
energy__         7.8 4.7e-002 9.7e-001    6.8   7.5   9.7 4.3e+002 1.0e+004 1.0e+000
theta           0.24 7.4e-003 1.2e-001  0.075  0.23  0.45 2.6e+002 6.1e+003 1.0e+000

Samples were drawn using hmc with nuts.
For each parameter, N_Eff is a crude measure of effective sample size,
and R_hat is the potential scale reduction factor on split chains (at
convergence, R_hat=1).

PS> $params = Get-StanSummary output1.csv
PS> $params.theta

name    : theta
Mean    : 0.244519
MCSE    : 0.00736044
StdDev  : 0.119169
5%      : 0.0748537
50%     : 0.229183
95%     : 0.45154
N_Eff   : 262.133
N_Eff/s : 6096.12
R_hat   : 1.00098

PS>

Example 2

The following example shows how to prepare an R data format file by the ConvertTo-StanData cmdlet.

PS> Get-Content example.csv
age,income
21,413
34,599
40,779
PS> Import-Csv example.csv | ConvertTo-StanData -DataCountName N | Set-Content example.data.R
PS> Get-Content example.data.R
age <- c(21, 34, 40)
income <- c(413, 599, 779)
N <- 3

Example 3

The following example shows to how to generate data records in the R data format programatically.

PS> New-StanData array 10, 20, 30, 40 | Set-Content example2.data.R
PS> New-StanData struct 1, 0, 0, 0, 1, 0, 0, 0, 1 -Dimensions 3, 3 | Add-Content example2.data.R
PS> New-StanData zero_values -Type double -Count 10 | Add-Content example2.data.R
PS> New-StanData range -First 100 -Last 200 | Add-Content example2.data.R
PS> Get-Content example2.data.R
array <- c(10, 20, 30, 40)
struct <- structure(c(1, 0, 0, 0, 1, 0, 0, 0, 1), .Dim = c(3, 3))
zero_values <- double(10)
range <- 100:200

To-Do

  • Documentation
  • Jugged array support in New-StanData

License

This module is licensed under the MIT License. See LICENSE.txt for more information.

psstan's People

Contributors

horker avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.