This package connects to the StatBank API from Statistics Denmark.
This package is in early BETA and new changes will most likely not have backward compatibility.
You can only install the development version from github, using Hadley Wickham's devtools package:
if(!require("devtools")) install.packages("devtools")
library("devtools")
install_github("krose/dkstat")
Here are a few simple examples that will go through the basics of requesting data from the StatBank and the structure of the output.
First, we'll load the library:
library(dkstat)
The dst_meta function retrieves meta data from the table you wan't to take a closer look at. It can be used to create the final request, but if you can figure out the structure of the query you can define it yourself.
We'll get some meta data from the AULAAR table. The AULAAR table has net unemployment numbers.
aulaar_meta <- dst_meta(table = "AULAAR", lang = "en")
The 'dst_meta' function returns a list with 4 objects: - basics - variables - values - basic_query
Let's see what the basics contains:
aulaar_meta$basics
## $id
## [1] "AULAAR"
##
## $text
## [1] "Net unemployed"
##
## $description
## [1] "Net unemployed by sex, persons/pct. and time"
##
## $unit
## [1] "Number"
##
## $updated
## [1] "2014-06-04T09:00:00"
There's a table id, a short description, a unit description and when the table was updated.
The variables in the list has a short description of each variable as well as the id:
aulaar_meta$variables
## id text
## 1 KØN sex
## 2 PERPCT persons/pct.
## 3 Tid time
The values is a list object of all the variable id's you can use to construct your final query:
str(aulaar_meta$values)
## List of 3
## $ KØN :'data.frame': 3 obs. of 2 variables:
## ..$ id : chr [1:3] "TOT" "M" "K"
## ..$ text: chr [1:3] "Total" "Men" "Women"
## $ PERPCT:'data.frame': 2 obs. of 2 variables:
## ..$ id : chr [1:2] "L10" "L9"
## ..$ text: chr [1:2] "Per cent of the labour force" "Unemployed (thousands)"
## $ Tid :'data.frame': 35 obs. of 2 variables:
## ..$ id : chr [1:35] "1979" "1980" "1981" "1982" ...
## ..$ text: chr [1:35] "1979" "1980" "1981" "1982" ...
The basic query is simply a list of the available variables with the first value id for each.
aulaar_meta$basic_query
## $KØN
## [1] "TOT"
##
## $PERPCT
## [1] "L10"
##
## $Tid
## [1] "2013"
If you know the table ids from the table you can simply supply the request through ...
aulaar <- dst_get_data(table = "AULAAR", KØN = "TOT", PERPCT = "L10", Tid = 2013,
lang = "en",
value_presentation = "ValueAndCode")
str(aulaar)
## 'data.frame': 1 obs. of 4 variables:
## $ KØN : chr "Total TOT"
## $ PERPCT: chr "Per cent of the labour force L10"
## $ TID : int 2013
## $ value : num 4.4
Let's use the basic_query from the dst_meta list to make our first query:
aulaar <- dst_get_data(table = "AULAAR",
query = aulaar_meta$basic_query,
lang = "en",
value_presentation = "ValueAndCode")
str(aulaar)
## 'data.frame': 1 obs. of 4 variables:
## $ KØN : chr "Total TOT"
## $ PERPCT: chr "Per cent of the labour force L10"
## $ TID : int 2013
## $ value : num 4.4
This is maybe not really what you want, so let's use the basic_query to construct a new query that might be better. I still want to have the total and percentage unemployed, but I would like all the observations going back to 1979. I'll now construct the final request, query the StatBank and make a plot.
aulaar_meta$basic_query$Tid <- aulaar_meta$values$Tid$id
aulaar <- dst_get_data(table = "AULAAR",
query = aulaar_meta$basic_query,
lang = "en",
format = "CSV",
value_presentation = "ValueAndCode")
plot(x = aulaar$TID,
y = aulaar$value,
main = "Unemployment",
xlab = "Year",
ylab = "Pct",
type = "l")
If you want the complete timeseries you can write "*" in the TID variable in the basic_query or like this:
aulaar <- dst_get_data(table = "AULAAR", KØN = "TOT", PERPCT = "L10", Tid = "*",
lang = "en",
value_presentation = "ValueAndCode")
str(aulaar)
## 'data.frame': 35 obs. of 4 variables:
## $ KØN : chr "Total TOT" "Total TOT" "Total TOT" "Total TOT" ...
## $ PERPCT: chr "Per cent of the labour force L10" "Per cent of the labour force L10" "Per cent of the labour force L10" "Per cent of the labour force L10" ...
## $ TID : int 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 ...
## $ value : num 6.1 6.9 8.9 9.5 10 9.6 8.6 7.5 7.5 8.3 ...