GithubHelp home page GithubHelp logo

juliastats / rdatasets.jl Goto Github PK

View Code? Open in Web Editor NEW
159.0 159.0 56.0 50.34 MB

Julia package for loading many of the data sets available in R

License: GNU General Public License v3.0

Julia 46.63% R 53.37%

rdatasets.jl's People

Contributors

alyst avatar andreasnoack avatar asinghvi17 avatar bjarthur avatar bkamins avatar boathit avatar dilumaluthge avatar dmbates avatar garborg avatar github-actions[bot] avatar ilanreinstein avatar johnmyleswhite avatar juliangehring avatar juliatagbot avatar laborg avatar matthieugomez avatar mneilsen avatar nalimilan avatar nignatiadis avatar pallharaldsson avatar petershintech avatar powerdistribution avatar randyzwitch avatar simondanisch avatar stewartwatts avatar stonecypher avatar swt30 avatar tkelman avatar tlienart avatar tlnagy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rdatasets.jl's Issues

RDatasets.datasets(package_name) returns only one row

julia> RDatasets.datasets("datasets")
1×5 DataFrame
│ Row │ Package  │ Dataset │ Title                     │ Rows │ Columns │
├─────┼──────────┼─────────┼───────────────────────────┼──────┼─────────┤
│ 1   │ datasets │ BOD     │ Biochemical Oxygen Demand │ 62       │

julia> RDatasets.datasets("mlmRev")
1×5 DataFrame
│ Row │ Package │ Dataset │ Title                               │ Rows  │ Columns │
├─────┼─────────┼─────────┼─────────────────────────────────────┼───────┼─────────┤
│ 1   │ mlmRev  │ Chem97  │ Scores on A-level Chemistry in 1997310228       │

julia> versioninfo()
Julia Version 1.0.0
Commit 5d4eaca0c9 (2018-08-08 20:58 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

Error encountered while loading

dataset("datasets","iris")
Error encountered while loading "C:\Users\hafez\.julia\packages\RDatasets\WIQKI\src\..\data\datasets\iris.rda".

Fatal error:
ERROR: UndefVarError: identifier not defined
Stacktrace:
[1] handle_error(::UndefVarError, ::FileIO.File{FileIO.DataFormat{:RData}}) at C:\Users\hafez.julia\packages\FileIO\ZknoK\src\error_handling.jl:82
[2] handle_exceptions(::Array{Any,1}, ::String) at C:\Users\hafez.julia\packages\FileIO\ZknoK\src\error_handling.jl:77
[3] load(::FileIO.Formatted; options::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\hafez.julia\packages\FileIO\ZknoK\src\loadsave.jl:186
[4] load at C:\Users\hafez.julia\packages\FileIO\ZknoK\src\loadsave.jl:166 [inlined]
[5] #load#13 at C:\Users\hafez.julia\packages\FileIO\ZknoK\src\loadsave.jl:118 [inlined]
[6] load at C:\Users\hafez.julia\packages\FileIO\ZknoK\src\loadsave.jl:118 [inlined]
[7] dataset(::String, ::String) at C:\Users\hafez.julia\packages\RDatasets\WIQKI\src\dataset.jl:12
[8] top-level scope at none:0

Unable to locate file .julia/v3.0/RDatasets/data/RDatasets/iris.rda

I was using RDatasets yesterday without a problem, but today I'm getting this error (and I didn't update Julia or anything):

ERROR: Unable to locate file /Users/vitoraguiar/.julia/v0.3/RDatasets/data/RDatasets/iris.rda or /Users/vitoraguiar/.julia/v0.3/RDatasets/data/RDatasets/iris.csv.gz

 in error at error.jl:21
 in dataset at /Users/vitoraguiar/.julia/v0.3/RDatasets/src/dataset.jl:11

The correct location is actually

 /Users/vitoraguiar/.julia/v0.3/RDatasets/data/datasets/iris.rda

So I had to manually rename the directory:

cd /Users/vitoraguiar/.julia/v0.3/RDatasets/data/
mv datasets RDatasets

Now it works!

meta-data lost

I noticed that:
RDatasets.dataset("datasets", "Titanic")

doesn't have the variable types it just converts everything to strings and integers (maybe because it's being converted to a text file in between?)

see
RCall.rcopy(R"as.data.frame(datasets::Titanic)")

DataFrames not installed by default

I don't know enough about the package system, but for some reason DataFrames isn't installed by default. Starting from a clean .julia directory:

julia> Pkg.add("RDatasets")
Installing RDatasets: v0.0.0
Cloning into 'RDatasets'...
remote: Counting objects: 582, done.
remote: Compressing objects: 100% (571/571), done.
remote: Total 582 (delta 7), reused 582 (delta 7)
Receiving objects: 100% (582/582), 10.53 MiB | 1.87 MiB/s, done.
Resolving deltas: 100% (7/7), done.

julia> load("RDatasets") 
Cannot find DataFrames.jl
 in realpath at file.jl:151
 in find_in_path at util.jl:197
 in require at util.jl:174
 in load_now at util.jl:228
 in load_now at util.jl:242
 in require at util.jl:176
at /Users/simon/.julia/RDatasets/src/RDatasets.jl:1
 in load_now at util.jl:228
 in load_now at util.jl:242
 in require at util.jl:176
 in load_now at util.jl:253
 in require at util.jl:176

RDatasets produce NaNs instead of NAs in Win10

using RDatasets, DataFrames
df = dataset("mlmRev","Gcsemv");

Produces:
│ 1 │ "20920" │ "16" │ "M" │ 23.0 │ NaN │
│ 2 │ "20920" │ "25" │ "F" │ NaN │ 71.2 │
│ 3 │ "20920" │ "27" │ "F" │ 39.0 │ 76.8 │
│ 4 │ "20920" │ "31" │ "F" │ 36.0 │ 87.9 │
│ 5 │ "20920" │ "42" │ "M" │ 16.0 │ 44.4 │
│ 6 │ "20920" │ "62" │ "F" │ 36.0 │ NaN │
│ 7 │ "20920" │ "101" │ "F" │ 49.0 │ 89.8 │
│ 8 │ "20920" │ "113" │ "M" │ 25.0 │ 17.5 │
│ 9 │ "20920" │ "146" │ "M" │ NaN │ 32.4 │
│ 10 │ "22520" │ "1" │ "F" │ 48.0 │ 84.2 │

RDatasets in Win10 produce NaN-values for unvailable values (NAs) as compared to Unices.
So the funcs dropna() and complete_cases() 'do not work' as needed, no filtering done.

Update doc of individual datasets, expose to users

I feel like doc access from the repl, IJulia, and Juno would make the package much more usable. Format-wise, adding HTML just takes a couple extra couple lines in an existing R script -- not sure how much extra work rst, etc., would be.

Thoughts on format/API?

Some datasets fail to load

package_directory = Pkg.dir("RDatasets", "data")
for directory in readdir(package_directory)
    for file in readdir(joinpath(package_directory, directory))
        dataname = replace(file, ".csv", "")
        dataname = replace(dataname, ".rda", "")
        try
            data(directory, dataname)
        catch
            @printf "Failed: %s - %s\n" directory dataname
        end
    end
end

yields

Failed: Zelig - friendship
Failed: Zelig - sna.ex
Failed: mlmRev - Oxboys
Failed: mlmRev - bdf

Failing due to new version of CSV.jl 0.5.1

On Julia 1.1 with CSV.jl 0.5.1 I get an error running

using RDatasets
cars = dataset("datasets", "cars")
MethodError: no method matching CSV.File(::Base.GenericIOBuffer{Array{UInt8,1}}; delim=',', quotechar='"', missingstring="NA", rows_for_type_detect=200)
Closest candidates are:
  CSV.File(::Any; header, normalizenames, datarow, skipto, footerskip, limit, transpose, comment, use_mmap, missingstrings, missingstring, delim, ignorerepeated, quotechar, openquotechar, closequotechar, escapechar, dateformat, decimal, truestrings, falsestrings, type, types, typemap, categorical, pool, strict, silencewarnings, debug, parsingdebug, allowmissing) at C:\Users\RTX2080\.julia\packages\CSV\iKwnQ\src\CSV.jl:154 got unsupported keyword argument "rows_for_type_detect"
  CSV.File(::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any) at C:\Users\RTX2080\.julia\packages\CSV\iKwnQ\src\CSV.jl:24 got unsupported keyword arguments "delim", "quotechar", "missingstring", "rows_for_type_detect"
  CSV.File(!Matched::String, !Matched::Array{UInt8,1}, !Matched::Array{Symbol,1}, !Matched::Array{Type,1}, !Matched::UInt8, !Matched::Bool, !Matched::Array{Array{String,1},1}, !Matched::Int64, !Matched::Int64, !Matched::Array{UInt64,1}) at C:\Users\RTX2080\.julia\packages\CSV\iKwnQ\src\CSV.jl:24 got unsupported keyword arguments "delim", "quotechar", "missingstring", "rows_for_type_detect"

Stacktrace:
 [1] kwerr(::NamedTuple{(:delim, :quotechar, :missingstring, :rows_for_type_detect),Tuple{Char,Char,String,Int64}}, ::Type, ::Base.GenericIOBuffer{Array{UInt8,1}}) at .\error.jl:125
 [2] (::getfield(Core, Symbol("#kw#Type")))(::NamedTuple{(:delim, :quotechar, :missingstring, :rows_for_type_detect),Tuple{Char,Char,String,Int64}}, ::Type{CSV.File}, ::Base.GenericIOBuffer{Array{UInt8,1}}) at .\none:0
 [3] (::getfield(RDatasets, Symbol("##1#2")){String,String})(::IOStream) at C:\Users\RTX2080\.julia\packages\RDatasets\1Ih8s\src\dataset.jl:28
 [4] #open#310(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::getfield(RDatasets, Symbol("##1#2")){String,String}, ::String, ::Vararg{String,N} where N) at .\iostream.jl:369
 [5] open(::Function, ::String, ::String) at .\iostream.jl:367
 [6] dataset(::String, ::String) at C:\Users\RTX2080\.julia\packages\RDatasets\1Ih8s\src\dataset.jl:26
 [7] top-level scope at In[89]:2

Missing AirlinePassengers?

I'm filing an issue instead of just adding it because that one has been added upstream in the same commit as others (for example datasets/CO2) which are in here. Why? Do you plan to stay in sync or was vincentarelbundock/Rdatasets just to get started?

no method matching getindex(::Void, ::String) on missing RData

Using Julia 0.6 with RDatatsets installed and passing tests but no RData installed, if I call for a dataset RDatasets correctly finds I have no RData and volunteers to install it for me. This appears to happen ok but ends with

should we install RData for you? (y/n):
y
INFO: Start installing RData...
INFO: Installing RData v0.0.4
INFO: Building Rmath
INFO: Package database updated
ERROR: MethodError: no method matching getindex(::Void, ::String)
 in dataset(::String, ::String) at /home/colin/.julia/v0.6/RDatasets/src/dataset.jl:6

RData seems to be correctly installed since now I can call for a dataset and it will display.
I tried checkout of RDatasets and same behaviour.
Also tried Pkg.rm("RData") and the behaviour repeated.
On Julia 0.4.6 I get a depwarn error which is quite different, but will hold off on that report.

Can't open Gzipped dataset

julia> dataset("MASS", "Boston")
ERROR: MethodError: no method matching position(::TranscodingStream{CodecZlib.GzipDecompressor,IOStream})
Closest candidates are:
  position(::IOStream) at iostream.jl:188
  position(::Base.Libc.FILE) at libc.jl:101
  position(::Base.Filesystem.File) at filesystem.jl:225
  ...
Stacktrace:
 [1] consumeBOM!(::TranscodingStream{CodecZlib.GzipDecompressor,IOStream}) at C:\Users\ohadl\.julia\packages\CSV\uLyo0\src\CSV.jl:209
 [2] #File#1(::Int64, ::Bool, ::Int64, ::Nothing, ::Int64, ::Nothing, ::Bool, ::Nothing, ::Bool, ::Array{String,1}, ::String, ::Char, ::Bool, ::Char, ::Nothing, ::Nothing, ::Char, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Dict{Type,Type}, ::Symbol, ::Bool, ::Bool, ::Bool, ::Base.Iterators.Pairs{Symbol,Int64,Tuple{Symbol},NamedTuple{(:rows_for_type_detect,),Tuple{Int64}}}, ::Type, ::TranscodingStream{CodecZlib.GzipDecompressor,IOStream}) at C:\Users\ohadl\.julia\packages\CSV\uLyo0\src\CSV.jl:142
 [3] (::getfield(Core, Symbol("#kw#Type")))(::NamedTuple{(:delim, :quotechar, :missingstring, :rows_for_type_detect),Tuple{Char,Char,String,Int64}}, ::Type{CSV.File}, ::TranscodingStream{CodecZlib.GzipDecompressor,IOStream}) at .\none:0
 [4] #read#101(::Bool, ::Dict{Int64,Function}, ::Base.Iterators.Pairs{Symbol,Any,NTuple{4,Symbol},NamedTuple{(:delim, :quotechar, :missingstring, :rows_for_type_detect),Tuple{Char,Char,String,Int64}}}, ::Function, ::TranscodingStream{CodecZlib.GzipDecompressor,IOStream}, ::Type) at C:\Users\ohadl\.julia\packages\CSV\uLyo0\src\CSV.jl:304
 [5] (::getfield(CSV, Symbol("#kw##read")))(::NamedTuple{(:delim, :quotechar, :missingstring, :rows_for_type_detect),Tuple{Char,Char,String,Int64}}, ::typeof(CSV.read), ::TranscodingStream{CodecZlib.GzipDecompressor,IOStream}, ::Type) at .\none:0 (repeats 2 times)
 [6] (::getfield(RDatasets, Symbol("##1#2")){String,String})(::TranscodingStream{CodecZlib.GzipDecompressor,IOStream}) at C:\Users\ohadl\.julia\packages\RDatasets\mvYPU\src\dataset.jl:27
 [7] open(::getfield(RDatasets, Symbol("##1#2")){String,String}, ::Type{TranscodingStream{CodecZlib.GzipDecompressor,S} where S<:IO}, ::String, ::String) at C:\Users\ohadl\.julia\packages\TranscodingStreams\SaPZ8\src\stream.jl:157
 [8] dataset(::String, ::String) at C:\Users\ohadl\.julia\packages\RDatasets\mvYPU\src\dataset.jl:26
 [9] top-level scope at none:0

Adding various packages didn't help.
Maybe an API changed? This looks similar.

ERROR: UndefVarError: identifier not defined

julia> import Pkg

julia> Pkg.add("RDatasets")
Updating registry at ~/.julia/registries/General
Updating git-repo https://github.com/JuliaRegistries/General.git
Resolving package versions...
Installed RDatasets ─ v0.6.8
Updating ~/.julia/environments/v1.4/Project.toml
[ce6b1742] + RDatasets v0.6.8
Updating ~/.julia/environments/v1.4/Manifest.toml
[78c3b35d] + Mocking v0.7.1
[df47a6cb] + RData v0.6.3
[ce6b1742] + RDatasets v0.6.8
[f269a46b] + TimeZones v1.2.0

julia> using RDatasets
[ Info: Precompiling RDatasets [ce6b1742-4840-55fa-b093-852dadbb1d8b]
WARNING: could not import DataFrames.identifier into RData

julia> using RDatasets

julia> dataset("datasets", "iris")
Error encountered while loading "/home/clyu/.julia/packages/RDatasets/WIQKI/src/../data/datasets/iris.rda".

Fatal error:
ERROR: UndefVarError: identifier not defined
Stacktrace:
[1] handle_error(::UndefVarError, ::FileIO.File{FileIO.DataFormat{:RData}}) at /home/clyu/.julia/packages/FileIO/ZknoK/src/error_handling.jl:82
[2] handle_exceptions(::Array{Any,1}, ::String) at /home/clyu/.julia/packages/FileIO/ZknoK/src/error_handling.jl:77
[3] load(::FileIO.Formatted; options::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/clyu/.julia/packages/FileIO/ZknoK/src/loadsave.jl:186
[4] load at /home/clyu/.julia/packages/FileIO/ZknoK/src/loadsave.jl:166 [inlined]
[5] #load#13 at /home/clyu/.julia/packages/FileIO/ZknoK/src/loadsave.jl:118 [inlined]
[6] load at /home/clyu/.julia/packages/FileIO/ZknoK/src/loadsave.jl:118 [inlined]
[7] dataset(::String, ::String) at /home/clyu/.julia/packages/RDatasets/WIQKI/src/dataset.jl:12
[8] top-level scope at REPL[7]:1

ERROR: type: readtable: in typeassert, expected Int32, got Int64

Hi,
I am facing an error when using iris dataset (some others too). I am using a latest 32-bit Julia build on windows. I tried using different versions of DataFrames with no luck.
Regards,
Dinu

julia> iris = data("datasets", "iris")
ERROR: type: readtable: in typeassert, expected Int32, got Int64
in readtable at c:\TrainingInstall\julia\GitHub\julia-win32\usr\share\julia\packages\DataFrames\src
\io.jl:664
in data at c:\TrainingInstall\julia\GitHub\julia-win32\usr\share\julia\packages\RDatasets\src\data.
jl:6

julia> Pkg.status()
BinDeps 0.2.0
Cairo 0.2.3
Gadfly 0.1.7
JSON master
RDatasets 0.0.1
Color 0.2.3
ArgParse 0.2.4
Codecs 0.0.0
Compose 0.1.5
DataFrames 0.3.6
Distributions 0.2.0
Iterators 0.1.1
Options 0.2.1
TextWrap 0.1.1
Mustache 0.0.0
GZip 0.2.4
Stats 0.2.3
NumericExtensions 0.2.10

Column of row indices left in the iris csv file

julia> iris=data("datasets","iris")
150x6 DataFrame:
              Sepal.Length Sepal.Width Petal.Length Petal.Width     Species
[1,]        1          5.1         3.5          1.4         0.2    "setosa"
[2,]        2          4.9         3.0          1.4         0.2    "setosa"
[3,]        3          4.7         3.2          1.3         0.2    "setosa"
[4,]        4          4.6         3.1          1.5         0.2    "setosa"
[5,]        5          5.0         3.6          1.4         0.2    "setosa"
[6,]        6          5.4         3.9          1.7         0.4    "setosa"
[7,]        7          4.6         3.4          1.4         0.3    "setosa"
[8,]        8          5.0         3.4          1.5         0.2    "setosa"
[9,]        9          4.4         2.9          1.4         0.2    "setosa"
[10,]      10          4.9         3.1          1.5         0.1    "setosa"

The data set should not have the first (and unnamed) column.

I haven't checked other data sets yet. This may be a widespread "infelicity".

No method matching CSV.File

When trying to load datasets from the "datasets" group I am getting the following error:

ERROR: MethodError: no method matching CSV.File(::Base.GenericIOBuffer{Array{UInt8,1}}; delim=',', quotechar='"', missingstring="NA", rows_for_type_detect=200)
Closest candidates are:
  CSV.File(::Any; header, normalizenames, datarow, skipto, footerskip, limit, transpose, comment, use_mmap, ignoreemptylines, threaded, select, drop, missingstrings, missingstring, delim, ignorerepeated, quotechar, openquotechar, closequotechar, escapechar, dateformat, decimal, truestrings, falsestrings, type, types, typemap, categorical, pool, strict, silencewarnings, debug, parsingdebug, allowmissing) at /Users/maxime/.julia/packages/CSV/76SRf/src/CSV.jl:262 got unsupported keyword argument "rows_for_type_detect"
Stacktrace:
 [1] kwerr(::NamedTuple{(:delim, :quotechar, :missingstring, :rows_for_type_detect),Tuple{Char,Char,String,Int64}}, ::Type{T} where T, ::Base.GenericIOBuffer{Array{UInt8,1}}) at ./error.jl:157
 [2] (::RDatasets.var"#1#2"{String,String})(::IOStream) at /Users/maxime/.julia/packages/RDatasets/1Ih8s/src/dataset.jl:28
 [3] open(::RDatasets.var"#1#2"{String,String}, ::String, ::Vararg{String,N} where N; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at ./io.jl:298
 [4] open at ./io.jl:296 [inlined]
 [5] dataset(::String, ::String) at /Users/maxime/.julia/packages/RDatasets/1Ih8s/src/dataset.jl:26
 [6] top-level scope at REPL[3]:1

Minimum example:

using RDatasets
dataset("datasets", "volcano")

Environment:

Status `~/.julia/environments/julia-tour/Project.toml`
  [336ed68f] CSV v0.6.1
  [a93c6f00] DataFrames v0.20.2
  [c601a237] Interact v0.10.3
  [add582a8] MLJ v0.10.3
  [91a5bcdd] Plots v0.29.9
  [ce6b1742] RDatasets v0.6.1
  [f3b207a7] StatsPlots v0.14.4
  [0f1e0344] WebIO v0.8.13

version info:

julia> versioninfo()
Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Compilation error: "Declaring __precompile__(false) is not allowed in files that are being precompiled."

I am testing code that uses the Zeilig dataset. I run these commands and get this output:

julia> Pkg.update()
INFO: Updating METADATA...
INFO: Computing changes...
INFO: No packages to install, update or remove
 
julia> Pkg.add("RDatasets")
INFO: Package RDatasets is already installed

julia> using RDatasets
INFO: Precompiling module RDatasets.
WARNING: Module Compat with uuid 229436453490 is missing from the cache.
This may mean module Compat does not support precompilation but is imported by a module that does.
ERROR: LoadError: Declaring __precompile__(false) is not allowed in files that are being precompiled.
Stacktrace:
 [1] _require(::Symbol) at ./loading.jl:455
 [2] require(::Symbol) at ./loading.jl:405
 [3] include_from_node1(::String) at ./loading.jl:576
 [4] include(::String) at ./sysimg.jl:14
 [5] anonymous at ./<missing>:2
while loading ~/.julia/v0.6/FileIO/src/FileIO.jl, in expression starting on line 5
ERROR: LoadError: Failed to precompile FileIO to ~/.julia/lib/v0.6/FileIO.ji.
Stacktrace:
 [1] compilecache(::String) at ./loading.jl:710
 [2] _require(::Symbol) at ./loading.jl:463
 [3] require(::Symbol) at ./loading.jl:405
 [4] include_from_node1(::String) at ./loading.jl:576
 [5] include(::String) at ./sysimg.jl:14
 [6] anonymous at ./<missing>:2
while loading ~/.julia/v0.6/RDatasets/src/RDatasets.jl, in expression starting on line 4
ERROR: Failed to precompile RDatasets to ~/.julia/lib/v0.6/RDatasets.ji.
Stacktrace:
 [1] compilecache(::String) at ./loading.jl:710
 [2] _require(::Symbol) at ./loading.jl:497
 [3] require(::Symbol) at ./loading.jl:405
julia> Pkg.status()
12 required packages:
 - CSV                           0.2.5
 - Cairo                         0.5.2
 - Colors                        0.8.2
 - Compose                       0.6.0
 - DataFrames                    0.11.6
 - Documenter                    0.18.0
 - GLM                           0.11.0
 - Gadfly                        0.7.0
 - Gallium                       0.1.0
 - RDatasets                     0.4.0
 - Revise                        0.1.1
 - StatsModels                   0.2.5
...

I am running Julia 0.6.3 on macOS High Sierra, 10.13.5.

pglm data sets

pglm has four data sets that could be added:

  • Fairness
  • HealthIns
  • PatsRD
  • Unions

Examples in doc cause errors

RDatasets.datasets("mlmRev")

errors with

ERROR: `getindex` has no method matching getindex(::DataFrame, ::ASCIIString)

[PkgEval] RDatasets may have a testing issue on Julia 0.3 (2014-07-14)

PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.2) and the nightly build of the unstable version (0.3). The results of this script are used to generate a package listing enhanced with testing results.

On Julia 0.3

  • On 2014-07-12 the testing status was Tests pass.
  • On 2014-07-14 the testing status changed to Package doesn't load.

Tests pass. means that PackageEvaluator found the tests for your package, executed them, and they all passed.

Package doesn't load. means that PackageEvaluator did not find tests for your package. Additionally, trying to load your package with using failed.

This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.

Test log:

INFO: Installing ArrayViews v0.4.6
INFO: Installing DataArrays v0.1.12
INFO: Installing DataFrames v0.5.6
INFO: Installing GZip v0.2.13
INFO: Installing RDatasets v0.1.1
INFO: Installing Reexport v0.0.1
INFO: Installing SortingAlgorithms v0.0.1
INFO: Installing StatsBase v0.5.3
INFO: Package database updated
Warning: could not import Sort.sortby into DataFrames
Warning: could not import Sort.sortby! into DataFrames
ERROR: repl_show not defined
 in include at ./boot.jl:245
 in include_from_node1 at ./loading.jl:128
 in include at ./boot.jl:245
 in include_from_node1 at ./loading.jl:128
 in reload_path at loading.jl:152
 in _require at loading.jl:67
 in require at loading.jl:54
 in include at ./boot.jl:245
 in include_from_node1 at ./loading.jl:128
 in reload_path at loading.jl:152
 in _require at loading.jl:67
 in require at loading.jl:51
 in include at ./boot.jl:245
 in include_from_node1 at loading.jl:128
 in process_options at ./client.jl:285
 in _start at ./client.jl:354
while loading /home/idunning/pkgtest/.julia/v0.3/DataFrames/src/dataframe/reshape.jl, in expression starting on line 163
while loading /home/idunning/pkgtest/.julia/v0.3/DataFrames/src/DataFrames.jl, in expression starting on line 110
while loading /home/idunning/pkgtest/.julia/v0.3/RDatasets/src/RDatasets.jl, in expression starting on line 2
while loading /home/idunning/pkgtest/.julia/v0.3/RDatasets/testusing.jl, in expression starting on line 1
INFO: Package database updated

Note this is possibly due to removal of deprecated functions in Julia 0.3-rc1: JuliaLang/julia#7609

[PkgEval] RDatasets may have a testing issue on Julia 0.4 (2014-10-08)

PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.3) and the nightly build of the unstable version (0.4). The results of this script are used to generate a package listing enhanced with testing results.

On Julia 0.4

  • On 2014-10-05 the testing status was Tests pass.
  • On 2014-10-08 the testing status changed to Tests fail, but package loads.

Tests pass. means that PackageEvaluator found the tests for your package, executed them, and they all passed.

Tests fail, but package loads. means that PackageEvaluator found the tests for your package, executed them, and they didn't pass. However, trying to load your package with using worked.

Special message from @IainNZ: This change may be due to breaking changes to Dict in JuliaLang/julia#8521, or the removal of deprecated syntax in JuliaLang/julia#8607.

This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.

Test log:

>>> 'Pkg.add("RDatasets")' log
INFO: Installing ArrayViews v0.4.6
INFO: Installing DataArrays v0.2.2
INFO: Installing DataFrames v0.5.9
INFO: Installing GZip v0.2.13
INFO: Installing RDatasets v0.1.1
INFO: Installing Reexport v0.0.1
INFO: Installing SortingAlgorithms v0.0.2
INFO: Installing StatsBase v0.6.6
INFO: Package database updated
INFO: METADATA is out-of-date a you may not have the latest version of RDatasets
INFO: Use `Pkg.update()` to get the latest versions of your packages

>>> 'using RDatasets' log

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/scalarstats.jl:98.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/scalarstats.jl:122.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "(T=>Float64)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:162.
Use "Dict{T,Float64}()" instead.

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:192.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "(T=>W)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:193.
Use "Dict{T,W}()" instead.

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/misc.jl:66.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/misc.jl:77.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "[a=>b, ...]" at /home/idunning/pkgtest/.julia/v0.4/DataFrames/src/RDA.jl:11.
Use "Dict(a=>b, ...)" instead.
Julia Version 0.4.0-dev+998
Commit e24fac0 (2014-10-07 22:02 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

>>> test log

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/scalarstats.jl:98.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/scalarstats.jl:122.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "(T=>Float64)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:162.
Use "Dict{T,Float64}()" instead.

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:192.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "(T=>W)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/counts.jl:193.
Use "Dict{T,W}()" instead.

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/misc.jl:66.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "(T=>Int)[]" at /home/idunning/pkgtest/.julia/v0.4/StatsBase/src/misc.jl:77.
Use "Dict{T,Int}()" instead.

WARNING: deprecated syntax "[a=>b, ...]" at /home/idunning/pkgtest/.julia/v0.4/DataFrames/src/RDA.jl:11.
Use "Dict(a=>b, ...)" instead.
Running tests:
 * dataset.jl

ERROR: `Dict{Symbol,Union(AbstractArray{Real,1},Real)}` has no method matching Dict{Symbol,Union(AbstractArray{Real,1},Real)}(::(Symbol,Symbol,Symbol,Symbol,Symbol), ::(Int64,Int64,Int64,Int64,Int64))
 in DataFrame at /home/idunning/pkgtest/.julia/v0.4/DataFrames/src/dataframe/dataframe.jl:85
 in DataFrame at /home/idunning/pkgtest/.julia/v0.4/DataFrames/src/RDA.jl:309
 in dataset at /home/idunning/pkgtest/.julia/v0.4/RDatasets/src/dataset.jl:6
 in include at ./boot.jl:245
 in include_from_node1 at ./loading.jl:128
 in anonymous at no file:17
 in include at ./boot.jl:245
 in include_from_node1 at loading.jl:128
 in process_options at ./client.jl:293
 in _start at ./client.jl:362
 in _start_3B_3789 at /home/idunning/julia04/usr/bin/../lib/julia/sys.so
while loading /home/idunning/pkgtest/.julia/v0.4/RDatasets/test/dataset.jl, in expression starting on line 7
while loading /home/idunning/pkgtest/.julia/v0.4/RDatasets/test/runtests.jl, in expression starting on line 15

INFO: Testing RDatasets
==============================[ ERROR: RDatasets ]==============================

failed process: Process(`/home/idunning/julia04/usr/bin/julia /home/idunning/pkgtest/.julia/v0.4/RDatasets/test/runtests.jl`, ProcessExited(1)) [1]

================================================================================
INFO: No packages to install, update or remove
ERROR: RDatasets had test errors
 in error at error.jl:21
 in test at pkg/entry.jl:719
 in anonymous at pkg/dir.jl:28
 in cd at ./file.jl:20
 in cd at pkg/dir.jl:28
 in test at pkg.jl:68
 in process_options at ./client.jl:221
 in _start at ./client.jl:362
 in _start_3B_3789 at /home/idunning/julia04/usr/bin/../lib/julia/sys.so

>>> end of log

Error with CSV 0.6

On RDatasets#master:

julia> using RDatasets
julia> RDatasets.datasets("plm")
MethodError: no method matching Parsers.Options(::Missing, ::UInt8, ::UInt8, ::UInt8, ::UInt8, ::UInt8, ::UInt8, ::UInt8, ::Nothing, ::Nothing, ::Nothing, ::Bool, ::Bool, ::Nothing, ::Bool, ::Bool, ::Bool, ::Bool)
Closest candidates are:
  Parsers.Options(::Union{Missing, Nothing, Array{String,1}}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Nothing, Char, UInt8, String}, ::Union{Char, UInt8}, ::Union{Nothing, Array{String,1}}, ::Union{Nothing, Array{String,1}}, ::Union{Nothing, String, Dates.DateFormat}, ::Any, ::Any, ::Any, ::Any, ::Any) at ~\.julia\packages\Parsers\GLY4Q\src\Parsers.jl:60
  Parsers.Options(::Union{Missing, Nothing, Array{String,1}}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Nothing, Char, UInt8, String}, ::Union{Char, UInt8}, ::Union{Nothing, Array{String,1}}, ::Union{Nothing, Array{String,1}}, ::Union{Nothing, String, Dates.DateFormat}, ::Any, ::Any, ::Any, ::Any) at ~\.julia\packages\Parsers\GLY4Q\src\Parsers.jl:60
  Parsers.Options(::Union{Missing, Nothing, Array{String,1}}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Char, UInt8}, ::Union{Nothing, Char, UInt8, String}, ::Union{Char, UInt8}, ::Union{Nothing, Array{String,1}}, ::Union{Nothing, Array{String,1}}, ::Union{Nothing, String, Dates.DateFormat}, ::Any, ::Any, ::Any) at ~\.julia\packages\Parsers\GLY4Q\src\Parsers.jl:60

Stacktrace:
 [1] file(::String, ::Int64, ::Bool, ::Int64, ::Nothing, ::Int64, ::Int64, ::Bool, ::Nothing, ::Bool, ::Bool, ::Nothing, ::Nothing, ::Nothing, ::Array{String,1}, ::String, ::Nothing, ::Bool, ::Char, ::Nothing, ::Nothing, ::Char, ::Nothing, ::UInt8, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Dict{Int8,Int8}, ::Bool, ::Float64, ::Bool, ::Bool, ::Bool, ::Bool, ::Nothing) at ~\.julia\packages\CSV\7Dav7\src\CSV.jl:388
...
[6] datasets() at ~\.julia\packages\RDatasets\1Mcea\src\datasets.jl:9
 [7] datasets(::String) at ~\.julia\packages\RDatasets\1Mcea\src\datasets.jl:2
 [8] top-level scope at In[10]:2

read_table not defined

Following the instructions in the readme:

julia> load("RDatasets")

Lots of warnings...

julia> using RDatasets

julia> iris = data("datasets", "iris")
in data: read_table not defined
 in data at /Users/simon/.julia/RDatasets/src/data.jl:11

CSV methoderror pops up again

  LoadError: MethodError: no method matching CSV.File(::Base.GenericIOBuffer{Array{UInt8,1}}; delim=',', quotechar='"', missingstring="NA", rows_for_type_detect=200)
  Closest candidates are:
    CSV.File(::Any; header, normalizenames, datarow, skipto, footerskip, limit, transpose, comment, use_mmap, ignoreemptylines, missingstrings, missingstring, delim, ignorerepeated, quotechar, openquotechar, closequotechar, escapechar, dateformat, decimal, truestrings, falsestrings, type, types, typemap, categorical, pool, strict, silencewarnings, debug, parsingdebug, allowmissing) at /builds/JuliaGPU/MakieGallery-jl/.julia/packages/CSV/9II7K/src/CSV.jl:160 got unsupported keyword argument "rows_for_type_detect"
    CSV.File(::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any) at /builds/JuliaGPU/MakieGallery-jl/.julia/packages/CSV/9II7K/src/CSV.jl:24 got unsupported keyword arguments "delim", "quotechar", "missingstring", "rows_for_type_detect"
    CSV.File(!Matched::String, !Matched::Array{Symbol,1}, !Matched::Array{Type,1}, !Matched::Int64, !Matched::Int64, !Matched::UInt8, !Matched::Bool, !Matched::Array{Array{String,1},1}, !Matched::Array{UInt8,1}, !Matched::Array{Array{UInt64,1},1}) at /builds/JuliaGPU/MakieGallery-jl/.julia/packages/CSV/9II7K/src/CSV.jl:24 got unsupported keyword arguments "delim", "quotechar", "missingstring", "rows_for_type_detect"
  Stacktrace:
   [1] kwerr(::NamedTuple{(:delim, :quotechar, :missingstring, :rows_for_type_detect),Tuple{Char,Char,String,Int64}}, ::Type, ::Base.GenericIOBuffer{Array{UInt8,1}}) at ./error.jl:125
   [2] (::getfield(Core, Symbol("#kw#Type")))(::NamedTuple{(:delim, :quotechar, :missingstring, :rows_for_type_detect),Tuple{Char,Char,String,Int64}}, ::Type{CSV.File}, ::Base.GenericIOBuffer{Array{UInt8,1}}) at ./none:0
   [3] (::getfield(RDatasets, Symbol("##1#2")){String,String})(::IOStream) at /builds/JuliaGPU/MakieGallery-jl/.julia/packages/RDatasets/1Ih8s/src/dataset.jl:28
   [4] #open#310(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::getfield(RDatasets, Symbol("##1#2")){String,String}, ::String, ::Vararg{String,N} where N) at ./iostream.jl:375
   [5] open(::Function, ::String, ::String) at ./iostream.jl:373
   [6] dataset(::String, ::String) at /builds/JuliaGPU/MakieGallery-jl/.julia/packages/RDatasets/1Ih8s/src/dataset.jl:26
   [7] top-level scope at none:0

Slight modifications to original datasets

Hi! I'd like to ask, if there are general guidelines in regards to the datasets that can be added to this repository. In particular:

  1. A package implements its own class. An object of this class basically consists of some metadata and a dataframe. Of the included example datasets, I just want to add the corresponding dataframes (and not the metadata) to RDatasets.jl.

  2. Using data("dataname") returns a list of 3 similar dataframes. Instead, I vertically merge those 3 dataframes and add an extra column to distinguish them.

Would such datasets be welcome, or should I refrain from adding them in such a form? And if I add them, how should the "modifications" be annotated?

(The package in question is adehabitatLT.)

Import fails on a clean Julia 0.2

I am using Julia 0.2 just downloaded form the website. Installation goes fine but I am unable to import the package.

Maybe is just that the version on the metadata repo is old.

Thanks

julia> using RDatasets
ERROR: RDatasets not found
 in require at loading.jl:39

julia> Pkg.add("RDatasets")
INFO: Initializing package repository /Users/danielfrg/.julia
INFO: Cloning METADATA from git://github.com/JuliaLang/METADATA.jl
INFO: Cloning cache of Blocks from git://github.com/tanmaykm/Blocks.jl.git
INFO: Cloning cache of DataArrays from git://github.com/JuliaStats/DataArrays.jl.git
INFO: Cloning cache of DataFrames from git://github.com/JuliaStats/DataFrames.jl.git
INFO: Cloning cache of GZip from git://github.com/kmsquire/GZip.jl.git
INFO: Cloning cache of RDatasets from git://github.com/johnmyleswhite/RDatasets.jl.git
INFO: Cloning cache of SortingAlgorithms from git://github.com/JuliaLang/SortingAlgorithms.jl.git
INFO: Cloning cache of Stats from git://github.com/JuliaStats/Stats.jl.git
INFO: Cloning cache of StatsBase from git://github.com/JuliaStats/StatsBase.jl.git
INFO: Installing Blocks v0.0.1
INFO: Installing DataArrays v0.0.1
INFO: Installing DataFrames v0.4.2
INFO: Installing GZip v0.2.7
INFO: Installing RDatasets v0.1.0
INFO: Installing SortingAlgorithms v0.0.1
INFO: Installing Stats v0.1.0
INFO: Installing StatsBase v0.2.10
INFO: REQUIRE updated.

julia> using RDatasets
ERROR: data not defined
 in include at boot.jl:238
 in include_from_node1 at loading.jl:114
 in include at boot.jl:238
 in include_from_node1 at loading.jl:114
 in reload_path at loading.jl:140
 in _require at loading.jl:58
 in require at loading.jl:43
at /Users/danielfrg/.julia/RDatasets/src/data.jl:15
at /Users/danielfrg/.julia/RDatasets/src/RDatasets.jl:8

huge variance in time to load iris

Hello

I have observed a 10x difference when loading the iris dataset in 2 different machines.

Loading times are a bit unreasonable, is there anything I can do to speed this up?

ulia> using RDatasets

julia> @time iris = dataset("datasets", "iris"); # a DataFrame
100.068931 seconds (75.23 M allocations: 4.053 GiB, 3.19% gc time)

julia> 102.497734 seconds (75.35 M allocations: 4.062 GiB, 3.33% gc time)
       (v1.2) pkg> status RDatasets
           Status `~/.julia/environments/v1.2/Project.toml`
         [a93c6f00] DataFrames v0.19.4
         [ce6b1742] RDatasets v0.6.4

julia> versioninfo()
Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i5-4278U CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)
Environment:
  JULIA_EDITOR = subl

(v1.2) pkg> status RDatasets
    Status `~/.julia/environments/v1.2/Project.toml`
  [336ed68f] CSV v0.5.14
  [a93c6f00] DataFrames v0.19.4
  [ce6b1742] RDatasets v0.6.4

In the other machine I get:

julia> using RDatasets
[ Info: Recompiling stale cache file /home/david/.julia/compiled/v1.1/RDatasets/JyIbx.ji for RDatasets [ce6b1742-4840-55fa-b093-852dadbb1d8b]

julia> @time iris = dataset("datasets", "iris"); 
 10.544570 seconds (37.27 M allocations: 1.767 GiB, 8.98% gc time)

julia> versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

(v1.1) pkg> status RDatasets
    Status `~/.julia/environments/v1.1/Project.toml`
  [336ed68f] CSV v0.5.14
  [a93c6f00] DataFrames v0.18.4
  [ce6b1742] RDatasets v0.6.1

Sync with original RDatasets repo

Vincent has added a bunch of data sets to the original repo from which we grabbed data. We should try to sync back up with his data to include more data sets.

Dimspec not defined

One both OSX and Ubuntu, after a Pkg.init(); Pkg.update(); Pkg.add("RDatasets"); load("RDatasets"); I get the following:

julia> load("RDatasets")
Warning: replacing module DataFrames
Warning: New definition +(BitArray{N},AbstractArray{T,N}) at /home/sabae/src/julia/base/bitarray.jl:992 is ambiguous with +(AbstractArray{T,N},BitArray{N}) at bitarray.jl:993.
         Make sure +(BitArray{N},BitArray{N}) is defined first.
...
<many identical warnings for a large number of operators>
...
Dimspec not defined
 in load_now at util.jl:235
 in require at util.jl:185
 in load_now at util.jl:235
 in load_now at util.jl:235
 in load_now at util.jl:247
at /home/sabae/src/julia/base/bitarray.jl:1406

CodecZlib isssue prevents "using RDatasets"

`Julia Version 1.1.1
Commit 55e36cc308 (2019-05-16 04:10 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu) - Ubuntu 18.04
CPU: Intel(R) Core(TM) i7-6850K CPU @ 3.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)


julia> using RDatasets
[ Info: Precompiling RDatasets [ce6b1742-4840-55fa-b093-852dadbb1d8b]
ERROR: LoadError: CodecZlib.jl is not installed properly, run Pkg.build("CodecZlib") and restart Julia.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] top-level scope at /home/jerry/.julia/packages/CodecZlib/9jDi1/src/CodecZlib.jl:34
[3] include at ./boot.jl:326 [inlined]
[4] include_relative(::Module, ::String) at ./loading.jl:1038
[5] include(::Module, ::String) at ./sysimg.jl:29
[6] top-level scope at none:2
[7] eval at ./boot.jl:328 [inlined]
[8] eval(::Expr) at ./client.jl:404
[9] top-level scope at ./none:3
in expression starting at /home/jerry/.julia/packages/CodecZlib/9jDi1/src/CodecZlib.jl:33
ERROR: LoadError: Failed to precompile CodecZlib [944b1d66-785c-5afd-91f1-9de20f533193] to /home/jerry/.julia/compiled/v1.1/CodecZlib/1TI30.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1197
[3] _require(::Base.PkgId) at ./loading.jl:960
[4] require(::Base.PkgId) at ./loading.jl:858
[5] require(::Module, ::Symbol) at ./loading.jl:853
[6] include at ./boot.jl:326 [inlined]
[7] include_relative(::Module, ::String) at ./loading.jl:1038
[8] include(::Module, ::String) at ./sysimg.jl:29
[9] top-level scope at none:2
[10] eval at ./boot.jl:328 [inlined]
[11] eval(::Expr) at ./client.jl:404
[12] top-level scope at ./none:3
in expression starting at /home/jerry/.julia/packages/RData/y6mA8/src/RData.jl:3
ERROR: LoadError: Failed to precompile RData [df47a6cb-8c03-5eed-afd8-b6050d6c41da] to /home/jerry/.julia/compiled/v1.1/RData/idMMA.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1197
[3] _require(::Base.PkgId) at ./loading.jl:960
[4] require(::Base.PkgId) at ./loading.jl:858
[5] require(::Module, ::Symbol) at ./loading.jl:853
[6] include at ./boot.jl:326 [inlined]
[7] include_relative(::Module, ::String) at ./loading.jl:1038
[8] include(::Module, ::String) at ./sysimg.jl:29
[9] top-level scope at none:2
[10] eval at ./boot.jl:328 [inlined]
[11] eval(::Expr) at ./client.jl:404
[12] top-level scope at ./none:3
in expression starting at /home/jerry/.julia/packages/RDatasets/67RP7/src/RDatasets.jl:2
ERROR: Failed to precompile RDatasets [ce6b1742-4840-55fa-b093-852dadbb1d8b] to /home/jerry/.julia/compiled/v1.1/RDatasets/JyIbx.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1197
[3] _require(::Base.PkgId) at ./loading.jl:960
[4] require(::Base.PkgId) at ./loading.jl:858
[5] require(::Module, ::Symbol) at ./loading.jl:853

julia> using Pkg

julia> Pkg.build("CodecZlib")
Building CodecZlib → ~/.julia/packages/CodecZlib/9jDi1/deps/build.log
┌ Error: Error building CodecZlib:
│ ERROR: LoadError: LibraryProduct(nothing, ["libz"], :libz, "Prefix(/home/jerry/.julia/packages/CodecZlib/9jDi1/deps/usr)") is not satisfied, cannot generate deps.jl!
│ Stacktrace:
│ [1] error(::String) at ./error.jl:33
│ [2] #write_deps_file#165(::Bool, ::Bool, ::Function, ::String, ::Array{LibraryProduct,1}) at /home/jerry/.julia/packages/BinaryProvider/A0sDa/src/Products.jl:419
│ [3] (::getfield(BinaryProvider, Symbol("#kw##write_deps_file")))(::NamedTuple{(:verbose,),Tuple{Bool}}, ::typeof(write_deps_file), ::String, ::Array{LibraryProduct,1}) at ./none:0
│ [4] top-level scope at /home/jerry/.julia/packages/CodecZlib/9jDi1/deps/build.jl:93
│ [5] include at ./boot.jl:326 [inlined]
│ [6] include_relative(::Module, ::String) at ./loading.jl:1038
│ [7] include(::Module, ::String) at ./sysimg.jl:29
│ [8] include(::String) at ./client.jl:403
│ [9] top-level scope at none:0
│ in expression starting at /home/jerry/.julia/packages/CodecZlib/9jDi1/deps/build.jl:78
└ @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:1075
`

ERROR: Cannot clone FileIO from https://github.com/JuliaIO/FileIO.jl.git. unexpected return value from ssl handshake -9806

I am getting the following error while installing this package, with Pkg.add("RDatasets"), from the Julia REPL (on VS Code):

INFO: Cloning cache of FileIO from https://github.com/JuliaIO/FileIO.jl.git
ERROR: Cannot clone FileIO from https://github.com/JuliaIO/FileIO.jl.git. unexpected return value from ssl handshake -9806
prefetch(::String, ::String, ::Array{String,1}) at ./pkg/cache.jl:56
resolve(::Dict{String,Base.Pkg.Types.VersionSet}, ::Dict{String,Dict{VersionNumber,Base.Pkg.Types.Available}}, ::Dict{String,Tuple{VersionNumber,Bool}}, ::Dict{String,Base.Pkg.Types.Fixed}, ::Dict{String,VersionNumber}, ::Set{String}) at ./pkg/entry.jl:516
resolve(::Dict{String,Base.Pkg.Types.VersionSet}, ::Dict{String,Dict{VersionNumber,Base.Pkg.Types.Available}}, ::Dict{String,Tuple{VersionNumber,Bool}}, ::Dict{String,Base.Pkg.Types.Fixed}) at ./pkg/entry.jl:479
edit(::Function, ::String, ::Base.Pkg.Types.VersionSet, ::Vararg{Base.Pkg.Types.VersionSet,N} where N) at ./pkg/entry.jl:30
(::Base.Pkg.Entry.##1#3{String,Base.Pkg.Types.VersionSet})() at ./task.jl:335
Stacktrace:
 [1] sync_end() at ./task.jl:287
 [2] macro expansion at ./task.jl:303 [inlined]
 [3] add(::String, ::Base.Pkg.Types.VersionSet) at ./pkg/entry.jl:51
 [4] (::Base.Pkg.Dir.##4#7{Array{Any,1},Base.Pkg.Entry.#add,Tuple{String}})() at ./pkg/dir.jl:36
 [5] cd(::Base.Pkg.Dir.##4#7{Array{Any,1},Base.Pkg.Entry.#add,Tuple{String}}, ::String) at ./file.jl:70
 [6] #cd#1(::Array{Any,1}, ::Function, ::Function, ::String, ::Vararg{String,N} where N) at ./pkg/dir.jl:36
 [7] add(::String) at ./pkg/pkg.jl:117

julia> versioninfo()
Julia Version 0.6.2
Commit d386e40c17 (2017-12-13 18:08 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)

dir is not defined

require("RDatasets")
using RDatasets
iris = data("datasets", "iris")

fails with error:
dir not defined
in data at /Users/adam/.julia/RDatasets/src/data.jl:4

If I try to log Pkg.dir in the console I get the same error

loading datasets fails on v0.7

julia> VERSION
v"0.7.0-beta.297"

julia> using RDatasets

julia> df = dataset("datasets", "iris") # load the dataset
Error encountered while loading "/home/tamas/.julia/packages/RDatasets/YIiZ/src/../data/datasets/iris.rda".
Fatal error:
ERROR: UndefVarError: is_installed not defined
Stacktrace:
 [1] checked_import(::Symbol) at /home/tamas/.julia/packages/FileIO/5FOv/src/loadsave.jl:24
 [2] #load#27(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::FileIO.File{FileIO.DataFormat{:RData}}) at /home/tamas/.julia/packages/FileIO/5FOv/src/loadsave.jl:175
 [3] load(::FileIO.File{FileIO.DataFormat{:RData}}) at /home/tamas/.julia/packages/FileIO/5FOv/src/loadsave.jl:167
 [4] #load#13(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::String) at /home/tamas/.julia/packages/FileIO/5FOv/src/loadsave.jl:113
 [5] load at /home/tamas/.julia/packages/FileIO/5FOv/src/loadsave.jl:113 [inlined]
 [6] dataset(::String, ::String) at /home/tamas/.julia/packages/RDatasets/YIiZ/src/dataset.jl:21
 [7] top-level scope at none:0

pkg> status
  [ce6b1742] RDatasets v0.4.0

(this is the second run, to suppress deprecation warnings)

error while loading the datasets

I tried to run the example that I found in the README.md of Gadfly and got the following error when I tried to load the R dataset:

julia> mammals = data("MASS", "mammals")
ERROR: type IOBuffer has no field ios
 in extract_string at /home/theodore/.julia/DataFrames/src/io.jl:29
 in read_separated_line at /home/theodore/.julia/DataFrames/src/io.jl:152
 in determine_column_names at /home/theodore/.julia/DataFrames/src/io.jl:245
 in read_table at /home/theodore/.julia/DataFrames/src/io.jl:380
 in data at /home/theodore/.julia/RDatasets/src/data.jl:14

Error when add the package

add RDatasets
  Resolving package versions...
ERROR: Unsatisfiable requirements detected for package RDatasets [ce6b1742]:
 RDatasets [ce6b1742] log:
 ├─possible versions are: [0.5.0, 0.6.0-0.6.9] or uninstalled
 ├─restricted to versions * by an explicit requirement, leaving only versions [0.5.0, 0.6.0-0.6.9]
 ├─restricted by compatibility requirements with DataFrames [a93c6f00] to versions: 0.6.8-0.6.9 or uninstalled, leaving only versions: 0.6.8-0.6.9
 │ └─DataFrames [a93c6f00] log:
 │   ├─possible versions are: 0.21.4 or uninstalled
 │   └─DataFrames [a93c6f00] is fixed to version 0.21.4
 └─restricted by compatibility requirements with CSV [336ed68f] to versions: [0.5.0, 0.6.0-0.6.1] or uninstalled — no versions left
   └─CSV [336ed68f] log:
     ├─possible versions are: 0.7.1 or uninstalled
     └─CSV [336ed68f] is fixed to version 0.7.1

My Julia version is

v"1.4.1"

Capitalization of variable names

Is it intentional that the capitalization of variables' names differs from that in the R data sets?

I'm currently revising Bates and Watts (1988), Nonlinear Regression Analysis and Its Applications, including examples in R and in Julia, Admittedly the capitalization of the variable names in R is wildly inconsistent and the capitalization in the RDatasets package is more consistent but it still becomes awkward explaining why the formulas are different in the two versions of an example.

For example, in R

> names(Puromycin)
[1] "conc"  "rate"  "state"

whereas in Julia,

julia> colnames(data("datasets","Puromycin"))
3-element Array{Union(ASCIIString,UTF8String),1}:
 "Conc" 
 "Rate" 
 "State"

clean_colnames!()

Would it make sense to clean_colnames!() by default in data.jl?

Creating a Formula with a "." in a colname causes a somewhat cryptic error.

using DataFrames
using RDatasets
using GLM

swiss = data("datasets", "swiss")
fit = lm(:(Fertility ~ Agriculture + Infant.Mortality), swiss)

ERROR: Non-call expression encountered
in dospecials at /home/stewart/.julia/DataFrames/src/formula.jl:68
in map at cell.jl:19
in dospecials at /home/stewart/.julia/DataFrames/src/formula.jl:72
in Terms at /home/stewart/.julia/DataFrames/src/formula.jl:128
in ModelFrame at /home/stewart/.julia/DataFrames/src/formula.jl:172
in lm at /home/stewart/.julia/GLM/src/lm.jl:37
in lm at /home/stewart/.julia/GLM/src/lm.jl:42

clean_colnames!(swiss)
fit = lm(:(Fertility ~ Agriculture + Infant_Mortality), swiss)

Formula: Fertility ~ :(+(Agriculture,Infant_Mortality))
Coefficients:
3x4 DataFrame:
Estimate Std.Error t value Pr(>|t|)
[1,] 21.9546 11.5285 1.90437 0.0634125
[2,] 0.208919 0.0686417 3.04362 0.00393547
[3,] 1.88563 0.535221 3.52308 0.00100803

datasets() gives harmless(?) parsing warning

julia> ds = RDatasets.datasets()
warning: failed parsing String on row=161, col=3, error=INVALID: OK, QUOTED, DELIMITED, INVALID_DELI
MITER
733×5 DataFrame
...

julia> ds[161, 3]
"Data from A.-M. Guerry, \\"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.