TimeSeries aims to provide a lightweight framework for working with time series data in Julia. Documentation is provided here.
juliastats / timeseries.jl Goto Github PK
View Code? Open in Web Editor NEWTime series toolkit for Julia
License: Other
Time series toolkit for Julia
License: Other
TimeSeries aims to provide a lightweight framework for working with time series data in Julia. Documentation is provided here.
A TimeFrame
object is a type that includes a DataFrame
as an element. The point of this is to simplify how time is treated as related to DataFrames
.
Here is an example of the repl show output:
julia> s
Column Names: Open High Low Close Volume Adjusted Close
Number of Rows: 10
Date Range: 1950-01-03 to 1950-01-16
Missing Values: 0
The basic sketch of the type is:
type TimeFrame
dataframe::DataFrame
typ::String
format::String
showtime::String
# inner constructor here
# decide on type for time (default to Datetime?)
# parse first row based on format element
# show format based on showtime element (default to format?)
# convert the first row into an IndexedVector
end
Write the following:
gtrows
=> greater than rows
gterows
=> greater than or equal rows
ltrows
=> less than rows
lterows
=> less than or equal rows
eqrows
=> equal to rows
Currently, if you take a TimeArray object and apply one of the transformation methods, you get the same name for the column, where obviously it has changed to a different meaning.
julia> percentchange(cl)[1]
1x1 TimeArray{Float64,2} 1980-01-04 to 1980-01-04
Close
1980-01-04 | 0.01
julia> moving(cl, mean, 10)[1]
1x1 TimeArray{Float64,2} 1980-01-16 to 1980-01-16
Close
1980-01-16 | 108.89
Currently the TimeStamp
immutable is in the timestamp branch, but some early observations. For our time trial, we're using 15,851 rows of SPX open prices. The first execution is removed from the time average. We compare three approaches.
#df[:,2] is the 2nd column (open prices) of a DataFrame
#ts is the value element of Array{TimeStamp}
#ar is an Array{Float64} of prices (with no other time identification)
julia> dftime = timetrial(max, df[:,2], 100)
0.0009391599292929288
julia> tstime = timetrial(max, ts, 100)
0.0227938448989899
julia> artime = timetrial(max, ar, 100)
0.0001925187676767677
julia> tstime/artime
118.39804074198092
julia> dftime/artime
4.878277274607043
It's becoming a bid tedious to git clone the package when using 0.2 versus picking it up with Pkg.add()
read_time
creates a jumbled mess if first column isn't named "Date"
julia> foo = read_time("pound.csv");
julia> colnames(foo)
7-element Union(UTF8String,ASCIIString) Array:
"Index"
"GBPUSD.Open"
"GBPUSD.High"
"GBPUSD.Low"
"GBPUSD.Close"
"GBPUSD.Volume"
"Date"
julia> typeof(foo["Index"])
DataArray{UTF8String,1}
julia> typeof(foo["Date"])
DataArray{Any,1}
The type of the "Date" column should be DataArray{CalendarTime,1}. This might have something to do with the recent immutable Calendar change. But I fixed in in repl, it just needs to be coded.
Here are the results from the test suite with Julia 0.1
julia> @timeseries
Running tests:
** test/returns.jl
** test/lag.jl
** test/moving.jl
** test/upto.jl
** test/indexdate.jl
But with the latest Julia 0.2, the test suite fails
julia> @timeseries
Running tests:
** test/returns.jl
ERROR: no method vector(Array{Any,1},)
at /Users/Administrator/.julia/TimeSeries/test/returns.jl:1
at /Users/Administrator/.julia/TimeSeries/run_tests.jl:19
The culprit is found in the src/testtimeseries.jl
file under the read_csv_for_testing
function. Specifically,
12 time_conversion = map(x -> parse("yyyy-MM-dd", x),
13 convert(Array{UTF16String}, vector(df[:,1])))
Line 13 no longer needs casting to vector so the corrected line should look like:
convert(Array{UTF16String}, df[:,1]))
This is easy enough to fix, but then pre-0.2 testing will break. Versioning for the fix needs to specify 0.2 and later
And while we're at it, change the names and generalize the gterows
etc methods. Doing this now in the date
branch
julia> Pkg.add("TimeSeries")
ERROR: The following packages are incompatible with fixed requirements: TimeSeries
in _resolve at pkg.jl:182
in _resolve at no file
in anonymous at no file:31
in cd at file.jl:25
in cd at pkg/dir.jl:28
in edit at pkg.jl:21
in add at pkg.jl:18
Probably 0.1.0 should be compatible with Julia 0.1, while 0.2.0 should be compatible with Julia 0.2, and so forth. That allows for ten versions within each Julia release, which seems reasonable.
Not sure what conventions other packages are using. DataFrames
is up to 0.3.9, so it's not following along with Julia releases.
It's more natural to think that the first data point should be 1.0 versus NA
This is a place holder for possibly adding a series type to the package. The hacking and trial and error is happening at the DataSeries.jl repo for now.
julia> using TimeSeries
WARNING: deprecated syntax "x[i:]" at /Users/jiahao/.julia/v0.3/DataFrames/src/formula.jl:58.
Use "x[i:end]" instead.
...
julia> spx = readtime(Pkg.dir("TimeSeries/test/data/spx.csv"))
ERROR: IndexedVector not defined
in readtime at /Users/jiahao/.julia/v0.3/TimeSeries/src/io.jl:32
julia> versioninfo()
Julia Version 0.3.0-prerelease+1362
Commit 842536d* (2014-02-02 21:12 UTC)
Platform Info:
System: Darwin (x86_64-apple-darwin13.0.2)
CPU: Intel(R) Core(TM) i5-4258U CPU @ 2.40GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY)
LAPACK: libopenblas
LIBM: libopenlibm
This limits the package to working with the DataFrames
family of types.
DataFrames appears to have removed it? Possibly renamed it?
julia> using DataFrames
julia> IndexedVector
ERROR: IndexedVector not defined
The idea behind this is to extend functionality in dependent packages. The padNA
issue is an example. Somewhere between DataFrames and its migration to DataArrays it broke, thus breaking many of the methods in TimeSeries, specifically those that offset (via lag
or lead
) observations. This affects not only moving
but more importantly entire packages such as MarketTechnicals where, for example, a moving average will not have a value until its n-period is attained.
Instead of depending on padNA
in DataArrays, write the method in utils .jl
and call it instead. JuliaStats/DataArrays.jl#21
We'll use results from R's TTR
package and compare them with results from the ema
function.
Arrays
by themselves are not serialized so it only makes sense to dispatch on DataFrames
, DataArrays
and the experimental DataSeries
.
The plan after this step is to eventually remove dependencies on DataFrames
, DataArrays
and Datetime
and make the package lightweight. This goes to the issue of changing the package name to Series.jl
Now that Julia has immutable types, TimeSeries can take advantage and create an array of these immutables.
immutable TimeFloat64 <: AbstractTimeDataType
timestamp::CalendarTime
value::Float64
end
type TimeArrayFloat64 <: AbstractTimeArray
values::Array{TimeFloat64}
#inner constructor that arranges objects in order by CalendarTime
end
type TimeFrame <: AbstractTimeFrame
# heterogenous collection of TimeArrays ordered by CalendarTime
end
Should time-series related modeling functions such as autocorrelation, ARIMA and GARCH be included in TimeSeries
or do they deserve their own package? A future package might be called TimeModels
.
TimeSeries
might be a good place to start implementing these functions.
So far, I'm impressed with how simple it's been to deal with an Array{TimeStamp}
where the TimeStamp immutable is defined as
abstract AbstractTimeStamp
immutable TimeStamp <: AbstractTimeStamp
timestamp::CalendarTime # possible improvements with Int64
value::Float64
end
It is a bit slower than DataFrames, but it's still the early stages. The reason that time series data deserves it's own type is because it's important to have confidence that values and timestamps travel together. Making those two elements part of an immutable type seems natural.
Also, syntax is proving to be easy to read. Let's say I want to query FRED about when the 10-year closed below 1.5%.
using TimeSeries
d = imfred("DGS10"); #immutable version of fred
lowd = d[v(d) .< 1.5]
6-element TimeStamp Array:
2012-06-01 | 1.47
2012-07-20 | 1.49
2012-07-23 | 1.47
2012-07-24 | 1.44
2012-07-25 | 1.43
2012-07-26 | 1.45
And to thwart any chicanery, we have an immutable wall.
julia> lowd[1]
2012-06-01 | 1.47
julia> lowd[1].value
1.47
julia> lowd[1].value = 2.1
ERROR: type TimeStamp is immutable
It's also natural to get a subset of your original data based on a date.
julia> recentd = d[t(d) .> p("2013-03-12")]
5-element TimeStamp Array:
2013-03-13 | 2.04
2013-03-14 | 2.04
2013-03-15 | 2.01
2013-03-18 | 1.96
2013-03-19 | 1.92
To leverage the methods available to Array
, a small (albeit expensive) tweak is done to define methods for Array{TimeStamp}
import Base.mean
mean(x::Array{TimeStamp}) = mean([v.value for v in x])
# or to use the v() function for shorthand
mean(x::Array{TimeStamp}) = mean(v(x))
This is a big change to the package. The idea is to implement a lightweight type and have the methods in the package dispatched on it. Already there is a prototype of this in the series
branch.
The newly registered TimeData package implements a time-aware type using the DataFrames/DataArrays data structure. Effort will be made to coordinate APIs so users can move between the two packages without too much syntax dissimilarities.
This is isolated to the series
branch, which uses MarketData for testing
Per recommendations of @timholy on the Thyme for Julia post, make moving calculate more efficiently.
"after you compute the sum for 1:n, you can compute the sum for 2:n+1 simply by taking the first result and adding x[n+1]-x[1]. This changes it from an order length(x)*n computation to one that is length(x), a big savings."
@wesmckinn confirmed this on twitter as well saying ... " O(1) updates at each time step instead of O(window size)"
Now that padNA is wrapping functions and tests are passing, the following TODO is on the short list:
Improve README
to document !
version special use.
Get more precise dispatch signatures for all functions.
Update tests for different types.
Clean up commented code that's no longer needed.
Come up with better package description.
This will require rewriting quite a bit, but the good part is the code should look much cleaner. Normally, I would just do this but now that the repo is in a group, I thought I'd file an issue.
I start work on this in a branch and wait to merge it.
Perfectly fresh installation of Ubuntu 13.04.
julia> using TimeSeries
Warning: New definition ref(DataArray{T,N},Union(Array{Bool,1},BitArray{1})) is ambiguous with ref(DataArray{T<:Number,N},Union(Ranges{T},Array{T,1},BitArray{1})) at /home/hase/.julia/DataFrames/src/dataarray.jl:339.
Make sure ref(DataArray{T<:Number,N},Union(Array{Bool,1},BitArray{1})) is defined first.
ERROR: Expr: too few arguments (expected 3)
in anonymous at no file:223
in include_from_node1 at loading.jl:76
in reload_path at loading.jl:96
in require at loading.jl:48
in include_from_node1 at loading.jl:76
in reload_path at loading.jl:96
in require at loading.jl:48
at /home/hase/.julia/Calendar/src/Calendar.jl:222
/////////////////////// full log, starting with install of julia
hase@hase-Virtual-Machine:$ sudo apt-get install juliaexp3 [3.309 kB]
[sudo] password for hase:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
git git-man libamd2.2.0 libarpack2 libblas3 libcholmod1.7.1 libcolamd2.7.1
libdouble-conversion0 liberror-perl libfftw3-3 libfftw3-double3 libgfortran3
liblapack3 libllvm3.2 libumfpack5.4.0 libunwind8
Suggested packages:
git-daemon-run git-daemon-sysvinit git-doc git-el git-arch git-cvs git-svn
git-email git-gui gitk gitweb julia-doc ess libfftw3-bin libfftw3-dev
The following NEW packages will be installed:
git git-man julia libamd2.2.0 libarpack2 libblas3 libcholmod1.7.1
libcolamd2.7.1 libdouble-conversion0 liberror-perl libfftw3-3
libfftw3-double3 libgfortran3 liblapack3 libllvm3.2 libumfpack5.4.0
libunwind8
0 upgraded, 17 newly installed, 0 to remove and 81 not upgraded.
Need to get 23,2 MB of archives.
After this operation, 62,9 MB of additional disk space will be used.
Do you want to continue [Y/n]? y
Get:1 http://at.archive.ubuntu.com/ubuntu/ raring/main libfftw3-double3 i386 3.3.3-2ubuntu1 [544 kB]
Get:2 http://at.archive.ubuntu.com/ubuntu/ raring/main libgfortran3 i386 4.7.2-22ubuntu3 [326 kB]
Get:3 http://at.archive.ubuntu.com/ubuntu/ raring/universe libllvm3.2 i386 3.2-2ubuntu4 [8.432 kB]
Get:4 http://at.archive.ubuntu.com/ubuntu/ raring/main libunwind8 i386 1.0.1-4ubuntu2 [52,5 kB]
Get:5 http://at.archive.ubuntu.com/ubuntu/ raring/main libamd2.2.0 i386 1:3.4.0-3ubuntu1 [19,7 kB]
Get:6 http://at.archive.ubuntu.com/ubuntu/ raring/main libblas3 i386 1.2.20110419-5 [205 kB]
Get:7 http://at.archive.ubuntu.com/ubuntu/ raring/main libcolamd2.7.1 i386 1:3.4.0-3ubuntu1 [14,7 kB]
Get:8 http://at.archive.ubuntu.com/ubuntu/ raring/main liblapack3 i386 3.4.2-1
Get:9 http://at.archive.ubuntu.com/ubuntu/ raring/main libcholmod1.7.1 i386 1:3.4.0-3ubuntu1 [343 kB]
Get:10 http://at.archive.ubuntu.com/ubuntu/ raring/universe libdouble-conversion0 i386 1.1.1-1 [35,2 kB]
Get:11 http://at.archive.ubuntu.com/ubuntu/ raring/universe libfftw3-3 i386 3.3.3-2ubuntu1 [1.644 B]
Get:12 http://at.archive.ubuntu.com/ubuntu/ raring/main libumfpack5.4.0 i386 1:3.4.0-3ubuntu1 [310 kB]
Get:13 http://at.archive.ubuntu.com/ubuntu/ raring/universe libarpack2 i386 3.1.2-2exp1 [119 kB]exp3_i386.deb) ...
Get:14 http://at.archive.ubuntu.com/ubuntu/ raring/universe julia i386 0.1.2+dfsg-1 [2.099 kB]
Get:15 http://at.archive.ubuntu.com/ubuntu/ raring/main liberror-perl all 0.17-1 [23,8 kB]
Get:16 http://at.archive.ubuntu.com/ubuntu/ raring/main git-man all 1:1.8.1.2-1 [653 kB]
Get:17 http://at.archive.ubuntu.com/ubuntu/ raring/main git i386 1:1.8.1.2-1 [6.739 kB]
Fetched 23,2 MB in 24s (966 kB/s)
Selecting previously unselected package libfftw3-double3:i386.
(Reading database ... 155860 files and directories currently installed.)
Unpacking libfftw3-double3:i386 (from .../libfftw3-double3_3.3.3-2ubuntu1_i386.deb) ...
Selecting previously unselected package libgfortran3:i386.
Unpacking libgfortran3:i386 (from .../libgfortran3_4.7.2-22ubuntu3_i386.deb) ...
Selecting previously unselected package libllvm3.2:i386.
Unpacking libllvm3.2:i386 (from .../libllvm3.2_3.2-2ubuntu4_i386.deb) ...
Selecting previously unselected package libunwind8.
Unpacking libunwind8 (from .../libunwind8_1.0.1-4ubuntu2_i386.deb) ...
Selecting previously unselected package libamd2.2.0.
Unpacking libamd2.2.0 (from .../libamd2.2.0_1%3a3.4.0-3ubuntu1_i386.deb) ...
Selecting previously unselected package libblas3.
Unpacking libblas3 (from .../libblas3_1.2.20110419-5_i386.deb) ...
Selecting previously unselected package libcolamd2.7.1.
Unpacking libcolamd2.7.1 (from .../libcolamd2.7.1_1%3a3.4.0-3ubuntu1_i386.deb) ...
Selecting previously unselected package liblapack3.
Unpacking liblapack3 (from .../liblapack3_3.4.2-1
Selecting previously unselected package libcholmod1.7.1.
Unpacking libcholmod1.7.1 (from .../libcholmod1.7.1_1%3a3.4.0-3ubuntu1_i386.deb) ...
Selecting previously unselected package libdouble-conversion0:i386.
Unpacking libdouble-conversion0:i386 (from .../libdouble-conversion0_1.1.1-1_i386.deb) ...
Selecting previously unselected package libfftw3-3:i386.
Unpacking libfftw3-3:i386 (from .../libfftw3-3_3.3.3-2ubuntu1_i386.deb) ...
Selecting previously unselected package libumfpack5.4.0.
Unpacking libumfpack5.4.0 (from .../libumfpack5.4.0_1%3a3.4.0-3ubuntu1_i386.deb) ...
Selecting previously unselected package libarpack2.
Unpacking libarpack2 (from .../libarpack2_3.1.2-2exp1_i386.deb) ...exp3) ...
Selecting previously unselected package julia.
Unpacking julia (from .../julia_0.1.2+dfsg-1_i386.deb) ...
Selecting previously unselected package liberror-perl.
Unpacking liberror-perl (from .../liberror-perl_0.17-1_all.deb) ...
Selecting previously unselected package git-man.
Unpacking git-man (from .../git-man_1%3a1.8.1.2-1_all.deb) ...
Selecting previously unselected package git.
Unpacking git (from .../git_1%3a1.8.1.2-1_i386.deb) ...
Processing triggers for man-db ...
Setting up libfftw3-double3:i386 (3.3.3-2ubuntu1) ...
Setting up libgfortran3:i386 (4.7.2-22ubuntu3) ...
Setting up libllvm3.2:i386 (3.2-2ubuntu4) ...
Setting up libunwind8 (1.0.1-4ubuntu2) ...
Setting up libamd2.2.0 (1:3.4.0-3ubuntu1) ...
Setting up libblas3 (1.2.20110419-5) ...
update-alternatives: using /usr/lib/libblas/libblas.so.3 to provide /usr/lib/libblas.so.3 (libblas.so.3) in auto mode
Setting up libcolamd2.7.1 (1:3.4.0-3ubuntu1) ...
Setting up liblapack3 (3.4.2-1
update-alternatives: using /usr/lib/lapack/liblapack.so.3 to provide /usr/lib/liblapack.so.3 (liblapack.so.3) in auto mode
Setting up libcholmod1.7.1 (1:3.4.0-3ubuntu1) ...
Setting up libdouble-conversion0:i386 (1.1.1-1) ...
Setting up libfftw3-3:i386 (3.3.3-2ubuntu1) ...
Setting up libumfpack5.4.0 (1:3.4.0-3ubuntu1) ...
Setting up libarpack2 (3.1.2-2exp1) ...$ julia
Setting up julia (0.1.2+dfsg-1) ...
Setting up liberror-perl (0.17-1) ...
Setting up git-man (1:1.8.1.2-1) ...
Setting up git (1:1.8.1.2-1) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
hase@hase-Virtual-Machine:
_
_ _ ()_ | A fresh approach to technical computing
() | () () | Documentation: http://docs.julialang.org
_ _ | | __ _ | Type "help()" to list help topics
| | | | | | |/ ` | |
| | || | | | (| | | Version 0.1.2
/ |_'|||__'| | 02f3159-Linux-i386 (2013-03-08 18:17:19)
|__/ |
julia> Pkg.update()
MESSAGE: Auto-initializing default package repository /home/hase/.julia.
MESSAGE: Git would like to know your name to initialize your .julia directory.
Enter it below:
[email protected]
MESSAGE: Thank you. You can change it using run(git config --global user.name NAME
)
MESSAGE: Git would like to know your email to initialize your .julia directory.
Enter it below:
[email protected]
MESSAGE: Thank you. You can change it using run(git config --global user.email EMAIL
)
Initialized empty Git repository in /home/hase/.julia/.git/
[master (root-commit) d1aaa3c] Initial empty commit
Cloning into 'METADATA'...
remote: Counting objects: 4067, done.
remote: Compressing objects: 100% (1896/1896), done.
remote: Total 4067 (delta 624), reused 4042 (delta 603)
Receiving objects: 100% (4067/4067), 333.23 KiB | 470 KiB/s, done.
Resolving deltas: 100% (624/624), done.
[master d2d44c4] Empty package repo
3 files changed, 4 insertions(+)
create mode 100644 .gitmodules
create mode 160000 METADATA
create mode 100644 REQUIRE
Already up-to-date.
julia> Pkg.add("TimeSeries")
MESSAGE: Installing Calendar v0.0.0
Cloning into 'Calendar'...
remote: Counting objects: 150, done.
remote: Compressing objects: 100% (112/112), done.
remote: Total 150 (delta 57), reused 123 (delta 30)
Receiving objects: 100% (150/150), 21.76 KiB, done.
Resolving deltas: 100% (57/57), done.
MESSAGE: Installing DataFrames v0.1.0
Cloning into 'DataFrames'...
remote: Counting objects: 3231, done.
remote: Compressing objects: 100% (996/996), done.
remote: Total 3231 (delta 2172), reused 3119 (delta 2070)
Receiving objects: 100% (3231/3231), 3.28 MiB | 1.08 MiB/s, done.
Resolving deltas: 100% (2172/2172), done.
MESSAGE: Installing ICU v0.0.0
Cloning into 'ICU'...
remote: Counting objects: 59, done.
remote: Compressing objects: 100% (27/27), done.
remote: Total 59 (delta 23), reused 54 (delta 18)
Receiving objects: 100% (59/59), 9.02 KiB, done.
Resolving deltas: 100% (23/23), done.
MESSAGE: Installing Options v0.1.0
Cloning into 'Options'...
remote: Counting objects: 61, done.
remote: Compressing objects: 100% (36/36), done.
remote: Total 61 (delta 15), reused 53 (delta 7)
Receiving objects: 100% (61/61), 9.96 KiB, done.
Resolving deltas: 100% (15/15), done.
MESSAGE: Installing Stats v0.0.0
Cloning into 'Stats'...
remote: Counting objects: 20, done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 20 (delta 4), reused 18 (delta 2)
Receiving objects: 100% (20/20), 5.35 KiB, done.
Resolving deltas: 100% (4/4), done.
MESSAGE: Installing TimeSeries v0.0.0
Cloning into 'TimeSeries'...
remote: Counting objects: 771, done.
remote: Compressing objects: 100% (336/336), done.
remote: Total 771 (delta 484), reused 709 (delta 422)
Receiving objects: 100% (771/771), 383.18 KiB | 310 KiB/s, done.
Resolving deltas: 100% (484/484), done.
MESSAGE: Installing UTF16 v0.0.0
Cloning into 'UTF16'...
remote: Counting objects: 14, done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 14 (delta 2), reused 13 (delta 1)
Receiving objects: 100% (14/14), done.
Resolving deltas: 100% (2/2), done.
julia> using TimeSeries
Warning: New definition ref(DataArray{T,N},Union(Array{Bool,1},BitArray{1})) is ambiguous with ref(DataArray{T<:Number,N},Union(Ranges{T},Array{T,1},BitArray{1})) at /home/hase/.julia/DataFrames/src/dataarray.jl:339.
Make sure ref(DataArray{T<:Number,N},Union(Array{Bool,1},BitArray{1})) is defined first.
ERROR: Expr: too few arguments (expected 3)
in anonymous at no file:223
in include_from_node1 at loading.jl:76
in reload_path at loading.jl:96
in require at loading.jl:48
in include_from_node1 at loading.jl:76
in reload_path at loading.jl:96
in require at loading.jl:48
at /home/hase/.julia/Calendar/src/Calendar.jl:222
julia>
$ julia ./run_tests.jl
Warning: using FS.rename in module DataFrames conflicts with an existing identifier.
/home/travis/build.sh: line 188: 5990 Segmentation fault julia ./run_tests.jl
Looks to be a problem with DataFrames and not the TimeSeries package
A TimeSeries
type would combine an Array
with a row index, whose type is any valid time type (as of now, CalendarTime
is the only candidate).
The basic idea is to provide an infrastructure for regularly and irregularly spaced rows, all of which are managed by the index value (date/time) associated with the row. The time stamp would replace the row value in the show of the object, while arbitrary strings would replace the show of the column.
There is some work now in DataFrames
to use IndexedVector
in association with a DataFrame
and the jury is still out on designing a DataSeries
type.
The important matter is to not duplicate efforts, so a TimeSeries
type would need to be justified.
Pkg.add("TimeSeries")
WARNING: Initializing default package repository /Users/arnim/.julia.
Initialized empty Git repository in /Users/somename/.julia/.git/
[master (root-commit) 6387bc3] Initial empty commit
Cloning into 'METADATA'...
remote: Counting objects: 2925, done.
remote: Compressing objects: 100% (1356/1356), done.
remote: Total 2925 (delta 424), reused 2908 (delta 408)
Receiving objects: 100% (2925/2925), 238.19 KiB, done.
Resolving deltas: 100% (424/424), done.
[master 8993fff] Empty package repo
2 files changed, 4 insertions(+)
create mode 100644 .gitmodules
create mode 160000 METADATA
create mode 100644 REQUIRE
Installing Calendar: v0.0.0
Cloning into 'Calendar'...
remote: Counting objects: 146, done.
remote: Compressing objects: 100% (110/110), done.
remote: Total 146 (delta 55), reused 119 (delta 28)
Receiving objects: 100% (146/146), 21.47 KiB, done.
Resolving deltas: 100% (55/55), done.
Installing DataFrames: v0.0.0
Cloning into 'DataFrames'...
remote: Counting objects: 3053, done.
remote: Compressing objects: 100% (990/990), done.
remote: Total 3053 (delta 2037), reused 2911 (delta 1903)
Receiving objects: 100% (3053/3053), 3.25 MiB | 1.09 MiB/s, done.
Resolving deltas: 100% (2037/2037), done.
Installing ICU: v0.0.0
Cloning into 'ICU'...
remote: Counting objects: 55, done.
remote: Compressing objects: 100% (25/25), done.
remote: Total 55 (delta 21), reused 51 (delta 17)
Receiving objects: 100% (55/55), 8.72 KiB, done.
Resolving deltas: 100% (21/21), done.
Installing Options: v0.0.0
Cloning into 'Options'...
remote: Counting objects: 40, done.
remote: Compressing objects: 100% (27/27), done.
remote: Total 40 (delta 7), reused 36 (delta 3)
Receiving objects: 100% (40/40), 8.27 KiB, done.
Resolving deltas: 100% (7/7), done.
Installing Stats: v0.0.0
Cloning into 'Stats'...
remote: Counting objects: 16, done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 16 (delta 3), reused 16 (delta 3)
Receiving objects: 100% (16/16), 4.27 KiB, done.
Resolving deltas: 100% (3/3), done.
Installing TimeSeries: v0.0.0
Cloning into 'TimeSeries'...
remote: Counting objects: 771, done.
remote: Compressing objects: 100% (333/333), done.
remote: Total 771 (delta 484), reused 712 (delta 425)
Receiving objects: 100% (771/771), 383.13 KiB | 316 KiB/s, done.
Resolving deltas: 100% (484/484), done.
error: pathspec '6a6a266412534b82e749bed653f4b5c894100693 ' did not match any file(s) known to git.
An invalid SHA1 hash seems to be registered for TimeSeries. Please contact the package maintainer.
in anonymous at no file:251
in cd at file.jl:26
in _resolve at pkg.jl:247
in anonymous at no file:123
in cd at file.jl:26
in cd_pkgdir at pkg.jl:28
in add at pkg.jl:105
in add at pkg.jl:130
julia> using TimeSeries
ERROR: TimeSeries not found
in require at loading.jl:39
I think the recent merge with the TimeArray branch caused an overwrite somewhere it git world, but I haven't tracked it down yet.
Serialized data is often indexed by some date/time type, but there is no reason this is a requirement. You can index on integers or letters. The package should support a more general set of goals.
Instead of using TradingInstrument
, use the Quandl
package to download data to demonstrate methods.
DataArrays has been moved from DataFrames into its own package, so new dependency needs to be added.
If a users wants a choice between using a DataFrames
and TimeStamp
version of TimeSeries
, the least arbitrary approach is to make both versions nested modules within the package. This is a little awkward though as it requires extra notation, and good submodule names haven't been discovered yet.
julia> using TimeSeries.DataFrameVersion
jula> using TimeSeries.TimeStampVersion
Also, if you load one version and decide you want the other one as well, any shared method names will need to be explicitly called by the package that was loaded last. This example is a little misleading since I took it from the current timestamp branch where TimeSeries by default uses DataFrames and TimeSeries.TimeStamps is required to access the immutable version.
julia> using TimeSeries.TimeStamps
julia> log_return
# methods for generic function log_return
log_return{T<:TimeStamp{T}}(ts::Array{T<:TimeStamp{T},N}) at /Users/Administrator/.julia/TimeSeries/src/ImmutableTimeSeries/tradinginstrument.jl:39
julia> using TimeSeries
Warning: using TimeSeries.log_return in module Main conflicts with an existing identifier.
julia> log_return
# methods for generic function log_return
log_return{T<:TimeStamp{T}}(ts::Array{T<:TimeStamp{T},N}) at /Users/Administrator/.julia/TimeSeries/src/ImmutableTimeSeries/tradinginstrument.jl:39
julia> TimeSeries.log_return
# methods for generic function log_return
log_return(dv::DataArray{T,N}) at /Users/Administrator/.julia/TimeSeries/src/returns.jl:4
log_return(fa::Array{Float64,1}) at /Users/Administrator/.julia/TimeSeries/src/returns.jl:8
The test suite currently uses require("test.jl")
, which apparently no longer works in 0.2.
The new framework appears to be to use using Base.Test
and including tests inside a let
block.
Since DataArray meets the requirement of including NA
s, it might be a useful path to explore.
immutable TimeArray{T,N}
timestamp::Vector{Date{ISOCalendar}}
values::DataArray{T,N}
# colnames::Vector{Symbol} or Vector{ASCIIString}
# batteries included for NAs with DataArray
# inner constructor to enforce obvious invariants (lengths of elements are equal, etc)
end
The (proposed) purpose of the package is to provide methods for working with serial data in the DataFrames/DataArrays framework.
It would require the Series package and provide a path from an Array{SeriesPair{T,V},1)
to DataFrames and back.
The Series package is currently named DataSeries.jl.
The current implementation for function
(i.e., moving
, upto
, simple_return
, log_return
, lag
, lead
) is to have it accept and return a DataArray
whose length is retained, and NAs
padded where it makes sense.
function!
accepts a DataFrame
and mutates that same object with a new padded column.
Going forward, I think it makes sense to leave the function!
semantic alone. This would limit its use to DataFrames
and that's an acceptable restriction.
But I'd like to change the semantics for the non-bang version of function
to accept either an Array{T,1}
or a DataArray
and return the same type object but without padding. Padding with NAs
doesn't work with Array
because it's only defined for DataArray
and DataFrame
and this limits this family of functions.
To sum up:
function
accepts a Array{T,1}
or DataArray
and returns the same type with a length equal to the original minus what was lost to computation (lag
of length 1 would lose one row, for example)
function!
only accepts DataFrame
and returns that same DataFrame
with a new column that is the result of the computation, and because of the requirements of DataFrame
columns being of the same length, it pads with NAs
where it makes sense to do so.
The Series.jl package was fun while it lasted but plans are to move on to better things with the TimeArrays.jl package, which is planned to move into TimeSeries
I'm going to experiment with a separate type elsewhere and leave this package depending on DataFrames
.
I need to look into this a bit more but it appears you can add a yml
file and travis will run test suites.
These are not general enough for the package and would likely fit better in a financial-oriented package (probably JuliaQuant's TradeAnalysis unregistered package).
Here's a list of the methods inside returns.jl
[src (master)]
✈ ack function returns.jl
function log_return(dv::DataArray)
function log_return(fa::Array{Float64, 1})
function log_return!(df::DataFrame, col::String)
function simple_return(dv::DataArray)
function simple_return(fa::Array{Float64, 1})
function simple_return!(df::DataFrame, col::String)
function equity(dv::DataArray)
function equity!(df::DataFrame, col::String)
The index*
family has been renamed by*
and read_time
is now readtime
to track its parent readtable
function.
deprecated list:
read_time
indexyear
indexmonth
indexday
indexdow
note: there are other index*
methods that haven't been implemented, so probably don't need deprecation warnings.
This file contains code to download data from local files or from Yahoo. These have been moved to TradingInstrument
and do not belong in a package that's meant to have more generic time series functionality.
Now that ema
results match that from R's TTR
package, make the function more general and name it exp_moving
.
To get the ema
equivalent in this approach, pass in the mean
function.
The exp_moving
function will weight values to be computed based on the exponential decay algorithm.
Perhaps a reference to a github pages website, and at the very least many more examples of usage. This may clarify if the current method semantics is good or awkward.
For example, moving
applies to a DataArray but moving!
modifies a DataFrame with a new column. This seems okay, but not sure.
I've started a readtime
branch. The current regex to identify a column is very fragile and breaks very easily.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.