Comments (55)
Yes, but the first issue is to even be able to enter a given Ops function. When there is conflict the internal operator is called, bypassing both defined operators... It works well when interacting with built-in types, but it seems there's a fundamental limitation with the way S3 classes can interact together or am I missing something?
from nanotime.
Yes, but do note that it's S4 derived from S3, so it conveniently falls back on integer64
ops.
I don't believe S3 can provide for correct ops because of the dispatch issue I mentioned previously.
from nanotime.
OK, I see. Will try to get rid of as many of these as possible.
from nanotime.
It's not a show stopper, but proper implementation of a new type should guarantee the usual behaviour. Else it's an idiosyncrasy for these specific types and personally I would call that a bug because it's a behaviour that will surprise and bite a user.
It definitely bit me when I was writing the utests!
from nanotime.
There is a small beginning of Ops()
in the R/
directory. Contributions welcome.
> library(nanotime)
R> nanotime("2017-02-02T17:16:27.633531000+00:00") + 1e9
[1] "2017-02-02T17:16:28.633531000+00:00"
R>
Agree on both points, but someone has to fill it in...
from nanotime.
I don't know how to solve the "Incompatible methods" issue. If both integer64
and nanotime
define +
then the dispatch will conflict. This is from ?groupGeneric
:
"If a method is found for just one argument or the same method is found for both, it is used. If different methods are found, there is a warning about ‘incompatible methods’: in that case or if no method is found for either argument the internal method is used."
and applies whether the ops are group- or individually-defined. So it seems to be a general shortcoming of S3 dispatch. The same happens for POSIXct
for example:
Sys.time() + as.integer64(1)
[1] "2017-02-03 10:17:05 EST"
Warning message:
Incompatible methods ("+.POSIXt", "+.integer64") for "+"
Do you know if there's a workaround for the dispatch?
from nanotime.
Uggh. I learned one nice trick from bit64::integer64 -- you can temporarily remove a class attribute. So we could make the c(nanotime, integer64)
object just an integer64, and then dispatch on the Ops.
Also, xts has some Ops code to look at left/right operand which is what we need to emulate difftime, ie difference between two nanotime objects becomes integer64.
(In other news, I finally built the R package for ztsdb too and managed to connect to an instance but haven't done anything. Been busy...)
from nanotime.
I fear you may be correct. But it is an area I do know not that much about.
@joshuaulrich Any idea?
from nanotime.
I've had some success defining nanotime
as an S4 class that contains the S3 class integer64
. Wrapped in setMethod
, there is little change to the original code. More changes of course with the operators (and some more needed still). There are now more xts
functions working with nanotime
. Here are the changes: 68b4f5e. Also, didn't get round to doing the as.data.table
function, etc. - really just a proof of concept for now.
from nanotime.
Uh-oh. I always found S4 to be somewhat more heavy-handed. So I am a little reluctant.
But if we manage to replicate all current features and then get better ops I could be persuaded.
from nanotime.
from nanotime.
I think we should try to work out Ops first. Working with xts
, zoo
is next. data.table
, which is wicked good (but a little "different") already works.
As for tests, I happened to be a RUnit user too. "Old-school".
from nanotime.
I've got the ops working, plus a decent utest coverage here: 9d415a6
- I believe all the previous functionality is there without change
- can you check the choices I made for operator type checking decisions (i.e. what is permissible and what generates an error)? I've tried to follow R and POSIXct where reasonable
- I'm unsure about the necessity of rbind/cbind; in any case it's not possible to generate a matrix (and this is true for
POSIXct
too, btw) - I added a basic ./nanotime/tests/runTests.R to run the utests with
R CMD check
; I've left the previous tests so you can check I didn't forget anything - I've got basic zoo, data.frame, xts (with my modifications to it) working; I'm not sure what you wanted to test for data.table...
- I'm not sure if we need any conversions, see the comment at the end of nanotime.R for an example of a (minor) issue
from nanotime.
Wow wow wow. About to leave for work so not sure when I can check but that is very promising.
from nanotime.
But it is S4 now? Hm.....
from nanotime.
I glanced at it on the train in. There are a few things we may want to clean up -- eg I do not like roxygen2 to touch my DESCRIPTION or NAMESPACE so I usually do not use @import
, @importFrom
, @export
. But those are all small fry.
Let me clean up what we have over in S3 land. Maybe even make one more release. And then be convert. I think you are correct that we have to go this route to 'tighten' the operations. However, a few of us have some (mostly personal though) "bad taste" in our mouths from some older-and-now-mostly-dead time series classes in S4: its, dateTime, ... so we may want to do this carefully.
from nanotime.
Agreed, I'll take out the @export
. With S4 and the increase in number of exports, it was easier to get them correct with roxygen2.
I was looking at performance numbers, and the S4 version is about 2x slower than the S3 version when the result is a nanotime
. Looks like the constructor makes a copy even where it could potentially avoid one. I briefly looked at C code for S4 generation and it didn't seem immediately obvious if this could be easily alleviated or not.
I also caught a really strange performance issue with c
which had an easy workaround (fixed in a58fc22)
The immediate conclusion though is that in its current state the implementation with S4 is about 2x slower than with S3. That seems to be the trade-off for having correct nanotime
operations.
from nanotime.
Yes, S4 has a reputation for being a little slower. It is also said to be more solid.
from nanotime.
And I guess while we're mucking with the different classes, the @export
tag may make sense. In general I still avoid them...
from nanotime.
I did get random roxygen errors trying to build of your branch.
from nanotime.
I've been having issues too. For some reason I can't run document()
twice in the same R session.
Furthermore, I have an issue in the part where it generates keywords. I assumed that it was my Roxygen version 6.0.0.9000 (the latest on github) and so I ended up modifying Roxygen's code. If my memory is correct I was having that issue with your unmodified package too, that's why I assumed it was some issue introduced between 5.0.1 and 6.0.0.9000. I can't check this right now but will do later today when I get back to my computer.
from nanotime.
Just say no to GitHub. Let's use roxygen 6.0.1 from CRAN.
from nanotime.
I installed 6.0.1, looked at the error I was getting a little closer, and it turns out I had an old version of stringr
. I still can only generate documentation once in a session, but this works for me:
library(devtools)
document("~/prog/nanotime")
from nanotime.
I prefer to do these things at the command-line. Maybe look into littler, link its examples/
script to /usr/local/bin
and then call roxy.r
?
(Which by my preferences only does Rd and no other roclets.)
from nanotime.
roxy.r
makes a direct call to roxygenize
. This produces the error "unable to find required package ‘roxygen_devtest"... The error occurs when defining nanotime
as an S4, specifically inside the setClass
function when the code is loaded via roxygen's function source_package
inside roxygenize
. It's looks like an issue in roxygen2. document
works because it calls roxygenize
with its own source function. There is this note in roxygen2::source_package
:
#' This is a simple attempt to load code in a package used by
#' [roxygenize]. It will work with simple packages, but fail if
#' there are compiled files, data files, etc. In that case, it's better to
#' use [devtools::document].
It looks like document
uses a more sophisticated way of loading the required package using in particular load_all
. That seems to also work for the setClass
...
Do you think there is ground to raise an issue with roxygen2
? My feeling is we'll be told to use document
...
from nanotime.
By the way, it seems it's also possible to specify the roclets with document
, so we can achieve the same as with the direct call to roxygenize
...
from nanotime.
I don't know / I don't care in the sense that I do not use devtools
, but if I must some of the component packages split off it (ie remotes::install_github
).
I am sure you can find a way to let us experiment with s4 as an alternative representation.
from nanotime.
I'm not sure what error you are encountering.
I have no issues installing the package. Either via R CMD build nanotime
and then R CMD INSTALL ...
or via remotes::install_github
. Now that my R packages are up-to-date I'm also not seeing any errors while generating documentation with document
(but I do have the problem mentioned above when using roxygenize
directly).
from nanotime.
I'm not sure what error you are encountering.
Basically R CMD check
. I happen to like command-line helpers and tools, so rcc.r
is just shorthard for calling Gabor's rcmdcheck() function, which, in essence, just prettyprints.
You are getting three NOTES and WARNING. I prefer to keep my packages ready for CRAN with minimal warning. I would not yet commit this to my main branch.
edd@max:~/git/leonardo-nanotime$ rcc.r
────────────────────────────────────────────────────────────────────────────────
─ using log directory ‘/tmp/file420a669023e9/nanotime.Rcheck’
─ using R version 3.3.2 (2016-10-31)
─ using platform: x86_64-pc-linux-gnu (64-bit)
─ using session charset: UTF-8
✔ checking for file ‘nanotime/DESCRIPTION’
─ checking extension type ... Package
─ this is package ‘nanotime’ version ‘0.1.1’
✔ checking package namespace information
✔ checking package dependencies
✔ checking if this is a source package
✔ checking if there is a namespace
✔ checking for executable files
✔ checking for hidden files and directories
✔ checking for portable file names
✔ checking for sufficient/correct file permissions
✔ checking whether package ‘nanotime’ can be installed
✔ checking installed package size
✔ checking package directory
✔ checking DESCRIPTION meta-information
✔ checking top-level files
✔ checking for left-over files
✔ checking index information
✔ checking package subdirectories
✔ checking R files for non-ASCII characters
✔ checking R files for syntax errors
✔ checking whether the package can be loaded
✔ checking whether the package can be loaded with stated dependencies
✔ checking whether the package can be unloaded cleanly
N checking whether the namespace can be loaded with stated dependencies
Warning: no function found corresponding to methods exports from ‘nanotime’ for: ‘Arith’, ‘Compare’, ‘Complex’, ‘Logic’, ‘Math’, ‘Math2’, ‘Summary’, ‘show’
A namespace must be able to be loaded with just the base namespace
loaded: otherwise if the namespace gets loaded by a saved object, the
session will be unable to start.
Probably some imports need to be declared in the NAMESPACE file.
✔ checking whether the namespace can be unloaded cleanly
✔ checking loading without being on the library search path
N checking dependencies in R code
package 'methods' is used but not declared
✔ checking S3 generic/method consistency
✔ checking replacement functions
✔ checking foreign function calls
N checking R code for possible problems
as.data.frame.nanotime: no visible global function definition for
‘S3Part’
as.integer64.nanotime: no visible global function definition for
‘S3Part’
nanotime.matrix: no visible global function definition for ‘new’
+,integer64-nanotime: no visible global function definition for ‘new’
+,integer64-nanotime: no visible global function definition for
‘S3Part’
+,nanotime-integer64: no visible global function definition for ‘new’
+,nanotime-integer64: no visible global function definition for
‘S3Part’
+,nanotime-numeric: no visible global function definition for ‘new’
+,nanotime-numeric: no visible global function definition for ‘S3Part’
+,numeric-nanotime: no visible global function definition for ‘new’
+,numeric-nanotime: no visible global function definition for ‘S3Part’
-,nanotime-integer64: no visible global function definition for ‘new’
-,nanotime-integer64: no visible global function definition for
‘S3Part’
-,nanotime-nanotime: no visible global function definition for ‘S3Part’
-,nanotime-numeric: no visible global function definition for ‘new’
-,nanotime-numeric: no visible global function definition for ‘S3Part’
Arith,nanotime-ANY: no visible global function definition for
‘callNextMethod’
Arith,nanotime-ANY: no visible global function definition for ‘S3Part’
Compare,nanotime-ANY: no visible global function definition for
‘callNextMethod’
Compare,nanotime-ANY: no visible global function definition for
‘S3Part’
[,nanotime: no visible global function definition for ‘new’
[,nanotime: no visible global function definition for ‘callNextMethod’
[<-,nanotime: no visible global function definition for ‘new’
[<-,nanotime: no visible global function definition for
‘callNextMethod’
c,nanotime: no visible global function definition for ‘new’
c,nanotime: no visible global function definition for ‘callNextMethod’
c,nanotime: no visible global function definition for ‘S3Part’
cbind,nanotime: no visible global function definition for ‘new’
cbind,nanotime: no visible global function definition for ‘S3Part’
max,nanotime: no visible global function definition for ‘new’
max,nanotime: no visible global function definition for
‘callNextMethod’
min,nanotime: no visible global function definition for ‘new’
min,nanotime: no visible global function definition for
‘callNextMethod’
nanotime,POSIXct: no visible global function definition for ‘new’
nanotime,character: no visible global function definition for ‘new’
range,nanotime: no visible global function definition for ‘new’
range,nanotime: no visible global function definition for
‘callNextMethod’
rbind,nanotime: no visible global function definition for ‘new’
rbind,nanotime: no visible global function definition for ‘S3Part’
Undefined global functions or variables:
S3Part callNextMethod new
Consider adding
importFrom("methods", "S3Part", "callNextMethod", "new")
to your NAMESPACE file (and ensure that your DESCRIPTION Imports field
contains 'methods').
✔ checking Rd files
✔ checking Rd metadata
✔ checking Rd cross-references
W checking for missing documentation entries
Undocumented code objects:
‘cbind’ ‘rbind’
Undocumented S4 classes:
‘nanotime’
Undocumented S4 methods:
generic '+' and siglist 'ANY,nanotime'
generic '+' and siglist 'integer64,nanotime'
generic '+' and siglist 'nanotime,ANY'
generic '+' and siglist 'nanotime,integer64'
generic '+' and siglist 'nanotime,nanotime'
generic '+' and siglist 'nanotime,numeric'
generic '+' and siglist 'numeric,nanotime'
generic '-' and siglist 'ANY,nanotime'
generic '-' and siglist 'nanotime,character'
generic '-' and siglist 'nanotime,integer64'
generic '-' and siglist 'nanotime,nanotime'
generic '-' and siglist 'nanotime,numeric'
generic 'Arith' and siglist 'nanotime,ANY'
generic 'Compare' and siglist 'nanotime,ANY'
generic 'Complex' and siglist 'nanotime'
generic 'Logic' and siglist 'ANY,nanotime'
generic 'Logic' and siglist 'nanotime,ANY'
generic 'Math2' and siglist 'nanotime'
generic 'Math' and siglist 'nanotime'
generic 'Summary' and siglist 'nanotime'
generic '[' and siglist 'nanotime'
generic '[<-' and siglist 'nanotime'
generic 'c' and siglist 'nanotime'
generic 'cbind' and siglist 'nanotime'
generic 'max' and siglist 'nanotime'
generic 'min' and siglist 'nanotime'
generic 'range' and siglist 'nanotime'
generic 'rbind' and siglist 'nanotime'
generic 'show' and siglist 'nanotime'
All user-level objects in a package (including S4 classes and methods)
should have documentation entries.
See chapter ‘Writing R documentation files’ in the ‘Writing R
Extensions’ manual.
✔ checking for code/documentation mismatches
W checking Rd \usage sections
Documented arguments not in \usage in documentation object 'nanotime':
‘justify’ ‘digits’ ‘na.encode’ ‘trim’ ‘e1’ ‘e2’
Functions with \usage entries need to have the appropriate \alias
entries, and all their arguments documented.
The \usage entries must correspond to syntactically valid R code.
See chapter ‘Writing R documentation files’ in the ‘Writing R
Extensions’ manual.
✔ checking Rd contents
✔ checking for unstated dependencies in examples
✔ checking examples
W checking for unstated dependencies in ‘tests’
'library' or 'require' call not declared from: ‘xts’
✔ checking tests
✔ checking PDF version of manual
── 0 errors ✔ | 3 warnings ✖ | 3 notes ✖
edd@max:~/git/leonardo-nanotime$
from nanotime.
The last one should just be copy-cat'ing; I had to do the same for the format()
and print()
methods. Some of the other stuff is probably easily learnable from other S4 packages.
from nanotime.
No warnings and notes in eadff32. I don't think it's satisfactory because the methods are defined to produce an error end up in the documentation (e.g. methods for group Complex
). We could put a warning that they generate an error, or pull them out and live with a warning during R CMD check. I wasn't able to suppress the warning with @noRd
. You probably have a better feel than me for all this. As soon as I get a minute I'll pull the changes you made - I'm a few commits behind.
from nanotime.
Ok, will catch up later -- trying to go for a run now having spent the morning on two other repos.
Rebasing is probably a good idea if you can. I also prefer the trees to not be too far apart.
from nanotime.
Couldn't resist a really quick peek -- any idea?
✔ checking for unstated dependencies in examples
✔ checking examples
✔ checking for unstated dependencies in ‘tests’
E checking tests
Running the tests in ‘tests/runTests.R’ failed.
Last 13 lines of output:
done successfully.
Executing test function test_rbind.xts ... Timing stopped at: 0.004 0 0.002
Error in xts(matrix(1:30, 10, 3), i1) :
order.by requires an appropriate time-based object
done successfully.
Error in checkForErrors(results) :
error or failure in test_xts.R : list(nTestFunc = 8, nDeactivated = 0, nErr = 8, nFail = 0, dirs = "../nanotime/unitTests", testFileRegexp = "^test_xts.R$", testFuncRegexp = "^test.+", sourceFileResults = list(`../nanotime/unitTests/test_xts.R` = list(test_align.time = list(kind = "error", msg = "Error in xts(a1, i1) : order.by requires an appropriate time-based object\n", checkNum = 0, traceBack = NULL), test_c.xts = list(kind = "error", msg = "Error in xts(matrix(1:30, 10, 3), i1) : \n order.by requires an appropriate time-based object\n",
checkNum = 0, traceBack = NULL), test_cbind.xts = list(kind = "error", msg = "Error in xts(a1, i1) : order.by requires an appropriate time-based object\n", checkNum = 0, traceBack = NULL), test_coredata.xts = list(kind = "error", msg = "Error in xts(a1, i1) : order.by requires an appropriate time-based object\n", checkNum = 0, traceBack = NULL), test_dimnames.xts = list(kind = "error", msg = "Error in xts(a1, i1) : order.by require
Execution halted
✔ checking PDF version of manual
── 1 error ✖ | 0 warnings ✔ | 0 notes ✔
edd@max:~/git/leonardo-nanotime(master)$
from nanotime.
Ouch... I'm running with my version of xts
which adds nanotime
as a time-based object. I'll disable these tests for now until xts
integrates nanotime
.
from nanotime.
Tried my best to rebase, but I had already previous merges and my two attempts ended up loosing my commit history (although the code was correct). I ended up doing a merge because it's safer in this case. Note that I reverted in the process my changes for automatic namespace generation, so it's back to manual as in your branch.
from nanotime.
Appreciate the man-to-man combat with git.
from nanotime.
Yes, that was definitely scary :)
from nanotime.
I poked you over on slack. Wanna chat there?
from nanotime.
For the rbind and cbind issues, I've done some testing and looked at the source code. It doesn't look like one can do any better.
Just for the record, it's not possible to use cbind2 and rbind2 because there exist rbind.integer64 and cbind.integer64. As documented, if an S3 method can be found, there will never be a dispatch onto methods:rbind. Here is an example to show that:
setOldClass("foo")
as.foo <- function(x) {
oldClass(x) <- "foo"
x
}
setClass("bar", contains="foo")
setMethod("rbind2",
signature("bar", "bar"),
function (x, y, ...) {
print("rbind2 called!")
new("bar", as.foo(c(x,y)))
})
a <- new("bar", as.foo(matrix(1,1,1)))
## as soon as the following is defined, 'rbind2' is not called anymore:
rbind.foo <- function(..., deparse.level = 1) {
print("rbind.foo called")
}
At any rate, I think rbind and cbind are not needed. POSIXct doesn't have them and works in the same way as nanotime would work without rbind and cbind definitions:
cbind(as.POSIXct("1970-01-01 00:00:00") + 1:10)
[,1]
[1,] 18001
[2,] 18002
[3,] 18003
[4,] 18004
[5,] 18005
[6,] 18006
[7,] 18007
[8,] 18008
[9,] 18009
[10,] 18010
What do you think?
from nanotime.
For the names issue with c it's possible to define c.nanotime as an S3 method and the naming issue would normally be resolved except for what could be considered a bug in as.integer64 which loses the names of a named vector:
c(a=as.integer64(1), b=as.integer64(2))
## integer64
## a b
## 1 2
as.integer64(c(a=1, b=2)) ## loses the names
## integer64
## [1] 1 2
If you agree, I'll raise the question with bit64, but I don't think it's a show stopped for nanotime.
from nanotime.
re cbind
/ rbind
: I think we should proceed without them. No name clashing, and as you say, the functionality may not be needed
re c
: I am confused. I thought the c()
operator always dropped in R, irregardless of class or type?
from nanotime.
Yes, it's true it always drops, but I'm just considering if c keeps the names or not. If one builds a named vector one should not expect the names to disappear like in as.integer64. Here is an example with POSIXct:
> as.POSIXct(c(a="1970-01-01 00:00:00", b="1970-01-01 00:00:00"))
## a b
## "1970-01-01 EST" "1970-01-01 EST"
> c(a=as.POSIXct("1970-01-01 00:00:00"), b=as.POSIXct("1970-01-01 00:00:00"))
## a b
## "1970-01-01 EST" "1970-01-01 EST"
In the previous example with as.integer64, the names a and b are dropped.
from nanotime.
[ imagine a shrugs emoji here ]
Do we care about names on a time index vector?
from nanotime.
And a couple of additional precisions: I think that nanotime will not necessarily be used only as a time index, and second, if as.integer64 is fixed, nanotime will also automatically be fixed without needing to change the code.
from nanotime.
I think this can be closed as we folded the change in.
from nanotime.
Any unpleasant consequences of this change so far? It looks like a pretty heavy handed fix for the size of the original problem.
dplyr + frends are norious for bad support of generic S4 vectors. Currently I see
> tibble(x = 1:10, y = nanotime(Sys.time()))
Error in callNextMethod() :
error in evaluating a 'primitive' next method: Error in FUN(x = x, i = i) : could not find function "FUN"
> tibble(x = 1:10, y = integer64(1))
# A tibble: 10 x 2
x y
<int> <S3: integer64>
1 1 0
2 2 0
3 3 0
4 4 0
5 5 0
6 6 0
7 7 0
8 8 0
9 9 0
10 10 0
>
An alternative fix to the original problem might have been to overwrite integer64
arithmetic ops. There are a number of options to do so.
More importantly (and please correct me if I am wrong), it seems that the only reason for +/- methods in nanotime is to preserve the class. In which case, bit64
methods do that already:
> tt <- integer64(1)
> class(tt) <- c("blabla", "integer64")
> class(tt + integer64(1))
[1] "blabla" "integer64"
So, it looks that neither this change nor the original S3 methods are actually necessary.
from nanotime.
I think @lsilvest are not aware of side effects. We just merged a small change needed for a bit64 update. Our use cases of nanotime with data.table and Leonardo's zts work fine.
> library(nanotime)
>
> data.frame(x=1:10, y=nanotime(Sys.time()))
Error in data.frame(x = 1:10, y = nanotime(Sys.time())) :
arguments imply differing number of rows: 10, 1
>
> data.frame(x=1:10, y=nanotime(Sys.time())+0:9)
x y
1 1 2018-04-19T15:15:45.597214000+00:00
2 2 2018-04-19T15:15:45.597214001+00:00
3 3 2018-04-19T15:15:45.597214002+00:00
4 4 2018-04-19T15:15:45.597214003+00:00
5 5 2018-04-19T15:15:45.597214004+00:00
6 6 2018-04-19T15:15:45.597214005+00:00
7 7 2018-04-19T15:15:45.597214006+00:00
8 8 2018-04-19T15:15:45.597214007+00:00
9 9 2018-04-19T15:15:45.597214008+00:00
10 10 2018-04-19T15:15:45.597214009+00:00
>
We worked pretty hard on the S3 and S4 methods, and still see the current ones as correct. I don't recall the excact details but my initial hopes also were to inherit as much as possible from bit64 and its integer64 class -- but some date/time semantics got it in the way, and here we are.
Now, the data.frame
throws an error if I try it with a scalar nanotime and a vector. That is base R, and for me closer to reference behaviour than what a tibble may do. @lsilvest, any thoughts on this?
from nanotime.
With tibble the non-recycling version works but fails on printing:
> tt <- tibble(x=1:10, y=nanotime(Sys.time())+0:9)
> tt
Error in callNextMethod() :
error in evaluating a 'primitive' next method: Error in FUN(x = x, i = i) : could not find function "FUN"
Most likely a tibble issue, but well ...
from nanotime.
Really interesting stuff: thanks!
If we just rely on integer64
and don't define any operators, we get this:
> nanotime(1) + nanotime(2)
integer64
[1] 3
attr(,".S3Class")
[1] integer64
attr(,"class")
[1] nanotime
This is because despite the attributes being right, attributes are not enough to make something an S4 object.
So that's the first rationale for defining the operators with nanotime
. A second rationale is that it's undesirable to have unintended automatic typecasts away from nanotime
, so with S4 it's possible to be tighter and not allow (unwanted) meaningless operations. This is especially important with the forthcoming additional types (duration
, interval
, period
) that are in my nanotime
fork in the temporal_types
branch.
I'm finding more and more that using an S4 object containing an S3 is tricky. We've got two systems interacting in subtle ways. I also had a nasty surprise with seemingly small method dispatch changes breaking some of the nanotime
code between R-3.3.1 and R-3.4.4.
I'm going to go through very carefully the issues above and also carefully read the R code for method dispatch.
@vspinu, I'm really not a fan of overriding. Seems kludgy to me and have been bitten on occasion when doing that. Maybe it's possible to do that safely, but I'm not convinced.
It might be easier in the end to not use integer64
as the base for nanotime
while still defining conversion functions.
from nanotime.
We can always burn it all down and rewrite in R6 :) Until then it mostly works and gives us some integer64-based time ops.
from nanotime.
This is because despite the attributes being right, attributes are not enough to make something an S4 object. So that's the first rationale for defining the operators with nanotime
You assume using S4 as base class. I was suggesting S3 rewrite and defining only those binary operations with both operands being nanotime (+,-). All other operations integer64 handles correctly.
This is especially important with the forthcoming additional types (duration, interval, period) that are in my nanotime fork in the temporal_types branch.
I was not aware of Howard Hinnant's libraries. It's an interesting approach what you took. This time-span prototyping at C++ level doesn't quite smell right to me, but I am really curious where it will bring you and what will be the performance eventually. Will this work on Date and POSIX eventually?
Your branch seems to require very recent compiler. C++14 and gcc6.3 fail with error: ‘literals’ is not a namespace-name
.
On a general note, experience with lubridate suggests that S4 for time spans is more pain than it's worth, inline binary operators are limited and it's more fruitful to focus on functional APIs, period arithmetics is insanely hard to make right, periods should not be limited to basic year - seconds units, etc. I have just done my share of struggle in timechange - first package in lubridate-rewrite series (hopefully done right this time). Will be adding support for nanotime soonish.
from nanotime.
@eddelbuettel, it turns out that the data.frame
issue has nothing to do with S4 or S3: the same failure occurs for integer64
:
> data.frame(x=1:10, y=as.integer64(0))
Error in data.frame(x = 1:10, y = as.integer64(0)) :
arguments imply differing number of rows: 10, 1
Digging into the data.frame
function, it looks like it could be a bug; in the function we find these lines:
for (i in seq_len(n)[nrows < nr]) {
xi <- vlist[[i]]
if (nrows[i] > 0L && (nr%%nrows[i] == 0L)) {
xi <- unclass(xi)
fixed <- TRUE
for (j in seq_along(xi)) {
xi1 <- xi[[j]]
if (is.vector(xi1) || is.factor(xi1))
xi[[j]] <- rep(xi1, length.out = nr)
else if (is.character(xi1) && inherits(xi1, "AsIs"))
xi[[j]] <- structure(rep(xi1, length.out = nr),
class = class(xi1))
else if (inherits(xi1, "Date") || inherits(xi1,
"POSIXct"))
xi[[j]] <- rep(xi1, length.out = nr)
else {
fixed <- FALSE
break
}
}
if (fixed) {
vlist[[i]] <- xi
next
}
}
stop(gettextf("arguments imply differing number of rows: %s",
paste(unique(nrows), collapse = ", ")), domain = NA)
}
The unclass(xi)
is made on columns that have been previously already converted to a data.frame
. So xi
becomes a list, but its contents are still of the original class. That's why the is.vector
will return false (because xi[[j]]
has non-name attributes) and the rep
will not be called. If I change the test for is.vector(xi1)
to is.vector(unclass(xi1))
it works fine. Initially I thought the original intent was to unclass the content, but because of the specific tests afterwards for POSIXct
and Date
(cheating! :) ) that's not the case. But I'm baffled as to why data.frame
would not allow a well-formed type that has rep
to be used in the same way as POSIXct
and Date
... Feels like a bug to me...
from nanotime.
@vspinu :
One of the main benefits of S4 is the fact that it gives better control on dispatch for functions with multiple parameters. S3 has limitations that make it really difficult to implement a set of types that are coherent.
I'm not sure I understand what part you are referring to that doesn't smell good. I'm not especially happy with the use of vectors of integer64
, if that's what you mean, but it seems to work relatively well. If you mean something different, I'm very interested in getting your comments.
I haven't looked at the performance of the R version in detail yet, but I've done some performance testing for the C++ version of the code which is part of my ztsdb
project and it turns out to be quite good. If you are interested, there is some documentation on the types here, but unfortunately I haven't yet written up the results of my performance testing, but will get round to that eventually. You might be interested in what I'm doing in dtts.utils. There is some documentation on the align functionality as implemented in ztsdb
here.
I compile the temporal_types
branch with gcc4.8.5, so it might be the inverse problem. Will try out gcc6.3 and see if I can reproduce the issue you are encountering.
I will look at your work on timechange
as soon as I have some time.
from nanotime.
ut I'm baffled as to why data.frame would not allow a well-formed type that has rep to be used in the same way as POSIXct and Date... Feels like a bug to me...
Yerh. I would say that the issue is that is.vector is not dispatched to is.vector.integer64
. is.vector(integer64(1))
doesn't work for me even in GlobalEnv. is.vector
, unlike as.vector
, is not an internal S3 generic and bit64 registration doesn't have an effect. Would be nice to pitch this to R folks.
S4 is the fact that it gives better control on dispatch for functions with multiple parameters
True, but do you really have use cases for multiparameter dispatch except the arithmetic operations? If not and S3 arithmetics could be made to work, there is little point of moving to S4.
with the use of vectors of integer64, if that's what you mean, but it seems to work relatively well.
Internal representation of period/interval is vector of a length multiple of the conceptual vector. My hunch is that it will be very hard to make this work with R ecosystem, data.frames most importantly. Users will have to stay in nanotime bubble and rely on nanotime methods exclusively.
But I meant something else. What didn't look right to me is the intermediate type layer of periods and intervals at C level. Now I understand that you are reusing the code from ztsdb, which does make total sense. But such design, while cleaner and less coupled with R type system, precludes (or at least makes it harder) the vectorized style optimization. If you would prototype at R level (say period as a data-frame or a matrix) then common operations could have been done once per R call and construction of proxy types avoided altogether.
What I am pursuing with lubridate's rewrite is to first have a set of CCTZ based low level functional utilities in one package (timechange
) then prototype timespans at R level and call utilities from timechange
for the heavy (vectorized) lifting. Extreme R coupling but with best performance out of the box.
You might be interested in what I'm doing in dtts.utils. There is some documentation on the align functionality as implemented in ztsdb here.
That's interesting. I am using my own rolling interval aggregators for irregular time series :) Will definitely have a closer look.
from nanotime.
The "intermediate layer" is not only because the code was carried over from somewhere else. It's because that's one way of preserving the vector in memory in R. I don't see how using matrix
or data.frame
solves anything (same problems arise when trying to use data.frame
or data.table
). At any rate, you'll see that the construction of proxy types is a null operation. It's mostly clean typecasting.
I've looked at data.frame
and data.table
integration. The former can work up to a certain extent, the latter pokes too much at C level for anything that's not built upon a clean normal base type of R to work, and nanoival
and period
are too big to fit in any base type of R. I believe the usefulness of these types remains even if they can't be used in a data frame
or data.table
.
I like your rolling interval aggregators. There's some similarity with what my align function does when you supply a method like mean
.
from nanotime.
Related Issues (20)
- R-devel nags HOT 3
- Filter timestamps between times from specific time zone HOT 2
- Add/subtract methods for nanoduration/difftime durations HOT 1
- Add/subtract methods for nanotime datetimes and difftime durations HOT 1
- Add 'as.nanoduration.Duration()' HOT 5
- Issues with 'nanoival()' checks/tests HOT 7
- 'as.nanotime(NA_character_)' throws an ERROR HOT 5
- 7 out of 1256 tests failed on ppc32 HOT 5
- POSIXct -> nanotime -> POSIXct consistency HOT 13
- nanoduration divided by nanoduration should return double rather than integer64
- as.Date does not handle the case where tz is a vector HOT 5
- Could you implement comparaison on nanoduration and Character HOT 3
- Duration since yesterday's midnight in some timezone HOT 4
- nanotime 0.3.3 release HOT 2
- as.character(nanotime) returns the underlying integer HOT 4
- Export `nanoperiod` C interface for other packages to use
- Incorrect parsing for negative `nanoperiod`
- Support for abs() HOT 13
- Add to awesome-vctrs? HOT 1
- Incorrect subsetting with operator `%in%` HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nanotime.