GithubHelp home page GithubHelp logo

nanotime arithmetic? about nanotime HOT 55 CLOSED

eddelbuettel avatar eddelbuettel commented on June 4, 2024
nanotime arithmetic?

from nanotime.

Comments (55)

lsilvest avatar lsilvest commented on June 4, 2024 1

Yes, but the first issue is to even be able to enter a given Ops function. When there is conflict the internal operator is called, bypassing both defined operators... It works well when interacting with built-in types, but it seems there's a fundamental limitation with the way S3 classes can interact together or am I missing something?

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024 1

Yes, but do note that it's S4 derived from S3, so it conveniently falls back on integer64 ops.

I don't believe S3 can provide for correct ops because of the dispatch issue I mentioned previously.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024 1

OK, I see. Will try to get rid of as many of these as possible.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024 1

It's not a show stopper, but proper implementation of a new type should guarantee the usual behaviour. Else it's an idiosyncrasy for these specific types and personally I would call that a bug because it's a behaviour that will surprise and bite a user.

It definitely bit me when I was writing the utests!

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

There is a small beginning of Ops() in the R/ directory. Contributions welcome.

> library(nanotime)
R> nanotime("2017-02-02T17:16:27.633531000+00:00") + 1e9
[1] "2017-02-02T17:16:28.633531000+00:00"
R> 

Agree on both points, but someone has to fill it in...

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

I don't know how to solve the "Incompatible methods" issue. If both integer64 and nanotime define + then the dispatch will conflict. This is from ?groupGeneric:

"If a method is found for just one argument or the same method is found for both, it is used. If different methods are found, there is a warning about ‘incompatible methods’: in that case or if no method is found for either argument the internal method is used."

and applies whether the ops are group- or individually-defined. So it seems to be a general shortcoming of S3 dispatch. The same happens for POSIXct for example:

Sys.time() + as.integer64(1)
[1] "2017-02-03 10:17:05 EST"
Warning message:
Incompatible methods ("+.POSIXt", "+.integer64") for "+" 

Do you know if there's a workaround for the dispatch?

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

Uggh. I learned one nice trick from bit64::integer64 -- you can temporarily remove a class attribute. So we could make the c(nanotime, integer64) object just an integer64, and then dispatch on the Ops.

Also, xts has some Ops code to look at left/right operand which is what we need to emulate difftime, ie difference between two nanotime objects becomes integer64.

(In other news, I finally built the R package for ztsdb too and managed to connect to an instance but haven't done anything. Been busy...)

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I fear you may be correct. But it is an area I do know not that much about.

@joshuaulrich Any idea?

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

I've had some success defining nanotime as an S4 class that contains the S3 class integer64. Wrapped in setMethod, there is little change to the original code. More changes of course with the operators (and some more needed still). There are now more xts functions working with nanotime. Here are the changes: 68b4f5e. Also, didn't get round to doing the as.data.table function, etc. - really just a proof of concept for now.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

Uh-oh. I always found S4 to be somewhat more heavy-handed. So I am a little reluctant.

But if we manage to replicate all current features and then get better ops I could be persuaded.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I think we should try to work out Ops first. Working with xts, zoo is next. data.table, which is wicked good (but a little "different") already works.

As for tests, I happened to be a RUnit user too. "Old-school".

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

I've got the ops working, plus a decent utest coverage here: 9d415a6

  • I believe all the previous functionality is there without change
  • can you check the choices I made for operator type checking decisions (i.e. what is permissible and what generates an error)? I've tried to follow R and POSIXct where reasonable
  • I'm unsure about the necessity of rbind/cbind; in any case it's not possible to generate a matrix (and this is true for POSIXct too, btw)
  • I added a basic ./nanotime/tests/runTests.R to run the utests with R CMD check; I've left the previous tests so you can check I didn't forget anything
  • I've got basic zoo, data.frame, xts (with my modifications to it) working; I'm not sure what you wanted to test for data.table...
  • I'm not sure if we need any conversions, see the comment at the end of nanotime.R for an example of a (minor) issue

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

Wow wow wow. About to leave for work so not sure when I can check but that is very promising.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

But it is S4 now? Hm.....

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I glanced at it on the train in. There are a few things we may want to clean up -- eg I do not like roxygen2 to touch my DESCRIPTION or NAMESPACE so I usually do not use @import, @importFrom, @export. But those are all small fry.

Let me clean up what we have over in S3 land. Maybe even make one more release. And then be convert. I think you are correct that we have to go this route to 'tighten' the operations. However, a few of us have some (mostly personal though) "bad taste" in our mouths from some older-and-now-mostly-dead time series classes in S4: its, dateTime, ... so we may want to do this carefully.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

Agreed, I'll take out the @export. With S4 and the increase in number of exports, it was easier to get them correct with roxygen2.

I was looking at performance numbers, and the S4 version is about 2x slower than the S3 version when the result is a nanotime. Looks like the constructor makes a copy even where it could potentially avoid one. I briefly looked at C code for S4 generation and it didn't seem immediately obvious if this could be easily alleviated or not.

I also caught a really strange performance issue with c which had an easy workaround (fixed in a58fc22)

The immediate conclusion though is that in its current state the implementation with S4 is about 2x slower than with S3. That seems to be the trade-off for having correct nanotime operations.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

Yes, S4 has a reputation for being a little slower. It is also said to be more solid.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

And I guess while we're mucking with the different classes, the @export tag may make sense. In general I still avoid them...

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I did get random roxygen errors trying to build of your branch.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

I've been having issues too. For some reason I can't run document() twice in the same R session.

Furthermore, I have an issue in the part where it generates keywords. I assumed that it was my Roxygen version 6.0.0.9000 (the latest on github) and so I ended up modifying Roxygen's code. If my memory is correct I was having that issue with your unmodified package too, that's why I assumed it was some issue introduced between 5.0.1 and 6.0.0.9000. I can't check this right now but will do later today when I get back to my computer.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

Just say no to GitHub. Let's use roxygen 6.0.1 from CRAN.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

I installed 6.0.1, looked at the error I was getting a little closer, and it turns out I had an old version of stringr. I still can only generate documentation once in a session, but this works for me:

library(devtools)
document("~/prog/nanotime")

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I prefer to do these things at the command-line. Maybe look into littler, link its examples/ script to /usr/local/bin and then call roxy.r ?

(Which by my preferences only does Rd and no other roclets.)

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

roxy.r makes a direct call to roxygenize. This produces the error "unable to find required package ‘roxygen_devtest"... The error occurs when defining nanotime as an S4, specifically inside the setClass function when the code is loaded via roxygen's function source_package inside roxygenize. It's looks like an issue in roxygen2. document works because it calls roxygenize with its own source function. There is this note in roxygen2::source_package:

#' This is a simple attempt to load code in a package used by
#' [roxygenize]. It will work with simple packages, but fail if
#' there are compiled files, data files, etc. In that case, it's better to
#' use [devtools::document].

It looks like document uses a more sophisticated way of loading the required package using in particular load_all. That seems to also work for the setClass...

Do you think there is ground to raise an issue with roxygen2? My feeling is we'll be told to use document...

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

By the way, it seems it's also possible to specify the roclets with document, so we can achieve the same as with the direct call to roxygenize...

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I don't know / I don't care in the sense that I do not use devtools, but if I must some of the component packages split off it (ie remotes::install_github).

I am sure you can find a way to let us experiment with s4 as an alternative representation.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

I'm not sure what error you are encountering.

I have no issues installing the package. Either via R CMD build nanotime and then R CMD INSTALL ... or via remotes::install_github. Now that my R packages are up-to-date I'm also not seeing any errors while generating documentation with document (but I do have the problem mentioned above when using roxygenize directly).

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I'm not sure what error you are encountering.

Basically R CMD check. I happen to like command-line helpers and tools, so rcc.r is just shorthard for calling Gabor's rcmdcheck() function, which, in essence, just prettyprints.

You are getting three NOTES and WARNING. I prefer to keep my packages ready for CRAN with minimal warning. I would not yet commit this to my main branch.

edd@max:~/git/leonardo-nanotime$ rcc.r
────────────────────────────────────────────────────────────────────────────────
─  using log directory ‘/tmp/file420a669023e9/nanotime.Rcheck’
─  using R version 3.3.2 (2016-10-31)
─  using platform: x86_64-pc-linux-gnu (64-bit)
─  using session charset: UTF-8
✔  checking for file ‘nanotime/DESCRIPTION’
─  checking extension type ... Package
─  this is package ‘nanotime’ version ‘0.1.1’
✔  checking package namespace information
✔  checking package dependencies
✔  checking if this is a source package
✔  checking if there is a namespace
✔  checking for executable files
✔  checking for hidden files and directories
✔  checking for portable file names
✔  checking for sufficient/correct file permissions
✔  checking whether package ‘nanotime’ can be installed
✔  checking installed package size
✔  checking package directory
✔  checking DESCRIPTION meta-information
✔  checking top-level files
✔  checking for left-over files
✔  checking index information
✔  checking package subdirectories
✔  checking R files for non-ASCII characters
✔  checking R files for syntax errors
✔  checking whether the package can be loaded
✔  checking whether the package can be loaded with stated dependencies
✔  checking whether the package can be unloaded cleanly
N  checking whether the namespace can be loaded with stated dependencies
   Warning: no function found corresponding to methods exports from ‘nanotime’ for: ‘Arith’, ‘Compare’, ‘Complex’, ‘Logic’, ‘Math’, ‘Math2’, ‘Summary’, ‘show’
   
   A namespace must be able to be loaded with just the base namespace
   loaded: otherwise if the namespace gets loaded by a saved object, the
   session will be unable to start.
   
   Probably some imports need to be declared in the NAMESPACE file.
✔  checking whether the namespace can be unloaded cleanly
✔  checking loading without being on the library search path
N  checking dependencies in R code
   package 'methods' is used but not declared
✔  checking S3 generic/method consistency
✔  checking replacement functions
✔  checking foreign function calls
N  checking R code for possible problems
   as.data.frame.nanotime: no visible global function definition for
     ‘S3Part’
   as.integer64.nanotime: no visible global function definition for
     ‘S3Part’
   nanotime.matrix: no visible global function definition for ‘new’
   +,integer64-nanotime: no visible global function definition for ‘new’
   +,integer64-nanotime: no visible global function definition for
     ‘S3Part’
   +,nanotime-integer64: no visible global function definition for ‘new’
   +,nanotime-integer64: no visible global function definition for
     ‘S3Part’
   +,nanotime-numeric: no visible global function definition for ‘new’
   +,nanotime-numeric: no visible global function definition for ‘S3Part’
   +,numeric-nanotime: no visible global function definition for ‘new’
   +,numeric-nanotime: no visible global function definition for ‘S3Part’
   -,nanotime-integer64: no visible global function definition for ‘new’
   -,nanotime-integer64: no visible global function definition for
     ‘S3Part’
   -,nanotime-nanotime: no visible global function definition for ‘S3Part’
   -,nanotime-numeric: no visible global function definition for ‘new’
   -,nanotime-numeric: no visible global function definition for ‘S3Part’
   Arith,nanotime-ANY: no visible global function definition for
     ‘callNextMethod’
   Arith,nanotime-ANY: no visible global function definition for ‘S3Part’
   Compare,nanotime-ANY: no visible global function definition for
     ‘callNextMethod’
   Compare,nanotime-ANY: no visible global function definition for
     ‘S3Part’
   [,nanotime: no visible global function definition for ‘new’
   [,nanotime: no visible global function definition for ‘callNextMethod’
   [<-,nanotime: no visible global function definition for ‘new’
   [<-,nanotime: no visible global function definition for
     ‘callNextMethod’
   c,nanotime: no visible global function definition for ‘new’
   c,nanotime: no visible global function definition for ‘callNextMethod’
   c,nanotime: no visible global function definition for ‘S3Part’
   cbind,nanotime: no visible global function definition for ‘new’
   cbind,nanotime: no visible global function definition for ‘S3Part’
   max,nanotime: no visible global function definition for ‘new’
   max,nanotime: no visible global function definition for
     ‘callNextMethod’
   min,nanotime: no visible global function definition for ‘new’
   min,nanotime: no visible global function definition for
     ‘callNextMethod’
   nanotime,POSIXct: no visible global function definition for ‘new’
   nanotime,character: no visible global function definition for ‘new’
   range,nanotime: no visible global function definition for ‘new’
   range,nanotime: no visible global function definition for
     ‘callNextMethod’
   rbind,nanotime: no visible global function definition for ‘new’
   rbind,nanotime: no visible global function definition for ‘S3Part’
   Undefined global functions or variables:
     S3Part callNextMethod new
   Consider adding
     importFrom("methods", "S3Part", "callNextMethod", "new")
   to your NAMESPACE file (and ensure that your DESCRIPTION Imports field
   contains 'methods').
✔  checking Rd files
✔  checking Rd metadata
✔  checking Rd cross-references
W  checking for missing documentation entries
   Undocumented code objects:
     ‘cbind’ ‘rbind’
   Undocumented S4 classes:
     ‘nanotime’
   Undocumented S4 methods:
     generic '+' and siglist 'ANY,nanotime'
     generic '+' and siglist 'integer64,nanotime'
     generic '+' and siglist 'nanotime,ANY'
     generic '+' and siglist 'nanotime,integer64'
     generic '+' and siglist 'nanotime,nanotime'
     generic '+' and siglist 'nanotime,numeric'
     generic '+' and siglist 'numeric,nanotime'
     generic '-' and siglist 'ANY,nanotime'
     generic '-' and siglist 'nanotime,character'
     generic '-' and siglist 'nanotime,integer64'
     generic '-' and siglist 'nanotime,nanotime'
     generic '-' and siglist 'nanotime,numeric'
     generic 'Arith' and siglist 'nanotime,ANY'
     generic 'Compare' and siglist 'nanotime,ANY'
     generic 'Complex' and siglist 'nanotime'
     generic 'Logic' and siglist 'ANY,nanotime'
     generic 'Logic' and siglist 'nanotime,ANY'
     generic 'Math2' and siglist 'nanotime'
     generic 'Math' and siglist 'nanotime'
     generic 'Summary' and siglist 'nanotime'
     generic '[' and siglist 'nanotime'
     generic '[<-' and siglist 'nanotime'
     generic 'c' and siglist 'nanotime'
     generic 'cbind' and siglist 'nanotime'
     generic 'max' and siglist 'nanotime'
     generic 'min' and siglist 'nanotime'
     generic 'range' and siglist 'nanotime'
     generic 'rbind' and siglist 'nanotime'
     generic 'show' and siglist 'nanotime'
   All user-level objects in a package (including S4 classes and methods)
   should have documentation entries.
   See chapter ‘Writing R documentation files’ in the ‘Writing R
   Extensions’ manual.
✔  checking for code/documentation mismatches
W  checking Rd \usage sections
   Documented arguments not in \usage in documentation object 'nanotime':
     ‘justify’ ‘digits’ ‘na.encode’ ‘trim’ ‘e1’ ‘e2’
   
   Functions with \usage entries need to have the appropriate \alias
   entries, and all their arguments documented.
   The \usage entries must correspond to syntactically valid R code.
   See chapter ‘Writing R documentation files’ in the ‘Writing R
   Extensions’ manual.
✔  checking Rd contents
✔  checking for unstated dependencies in examples
✔  checking examples
W  checking for unstated dependencies in ‘tests’
   'library' or 'require' call not declared from: ‘xts’
✔  checking tests
✔  checking PDF version of manual

── 0 errors ✔ | 3 warnings ✖ | 3 notes ✖
edd@max:~/git/leonardo-nanotime$ 

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

The last one should just be copy-cat'ing; I had to do the same for the format() and print() methods. Some of the other stuff is probably easily learnable from other S4 packages.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

No warnings and notes in eadff32. I don't think it's satisfactory because the methods are defined to produce an error end up in the documentation (e.g. methods for group Complex). We could put a warning that they generate an error, or pull them out and live with a warning during R CMD check. I wasn't able to suppress the warning with @noRd. You probably have a better feel than me for all this. As soon as I get a minute I'll pull the changes you made - I'm a few commits behind.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

Ok, will catch up later -- trying to go for a run now having spent the morning on two other repos.

Rebasing is probably a good idea if you can. I also prefer the trees to not be too far apart.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

Couldn't resist a really quick peek -- any idea?

checking for unstated dependencies in exampleschecking exampleschecking for unstated dependencies intestsE  checking tests
   Running the tests intests/runTests.Rfailed.
   Last 13 lines of output:
      done successfully.
     
     
     
     Executing test function test_rbind.xts  ... Timing stopped at: 0.004 0 0.002 
     Error in xts(matrix(1:30, 10, 3), i1) : 
       order.by requires an appropriate time-based object
      done successfully.
     
     Error in checkForErrors(results) : 
       error or failure in test_xts.R :  list(nTestFunc = 8, nDeactivated = 0, nErr = 8, nFail = 0, dirs = "../nanotime/unitTests", testFileRegexp = "^test_xts.R$", testFuncRegexp = "^test.+", sourceFileResults = list(`../nanotime/unitTests/test_xts.R` = list(test_align.time = list(kind = "error", msg = "Error in xts(a1, i1) : order.by requires an appropriate time-based object\n", checkNum = 0, traceBack = NULL), test_c.xts = list(kind = "error", msg = "Error in xts(matrix(1:30, 10, 3), i1) : \n  order.by requires an appropriate time-based object\n", 
         checkNum = 0, traceBack = NULL), test_cbind.xts = list(kind = "error", msg = "Error in xts(a1, i1) : order.by requires an appropriate time-based object\n", checkNum = 0, traceBack = NULL), test_coredata.xts = list(kind = "error", msg = "Error in xts(a1, i1) : order.by requires an appropriate time-based object\n", checkNum = 0, traceBack = NULL), test_dimnames.xts = list(kind = "error", msg = "Error in xts(a1, i1) : order.by require
     Execution halted
✔  checking PDF version of manual

── 1 error ✖ | 0 warnings ✔ | 0 notes ✔
edd@max:~/git/leonardo-nanotime(master)$ 

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

Ouch... I'm running with my version of xts which adds nanotime as a time-based object. I'll disable these tests for now until xts integrates nanotime.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

Tried my best to rebase, but I had already previous merges and my two attempts ended up loosing my commit history (although the code was correct). I ended up doing a merge because it's safer in this case. Note that I reverted in the process my changes for automatic namespace generation, so it's back to manual as in your branch.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

Appreciate the man-to-man combat with git.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

Yes, that was definitely scary :)

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I poked you over on slack. Wanna chat there?

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

For the rbind and cbind issues, I've done some testing and looked at the source code. It doesn't look like one can do any better.

Just for the record, it's not possible to use cbind2 and rbind2 because there exist rbind.integer64 and cbind.integer64. As documented, if an S3 method can be found, there will never be a dispatch onto methods:rbind. Here is an example to show that:

setOldClass("foo")
as.foo <- function(x) {
  oldClass(x) <- "foo"
  x
}

setClass("bar", contains="foo")

setMethod("rbind2",
          signature("bar", "bar"),
          function (x, y, ...) {
            print("rbind2 called!")
            new("bar", as.foo(c(x,y)))
          })
a <- new("bar", as.foo(matrix(1,1,1)))

## as soon as the following is defined, 'rbind2' is not called anymore:
rbind.foo <- function(..., deparse.level = 1) {
  print("rbind.foo called")
}

At any rate, I think rbind and cbind are not needed. POSIXct doesn't have them and works in the same way as nanotime would work without rbind and cbind definitions:

cbind(as.POSIXct("1970-01-01 00:00:00") + 1:10)
       [,1]
 [1,] 18001
 [2,] 18002
 [3,] 18003
 [4,] 18004
 [5,] 18005
 [6,] 18006
 [7,] 18007
 [8,] 18008
 [9,] 18009
[10,] 18010

What do you think?

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

For the names issue with c it's possible to define c.nanotime as an S3 method and the naming issue would normally be resolved except for what could be considered a bug in as.integer64 which loses the names of a named vector:

c(a=as.integer64(1), b=as.integer64(2))
## integer64
## a b 
## 1 2 

as.integer64(c(a=1, b=2))  ## loses the names
## integer64
## [1] 1 2

If you agree, I'll raise the question with bit64, but I don't think it's a show stopped for nanotime.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

re cbind / rbind: I think we should proceed without them. No name clashing, and as you say, the functionality may not be needed

re c: I am confused. I thought the c() operator always dropped in R, irregardless of class or type?

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

Yes, it's true it always drops, but I'm just considering if c keeps the names or not. If one builds a named vector one should not expect the names to disappear like in as.integer64. Here is an example with POSIXct:

> as.POSIXct(c(a="1970-01-01 00:00:00", b="1970-01-01 00:00:00"))
##                a                b 
## "1970-01-01 EST" "1970-01-01 EST" 


> c(a=as.POSIXct("1970-01-01 00:00:00"), b=as.POSIXct("1970-01-01 00:00:00"))
##                a                b 
## "1970-01-01 EST" "1970-01-01 EST" 

In the previous example with as.integer64, the names a and b are dropped.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

[ imagine a shrugs emoji here ]

Do we care about names on a time index vector?

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

And a couple of additional precisions: I think that nanotime will not necessarily be used only as a time index, and second, if as.integer64 is fixed, nanotime will also automatically be fixed without needing to change the code.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I think this can be closed as we folded the change in.

from nanotime.

vspinu avatar vspinu commented on June 4, 2024

Any unpleasant consequences of this change so far? It looks like a pretty heavy handed fix for the size of the original problem.

dplyr + frends are norious for bad support of generic S4 vectors. Currently I see

> tibble(x = 1:10, y =  nanotime(Sys.time()))
Error in callNextMethod() : 
  error in evaluating a 'primitive' next method: Error in FUN(x = x, i = i) : could not find function "FUN"
> tibble(x = 1:10, y =  integer64(1))
# A tibble: 10 x 2
       x y              
   <int> <S3: integer64>
 1     1 0              
 2     2 0              
 3     3 0              
 4     4 0              
 5     5 0              
 6     6 0              
 7     7 0              
 8     8 0              
 9     9 0              
10    10 0              
> 

An alternative fix to the original problem might have been to overwrite integer64 arithmetic ops. There are a number of options to do so.

More importantly (and please correct me if I am wrong), it seems that the only reason for +/- methods in nanotime is to preserve the class. In which case, bit64 methods do that already:

> tt <- integer64(1)
> class(tt) <- c("blabla", "integer64")
> class(tt + integer64(1))
[1] "blabla"    "integer64"

So, it looks that neither this change nor the original S3 methods are actually necessary.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

I think @lsilvest are not aware of side effects. We just merged a small change needed for a bit64 update. Our use cases of nanotime with data.table and Leonardo's zts work fine.

> library(nanotime)
>
> data.frame(x=1:10, y=nanotime(Sys.time()))
Error in data.frame(x = 1:10, y = nanotime(Sys.time())) : 
  arguments imply differing number of rows: 10, 1
> 
> data.frame(x=1:10, y=nanotime(Sys.time())+0:9)
    x                                   y
1   1 2018-04-19T15:15:45.597214000+00:00
2   2 2018-04-19T15:15:45.597214001+00:00
3   3 2018-04-19T15:15:45.597214002+00:00
4   4 2018-04-19T15:15:45.597214003+00:00
5   5 2018-04-19T15:15:45.597214004+00:00
6   6 2018-04-19T15:15:45.597214005+00:00
7   7 2018-04-19T15:15:45.597214006+00:00
8   8 2018-04-19T15:15:45.597214007+00:00
9   9 2018-04-19T15:15:45.597214008+00:00
10 10 2018-04-19T15:15:45.597214009+00:00
> 

We worked pretty hard on the S3 and S4 methods, and still see the current ones as correct. I don't recall the excact details but my initial hopes also were to inherit as much as possible from bit64 and its integer64 class -- but some date/time semantics got it in the way, and here we are.

Now, the data.frame throws an error if I try it with a scalar nanotime and a vector. That is base R, and for me closer to reference behaviour than what a tibble may do. @lsilvest, any thoughts on this?

from nanotime.

vspinu avatar vspinu commented on June 4, 2024

With tibble the non-recycling version works but fails on printing:

> tt <- tibble(x=1:10, y=nanotime(Sys.time())+0:9)
> tt
Error in callNextMethod() : 
  error in evaluating a 'primitive' next method: Error in FUN(x = x, i = i) : could not find function "FUN"

Most likely a tibble issue, but well ...

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

Really interesting stuff: thanks!

If we just rely on integer64 and don't define any operators, we get this:

> nanotime(1) + nanotime(2)
integer64
[1] 3
attr(,".S3Class")
[1] integer64
attr(,"class")
[1] nanotime

This is because despite the attributes being right, attributes are not enough to make something an S4 object.

So that's the first rationale for defining the operators with nanotime. A second rationale is that it's undesirable to have unintended automatic typecasts away from nanotime, so with S4 it's possible to be tighter and not allow (unwanted) meaningless operations. This is especially important with the forthcoming additional types (duration, interval, period) that are in my nanotime fork in the temporal_types branch.

I'm finding more and more that using an S4 object containing an S3 is tricky. We've got two systems interacting in subtle ways. I also had a nasty surprise with seemingly small method dispatch changes breaking some of the nanotime code between R-3.3.1 and R-3.4.4.

I'm going to go through very carefully the issues above and also carefully read the R code for method dispatch.

@vspinu, I'm really not a fan of overriding. Seems kludgy to me and have been bitten on occasion when doing that. Maybe it's possible to do that safely, but I'm not convinced.

It might be easier in the end to not use integer64 as the base for nanotime while still defining conversion functions.

from nanotime.

eddelbuettel avatar eddelbuettel commented on June 4, 2024

We can always burn it all down and rewrite in R6 :) Until then it mostly works and gives us some integer64-based time ops.

from nanotime.

vspinu avatar vspinu commented on June 4, 2024

This is because despite the attributes being right, attributes are not enough to make something an S4 object. So that's the first rationale for defining the operators with nanotime

You assume using S4 as base class. I was suggesting S3 rewrite and defining only those binary operations with both operands being nanotime (+,-). All other operations integer64 handles correctly.

This is especially important with the forthcoming additional types (duration, interval, period) that are in my nanotime fork in the temporal_types branch.

I was not aware of Howard Hinnant's libraries. It's an interesting approach what you took. This time-span prototyping at C++ level doesn't quite smell right to me, but I am really curious where it will bring you and what will be the performance eventually. Will this work on Date and POSIX eventually?

Your branch seems to require very recent compiler. C++14 and gcc6.3 fail with error: ‘literals’ is not a namespace-name.

On a general note, experience with lubridate suggests that S4 for time spans is more pain than it's worth, inline binary operators are limited and it's more fruitful to focus on functional APIs, period arithmetics is insanely hard to make right, periods should not be limited to basic year - seconds units, etc. I have just done my share of struggle in timechange - first package in lubridate-rewrite series (hopefully done right this time). Will be adding support for nanotime soonish.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

@eddelbuettel, it turns out that the data.frame issue has nothing to do with S4 or S3: the same failure occurs for integer64:

> data.frame(x=1:10, y=as.integer64(0))
Error in data.frame(x = 1:10, y = as.integer64(0)) : 
  arguments imply differing number of rows: 10, 1

Digging into the data.frame function, it looks like it could be a bug; in the function we find these lines:

    for (i in seq_len(n)[nrows < nr]) {
        xi <- vlist[[i]]
        if (nrows[i] > 0L && (nr%%nrows[i] == 0L)) {
            xi <- unclass(xi)
            fixed <- TRUE
            for (j in seq_along(xi)) {
                xi1 <- xi[[j]]
                if (is.vector(xi1) || is.factor(xi1)) 
                  xi[[j]] <- rep(xi1, length.out = nr)
                else if (is.character(xi1) && inherits(xi1, "AsIs")) 
                  xi[[j]] <- structure(rep(xi1, length.out = nr), 
                    class = class(xi1))
                else if (inherits(xi1, "Date") || inherits(xi1, 
                  "POSIXct")) 
                  xi[[j]] <- rep(xi1, length.out = nr)
                else {
                  fixed <- FALSE
                  break
                }
            }
            if (fixed) {
                vlist[[i]] <- xi
                next
            }
        }
        stop(gettextf("arguments imply differing number of rows: %s", 
            paste(unique(nrows), collapse = ", ")), domain = NA)
    }

The unclass(xi) is made on columns that have been previously already converted to a data.frame. So xi becomes a list, but its contents are still of the original class. That's why the is.vector will return false (because xi[[j]] has non-name attributes) and the rep will not be called. If I change the test for is.vector(xi1) to is.vector(unclass(xi1)) it works fine. Initially I thought the original intent was to unclass the content, but because of the specific tests afterwards for POSIXct and Date (cheating! :) ) that's not the case. But I'm baffled as to why data.frame would not allow a well-formed type that has rep to be used in the same way as POSIXct and Date... Feels like a bug to me...

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

@vspinu :

One of the main benefits of S4 is the fact that it gives better control on dispatch for functions with multiple parameters. S3 has limitations that make it really difficult to implement a set of types that are coherent.

I'm not sure I understand what part you are referring to that doesn't smell good. I'm not especially happy with the use of vectors of integer64, if that's what you mean, but it seems to work relatively well. If you mean something different, I'm very interested in getting your comments.

I haven't looked at the performance of the R version in detail yet, but I've done some performance testing for the C++ version of the code which is part of my ztsdb project and it turns out to be quite good. If you are interested, there is some documentation on the types here, but unfortunately I haven't yet written up the results of my performance testing, but will get round to that eventually. You might be interested in what I'm doing in dtts.utils. There is some documentation on the align functionality as implemented in ztsdb here.

I compile the temporal_types branch with gcc4.8.5, so it might be the inverse problem. Will try out gcc6.3 and see if I can reproduce the issue you are encountering.

I will look at your work on timechange as soon as I have some time.

from nanotime.

vspinu avatar vspinu commented on June 4, 2024

ut I'm baffled as to why data.frame would not allow a well-formed type that has rep to be used in the same way as POSIXct and Date... Feels like a bug to me...

Yerh. I would say that the issue is that is.vector is not dispatched to is.vector.integer64. is.vector(integer64(1)) doesn't work for me even in GlobalEnv. is.vector, unlike as.vector, is not an internal S3 generic and bit64 registration doesn't have an effect. Would be nice to pitch this to R folks.

S4 is the fact that it gives better control on dispatch for functions with multiple parameters

True, but do you really have use cases for multiparameter dispatch except the arithmetic operations? If not and S3 arithmetics could be made to work, there is little point of moving to S4.

with the use of vectors of integer64, if that's what you mean, but it seems to work relatively well.

Internal representation of period/interval is vector of a length multiple of the conceptual vector. My hunch is that it will be very hard to make this work with R ecosystem, data.frames most importantly. Users will have to stay in nanotime bubble and rely on nanotime methods exclusively.

But I meant something else. What didn't look right to me is the intermediate type layer of periods and intervals at C level. Now I understand that you are reusing the code from ztsdb, which does make total sense. But such design, while cleaner and less coupled with R type system, precludes (or at least makes it harder) the vectorized style optimization. If you would prototype at R level (say period as a data-frame or a matrix) then common operations could have been done once per R call and construction of proxy types avoided altogether.

What I am pursuing with lubridate's rewrite is to first have a set of CCTZ based low level functional utilities in one package (timechange) then prototype timespans at R level and call utilities from timechange for the heavy (vectorized) lifting. Extreme R coupling but with best performance out of the box.

You might be interested in what I'm doing in dtts.utils. There is some documentation on the align functionality as implemented in ztsdb here.

That's interesting. I am using my own rolling interval aggregators for irregular time series :) Will definitely have a closer look.

from nanotime.

lsilvest avatar lsilvest commented on June 4, 2024

The "intermediate layer" is not only because the code was carried over from somewhere else. It's because that's one way of preserving the vector in memory in R. I don't see how using matrix or data.frame solves anything (same problems arise when trying to use data.frame or data.table). At any rate, you'll see that the construction of proxy types is a null operation. It's mostly clean typecasting.

I've looked at data.frame and data.table integration. The former can work up to a certain extent, the latter pokes too much at C level for anything that's not built upon a clean normal base type of R to work, and nanoival and period are too big to fit in any base type of R. I believe the usefulness of these types remains even if they can't be used in a data frame or data.table.

I like your rolling interval aggregators. There's some similarity with what my align function does when you supply a method like mean.

from nanotime.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.