GithubHelp home page GithubHelp logo

Comments (10)

tdhock avatar tdhock commented on August 18, 2024 2

would be useful if you could create a data set with 1 row and many many columns that reproduces your issue.

from data.table.

TimothyWillard avatar TimothyWillard commented on August 18, 2024

Could you provide a reproducible example? I'm unable to recreate what I understand the issue to be with version 1.15.4:

library(data.table)
DT = data.table(
  'abc'=letters,
  'def'=LETTERS,
  'ghi'=1L:26L
)
str(DT)
#> Classes 'data.table' and 'data.frame':   26 obs. of  3 variables:
#>  $ abc: chr  "a" "b" "c" "d" ...
#>  $ def: chr  "A" "B" "C" "D" ...
#>  $ ghi: int  1 2 3 4 5 6 7 8 9 10 ...
#>  - attr(*, ".internal.selfref")=<externalptr>
setcolorder(DT, c('def', 'ghi', 'abc'))
str(DT)
#> Classes 'data.table' and 'data.frame':   26 obs. of  3 variables:
#>  $ def: chr  "A" "B" "C" "D" ...
#>  $ ghi: int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ abc: chr  "a" "b" "c" "d" ...
#>  - attr(*, ".internal.selfref")=<externalptr>

Created on 2024-06-08 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.3 (2024-02-29)
#>  os       macOS Sonoma 14.5
#>  system   x86_64, darwin23.2.0
#>  ui       unknown
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/New_York
#>  date     2024-06-08
#>  pandoc   3.2 @ /usr/local/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.3)
#>  data.table  * 1.15.4  2024-03-30 [1] CRAN (R 4.3.3)
#>  digest        0.6.35  2024-03-11 [1] CRAN (R 4.3.3)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.3)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.3)
#>  fs            1.6.4   2024-04-25 [1] CRAN (R 4.3.3)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.3)
#>  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.3.3)
#>  knitr         1.46    2024-04-06 [1] CRAN (R 4.3.3)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.3)
#>  reprex        2.1.0   2024-01-11 [1] CRAN (R 4.3.3)
#>  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.3)
#>  rmarkdown     2.26    2024-03-05 [1] CRAN (R 4.3.3)
#>  rstudioapi    0.16.0  2024-03-24 [1] CRAN (R 4.3.3)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.3)
#>  withr         3.0.0   2024-01-16 [1] CRAN (R 4.3.3)
#>  xfun          0.43    2024-03-25 [1] CRAN (R 4.3.3)
#>  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.3)
#> 
#>  [1] /usr/local/lib/R/4.3/site-library
#>  [2] /usr/local/Cellar/r/4.3.3/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

from data.table.

eacabbi avatar eacabbi commented on August 18, 2024

ok that's strange. I can confirm that your example runs fine on my computer, and still I went back, double-checked my data, and can confirm that for my dataset setcolorder truly scrambles the columns...

`─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.4.0 (2024-04-24)
os macOS Sonoma 14.5
system aarch64, darwin20
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Madrid
date 2024-06-08
pandoc NA

─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
! package * version date (UTC) lib source
bdsmatrix 1.3-7 2024-03-02 [1] CRAN (R 4.4.0)
bit 4.0.5 2022-11-15 [1] CRAN (R 4.4.0)
bit64 4.0.5 2020-08-30 [1] CRAN (R 4.4.0)
cachem 1.1.0 2024-05-16 [1] CRAN (R 4.4.0)
cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.4.0)
cli 3.6.2 2023-12-11 [1] CRAN (R 4.4.0)
collapse 2.0.14 2024-05-24 [1] CRAN (R 4.4.0)
colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.4.0)
crayon 1.5.2 2022-09-29 [1] CRAN (R 4.4.0)
V data.table * 1.15.99 2024-03-30 [1] CRAN (R 4.4.0) (on disk 1.15.4)
devtools * 2.4.5 2022-10-11 [1] CRAN (R 4.4.0)
digest 0.6.35 2024-03-11 [1] CRAN (R 4.4.0)
dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.4.0)
dreamerr 1.4.0 2023-12-21 [1] CRAN (R 4.4.0)
DT * 0.33 2024-04-04 [1] CRAN (R 4.4.0)
ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.4.0)
fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0)
fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0)
fixest 0.12.1 2024-05-18 [1] https://fastverse.r-universe.dev (R 4.4.0)
forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.4.0)
Formula 1.2-5 2023-02-24 [1] CRAN (R 4.4.0)
fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.0)
V fst * 0.9.8 2024-06-08 [1] Github (fstpackage/fst@6f9ec28) (on disk 0.9.9)
fstcore * 0.9.18 2023-12-02 [1] CRAN (R 4.4.0)
generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0)
ggplot2 * 3.5.1 2024-04-23 [1] CRAN (R 4.4.0)
glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0)
gtable 0.3.5 2024-04-22 [1] CRAN (R 4.4.0)
haven * 2.5.4 2023-11-30 [1] CRAN (R 4.4.0)
hms 1.1.3 2023-03-21 [1] CRAN (R 4.4.0)
htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.4.0)
httpuv 1.6.15 2024-03-26 [1] CRAN (R 4.4.0)
jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.4.0)
later 1.3.2 2023-12-06 [1] CRAN (R 4.4.0)
lattice 0.22-6 2024-03-20 [1] CRAN (R 4.4.0)
lfe 3.0-0 2024-02-29 [1] CRAN (R 4.4.0)
lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0)
lmtest 0.9-40 2022-03-21 [1] CRAN (R 4.4.0)
lpdensity 2.4 2023-01-21 [1] CRAN (R 4.4.0)
lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.4.0)
magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0)
MASS 7.3-60.2 2024-04-24 [1] local
Matrix 1.7-0 2024-03-22 [1] CRAN (R 4.4.0)
maxLik 1.5-2.1 2024-03-24 [1] CRAN (R 4.4.0)
memoise 2.0.1 2021-11-26 [1] CRAN (R 4.4.0)
mime 0.12 2021-09-28 [1] CRAN (R 4.4.0)
miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.4.0)
miscTools 0.6-28 2023-05-03 [1] CRAN (R 4.4.0)
munsell 0.5.1 2024-04-01 [1] CRAN (R 4.4.0)
nlme 3.1-165 2024-06-06 [1] CRAN (R 4.4.0)
numDeriv 2016.8-1.1 2019-06-06 [1] CRAN (R 4.4.0)
pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0)
pkgbuild 1.4.4 2024-03-17 [1] CRAN (R 4.4.0)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0)
pkgload 1.3.4 2024-01-16 [1] CRAN (R 4.4.0)
plm * 2.6-4 2024-04-01 [1] CRAN (R 4.4.0)
profvis 0.3.8 2023-05-02 [1] CRAN (R 4.4.0)
promises 1.3.0 2024-04-05 [1] CRAN (R 4.4.0)
purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.4.0)
R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0)
rbibutils 2.2.16 2023-10-25 [1] CRAN (R 4.4.0)
Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.4.0)
rddensity * 2.5 2024-01-22 [1] CRAN (R 4.4.0)
Rdpack 2.6 2023-11-08 [1] CRAN (R 4.4.0)
rdrobust * 2.2 2023-11-03 [1] CRAN (R 4.4.0)
readr * 2.1.5 2024-01-10 [1] CRAN (R 4.4.0)
readxl * 1.4.3 2023-07-06 [1] CRAN (R 4.4.0)
remotes 2.5.0 2024-03-17 [1] CRAN (R 4.4.0)
rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.0)
sandwich 3.1-0 2023-12-11 [1] CRAN (R 4.4.0)
scales 1.3.0 2023-11-28 [1] CRAN (R 4.4.0)
sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0)
shiny 1.8.1.1 2024-04-02 [1] CRAN (R 4.4.0)
stringi 1.8.4 2024-05-06 [1] CRAN (R 4.4.0)
stringmagic 1.1.2 2024-04-30 [1] CRAN (R 4.4.0)
stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.4.0)
tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.4.0)
tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.4.0)
tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0)
tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.4.0)
timechange 0.3.0 2024-01-18 [1] CRAN (R 4.4.0)
tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.4.0)
urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.4.0)
usethis * 2.2.3 2024-02-19 [1] CRAN (R 4.4.0)
utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0)
vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0)
vroom 1.6.5 2023-12-05 [1] CRAN (R 4.4.0)
withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0)
xtable 1.8-4 2019-04-21 [1] CRAN (R 4.4.0)
zoo 1.8-12 2023-04-13 [1] CRAN (R 4.4.0)

[1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library`

Any idea on something else I can provide? Unfortunately I cannot move the data...

from data.table.

TimothyWillard avatar TimothyWillard commented on August 18, 2024
  1. Could you provide a subset of the data or a dataset that has a similar str?
  2. The code that is causing the issue would be helpful, maybe it's not the setcolorder call that's causing a problem?
  3. sessionInfo()

from data.table.

TimothyWillard avatar TimothyWillard commented on August 18, 2024

This V data.table * 1.15.99 2024-03-30 [1] CRAN (R 4.4.0) (on disk 1.15.4) suggests that the data.table being used here was installed from source maybe? Trying with the current master branch (a5e2bca):

library(data.table)
DT = data.table(
  'abc'=letters,
  'def'=LETTERS,
  'ghi'=1L:26L
)
str(DT)
#> Classes 'data.table' and 'data.frame':   26 obs. of  3 variables:
#>  $ abc: chr  "a" "b" "c" "d" ...
#>  $ def: chr  "A" "B" "C" "D" ...
#>  $ ghi: int  1 2 3 4 5 6 7 8 9 10 ...
#>  - attr(*, ".internal.selfref")=<externalptr>
setcolorder(DT, c('def', 'ghi', 'abc'))
str(DT)
#> Classes 'data.table' and 'data.frame':   26 obs. of  3 variables:
#>  $ def: chr  "A" "B" "C" "D" ...
#>  $ ghi: int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ abc: chr  "a" "b" "c" "d" ...
#>  - attr(*, ".internal.selfref")=<externalptr>

Created on 2024-06-08 with reprex v2.1.0

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.3 (2024-02-29)
#>  os       macOS Sonoma 14.5
#>  system   x86_64, darwin23.2.0
#>  ui       unknown
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/New_York
#>  date     2024-06-08
#>  pandoc   3.2 @ /usr/local/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.3)
#>  data.table  * 1.15.99 2024-06-08 [1] local
#>  digest        0.6.35  2024-03-11 [1] CRAN (R 4.3.3)
#>  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.3)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.3)
#>  fs            1.6.4   2024-04-25 [1] CRAN (R 4.3.3)
#>  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.3)
#>  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.3.3)
#>  knitr         1.46    2024-04-06 [1] CRAN (R 4.3.3)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.3)
#>  reprex        2.1.0   2024-01-11 [1] CRAN (R 4.3.3)
#>  rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.3)
#>  rmarkdown     2.26    2024-03-05 [1] CRAN (R 4.3.3)
#>  rstudioapi    0.16.0  2024-03-24 [1] CRAN (R 4.3.3)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.3)
#>  withr         3.0.0   2024-01-16 [1] CRAN (R 4.3.3)
#>  xfun          0.43    2024-03-25 [1] CRAN (R 4.3.3)
#>  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.3)
#> 
#>  [1] /usr/local/lib/R/4.3/site-library
#>  [2] /usr/local/Cellar/r/4.3.3/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

from data.table.

TysonStanley avatar TysonStanley commented on August 18, 2024

Setting options(datatable.verbose = TRUE) could help diagnose as well.

from data.table.

tdhock avatar tdhock commented on August 18, 2024

original post wrote "after the last developer update" meaning github master? could be related to #6068?

from data.table.

eacabbi avatar eacabbi commented on August 18, 2024

sorry for the annoying stuff, it's not simple to really reproduce it here (the dataset has a lot of variables).
What I can say is that

  • yes, compiled from source
  • it is setcolorder, I have worked around the rest of the code and really, the issue happens trying to reorder columns.
  • I tried to only select a subset of variables and then use setcolorder to reorder the columns: everything worked fine there.
  • setting to verbose did not tell me anything useful, especially considering my previous point.

So it must be something about the fact that I have many many columns, and some of them maybe create a problem. It is not obvious to me what that might be, I am experimenting a bit to see whether I figure it out. If I understand anything more I will let you know.

Thanks!

from data.table.

MichaelChirico avatar MichaelChirico commented on August 18, 2024

Yes, can you reproduce this issue by doing the following?

# ... other code ...
dataset <- dataset[0]
setcolorder(dataset, ...) # the same setcolorder() call

If so, hopefully you're comfortable sharing at least your column names.

Another suggestion: anonymize the data like so:

anonymized_data <- dataset |>
  lapply(\(x) vector(typeof(x), length(x))) |>
  setDT()
setcolorder(anonymized_dataset, ...)

Some more care could be taken to reproduce common types like factor/Date/POSIXct, but hopefully this gives you a good idea of how to proceed.

from data.table.

TysonStanley avatar TysonStanley commented on August 18, 2024

Hi @eacabbi any updates on this? I think Michael had a good suggestion for how to share a more reproducible example if it's possible to share very minimal information.

from data.table.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.