GithubHelp home page GithubHelp logo

Comments (13)

Aariq avatar Aariq commented on August 17, 2024 1

@stitam The message from the POST error is "Invalid ID, must be positive integer" if you supply it with a CID like c("1234", "baloon"). I think it makes sense if we enforce early on that CIDs are positive integers (or can be coerced to them), yeah? Anything not a positive integer will get replaced with NA?

from webchem.

schymane avatar schymane commented on August 17, 2024 1

OK let me scale this up for you and attempt to recreate the situation as far as feasible ...
(I have omitted the workflow how I downloaded the CSV as it's part of a routine with a lot of useless overhead for debugging purposes)

Getting the data

Note that this CSV is retrieved directly from PubChem and saved "as is"
(and note that it's 3.6MB, so it is not instantaneous)

library(webchem)
library(data.table)
data_URL <- "https://gitlab.lcsb.uni.lu/eci/pubchem/-/raw/master/annotations/tps/Transformations/NORMAN-SLE_all_transformations.csv"
all_trans <- read.csv(data_URL,stringsAsFactors = F)
# NOTE this data comes with a NULL in the successorcid column, previously this was NA
all_CIDs <- unique(na.omit(c(all_trans$predecessorcid,all_trans$successorcid)))
# 
length(all_CIDs) 
grep("NULL",all_CIDs)

[1] 5411

Here, the function fails with single NA

So, the issue was, previously this seemed to be a NA entry (I already had the routine to omit the NA), which due to PubChem changes to unify all empty entries their side, became a NULL this time around. So this is what happened ...

# If we then go to webchem:
selected_properties <- c("MolecularFormula","ExactMass","XlogP",
                         "CanonicalSMILES","IsomericSMILES",
                         "InChI","InChIKey","Title","IUPACName")
# retrieve info with webchem
CID_info_all <- as.data.table(webchem::pc_prop(all_CIDs, selected_properties))
# this fails with a single NA
dim(CID_info_all)

[1] 1 1

Remove the NULL for full output...

Once I figured out what was going on, I removed it in advance and things work out as expected.
(interestingly, I ended up with cache errors from the above the first time and had to restart the session before it worked tho)

# if the NULL is removed, full output
NULL_i <- grep("NULL",all_CIDs)
if (length(NULL_i)>0) {
  all_CIDs <- all_CIDs[-NULL_i]
}
length(all_CIDs)
grep("NULL",all_CIDs)
# retrieve info with webchem
CID_info_all <- as.data.table(webchem::pc_prop(all_CIDs, selected_properties))
# full dataset is returned
dim(CID_info_all)

[1] 5440 10

Q&A

To answer your question is this going to be a rare situation - probably not, as in, we retrieve a wide variety of tables like this from PubChem, and there are NULL entries everywhere (also in CID columns) because of gaps in the contributing data. So it seems to be that it's not likely just me (and my group) that will encounter this ... we use this quite frequently in different context and are hoping to expand further. For these numbers (1000s of CIDs), one by one is not feasible, we'll overload the PubChem systems.
My preference would be for this option: Enforcing positive integers and converting everything else to NA - I do not think it would be an issue to have NAs in the rows of tables and getting a 5441x10 output with one row of NAs ... but getting a 1x1 NA out had me stumped until I wrote out the CID list ...
Otherwise, a warning or error is an option, so that people are aware they have to clean the CID list before they will get valid output ...

On a total aside, for this dataset I actually fixed the cause of the NA that became NULL here yesterday. But there will be other tables that have this issue.

I am adding @PaulThiessen as I am pretty sure he would prefer we avoid this option (maybe fine for debugging, but not for routine use). The speed of this function is also exactly why we use it this way (I used to do single calls and it's just not feasible).

Another option could be to vectorise the function like most other functions in the package and make a request for each compound. It would slow down the function and put more pressure on the API but on the other hand, it would be easier to find the compound which produces the error

from webchem.

schymane avatar schymane commented on August 17, 2024 1

OK I installed the latest version from GitHub - it was a little stubborn and I had to quit the session before it worked fully, but it seems to put NAs inline now for me too and not as a single output (testing the minimum case example from above).

> library(devtools)
Loading required package: usethis
> install_github("ropensci/webchem")
Skipping install of 'webchem' from a github remote, the SHA1 (cfe02336) has not changed since last install.
  Use `force = TRUE` to force installation
> library(webchem)
> library(data.table)
data.table 1.14.8 using 4 threads (see ?getDTthreads).  Latest news: r-datatable.com
> selected_properties <- c("MolecularFormula","ExactMass","XlogP",
+                          "CanonicalSMILES","IsomericSMILES",
+                          "InChI","InChIKey","Title","IUPACName")
> test_cids <- c("1234","NULL")
> as.data.table(webchem::pc_prop(test_cids, selected_properties))
    CID MolecularFormula
1: 1234       C28H40N2O5
2: NULL             <NA>
                                                    CanonicalSMILES
1: CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
2:                                                             <NA>
                                                     IsomericSMILES
1: CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
2:                                                             <NA>
                                                                                                                                                         InChI
1: InChI=1S/C28H40N2O5/c1-20(2)28(19-29,22-17-25(33-6)27(35-8)26(18-22)34-7)13-9-14-30(3)15-12-21-10-11-23(31-4)24(16-21)32-5/h10-11,16-18,20H,9,12-15H2,1-8H3
2:                                                                                                                                                        <NA>
                      InChIKey
1: XQLWNAFCTODIRK-UHFFFAOYSA-N
2:                        <NA>
                                                                                             IUPACName
1: 5-[2-(3,4-dimethoxyphenyl)ethyl-methylamino]-2-propan-2-yl-2-(3,4,5-trimethoxyphenyl)pentanenitrile
2:                                                                                                <NA>
   XLogP    ExactMass      Title
1:   3.8 484.29372238 Gallopamil
2:    NA         <NA>       <NA>

No version bump (had me a little confused when the new version didn't work at first ... ) but here's the sessionInfo.

> sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.utf8  LC_CTYPE=English_Australia.utf8   
[3] LC_MONETARY=English_Australia.utf8 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.utf8    

time zone: Europe/Paris
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.14.8 webchem_1.3.0     devtools_2.4.5    usethis_2.2.1    

loaded via a namespace (and not attached):
 [1] utf8_1.2.3        generics_0.1.3    xml2_1.3.4        stringi_1.7.12   
 [5] digest_0.6.32     magrittr_2.0.3    evaluate_0.21     pkgload_1.3.2    
 [9] fastmap_1.1.1     jsonlite_1.8.5    processx_3.8.1    sessioninfo_1.2.2
[13] pkgbuild_1.4.2    urlchecker_1.0.1  ps_1.7.5          promises_1.2.0.1 
[17] httr_1.4.6        rvest_1.0.3       purrr_1.0.1       fansi_1.0.4      
[21] cli_3.6.1         shiny_1.7.4       rlang_1.1.1       crayon_1.5.2     
[25] ellipsis_0.3.2    remotes_2.4.2     cachem_1.0.8      yaml_2.3.7       
[29] tools_4.3.1       memoise_2.0.1     dplyr_1.1.2       httpuv_1.6.11    
[33] curl_5.0.1        vctrs_0.6.3       R6_2.5.1          mime_0.12        
[37] lifecycle_1.0.3   stringr_1.5.0     fs_1.6.2          htmlwidgets_1.6.2
[41] miniUI_0.1.1.1    pkgconfig_2.0.3   callr_3.7.3       pillar_1.9.0     
[45] later_1.3.1       glue_1.6.2        profvis_0.3.8     Rcpp_1.0.10      
[49] xfun_0.39         tibble_3.2.1      tidyselect_1.2.0  data.tree_1.0.0  
[53] rstudioapi_0.15.0 knitr_1.43        xtable_1.8-4      htmltools_0.5.5  
[57] rmarkdown_2.23    compiler_4.3.1    prettyunits_1.1.1

Thanks for the quick fix - much appreciated! I'll go ahead and close this ... thanks again!

from webchem.

Aariq avatar Aariq commented on August 17, 2024

I'm not able to reproduce this with the dev version of webchem with NULL or NA, but I do get it with Inf or "baloon" (E.g.)

library(webchem)
selected_properties <- c("MolecularFormula","ExactMass","XlogP",
                         "CanonicalSMILES","IsomericSMILES",
                         "InChI","InChIKey","Title","IUPACName")
pc_prop(c("1234", NA), selected_properties)
#>    CID MolecularFormula
#> 1 1234       C28H40N2O5
#> 2   NA             <NA>
#>                                                    CanonicalSMILES
#> 1 CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
#> 2                                                             <NA>
#>                                                     IsomericSMILES
#> 1 CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
#> 2                                                             <NA>
#>                                                                                                                                                         InChI
#> 1 InChI=1S/C28H40N2O5/c1-20(2)28(19-29,22-17-25(33-6)27(35-8)26(18-22)34-7)13-9-14-30(3)15-12-21-10-11-23(31-4)24(16-21)32-5/h10-11,16-18,20H,9,12-15H2,1-8H3
#> 2                                                                                                                                                        <NA>
#>                      InChIKey
#> 1 XQLWNAFCTODIRK-UHFFFAOYSA-N
#> 2                        <NA>
#>                                                                                             IUPACName
#> 1 5-[2-(3,4-dimethoxyphenyl)ethyl-methylamino]-2-propan-2-yl-2-(3,4,5-trimethoxyphenyl)pentanenitrile
#> 2                                                                                                <NA>
#>   XLogP    ExactMass      Title
#> 1   3.8 484.29372238 Gallopamil
#> 2    NA         <NA>       <NA>
pc_prop(c("1234", NULL), selected_properties)
#>    CID MolecularFormula
#> 1 1234       C28H40N2O5
#>                                                    CanonicalSMILES
#> 1 CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
#>                                                     IsomericSMILES
#> 1 CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
#>                                                                                                                                                         InChI
#> 1 InChI=1S/C28H40N2O5/c1-20(2)28(19-29,22-17-25(33-6)27(35-8)26(18-22)34-7)13-9-14-30(3)15-12-21-10-11-23(31-4)24(16-21)32-5/h10-11,16-18,20H,9,12-15H2,1-8H3
#>                      InChIKey
#> 1 XQLWNAFCTODIRK-UHFFFAOYSA-N
#>                                                                                             IUPACName
#> 1 5-[2-(3,4-dimethoxyphenyl)ethyl-methylamino]-2-propan-2-yl-2-(3,4,5-trimethoxyphenyl)pentanenitrile
#>   XLogP    ExactMass      Title
#> 1   3.8 484.29372238 Gallopamil
pc_prop(c("1234", Inf), selected_properties)
#> [1] NA
pc_prop(c("1234", "baloon"), selected_properties)
#> [1] NA

Created on 2023-07-06 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.2 (2022-10-31)
#>  os       macOS Big Sur ... 10.16
#>  system   x86_64, darwin17.0
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/New_York
#>  date     2023-07-06
#>  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.6.1   2023-03-23 [1] CRAN (R 4.2.0)
#>  curl          5.0.1   2023-06-07 [1] CRAN (R 4.2.0)
#>  data.tree     1.0.0   2020-08-03 [1] CRAN (R 4.2.0)
#>  digest        0.6.31  2022-12-11 [1] CRAN (R 4.2.0)
#>  dplyr         1.1.2   2023-04-20 [1] CRAN (R 4.2.0)
#>  evaluate      0.20    2023-01-17 [1] CRAN (R 4.2.0)
#>  fansi         1.0.4   2023-01-22 [1] CRAN (R 4.2.0)
#>  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.2.0)
#>  fs            1.6.1   2023-02-06 [1] CRAN (R 4.2.0)
#>  generics      0.1.3   2022-07-05 [1] CRAN (R 4.2.0)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
#>  htmltools     0.5.5   2023-03-23 [1] CRAN (R 4.2.0)
#>  httr          1.4.6   2023-05-08 [1] CRAN (R 4.2.0)
#>  jsonlite      1.8.7   2023-06-29 [1] CRAN (R 4.2.0)
#>  knitr         1.42    2023-01-25 [1] CRAN (R 4.2.0)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.0)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
#>  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.2.0)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.0)
#>  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.2.0)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.2.0)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.2.0)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.2.0)
#>  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.2.0)
#>  rmarkdown     2.20    2023-01-19 [1] CRAN (R 4.2.0)
#>  rstudioapi    0.14    2022-08-22 [1] CRAN (R 4.2.0)
#>  rvest         1.0.3   2022-08-19 [1] CRAN (R 4.2.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
#>  stringi       1.7.12  2023-01-11 [1] CRAN (R 4.2.0)
#>  stringr       1.5.0   2022-12-02 [1] CRAN (R 4.2.0)
#>  styler        1.9.1   2023-03-04 [1] CRAN (R 4.2.0)
#>  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.2.0)
#>  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.2.0)
#>  utf8          1.2.3   2023-01-31 [1] CRAN (R 4.2.2)
#>  vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.2.0)
#>  webchem     * 1.3.0   2023-07-06 [1] local
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun          0.37    2023-01-31 [1] CRAN (R 4.2.0)
#>  xml2          1.3.4   2023-04-27 [1] CRAN (R 4.2.0)
#>  yaml          2.3.7   2023-01-23 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

from webchem.

Aariq avatar Aariq commented on August 17, 2024

Ah, I see now that the example is "NULL", not NULL

from webchem.

schymane avatar schymane commented on August 17, 2024

Yes, I guess everything was converted to string once it encountered a non-numeric ... before (and/or without the "NULL" in the works) the CIDs were numeric as well.

from webchem.

stitam avatar stitam commented on August 17, 2024

Thanks @schymane for raising this issue! Most webchem functions can print verbose messages but webchem respects your session settings, so the default value is always getOption("verbose"). Invalid query with verbose messages:

webchem::pc_prop(0, properties = "CanonicalSmiles", verbose = TRUE)
#> Querying. Bad Request (HTTP 400).
#> [1] NA

Created on 2023-07-07 with reprex v2.0.2

Though I admit in this situation this verbose message is not super useful :)

Enforcing positive integers and converting everything else to NA could work, however, it is still possible to submit an integer which cannot be linked to a PubChem CID and then the function will still return NA:

webchem::pc_prop(9999999999, properties = "CanonicalSmiles", 
    verbose = TRUE)
#> Querying. Bad Request (HTTP 400).
#> [1] NA

Created on 2023-07-07 with reprex v2.0.2

But maybe this is a rare exception? Another option could be to vectorise the function like most other functions in the package and make a request for each compound. It would slow down the function and put more pressure on the API but on the other hand, it would be easier to find the compound which produces the error:

lapply(c(1234, "balloon"), function(x) webchem::pc_prop(x, properties = "CanonicalSmiles", 
    verbose = TRUE))
#> Querying. OK (HTTP 200).
#> Querying. Bad Request (HTTP 400).
#> [[1]]
#>    CID                                                  CanonicalSMILES
#> 1 1234 CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
#> 
#> [[2]]
#> [1] NA

Created on 2023-07-07 with reprex v2.0.2

Enforcing integers is simpler to implement, vectorising would probably be more robust. What do you think?

from webchem.

stitam avatar stitam commented on August 17, 2024

is this going to be a rare situation I think you may have misunderstood me. What I was wondering about is how often we will still encounter invalid queries "after" enforcing positive integers and converting everything else to NA. In my example above I used a large integer which would pass such a filter (i.e. integer) but would still throw an error because you can't link a PubChem entry to it. If this situation is rare, then it's better to just enforce integers to maintain speed. If not, then vectorising could be a robust alternative.

I understand that getting a single NA for output stumped you @schymane, but this is a PubChem thing. This API allows you to make a single request with as many compounds as you like, but on the downside if any of your compound queries are invalid, you will receive a singe HTTP error which webchem will translate to a single NA because it cannot do anything better with it.

It would take about 30 mins to query 5k compounds in a vectorised way, I'm not saying that's not much, but it's not terrible either. I'm okay with integers for now and then we'll see, what do you think,@Aariq?

from webchem.

Aariq avatar Aariq commented on August 17, 2024

Enforcing positive integers and converting everything else to NA could work, however, it is still possible to submit an integer which cannot be linked to a PubChem CID and then the function will still return NA:

9999999999 is actually not an integer (at least not in R).

> as.integer(9999999999)
[1] NA
Warning message:
NAs introduced by coercion to integer range 

But, if we reduce it to 999999999L we can test pc_prop() and see that it behaves as expected, with NA only for that CID:

library(webchem)
pc_prop(c(1234L, 999999999L))
#>         CID MolecularFormula MolecularWeight
#> 1      1234       C28H40N2O5           484.6
#> 2 999999999             <NA>            <NA>
#>                                                    CanonicalSMILES
#> 1 CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
#> 2                                                             <NA>
#>                                                     IsomericSMILES
#> 1 CC(C)C(CCCN(C)CCC1=CC(=C(C=C1)OC)OC)(C#N)C2=CC(=C(C(=C2)OC)OC)OC
#> 2                                                             <NA>
#>                                                                                                                                                         InChI
#> 1 InChI=1S/C28H40N2O5/c1-20(2)28(19-29,22-17-25(33-6)27(35-8)26(18-22)34-7)13-9-14-30(3)15-12-21-10-11-23(31-4)24(16-21)32-5/h10-11,16-18,20H,9,12-15H2,1-8H3
#> 2                                                                                                                                                        <NA>
#>                      InChIKey
#> 1 XQLWNAFCTODIRK-UHFFFAOYSA-N
#> 2                        <NA>
#>                                                                                             IUPACName
#> 1 5-[2-(3,4-dimethoxyphenyl)ethyl-methylamino]-2-propan-2-yl-2-(3,4,5-trimethoxyphenyl)pentanenitrile
#> 2                                                                                                <NA>
#>   XLogP    ExactMass MonoisotopicMass TPSA Complexity Charge HBondDonorCount
#> 1   3.8 484.29372238     484.29372238 73.2        639      0               0
#> 2    NA         <NA>             <NA>   NA         NA     NA              NA
#>   HBondAcceptorCount RotatableBondCount HeavyAtomCount IsotopeAtomCount
#> 1                  7                 14             35                0
#> 2                 NA                 NA             NA               NA
#>   AtomStereoCount DefinedAtomStereoCount UndefinedAtomStereoCount
#> 1               1                      0                        1
#> 2               0                     NA                       NA
#>   BondStereoCount DefinedBondStereoCount UndefinedBondStereoCount
#> 1               0                      0                        0
#> 2               0                     NA                       NA
#>   CovalentUnitCount Volume3D XStericQuadrupole3D YStericQuadrupole3D
#> 1                 1    393.6               15.07                5.63
#> 2                NA       NA                  NA                  NA
#>   ZStericQuadrupole3D ConformerModelRMSD3D EffectiveRotorCount3D
#> 1                2.25                  1.2                    14
#> 2                  NA                   NA                    NA
#>   ConformerCount3D
#> 1               10
#> 2                0
#>                                                                                                                                                  Fingerprint2D
#> 1 AAADcfB7OAAAAAAAAAAAAAAAAAAAAAAAAAAwYAAAAAAAAAABQAAAHgAAAAAADwTBmAYyBoMABACQBiBCAAACCAAgIAAIiAAOiIgNpyKEsRuEMCIlwBWKqA+Q8P4PoAABCAAAQABAAAIQAACAAAAAAAAAAA==
#> 2                                                                                                                                                         <NA>

Created on 2023-07-07 with reprex v2.0.2

The query is still formed correctly, so it works. If the query is bad, then you get NA for everything because the API returns nothing.

library(webchem)
pc_prop(c(1234L, "dootdoot"))
#> [1] NA

Created on 2023-07-07 with reprex v2.0.2

from webchem.

stitam avatar stitam commented on August 17, 2024

It never occurred to me that 99..99 is not an integer in R by default, lol, thanks!

OK, so it seems all positive integer values are valid queries, great! Then we just need to enforce this. I'll add this in a bit

from webchem.

Aariq avatar Aariq commented on August 17, 2024

probably not, as in, we retrieve a wide variety of tables like this from PubChem, and there are NULL entries everywhere (also in CID columns) because of gaps in the contributing data.

Just to clarify, "NULL" is not the same as NULL, and webchem handles NULL sensibly (it is ignored). I don't think it should be webchem's job to do data cleaning tasks like converting the character value "NULL" to an actual NULL value. What webchem can do is make it impossible to use pc_prop() unless you do those data cleaning tasks first—e.g. requiring that the input is an integer vector (or, maybe, a vector that is coercible to integer) and giving a helpful error if it is not.

For you situation, you could read the data in with readr::read_csv() where you can specify how NAs are represented as well as what types certain columns should be.

library(readr)
data_URL <- "https://gitlab.lcsb.uni.lu/eci/pubchem/-/raw/master/annotations/tps/Transformations/NORMAN-SLE_all_transformations.csv"

all_trans <- read_csv(data_URL, na = c("", "NULL"), col_types = cols(
  predecessorcid = col_integer(),
  successorcid = col_integer()
))

all_CIDs <- unique(na.omit(c(all_trans$predecessorcid, all_trans$successorcid)))
length(all_CIDs)
#> [1] 5440
is.integer(all_CIDs)
#> [1] TRUE

Created on 2023-07-07 with reprex v2.0.2

from webchem.

schymane avatar schymane commented on August 17, 2024

What webchem can do is make it impossible to use pc_prop() unless you do those data cleaning tasks first—e.g. requiring that the input is an integer vector (or, maybe, a vector that is coercible to integer) and giving a helpful error if it is not.

I completely agree!

I will also update my code (thanks for the tips), as you correctly picked, it is seriously out of date and needs a good spring cleaning - but of course it needs a good bug to provide sufficient incentive to do so!
We're doing a lot of this in RMarkdown (or similar) workflows to prep files for other parts of workflows and 30 min is not a feasible timeframe for that. Since many single queries quickly cause issues with server overload, I definitely prefer to avoid single queries for many CIDs at a time if at all possible...

from webchem.

stitam avatar stitam commented on August 17, 2024

I've just merged PR #408, @schymane can you please confirm that the PR fixed the issue?

Note, verbose messages are only printed to the console if you use the verbose = TRUE argument.

from webchem.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.