GithubHelp home page GithubHelp logo

Comments (5)

mgkahn avatar mgkahn commented on August 12, 2024 1

@eminty I've forwarded your posting above to my GBQ contacts.......

from legendt2dm.

konstjar avatar konstjar commented on August 12, 2024

@eminty

Is it possible to check what version of BigQuery JDBC Driver you are using?
Taking into account that you have DatabaseConnector v4.0.5 - I can assume the versions of OHDSI R packages are quite old.
Can you try to run the analysis on the latest HADES version using latest JDBC version of driver?

from legendt2dm.

eminty avatar eminty commented on August 12, 2024

Thanks @konstjar ,

The BQ JDBC driver I used was 1.2.21.1025, which was current as of January. I see they are up to 1.2.22.1026; I could try that if we think there have been some material changes.

Database connector v4.0.2 is used as per this renv.lock; based closely on the original. It has the other OHDSI dependencies / versions. (The only difference is that it references a branch of cohortDiagnostics to address a BQ / permissions issue with getTableNames() when applied to the cdm schema).

@msuchard would know better than I if there's reasons for using this particular version of HADES in LEGEND.

I could look to run with the latest HADES version and driver. It pushes us a bit further away from reproducibility in the code base used for the study, and I don't know if it's going to generate other dependency issues, given that they are also specified in the lock file.

When you say 'latest version of HADES', is there a specification (esp. in the form of an renv.lock file) of all those packages and dependencies that can be referenced?

I gather from your suggestion that you don't expect there is actually a quota issue here?

Happy to have the opportunity to learn from you on this. Thanks.

from legendt2dm.

jdposada avatar jdposada commented on August 12, 2024

hi @konstjar,

Echoing what @eminty said it would be better if you provide an renv.lock file that should be used instead of the one used on this Github repository. That significantly reduces the guess work and the possible multiple back and foth for slight dependencies changes and compatibilities.

Thanks a lot for your help

from legendt2dm.

eminty avatar eminty commented on August 12, 2024

As a (delayed) follow up to this, I wound up engineering a work around Google's quota by substituting the generic 'Codesets' table name within the ATLAS-generated SQL in the inst/sql/sql_server/class folder, with 'CodesetsB'. This prevents the quota limit from being triggered by table operations on 'Codesets'.

I did this by way of a perl script in R. Paths are appropriate to a deployed legendT2dm container. Suggest backing up original sql files first. Then:

cohortSQL <- list.files("/workdir/LegendT2dm/inst/sql/sql_server/class")
n = trunc(length(cohortSQL)/2)
cohortSQL_toChange = cohortSQL[1:n]

setwd("/workdir/LegendT2dm/inst/sql/sql_server/class")
for (i in 1:length(cohortSQL_toChange)){
  perl_script <- paste0("perl -pi.bak -e 's/Codesets/CodesetsB/g' ",cohortSQL_toChange[i])
  system(perl_script)
}

(note that this also creates an additional back up of any original SQL files with a .bak extension; modified from this example).

While not ideal to be creating a divergence in the code base across sites (including the cohort SQL files), this did allow us to run the cohort characterization phase of LEGENDT2DM.

I suspect that, instead of needing to do this for other computational epidemiology experiments at large scale (an important part of the OHDSI mission!), we would want to discuss this quota limit with those at google interested in seeing OHDSI thrive in a Big Query dbms. If that doesn't seem possible, those more knowledgeable than I could alternatively consider how ATLAS generated SQL might be able to be translated for BQ to conform to the the limit..

from legendt2dm.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.