GithubHelp home page GithubHelp logo

Comments (13)

Wisser avatar Wisser commented on August 23, 2024

Would it really make sense to extract a partial data model without the unrelated tables?
The data model would be incomplete. Wouldn't then the applications that work with the model have problems?

from jailer.

markovendelin avatar markovendelin commented on August 23, 2024

Excellent questions and it would depend on the context. For database application testing, probably not. For data extraction from the database used in scientific lab for publishing, it makes great sense.

In the lab, we are using the same database for linking data coming from different experiments. The database is used by several smaller applications that work only on the part of it. So, if I want to publish datasets describing one particular study, I would like to extract the data that is relevant to that study only. This can be done by Jailer through definition of the model and the relevance judged by the researcher. As a part of extraction, I don't need to define database schema with all possible experiments going on as it is irrelevant in the current publication. It would be preferred to extract only the relevant schema as well.

Please let me know if I missed something in your questions.

from jailer.

Wisser avatar Wisser commented on August 23, 2024

I see, that makes sense. Thanks for the insight.
I'll see how this could be realized. I'm just afraid it might be difficult to do this in a DBMS-independent way. Maybe the feature would only be available for some selected DBMSes. Which one would be relevant in this context?

from jailer.

markovendelin avatar markovendelin commented on August 23, 2024

I agree, that could be difficult to do in database-independent way. We are using PostgreSQL. Now thinking of it, maybe I would just need the list of exported tables as I can use pg_dump and specify tables of interest as it's argument as well as dump schema only. That way Jailer would only give the list of tables used in the export and it will be user who would extract schema accordingly.

from jailer.

Wisser avatar Wisser commented on August 23, 2024

This information is contained in the generated SQL script and could theoretically be extracted from it:

$ cat demo-scott.sql
-- generated by Jailer 9.5.5, Tue Sep 15 11:48:36 CEST 2020 from RalfW@W46810

-- Extraction Model:  EMPLOYEE where T.NAME='SCOTT'
-- Source DBMS:       H2
-- ...

-- Exported Rows:     13
--    DEPARTMENT                        2 
--    EMPLOYEE                          3 
--    PROJECT                           2 
--    PROJECT_PARTICIPATION             2 
--    ROLE                              2 
--    SALARYGRADE                       2 

...


$ grep '^--    ' demo-scott.sql | sed 's/--\s*//g' | sed 's/\s*[0-9]*//g'
DEPARTMENT
EMPLOYEE
PROJECT
PROJECT_PARTICIPATION
ROLE
SALARYGRADE

However, all tables from which no rows were exported (e.g. because the table in the source-database happens to be empty), but which are still relevant would be missing. Perhaps a cli-command would be useful, which would return a list of all tables that are potentially (transitively) related to the subject table?

from jailer.

vsgfe avatar vsgfe commented on August 23, 2024

Perhaps a cli-command would be useful, which would return a list of all tables that are potentially (transitively) related to the subject table?

This list would be useful for me as well. I generate reports about our tables and databases to help manage them. A list of the tables that are transferred between databases using Jailer would be very nice. I could not find an easy way to generate that list.

from jailer.

markovendelin avatar markovendelin commented on August 23, 2024

Perhaps a cli-command would be useful, which would return a list of all tables that are potentially (transitively) related to the subject table?

It would be useful, indeed. Ideally, it should stop associations according to the extraction model. Thus, if we have tables

A -> B -> C

and user disconnected B->C, this would be respected and that CLI tool will not print table C either. Or have such behavior as an option that can be switched on and off.

from jailer.

Wisser avatar Wisser commented on August 23, 2024

In the next release there will be the CLI tool "print-closure":

$ jailer.sh
usage:
...
  jailer print-closure <extraction-model> [<separator>] [-datamodel VAL]
    prints a list of all tables that are directly or transitively associated with a subject table,
    taking into account the restrictions on the associations (the so-called "Closure")
    <separator>: optional separator between table names in the output
...

$ jailer.sh print-closure extractionmodel\Demo-Scott.jm
BONUS
DEPARTMENT
EMPLOYEE
SALARYGRADE

$ jailer.sh print-closure extractionmodel\Demo-Scott.jm ", "
BONUS, DEPARTMENT, EMPLOYEE, SALARYGRADE

If you want to test this in advance, you can unzip the file in the attachment and replace the file "jailer.jar" with it.

jailer.zip

from jailer.

vsgfe avatar vsgfe commented on August 23, 2024

If you want to test this in advance, you can unzip the file in the attachment and replace the file "jailer.jar" with it.

jailer.zip

I did a quick test with our largest model (118 tables) and it works.
Thank you for this new feature!

from jailer.

markovendelin avatar markovendelin commented on August 23, 2024

Excellent, worked for me as well - exactly as expected. Please feel free to close the issue and thank you very much for your help!

from jailer.

Wisser avatar Wisser commented on August 23, 2024

Available in release 9.5.6.

from jailer.

rbeucher avatar rbeucher commented on August 23, 2024

Hi @Wisser

I am trying to use the CLI tool but I get an error:

2022-03-23 11:44:47,788 [main] ERROR  - './extractionmodel/LT_Canada.jm' does not exist
java.io.FileNotFoundException: './extractionmodel/LT_Canada.jm' does not exist
	at net.sf.jailer.extractionmodel.ExtractionModel.loadDatamodelFolder(ExtractionModel.java:522)
	at net.sf.jailer.Jailer.updateDataModelFolder(Jailer.java:383)
	at net.sf.jailer.Jailer.jailerMain(Jailer.java:274)
	at net.sf.jailer.Jailer.main(Jailer.java:149)
Error: java.io.FileNotFoundException: './extractionmodel/LT_Canada.jm' does not exist

Arguments:  0: {print-closure},  1: {./extractionmodel/LT_Canada.jm}

2022-03-23 11:44:47,797 [main] ERROR  - working directory is /opt/jailer-database-tools/lib/app

The model file definitely exist and is in $HOME/.jailer/extractionmodel

I have installed Jailer from the Arch Linux User Repository. It is installed in /opt. The gui works fine but I'm having trouble with the command line tools. Looks like it is using the wrong working directory. Any idea?

Thanks!

from jailer.

Wisser avatar Wisser commented on August 23, 2024

Hi @rbeucher

The script jailer.sh changes the working directory to /opt/jailer-database-tools/lib/app, so the path to the extraction model must be absolute.

from jailer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.