GithubHelp home page GithubHelp logo

rodionovsasha / jfixtures Goto Github PK

View Code? Open in Web Editor NEW
20.0 3.0 2.0 454 KB

A tool for populating relational databases with yml-based test data

License: MIT License

Shell 0.08% Java 25.92% Groovy 74.00%
java testing test-data sql

jfixtures's People

Contributors

rodionovsasha avatar vkorobkov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

jfixtures's Issues

Configure output SQL formatting parameters

These could be:

  • The amount of empty lines between: tables, clean instructions, inserts, e.t.c.
  • Formatting insert instructions anyhow
  • Adding comments when table starts/ends
  • Adding comments with row aliases

Throw NotSupportedException on Iterable yaml types

When fixture value type is iterable(array, list, map, e.t.c.) we need to throw NotSupportedException(or NotImplementedYesException until we understand how to support these types for different SQL dialects.

Full list of unsopported types/syntax:

  1. list: [LITE, RES_ACID, SUS_DEXT]
  2. map: {hp: 13, sp: 5}
  3. !!omap, !!pairs, !!set, !!seq, !!map

Every type of syntax should be covered with a test.

Also, as a result of this bug, we need to raise epics to investigate/implement collections for particular SQL dialects(like map could be represented as a JSONB field of PG, !!set, !!seq, !!map as arrays of PG)

Computed by row aliases PKs and IDs

Tests need an ability to calculate ID(pk) by row aliases.
Ideally, calculating the id from row alias should be a pure function, like id = by_alias('homer'). This trick allows to use the same ID auto generation mechanism in both JFixtures and the tests within JFixtures.

The algorithm could base on the standard hashCode of String:

final LOWER_BOUND = 100_000;
final HASH_RANGE = Integer.MAX_VALUE - LOWER_BOUND;

public int calculate(String alias) { // do not copy this method name
  int hashWithOffset = alias.hashCode() % HASH_RANGE;
  return LOWER_BOUND + hashWithOffset;
}

Pros:

  • Pure function(no side effects, does not depend on any external state, the result depends only on the argument)
  • Easy
  • Fast
  • User can define their own ID's in the range 0...LOWER_BOUND which will never intersect with auto generated ones

Cons:

  • Collisions of auto generated IDs are possible:
=> 4.6587822898051644e-08

This is a probability of collisions when a table has 100 rows(don't think users would ever add more rows manually). The probability is low, so I think we can this approach. For that rare case of collision, we need to offer users to define their own IDs on 0...LOWER_BOUND range.

I think we need to do in a few small steps. Every steps would need a separate PR:

commentService.findByTopic(TOPIC_ID).id == [4, 5, 6, 8]

will look like

commentService.findByTopic(TOPIC_ID).id == HashId.list("homer", "bart", "lisa", "maggy")
  • Merge develop to master and release
  • Update core version in command line runner
  • Update tests in demo project

Table property for disabling cleanup

By now JFixtures generates clean up SQL for every table(before inserts):

DELETE FROM $table_name;

This behavior is hardcoded but should be switchable off. A new table property is required:

tables:
   no_clean_up:
     applies_to: /.+
     clean_method: none # might be delete or truncate(truncate is out of scope of this ticket)

See com.github.vkorobkov.jfixtures.processor.Processor#cleanupTable

Remove `type:sql` for column values

Currently when we don't want to escape a string we use type: sql in column values in fixture files:

vlad:
  age: 29
  name: Vlad # Gets escaped
dan:
  age: 45
  name:
    type:  value
    value: Dan # Gets escaped because of type: value
robot:
  age: 0
  name:
    type: sql
    value: DEFAULT # Does not get escaped because of  type: sql

This could be done easier:

vlad:
  age: 29
  name: Vlad # Gets escaped
dan:
  age: 45
  name: text:Dan # Gets escaped because of text:
robot:
  age: 0
  name: sql:DEFAULT # Does not get escaped because of `sql:` prefix
  • text: or sql: does not go to the resulting SQL
  • values after the prefix is not trimmed, so the spaces go to SQL
  • if no prefix text: or sql: is specified the value will be interpreted as text and will be escaped
  • com.github.vkorobkov.jfixtures.util.RowMergeConflictResolver could be removed

Support .yaml file extension

YML files have two valid possible extensions: .yml and .yaml

Currently we do not support .yaml extension and we need to fix it before the public release.

  1. com.github.vkorobkov.jfixtures.config.ConfigLoader
    Should lookup for .conf.yaml first, in case of failure should lookup fpr .conf.yml

  2. com.github.vkorobkov.jfixtures.loader.FixturesLoader#isYml should be fixed:

.endsWith(".yml") || .endsWith(".yaml")

There is an edge case here: if the same fixture exists with both extensions(like users.yaml and users.yml we need to throw an exception with a readable message)

truncate clean up method

By now JFixtures generates clean up SQL for every table(before inserts):

DELETE FROM $table_name;

Add ability to change the instruction from DELETE to TRUNCATE in order to produce SQL like:

TRUNCATE $table_name;

A new table property is required:

tables:
   no_clean_up:
     tables: /.+
     clean_method: truncate # might be delete or none

Mono YAML file so small dataset

Depends and basing on #167
If we read YAML with com.github.vkorobkov.jfixtures.util.YmlUtil#load then we can just pass the output map into #167

For medium sized datasets(a few small tables) it may be the most valuable option to put the data into a single YAML file, so no need to think about directory structure:

# users table
users:
  homer:
    name: Homer
    age: 40
  bart:
    name: Bart
    age: 12

# admin.roles table
# beware the dots in table name should be escaped
"admin.roles": 
  admin:
    reads: true
    writes: true
  user:
    reads: true
    writes: false

Add loger and minimal verbosity

No new compile-time dependencies should be added - provided scope only.

Probably, at minimum, there might be following logs:

WARN: file ".cong.yml" not found, using defaults
Processing table "users"
Processing table "comments"
Processing table "permissions"
Processing table "photos"

Have a mechanism to define tables order AND is different that relationships

Currently the order tables go to the destination SQL is not defined except one rule: referred tables go strictly before the referring ones.

However, except the relationship, there might be other cases when the tables order is important: table can have dependencies on triggers/stored procedures/default values level.

So there should be a way to say that table A goes before table B and these rules should have the highest priority, higher, than table ordering defined by relationships. The circular references should still be detected in the same manner.

A possible format of describing such dependencies in .conf:

# maybe a new <requires> table property
user_needs_roles:
  apply_to: users
  requires: [roles, permissions] # the required tables go before <users>

Disallow dots in folder names

So sub folders with fixtures can not have dots to avoid collisions like below:

+ old/
  + searches/
    log.yaml        # old.searches.log
+ old.searches/
  log.yaml          # also old.searches.log?

Instead, an exception should occur when dot in a folder name is detected.

Before inserts custom SQL

Problem:
To be able to write a custom SQL before insert instructions(and after cleanup instruction).

Possible configuration:

tables:
  transactional_users:
    applies_to: users, users_profiles
    before_inserts:
      - // Doing table $TABLE_NAME
      - BEGIN TRANSACTION;

The before-insert section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME placeholder should be replaced with the real table name.

Before-insert instructions order must not be changed, duplicates(if there are) must not be removed.

Self defined foreign keys

Probably for small project or database it would be useful to define foreign keys on the place:

Short(or relative) table name:

vlads_comment:
  text: Hello
  # users table should be looked up first in the same directory as current fixture is
  # otherwise the general lookup strategy from fixtures root folder should be user
  user_id: users:vlad # trying to find users table and vlad in it

Table name with schema:

vlads_comment:
  text: Hello
  # the same lookup strategy as above: relative first, canonical after 
  user_id: public.users:vlad # trying to find public.users table and vlad in it

Foreign key to any field other than to PK:

vlads_comment:
  text: Hello
  user_id: public.users:vlad:public_id # refers to public_id column of public.users table

Table relations defined in .conf.yml would be the default behaviour, but manually defined FKs would override it.

Clickhouse dialect

I would use your library for writing clickhouse integration tests. Could you add clickhouse dialect?

Restrict binary and timestamp types

Restrict binary and timestamp types since we don't normally support them at the moment.

It is better to fail gracefully rather than to write to SQL meaningless values.

Add GenericSql dialect

Remove all dialects that just inherit BaseSql in favor of one Sql99 which covers many databases at once.

Support Binary type

Yaml has a binary type - http://yaml.org/type/binary.html

Currently if we have a binary type in YAML:

  children: !!binary |
             R0lGODlhDAAMAIQAAP//9/X17unp5WZmZgAAAOfn515eXvPz7Y6OjuDg4J+fn5
             OTk6enp56enmlpaWNjY6Ojo4SEhP/++f/++f/++f/++f/++f/++f/++f/++f/+
             +f/++f/++f/++f/++f/++SH+Dk1hZGUgd2l0aCBHSU1QACwAAAAADAAMAAAFLC
             AgjoEwnuNAFOhpEMTRiggcz4BNJHrv/zCFcLiwMWYNG84BwwEeECcgggoBADs=

It simply concerts into VALUES (99560979, [B@303cf2ba);

We need to find a way to concert it to correct SQL binary field

http://www.ocelot.ca/datatype.htm
http://www.technowlogeek.com/programming/java/jdbc/the-sql-99-types-blob-binary-large-object/

customize lower bound of auto generated PKs

Currently when JFixtures auto generates primary keys it starts with value 10_000(See com.github.vkorobkov.jfixtures.processor.sequence.IncrementalSequence#LOWER_BOUND) and then increments for every row.

This value should be configurable per table. The minimal acceptable value is 1, otherwise an exception should be thrown.

Computed PK name

Some databases have strange convention for PK name: it is not fixed(like id or ID) but it is dynamic instead and depends on the table name: USER_ID, payment_id.

To support that, we need to inline some scripting when we set a table PK name. It could be a JS since Java has it's interpreter.

To overall design TBD

Aliases for column name

If column names are too long or too difficult there may be useful to use shorter/better aliases to them in fixtures files.

Probably the scope of column aliases must be only for yaml fixtures, all the configuration should use original column names just like these are in the DB

H2 support

We need to add H2 database support for a few reasons:

  1. An ability to write fast end-to-end integration tests without need to install a heavy databases/dockers/etc
  2. To support one more DB

Acceptance criteria:

  • Basing on adding h2 support, README.md file should be extended with an instruction of adding a custom database support
  • H2 support itself (see com.github.vkorobkov.jfixtures.JFixtures, com.github.vkorobkov.jfixtures.sql, com.github.vkorobkov.jfixtures.sql.dialects). I think the main goal is just to escape column/table names properly as well as string literals
  • Let's add one end-to-end test which will just create a single table with a few records - just to reuse it as a template in the future
  • Ensure code coverage remains 100%(see target/site/jacoco/index.html)
  • Add yourself into <developers> section of pom.xml
  • <version>1.0.5</version> should be increased to 1.0.6 in pom.xml

Ability to set custom ID generator

Ability to set a custom ID generator per table(s) and globally. Should be a property like:

id_generator: com.package.IdGenerator.generate

next to other id related properties.

The property describes a path to a static method which will be used for ID generation. The method must be public, static, it must consume String and return Object. Otherwise, an exception should be thrown.

A few standard generator could be added to the project: LongId(the same as IntId but with wider range) and StringId(presumably can just return row aliases)

Write into DB directly in Fluent API?

Add fluent methods to write the result directly in a DB using JDBC driver class, connection url, username, password, or maybe using a provided JDBC connection/pool.

Might be useful in maven plugin so maven can upload the data to the DB by connection properties.

The command line tool should be modified to enable this feature - it should load an external JAR with DB driver.

Handle Date type correctly

SnakeYaml has implicit conversion of strings like 2001-11-23 and 2001-11-23 15:02:31 to Date type. Currently, the Date type generates invalid SQL value:

INSERT INTO "users" ("id","date") VALUES (99560979, Fri Nov 23 21:03:17 MSK 2001);

Perhaps, we need to investigate this question and to generate more convenient dates for SQL in format YYYY-MM-DD / format: YYYY-MM-DD HH:MI:SS / unix timestamp.

More info:
https://bitbucket.org/asomov/snakeyaml/wiki/Documentation
https://bitbucket.org/asomov/snakeyaml/src/tip/src/test/java/examples/resolver/CustomResolverTest.java?fileviewer=file-view-default
https://www.w3schools.com/sql/sql_dates.asp

Clean tables which don't have fixtures

If there are tables in the DB which don't have fixtures it's good to have ability to clean them anyway.

Currently, the workaround is to create an empty fixture so the clean up instruction will be executed.
Though, the more elegant way is to specify such tables right in the config:

clean_tables: ['logs', 'statistics', 'garbage', 'slow_sql']

The instructions for cleaning these tables should go first. The order should be preserved, the duplicates should be skipped.

Uppercase `null`to `NULL`

Yaml has two valid null types: ~ and null as per snake yaml documentation

Both currently translate to null in output SQL code. We need to uppercase null so it becomes NULL and to cover this scenario with tests:

  1. ~ -> NULL
  2. null -> NULL
  3. Null -> NULL
  4. nULL -> NULL
  5. nulla -> 'nulla'

Custom SQL fragments before/after all

Custom SQL fragments before/after all the fixtures generation. In case user needs to do something common special in the database.

There are few ways to define these fragments:

  • In config file(inline SQL as well as reference to external SQL file)
  • File with conventional name which will be auto registered and auto executed:
- .before/
    01-clean-logs.sql
    02-recreate-sequences.sql
    03-apply-migrations.sql
- .after/
    01-do-something.sql

(file are sorted by name ASC and then executed)

Before cleanup custom SQL

Problem:
To be able to write a custom SQL before the cleanup instruction(if disabled, then before-cleanup means the same as before-inserts):
Possible configuration:

tables:
  transactional_users:
    applies_to: users, users_profiles
    before-cleanup:
      - // Beginning of the $TABLE_NAME
      - BEGIN TRANSACTION;

The before-cleanup section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME placeholder should be replaced with the real table name.

The instructions order must not be changed, duplicates(if there are) must not be removed.

After inserts custom SQL

Problem:
To be able to write a custom SQL after the insert instructions
Possible configuration:

tables:
  transactional_users:
    applies_to: users, users_profiles
    after_inserts:
      - // Completed table $TABLE_NAME
      - COMMIT TRANSACTION;

The after-insert section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME placeholder should be replaced with the real table name.

The instructions order must not be changed, duplicates(if there are) must not be removed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.