rodionovsasha / jfixtures Goto Github PK
View Code? Open in Web Editor NEWA tool for populating relational databases with yml-based test data
License: MIT License
A tool for populating relational databases with yml-based test data
License: MIT License
Should be TRUE
/ FALSE
instead of true
and false
WHEN row has a PK which is defined by user
AND table already has a row with such a PK
THEN jfixtures should throw an exception with a human readable message
Now the step is always 1
but could be configured per table(s)
These could be:
Replace null checks like == null
or != null
with Optional
When fixture value type is iterable(array, list, map, e.t.c.) we need to throw NotSupportedException
(or NotImplementedYesException
until we understand how to support these types for different SQL dialects.
Full list of unsopported types/syntax:
list: [LITE, RES_ACID, SUS_DEXT]
map: {hp: 13, sp: 5}
!!omap
, !!pairs
, !!set
, !!seq
, !!map
Every type of syntax should be covered with a test.
Also, as a result of this bug, we need to raise epics to investigate/implement collections for particular SQL dialects(like map
could be represented as a JSONB
field of PG, !!set
, !!seq
, !!map
as arrays of PG)
Tests need an ability to calculate ID(pk) by row aliases.
Ideally, calculating the id from row alias should be a pure function, like id = by_alias('homer')
. This trick allows to use the same ID auto generation mechanism in both JFixtures and the tests within JFixtures.
The algorithm could base on the standard hashCode
of String:
final LOWER_BOUND = 100_000;
final HASH_RANGE = Integer.MAX_VALUE - LOWER_BOUND;
public int calculate(String alias) { // do not copy this method name
int hashWithOffset = alias.hashCode() % HASH_RANGE;
return LOWER_BOUND + hashWithOffset;
}
Pros:
Cons:
=> 4.6587822898051644e-08
This is a probability of collisions when a table has 100 rows(don't think users would ever add more rows manually). The probability is low, so I think we can this approach. For that rare case of collision, we need to offer users to define their own IDs on 0...LOWER_BOUND range.
I think we need to do in a few small steps. Every steps would need a separate PR:
commentService.findByTopic(TOPIC_ID).id == [4, 5, 6, 8]
will look like
commentService.findByTopic(TOPIC_ID).id == HashId.list("homer", "bart", "lisa", "maggy")
By now JFixtures generates clean up SQL for every table(before inserts):
DELETE FROM $table_name;
This behavior is hardcoded but should be switchable off. A new table property is required:
tables:
no_clean_up:
applies_to: /.+
clean_method: none # might be delete or truncate(truncate is out of scope of this ticket)
See com.github.vkorobkov.jfixtures.processor.Processor#cleanupTable
Currently when we don't want to escape a string we use type: sql
in column values in fixture files:
vlad:
age: 29
name: Vlad # Gets escaped
dan:
age: 45
name:
type: value
value: Dan # Gets escaped because of type: value
robot:
age: 0
name:
type: sql
value: DEFAULT # Does not get escaped because of type: sql
This could be done easier:
vlad:
age: 29
name: Vlad # Gets escaped
dan:
age: 45
name: text:Dan # Gets escaped because of text:
robot:
age: 0
name: sql:DEFAULT # Does not get escaped because of `sql:` prefix
text:
or sql:
does not go to the resulting SQLtext:
or sql:
is specified the value will be interpreted as text and will be escapedYML files have two valid possible extensions: .yml
and .yaml
Currently we do not support .yaml
extension and we need to fix it before the public release.
com.github.vkorobkov.jfixtures.config.ConfigLoader
Should lookup for .conf.yaml
first, in case of failure should lookup fpr .conf.yml
com.github.vkorobkov.jfixtures.loader.FixturesLoader#isYml
should be fixed:
.endsWith(".yml") || .endsWith(".yaml")
There is an edge case here: if the same fixture exists with both extensions(like users.yaml
and users.yml
we need to throw an exception with a readable message)
By now JFixtures generates clean up SQL for every table(before inserts):
DELETE FROM $table_name;
Add ability to change the instruction from DELETE
to TRUNCATE
in order to produce SQL like:
TRUNCATE $table_name;
A new table property is required:
tables:
no_clean_up:
tables: /.+
clean_method: truncate # might be delete or none
Depends and basing on #167
If we read YAML with com.github.vkorobkov.jfixtures.util.YmlUtil#load
then we can just pass the output map into #167
For medium sized datasets(a few small tables) it may be the most valuable option to put the data into a single YAML file, so no need to think about directory structure:
# users table
users:
homer:
name: Homer
age: 40
bart:
name: Bart
age: 12
# admin.roles table
# beware the dots in table name should be escaped
"admin.roles":
admin:
reads: true
writes: true
user:
reads: true
writes: false
No new compile-time dependencies should be added - provided scope only.
Probably, at minimum, there might be following logs:
WARN: file ".cong.yml" not found, using defaults
Processing table "users"
Processing table "comments"
Processing table "permissions"
Processing table "photos"
Currently the order tables go to the destination SQL is not defined except one rule: referred tables go strictly before the referring ones.
However, except the relationship, there might be other cases when the tables order is important: table can have dependencies on triggers/stored procedures/default values level.
So there should be a way to say that table A
goes before table B
and these rules should have the highest priority, higher, than table ordering defined by relationships. The circular references should still be detected in the same manner.
A possible format of describing such dependencies in .conf
:
# maybe a new <requires> table property
user_needs_roles:
apply_to: users
requires: [roles, permissions] # the required tables go before <users>
So sub folders with fixtures can not have dots to avoid collisions like below:
+ old/
+ searches/
log.yaml # old.searches.log
+ old.searches/
log.yaml # also old.searches.log?
Instead, an exception should occur when dot in a folder name is detected.
Once this dependency is fixed:
checkstyle/checkstyle#2579
Problem:
To be able to write a custom SQL before insert instructions(and after cleanup instruction).
Possible configuration:
tables:
transactional_users:
applies_to: users, users_profiles
before_inserts:
- // Doing table $TABLE_NAME
- BEGIN TRANSACTION;
The before-insert
section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME
placeholder should be replaced with the real table name.
Before-insert instructions order must not be changed, duplicates(if there are) must not be removed.
Probably for small project or database it would be useful to define foreign keys on the place:
Short(or relative) table name:
vlads_comment:
text: Hello
# users table should be looked up first in the same directory as current fixture is
# otherwise the general lookup strategy from fixtures root folder should be user
user_id: users:vlad # trying to find users table and vlad in it
Table name with schema:
vlads_comment:
text: Hello
# the same lookup strategy as above: relative first, canonical after
user_id: public.users:vlad # trying to find public.users table and vlad in it
Foreign key to any field other than to PK:
vlads_comment:
text: Hello
user_id: public.users:vlad:public_id # refers to public_id column of public.users table
Table relations defined in .conf.yml
would be the default behaviour, but manually defined FKs would override it.
I would use your library for writing clickhouse integration tests. Could you add clickhouse dialect?
Like "Getting started with JFixtures, Spring Boot and JUnit" with examples of code and with a reference to demo proj
Restrict binary and timestamp types since we don't normally support them at the moment.
It is better to fail gracefully rather than to write to SQL meaningless values.
Remove all dialects that just inherit BaseSql
in favor of one Sql99
which covers many databases at once.
Yaml has a binary type - http://yaml.org/type/binary.html
Currently if we have a binary type in YAML:
children: !!binary |
R0lGODlhDAAMAIQAAP//9/X17unp5WZmZgAAAOfn515eXvPz7Y6OjuDg4J+fn5
OTk6enp56enmlpaWNjY6Ojo4SEhP/++f/++f/++f/++f/++f/++f/++f/++f/+
+f/++f/++f/++f/++f/++SH+Dk1hZGUgd2l0aCBHSU1QACwAAAAADAAMAAAFLC
AgjoEwnuNAFOhpEMTRiggcz4BNJHrv/zCFcLiwMWYNG84BwwEeECcgggoBADs=
It simply concerts into VALUES (99560979, [B@303cf2ba);
We need to find a way to concert it to correct SQL binary field
http://www.ocelot.ca/datatype.htm
http://www.technowlogeek.com/programming/java/jdbc/the-sql-99-types-blob-binary-large-object/
Pros:
Cons:
Singe quote characters '
should be replaced with ''
for string literals in SQL statements
Currently when JFixtures auto generates primary keys it starts with value 10_000(See com.github.vkorobkov.jfixtures.processor.sequence.IncrementalSequence#LOWER_BOUND
) and then increments for every row.
This value should be configurable per table. The minimal acceptable value is 1, otherwise an exception should be thrown.
Export to custom formats which keeps tables ordering: think of YAML, XML, CSV, JSON
Some databases have strange convention for PK name: it is not fixed(like id
or ID
) but it is dynamic instead and depends on the table name: USER_ID
, payment_id
.
To support that, we need to inline some scripting when we set a table PK name. It could be a JS since Java has it's interpreter.
To overall design TBD
If column names are too long or too difficult there may be useful to use shorter/better aliases to them in fixtures files.
Probably the scope of column aliases must be only for yaml fixtures, all the configuration should use original column names just like these are in the DB
We need to add H2 database support for a few reasons:
Acceptance criteria:
com.github.vkorobkov.jfixtures.JFixtures
, com.github.vkorobkov.jfixtures.sql
, com.github.vkorobkov.jfixtures.sql.dialects
). I think the main goal is just to escape column/table names properly as well as string literalstarget/site/jacoco/index.html
)<developers>
section of pom.xml<version>1.0.5</version>
should be increased to 1.0.6 in pom.xmlAbility to set a custom ID generator per table(s) and globally. Should be a property like:
id_generator: com.package.IdGenerator.generate
next to other id related properties.
The property describes a path to a static method which will be used for ID generation. The method must be public, static, it must consume String
and return Object
. Otherwise, an exception should be thrown.
A few standard generator could be added to the project: LongId
(the same as IntId
but with wider range) and StringId
(presumably can just return row aliases)
Add fluent methods to write the result directly in a DB using JDBC driver class, connection url, username, password, or maybe using a provided JDBC connection/pool.
Might be useful in maven plugin so maven can upload the data to the DB by connection properties.
The command line tool should be modified to enable this feature - it should load an external JAR with DB driver.
SnakeYaml has implicit conversion of strings like 2001-11-23
and 2001-11-23 15:02:31
to Date
type. Currently, the Date
type generates invalid SQL value:
INSERT INTO "users" ("id","date") VALUES (99560979, Fri Nov 23 21:03:17 MSK 2001);
Perhaps, we need to investigate this question and to generate more convenient dates for SQL in format YYYY-MM-DD
/ format: YYYY-MM-DD HH:MI:SS
/ unix timestamp.
More info:
https://bitbucket.org/asomov/snakeyaml/wiki/Documentation
https://bitbucket.org/asomov/snakeyaml/src/tip/src/test/java/examples/resolver/CustomResolverTest.java?fileviewer=file-view-default
https://www.w3schools.com/sql/sql_dates.asp
How it is possible to setup the same name for PK and for foreign key in scope of one table. This weird scenario should be checked and a exception should be thrown
Refer to external files with SQL in before-cleanup, before-inserts and after-inserts .
That would allow to keep yaml cleaner if big custom SQL scripts are required.
$TABLE_NAME
placeholder still must be replaced.
If there are tables in the DB which don't have fixtures it's good to have ability to clean them anyway.
Currently, the workaround is to create an empty fixture so the clean up instruction will be executed.
Though, the more elegant way is to specify such tables right in the config:
clean_tables: ['logs', 'statistics', 'garbage', 'slow_sql']
The instructions for cleaning these tables should go first. The order should be preserved, the duplicates should be skipped.
See https://www.postgresql.org/docs/current/static/sql-truncate.html
Might be really useful when between test launches a table(users
) refers to another table(logs
), and this another referred table(logs
) does not have fixture file and it does get cleaned up, correspondingly.
Is this case, the standard DELETE
or TRUNCATE
fails because of foreign key restriction.
Yaml has two valid null
types: ~
and null
as per snake yaml documentation
Both currently translate to null
in output SQL code. We need to uppercase null
so it becomes NULL
and to cover this scenario with tests:
~
-> NULL
null
-> NULL
Null
-> NULL
nULL
-> NULL
nulla
-> 'nulla'
WHEN JFixures is auto generating a new PK for a row in the table,
AND a row in the table with such a PK already exists,
THEN auto generator should generate a new value(next in sequence) and so on
README should have the overall description of the product;
Documentation should go to WIKI
Custom SQL fragments before/after all the fixtures generation. In case user needs to do something common special in the database.
There are few ways to define these fragments:
- .before/
01-clean-logs.sql
02-recreate-sequences.sql
03-apply-migrations.sql
- .after/
01-do-something.sql
(file are sorted by name ASC
and then executed)
Now foreign key refers only to a PK of referred table, though, FK in SQL can point to any column of referred table
Problem:
To be able to write a custom SQL before the cleanup instruction(if disabled, then before-cleanup means the same as before-inserts):
Possible configuration:
tables:
transactional_users:
applies_to: users, users_profiles
before-cleanup:
- // Beginning of the $TABLE_NAME
- BEGIN TRANSACTION;
The before-cleanup
section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME
placeholder should be replaced with the real table name.
The instructions order must not be changed, duplicates(if there are) must not be removed.
Problem:
To be able to write a custom SQL after the insert instructions
Possible configuration:
tables:
transactional_users:
applies_to: users, users_profiles
after_inserts:
- // Completed table $TABLE_NAME
- COMMIT TRANSACTION;
The after-insert
section could be: a single value or a list(or list of lists with any depth). The $TABLE_NAME
placeholder should be replaced with the real table name.
The instructions order must not be changed, duplicates(if there are) must not be removed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.