prestodb / tempto Goto Github PK
View Code? Open in Web Editor NEWA testing framework for Presto
License: Apache License 2.0
A testing framework for Presto
License: Apache License 2.0
Convention based tests implemented as a single file for some reason can't be selected with the -t
flag.
HDFS should not be required to run tests using Tempto. This splits to two things:
tempto-configuration.yaml
configuration should not require any properties in hdfs
section, tests.hdfs
should not be mandatory as well.SELECT * FROM nation
fails when no hdfs
properties are defined.Prerequisites:
.txt
extension because of GitHub limitations)Steps to reproduce:
Expected result:
Tests ran (with failure because of not matching output from test1.result
file)
Actual result:
Bunch of errors complaining about missing configuration (which should not be used):
No implementation for java.lang.Integer annotated with @com.google.inject.name.Named(value=hdfs.webhdfs.port) was bound.
No implementation for java.lang.String annotated with @com.google.inject.name.Named(value=hdfs.username) was bound.
No implementation for java.lang.String annotated with @com.google.inject.name.Named(value=hdfs.username) was bound.
No implementation for java.lang.String annotated with @com.google.inject.name.Named(value=hdfs.webhdfs.host) was bound.
No implementation for java.lang.String annotated with @com.google.inject.name.Named(value=tests.hdfs.path) was bound.
In certain cases data file should have the capability to mark the datatype being in data columns. For example in Teradata varbyte (similar to varbinary) columns can only take values that have been explicitly marked as binary.
create table tab1(col1 integer, col2 varbyte(100));
insert into tab1 values(1, '0ab'xb);
select * from tab1;
col1 col2
1 0AB0
Now in order for Tempto to pass this in .data file we will need to have some support for notation like this
1|'0ab'xb
If you simply put the values
1|0ab
you will get this error while inserting data, [Teradata Database] [TeraJDBC 15.10.00.07] [Error 3532] [SQLState 22018] Conversion between BYTE data and other types is illegal.
When a last query in convention test file ends with ;
then tempto raises following error:
com.teradata.tempto.query.QueryExecutionException: java.sql.SQLException: Query failed (#20160803_062105_00016_sucsx): line 18:12: ext
raneous input ';' expecting {<EOF>, '.', ',', '[', 'LIMIT', 'APPROXIMATE', 'AT', 'OR', 'AND', 'IN', 'NOT', 'BETWEEN', 'LIKE', 'IS', 'N
ULLS', 'ASC', 'DESC', '=', NEQ, '<', '<=', '>', '>=', '+', '-', '*', '/', '%', '||'}
at com.teradata.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:116)
at com.teradata.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:86)
When a data for table is short having ddl, data and data revision information stored in separate files seems to be unnecessary.
Currently one have to specify column types in .data file for convention tables managed by JdbcTableManager.
We can make use of standard INFORMATION _SCHEMA for obtaining actual column types.
To test PostgreSQL, MySQL, and other Presto connectors that work with external JDBC compliant databases, Tempto needs a table provisioner that can insert data via JDBC.
When there are no tests scheduled for execution using database configured in yaml configuration, the database should not be connected.
E.g.:
tempto yaml:
databases:
default:
alias: presto
hive:
host: hadoop-master
mysql:
jdbc_driver_class: com.mysql.jdbc.Driver
jdbc_url: jdbc:mysql://mysql:3306/test
jdbc_user: root
running only Hive tests (e.g. TestAllDatatypesFromHiveConnector.java
) MySQL instance shouldn't be expected to run.
If we want to define a list in yaml config like:
slaves:
the @nAmed annotation wont work since it is only for String and Injecting this into a test thus fails
There should be a way to register a cleanup methods to be invoked when test method ends (correctly or abnormally).
Rationale: prestodb/presto#4985 (comment)
Things to consider:
What to do when cleanup method throws?
Possible solutions:
There should be support for running a single test, which is tagged under QUARANTINE
group, when specified via -t <testname>
option. At present, if you try to run such a test explicitly, the test doesn't run. For ex. when I tried running testInsertIntoValuesToHiveTableAllHiveSimpleTypes
, I expected it to run even though it comes under QUARANTINE
group, since I was specifying the name of the test explicitly:
./presto-product-tests/bin/run_on_docker.sh singlenode -t testInsertIntoValuesToHiveTableAllHiveSimpleTypes
This is the output I got:
INFO: 0 SUCCEEDED / 0 FAILED / 0 SKIPPED
===============================================
tempto-tests
Total tests run: 0, Failures: 0, Skips: 0
===============================================
Currently the only way to specify a schema to use is to append it to the jdbc url in the configuration file. This means that you need to define a different database for each schema that you want to use for a given connection. It would be useful if in the functions to create and access tables you could specify the schema to create tables in for when you don't want to use the default.
Now per remote command a new ssh connection is created and the closed. Establishing ssh connection is time consuming, it would be nice to reuse ssh connection for several commands.
Singleton scope only works per injector, tempto is using different injector per each context on stack.
When injecting to a test configuration entry:
@Inject
@Named("some.conf")
String someConf
which is given in cofiguration yaml files tempto return error message with no details in it.
Exception in thread "main" java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at java.util.Optional.<init>(Optional.java:96)
at java.util.Optional.of(Optional.java:108)
at com.teradata.tempto.internal.configuration.MapConfiguration.getObject(MapConfiguration.java:87)
at com.teradata.tempto.internal.configuration.MapConfiguration.get(MapConfiguration.java:77)
Instead, they end up in the target table as data. This is inconsistent with how .data
files for jdbc tables are handled.
To reproduce:
type: hive
, with a single char(30) column.data
file, e.g. -- trimValues: false; types: CHAR
I am unable to add a test for the map aggregate histogram function because tempto does not support having a query return a map type.
Query:
presto:default> select histogram(n_regionkey) from nation;
_col0
---------------------------
{0=5, 1=5, 2=5, 3=5, 4=5}
(1 row)
Query 20150909_163252_00146_vfjfr, FINISHED, 2 nodes
Splits: 2 total, 2 done (100.00%)
0:00 [25 rows, 2.17KB] [125 rows/s, 10.9KB/s]
But when run with Tempto:
2015-09-09 23:10:08 DEBUG [MapAggregateTests.testSimpleHistogramInt_1441819507415] c.t.t.i.i.TestInitializationListener - test failure java.lang.RuntimeException: Unsupported sql type JAVA_OBJECT
at com.teradata.tempto.internal.query.QueryResultValueComparator.compare(QueryResultValueComparator.java:100)
at com.teradata.tempto.assertions.QueryAssert.rowsEqual(QueryAssert.java:334)
at com.teradata.tempto.assertions.QueryAssert.containsExactly(QueryAssert.java:227)
at com.teradata.tempto.assertions.QueryAssert.containsExactly(QueryAssert.java:247)
at com.facebook.presto.tests.functions.MapAggregateTests.testSimpleHistogramInt(MapAggregateTests.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.TestRunner.privateRun(TestRunner.java:767)
at org.testng.TestRunner.run(TestRunner.java:617)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
at org.testng.SuiteRunner.run(SuiteRunner.java:240)
at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
at org.testng.TestNG.runSuitesSequentially(TestNG.java:1224)
at org.testng.TestNG.runSuitesLocally(TestNG.java:1149)
at org.testng.TestNG.run(TestNG.java:1057)
at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:125)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Update docs to include this,
Traceback (most recent call last):
File "/usr/local/bin/docker-compose", line 9, in
load_entry_point('docker-compose==1.6.2', 'console_scripts', 'docker-compose')()
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 337, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2279, in load_entry_point
return ep.load()
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1989, in load
entry = __import__
(self.module_name, globals(),globals(), ['__name__'
])
File "/usr/local/lib/python2.7/dist-packages/compose/cli/main.py", line 18, in
from ..config import config
File "/usr/local/lib/python2.7/dist-packages/compose/config/init.py", line 5, in
from .config import ConfigurationError
File "/usr/local/lib/python2.7/dist-packages/compose/config/config.py", line 33, in
from .validation import match_named_volumes
File "/usr/local/lib/python2.7/dist-packages/compose/config/validation.py", line 12, in
from jsonschema import Draft4Validator
File "/usr/local/lib/python2.7/dist-packages/jsonschema/init.py", line 12, in
from jsonschema.exceptions import (
File "/usr/local/lib/python2.7/dist-packages/jsonschema/exceptions.py", line 6, in
from jsonschema import _utils
File "/usr/local/lib/python2.7/dist-packages/jsonschema/_utils.py", line 6, in
from jsonschema.compat import str_types, MutableMapping, urlsplit
File "/usr/local/lib/python2.7/dist-packages/jsonschema/compat.py", line 39, in
from functools32 import lru_cache
ImportError: No module named functools32
With edbbff2 tempto configuration is necessary even for unit tests which is very incovenient. We should make it not mandatory unless used.
Regardless of the settings in test-configuration-local.yaml for the tags schema in hive and jdbc_url in presto, all tables in the default schema are dropped before tests are run.
For example, when using presto and jdbc_url is set to jdbc:presto://10.25.191.40:8080/hive/default all tables in the schema are dropped. When jdbc_url is set to jdbc:presto://10.25.191.40:8080/hive/web, so that the schema used is web, the web schema retains all of its tables, and all tables in default are still dropped.
With Hive the same issue occurs - when schema is set to default, all tables in the schema are dropped. When schema is set to web, all tables in default are still dropped, and web retains its original tables.
When I try to import project from gradle I get this message:
No such property: nexusUsername for class: org.gradle.api.publication.maven.internal.ant.DefaultGroovyMavenDeployer Consult IDE log for more details (Help | Show Log)
It looks like there is a problem in build.gradle
file.
There should be a way to define convention-based SQL tests outside resources (to avoid recompiling when they change).
We can add a property to tempto-configuration.yaml
file to configure the path for these external tests.
If you specify a requirement for one test file, all other tests will have those requirements fulfilled as well. If you want to specifically write a test without that requirement, the test will work if you run it individually, but not if you run the whole suite.
You could solve this issue by having another test suite, but if you start getting a lot of interdependencies, it would start getting complicated (and you'd have a lot of test suites).
Though tempto now has support for specifying the schema (thanks!), Teradata specifies schemas differently -- it calls schemas "databases" and has different syntax. However, the schema support is baked into JdbcTableManager and assumes that you can create schemas via CREATE SCHEMA IF NOT EXISTS. We need a way to specify database-specific SQL for creating schemas (or generally overriding how the table manager works).
The schema tag in the hive section of test-configuration-local.yaml does not affect what schema is used.
With two schemas, default and web, each containing tables test0 and test1, and the following query:
select count() from test0
In the first case, when the schema tag is set to default, the result of the query is a failed test, as all tables in default are dropped, as described in SWARM-708.
In the second case, when the schema tag is set to web, the test also fails, as the tables were dropped from default.
When the query is instead
select count() from web.test0
the test passes, and when the query is
select count(*) from default.test0
the test fails, again because the tables have been dropped.
If user does not need creating any tables TableManager
should not be mandatory (read-only tests).
Simple scenario to show why current approach is erratic:
Prerequisites:
Expected result:
Test ran (probably failed because of wrong results but at least ran)
Actual result:
7) No implementation for java.util.Map<java.lang.String, com.teradata.tempto.fulfillment.table.TableManager> was bound.
[currently rest of the errors as in https://github.com//issues/105]
When JDBCTableDefinition has data source which returns no rows NullPointerException
is thrown when executing test case.
Steps to reproduce:
JdbcTableDefinition
with JdbcTableDataSource
having iterator on empty collection.Actual behavior:
NullPointerException
is thrown.
Expected behavior:
Test is executed against empty table as defined in JdbcTableDefinition
.
According to: #96 (comment)
better default tolerance should be given.
It's suggested to use tolerance equal to the difference between expected value of floating point number and the nearest representation of float. When user provides tolerance, that should be used.
E.g.
when user compares value 0.2, the next representable floating point value bigger than that is 0.20000001788139343262
, the next representable floating point value smaller than 0.2 is 0.19999998807907104492
so we should pick tolerance as small as 0.0000000119209...
(difference between 0.2 and the one representable floating point value closer to it).
On the other hand, when comparing doubles (binary64), tolerance is different because next representable double after 0.2 is 0.20000000000000003886
and before 0.2: 0.19999999999999998335
, so the tolerance in that case should be 2.77...E-17
.
Typically developers are using only tempto runner executable jar. It would be great to integrate SQL results generator into main tempto runner, so that developer can easily write convention tests sql file and then generate result file with the same binary as it is used for executing tests.
I am trying to build tempto using the command : ./gradlew install -x signArchives
I get the following error:
FAILURE: Build failed with an exception.
Where:
Build file '/home/anu/workspace/tempto/build.gradle' line: 161
What went wrong:
A problem occurred evaluating root project 'tempto'.
No such property: nexusUsername for class: org.gradle.api.publication.maven.internal.ant.DefaultGroovyMavenDeployer
Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.
BUILD FAILED
Any ideas?
Right now --help says that this flag takes URI as parameter. In fact the URI is accepted on Mac OS X and not on GNU/Linux. On the other hand only GNU/Linux accepts path as this parameter.
It would be nice to have that cleaned to avoid two different commands for two platforms and have a valid --help.
If the query select * from system.jdbc.tables fails then Tempto is unable to run the tests. This happens when under some configuration all the connectors do not report all their tables.
Exception in thread "main" java.lang.RuntimeException: java.sql.SQLException: Query failed (#20160324_203009_00028_2wc4a): Cannot list tables in all databases
at com.teradata.tempto.internal.fulfillment.table.AbstractTableManager.dropAllMutableTables(AbstractTableManager.java:60)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at com.teradata.tempto.internal.fulfillment.table.MutableTablesCleaner.fulfill(MutableTablesCleaner.java:38)
at com.teradata.tempto.internal.initialization.TestInitializationListener.lambda$doFulfillment$44(TestInitializationListener.java:311)
at com.teradata.tempto.context.TestContextDsl.runWithTestContext(TestContextDsl.java:51)
There should be a way to define test groups in a hierarchical way.
E.g.:
hive_tests
with two children:
long_running
and joins
This would allow excluding/choosing tests easier.
E.g. super-group of kerberos
would be excluded for non-kerberized environments but for those working on kerberos
it would be helpful to run a subset of tests, e.g. kerberos-impersonation
only.
Invoking this method should lead to ignore order during result rows matching
While running the presto-product-tests on docker I specified both of these arguments by mistake and the tests started running. I think specifying both doesn't make sense and we should error out if a user does that. The exact command I ran was presto-product-tests/bin/run_on_docker.sh singlenode -x quarantine,big_query -g hive_connector
and all tests ran.
When .result file is processed all empty lines are filtered out.
This results with comparison failures when joinAllRowsToOne is used in .result file end expected result actually should have empty lines.
Issue #50 added support for list of Strings, it will be nice to have support for list of Integers too. Eg config:
a:
- 1
- 2
- 3
It would be useful to have a table manager for Presto that does not depend on HDFS and Hive. This will require loading data via individual INSERT statements. Obviously performance will be poor, but for very small tables it should be acceptable.
Some tables (or view) may not require data as they may use some query to provide data. Like
CREATE TABLE %name% AS SELECT * FROM other_table;
[2015-11-09 08:20:09] [] 10 SUCCEEDED / 13 FAILED / 0 SKIPPED
===============================================
tempto-tests
Total tests run: 23, Failures: 13, Skips: 0
===============================================
(py27-hfab)kogut@haxu:~/tempto/tempto-examples$ echo $?
0
Sometimes it could be useful to use tempto with no statically databases set in tempto-configuration.yaml. Currently tempto fails for such configuration.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.