GithubHelp home page GithubHelp logo

Comments (6)

wangweicugw avatar wangweicugw commented on June 23, 2024

I think there might be an issue with the TryExecute method in go/vt/vtgate/engine/scalar_aggregation.go. To illustrate with my current example, consider a situation where there is only one data entry, but suppose it is split across two shards. In this case, one shard would return null, null, like this

mysql> SELECT empno, MAX(sal) FROM emp;
+-------+----------+
| empno | max(sal) |
+-------+----------+
|  NULL |     NULL |
+-------+----------+
1 row in set (0.01 sec)

and the other shard would return 1, 5000. , like this

mysql> SELECT empno, MAX(sal) FROM emp;
+-------+----------+
| empno | max(sal) |
+-------+----------+
|     1 |     5000 |
+-------+----------+
1 row in set (0.01 sec)

If null, null is in the first row of result.Rows, the final result would be null, 5000. Conversely, if 1, 5000 is in the first row of result.Rows, the final result would be 1, 5000.

from vitess.

mattlord avatar mattlord commented on June 23, 2024

This is not obviously a Vitess bug to me. I say that as the results for the non-aggregated column in the query you ran are not guaranteed in SQL. If you tried to execute the query directly on one of the mysqld instances you should get this error (same type of query used here with the local example's: ./101_initial_cluster.sh):

mysql> select sku, max(price) from product;
ERROR 1140 (42000): In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'vt_commerce.product.sku'; this is incompatible with sql_mode=only_full_group_by

When I execute this same query in vtgate (v20-dev):

mysql>  select sku, max(price) from product;
ERROR 1140 (42000): target: commerce.0.primary: vttablet: rpc error: code = InvalidArgument desc = In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'vt_commerce.product.sku'; this is incompatible with sql_mode=only_full_group_by (errno 1140) (sqlstate 42000) (CallerID: userData1): Sql: "select sku, max(price) from product", BindVars: {}

Are you explicitly modifying MySQL's sql_mode in your setup to remove the default only_full_group_by flag? I can't try and repeat what you saw as you did not provide a full test case (e.g. using the local examples). Each Vitess cluster is a bit of a snowflake and I have no details on yours.

from vitess.

mattlord avatar mattlord commented on June 23, 2024

To be more clear, this is not guaranteed in SQL -- the language itself (not just Vitess or MySQL). SQL is declarative. And the results you declared in your statement are:
select, any sku column value, the maximum value in the price column, from the employee table

This is the kind of statement that MySQL, many, many years ago, would let you execute w/o any warnings or errors even though the results are not deterministic. The results for that are not deterministic as written, thus them not being deterministic in Vitess is not in and of itself a bug — the same is true in a single stand-alone mysqld instance. https://dev.mysql.com/doc/refman/8.0/en/group-by-handling.html

from vitess.

wangweicugw avatar wangweicugw commented on June 23, 2024

Yes, I changed the sql_mode. Apologies for not mentioning it earlier. I set it globally as follows:

SET GLOBAL sql_mode='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION';

from vitess.

wangweicugw avatar wangweicugw commented on June 23, 2024

Actually, this SQL comes from here; the original SQL was SELECT empno, MAX(sal) FROM emp HAVING COUNT(*) > 3. I discovered that this SQL could not pass in my test case, so I simplified the SQL and found that it still could not pass.

using my current example, MySQL returns a consistent result no matter how many times I execute it.

from vitess.

mattlord avatar mattlord commented on June 23, 2024

If you look at the test executing that query, and the function it uses, you will see that it's confirming in this case that we get the same error when executing it against mysqld and against vtgate:

func (mcmp *MySQLCompare) ExecAllowAndCompareError(query string) (*sqltypes.Result, error) {
	mcmp.t.Helper()
	vtQr, vtErr := mcmp.VtConn.ExecuteFetch(query, 1000, true)
	mysqlQr, mysqlErr := mcmp.MySQLConn.ExecuteFetch(query, 1000, true)
	compareVitessAndMySQLErrors(mcmp.t, vtErr, mysqlErr)

	// Since we allow errors, we don't want to compare results if one of the client failed.
	// Vitess and MySQL should always be agreeing whether the query returns an error or not.
	if vtErr == nil && mysqlErr == nil {
		vtErr = compareVitessAndMySQLResults(mcmp.t, query, mcmp.VtConn, vtQr, mysqlQr, false)
	}
	return vtQr, vtErr
}

And as I showed in my example, they do.

I discovered that this SQL could not pass in my test case, so I simplified the SQL and found that it still could not pass.

I don't know what this means.

using my current example, MySQL returns a consistent result no matter how many times I execute it.

Sure, if the state in that table does not change. Just as in if the rows came back from the shards in the same order every time in vtgate you would get the same results. The point is that the query as you wrote it is non-deterministic so you cannot expect it to be deterministic.

I'm going to close this as there's no Vitess bug here that I can see. If you feel it was closed in error, please add a comment with a clear description of what you feel the bug is.

from vitess.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.