GithubHelp home page GithubHelp logo

[Bug] [savepoints] CDC SOURCE The operator graph generated by each submission task is inconsistent, resulting in the failure to restore the old and new serialization of the savepoints. about dinky HOT 14 CLOSED

tttzzzwww avatar tttzzzwww commented on August 30, 2024
[Bug] [savepoints] CDC SOURCE The operator graph generated by each submission task is inconsistent, resulting in the failure to restore the old and new serialization of the savepoints.

from dinky.

Comments (14)

Zzm0809 avatar Zzm0809 commented on August 30, 2024 1

经过研究算子图生成源码,发现dlink内部是hashmap无序存储同步的表,每次生成的算子图必然无序,而且即使改成有序存储,dlink内部是支持正则匹配同步的表的,例如:${schema}.*,也就意味着无法保证每次生成的算子图顺序是一致的,尝试改了下源码,测试无正则的情况下是没有问题的,每次生成的算子图也是一致,但这解决不了根本原因,这里有点难啃...... 按照同步表顺序生成算子图

正则匹配之后进行一次排序呢?

from dinky.

tttzzzwww avatar tttzzzwww commented on August 30, 2024 1

经过研究算子图生成源码,发现dlink内部是hashmap无序存储同步的表,每次生成的算子图必然无序,而且即使改成有序存储,dlink内部是支持正则匹配同步的表的,例如:${schema}.*,也就意味着无法保证每次生成的算子图顺序是一致的,尝试改了下源码,测试无正则的情况下是没有问题的,每次生成的算子图也是一致,但这解决不了根本原因,这里有点难啃...... 按照同步表顺序生成算子图

正则匹配之后进行一次排序呢?

这个办法可以,直接对最终匹配的结果集合排序,这样就不用关心中间处理的流程,源码位置:
结果集排序

可以改下代码测试一下,看一下构建结果是否与预期一致,以及savepoint是否与预期一致。同时欢迎贡献该优化点

目前测试是没问题的,源码修改不多,就增加了排序,如下:
修改差异
不过还是等博主回复吧,目前根据上下文没看出来为什么不加排序和使用hashmap存储

from dinky.

github-actions avatar github-actions commented on August 30, 2024

Hello @tttzzzwww, this issue is about CDC/CDCSOURCE, so I assign it to @aiwenmo. If you have any questions, you can comment and reply.

你好 @tttzzzwww, 这个 issue 是关于 CDC/CDCSOURCE 的,所以我把它分配给了 @aiwenmo。如有任何问题,可以评论回复。

from dinky.

Zzm0809 avatar Zzm0809 commented on August 30, 2024

加一个表 算子图就变了 肯定无法恢复, 需要指定参数

from dinky.

tttzzzwww avatar tttzzzwww commented on August 30, 2024

不好意思,我描述错了,不是加一张表,在测试的过程中,表的数量是没有发现变化,只是一套流程,同步两张表记录的保存点文件可以正常恢复取消后的任务,但如果测试改成:3张表先提交后记录保存点,再根据记录的保存点提交,则提示序列化失败,但整个过程不管是2张还是3张表,流程中表的数量从一开始就固定了,若表数量发生变化,我会重新生成保存点文件恢复

from dinky.

tttzzzwww avatar tttzzzwww commented on August 30, 2024

不过在您的提醒下,我刚刚查看了生成的算子图,例如:同步表jira.jiraaction,jira.fileattachment,jira.issuestatus,jira.jiraissue,jira.label,jira.issuetype,jira.project,首次提交生成的算子图顺序是

首次提交任务图

根据保存点恢复的算子图
失败任务图

虽然表是一样的,但算子图下方的表的顺序不一致

from dinky.

tttzzzwww avatar tttzzzwww commented on August 30, 2024

根据算子图顺序测试,两张表能够成功恢复的概率不是100%,因为两张表排列组合只有2种情况,每次提交的算子图顺序一致概率是50%,一旦超过2张表进行同步,其生成的算子图顺序情况将会递增,也就导致正常恢复的概率很小
结论:无法通过保存点恢复的原因是,内部生成的算子图不是按照SQL指定的表的顺序进行生成

from dinky.

tttzzzwww avatar tttzzzwww commented on August 30, 2024

经过研究算子图生成源码,发现dlink内部是hashmap无序存储同步的表,每次生成的算子图必然无序,而且即使改成有序存储,dlink内部是支持正则匹配同步的表的,例如:${schema}.*,也就意味着无法保证每次生成的算子图顺序是一致的,尝试改了下源码,测试无正则的情况下是没有问题的,每次生成的算子图也是一致,但这解决不了根本原因,这里有点难啃......
按照同步表顺序生成算子图

from dinky.

tttzzzwww avatar tttzzzwww commented on August 30, 2024

经过研究算子图生成源码,发现dlink内部是hashmap无序存储同步的表,每次生成的算子图必然无序,而且即使改成有序存储,dlink内部是支持正则匹配同步的表的,例如:${schema}.*,也就意味着无法保证每次生成的算子图顺序是一致的,尝试改了下源码,测试无正则的情况下是没有问题的,每次生成的算子图也是一致,但这解决不了根本原因,这里有点难啃...... 按照同步表顺序生成算子图

正则匹配之后进行一次排序呢?

这个办法可以,直接对最终匹配的结果集合排序,这样就不用关心中间处理的流程,源码位置:
结果集排序

from dinky.

Zzm0809 avatar Zzm0809 commented on August 30, 2024

经过研究算子图生成源码,发现dlink内部是hashmap无序存储同步的表,每次生成的算子图必然无序,而且即使改成有序存储,dlink内部是支持正则匹配同步的表的,例如:${schema}.*,也就意味着无法保证每次生成的算子图顺序是一致的,尝试改了下源码,测试无正则的情况下是没有问题的,每次生成的算子图也是一致,但这解决不了根本原因,这里有点难啃...... 按照同步表顺序生成算子图

正则匹配之后进行一次排序呢?

这个办法可以,直接对最终匹配的结果集合排序,这样就不用关心中间处理的流程,源码位置:
结果集排序

可以改下代码测试一下,看一下构建结果是否与预期一致,以及savepoint是否与预期一致。同时欢迎贡献该优化点

from dinky.

aiwenmo avatar aiwenmo commented on August 30, 2024

Sorry, I 've been a little busy recently and haven't replied to you in time.
The reason for not sorting and not using LinkedHashMap storage is that previous tests neglected this situation.
Thank you very much for your research and answers. I agree with your solution. Welcome to submit PR to dev branch. If you encounter any problems during this period, you can communicate continuously.

from dinky.

aiwenmo avatar aiwenmo commented on August 30, 2024

Hi. Are you willing to contribute code? If you have no time, I will fix the problem.

from dinky.

tttzzzwww avatar tttzzzwww commented on August 30, 2024

Hi. Are you willing to contribute code? If you have no time, I will fix the problem.

不好意思,这几天项目有点赶/(ㄒoㄒ)/~~,一直在加班,博主你提交下吧

from dinky.

Zzm0809 avatar Zzm0809 commented on August 30, 2024

Hi. Are you willing to contribute code? If you have no time, I will fix the problem.

Sorry, the project is a bit rushed these days /(ㄒoㄒ)/~~, I have been working overtime, blogger, please submit it

You are on the 0.7 branch. Currently, 0.7 is not maintained, Can be modified in the dev branch

from dinky.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.