didi / es-fastloader Goto Github PK
View Code? Open in Web Editor NEWQuickly build large-scale ElasticSearch indices by using the fault tolerance and parallelism of Hadoop
License: Apache License 2.0
Quickly build large-scale ElasticSearch indices by using the fault tolerance and parallelism of Hadoop
License: Apache License 2.0
有几个问题想请教一下,谢谢。
1、ES插件打包问题
“wiki/使用文档” 中提到 “修改好ES代码之后,只需要将mr目录拷贝到ES源码的modules目录中,然后执行ES的编译打包命令,就可以获得安装了appendLucene插件的ES安装包了。“
我一开始把ES-Fastloader\mr目录放进去打包,没看到生成jar包;后来感觉不对,就把 ES-Fastloader\plugin 复制到ES源码的modules目录下,编译打包就能看到jar包了: modules\plugin\build\distributions\plugin-6.6.1-SNAPSHOT.jar。
想确认一下是不是使用ES-Fastloader\plugin。
2、ES部署问题
部署ES,得部署修改过源码的ES版本,还是说可以部署没改过源码的ES。
3、关于运行场景
mr任务运行执行hadoop时入参有esTemplate和time,看代码索引名是由这两个部分组成比如最终是 mytable_20200715。
如果我已有的索引名不是这么组成的,比如索引名就叫mytable,是不是就用不了,得重建索引吗?
另外,首页的微信二维码过期了,方便更新一下吗?
if use reflect to get InternalEngine and IndexWriter, we need not modify elasticsearch source code
mr模块里面的elasticsearch-6.6.1是不是缺失了lib文件夹,这样es可以运行起来么
什么时候可以开源es集群管理平台和es收集query查询分析的网关呀
ExecutorLostFailure (executor 13 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 20.0 GB of 20 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
master: mvn clean package -Dmaven.test.skip=true -Ppro
[ERROR] Failed to execute goal on project mr2es: Could not resolve dependencies for project com.didi.bigdata:mr2es:jar:1.0.0: Could not find artifact org.apache.hadoop:hadoop-client:jar:2.7.2-2312 in central (https://repo.maven.apache.org/maven2) -> [Help 1]
您好,我们的业务场景和滴滴的稍有不同,感觉会简单些
暂时我们不会有更新索引的场景,而是每天都新建一份索引,全量刷写数据
我想问下,hadoop fs -get已经把文件按照shard对应关系拉取到es集群本地的节点了
后面如何把lucene的各个文件加载到es集群呢,相当于把这个部分数据加载到一个空索引里面,我理解就是让文件和索引在es端对应起来,不知道这个步骤直接把文件放进索引的存储目录行不行~
还是说要调取lucene的indexWriter这种操作呢,多谢
这里给出ES7.X的代码:
int routingNumShards=MetaDataCreateIndexService.calculateNumRoutingShards(numShards, Version.CURRENT);
int routingFactor = routingNumShards / numShards;
int shardId = Math.floorMod(hash, routingNumShards) / routingFactor;
希望能有所帮助
请问处理各个shard的数据搬迁FastIndexLoadDataCollector类是怎么执行的,以及ZeusUtils中怎么执行的loadData.sh脚本,用的什么技术?
Lines 135 to 137 in dc00f9b
Recommended upgrade version:2.13.2
安装es插件后,调用rest uri会提示
"action [indices:append-lucene] is unauthorized for user [elastic]"
Lines 17 to 19 in dc00f9b
CVE-2020-7018 CVE-2019-7614 CVE-2020-7019 CVE-2020-7020
Recommended upgrade version:6.8.13
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.