GithubHelp home page GithubHelp logo

bebee4java / ides Goto Github PK

View Code? Open in Web Editor NEW
35.0 35.0 22.0 6.49 MB

智能数据探索服务(Intelligent Data Exploration Service),一站式Data + AI数据解决方案!

Home Page: https://www.datalinked.cn

License: Apache License 2.0

Scala 51.76% Shell 1.64% CSS 0.23% JavaScript 10.13% Java 34.24% ANTLR 1.60% Dockerfile 0.41%
ai batch-processing bigdata daas data data-analysis data-science datalink etl ides ml olap spark sql stream-processing

ides's Introduction

Hey, bebee4java here! 👋


bebee


🎉 Highlight products: Datalinked

  The open source product( Datalinked ) to solve the unified bigdata + AI development process. In order to simplify the process of data processing, data analysis, data mining, machine learning, etc., please pay attention!

ides's People

Contributors

bebee4java avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ides's Issues

ides repl shell定义scala udf执行Executor报序列化异常

ides repl shell支持scala和ides dsl一起运行,所以我用scala定义了一个字符串长度函数,然后注册成udf,但是数据在Executor执行时,报了序列化异常:
Pasted image 20220510022031
执行代码如下:

val strLen:String=>Int = (s:String) => if (s == null) 0 else s.length();
spark.udf.register("strLen", strLen);
select strLen('123') as t;
t.show()

[repl] 通过ides repl shell 使用--jars加入的依赖不生效

我在启动repl shell时,使用--jars加入mysql connector依赖,在真实运行时还是报了错误:java.lang.ClassNotFoundException: com.mysql.jdbc.Driver。
运行命令如下:

./bin/ides-shell --jars /Users/sgr/.ivy2/jars/mysql_mysql-connector-java-5.1.36.jar

执行报错如下:
image

[server] ides restful 服务请求日志时间显示不正常

我启用了ides rest server并开启了request-log配置,在请求的日志文件里发现api请求时间显示不正常:
image
我请求时间是:2022-08-12 14:54:44 结果在log文件中显示是02:54:44,这会和凌晨时间无法区分,请求修正!
配置参数主要如下:

  • ides.spark.service=true
  • ides.server.request-log.enable=true

CSV文件字段存在换行符,如何正常读写存储到hive表?

csv文件某些文本字段里存在换行符,如:

id,name,desc
1,spark,"spark是一个分布式计算框架"
2,ides,"ides是一个分布式计算框架,解决了统一大数据+AI开发流程,
简化了数据处理、数据分析、数据挖掘、机器学习等过程"

总共2行数据,第二行存在换行

问题是:

  1. 如何正确读取字段存在换行的csv数据?
  2. 如何将csv数据保存到hive表中,并能正确解析读写?

脚本支持多语言(python、sql)模式开发

ides自定义了一整套dsl,如(load/save/set/connect...)等语法,具体参考
我们希望从语法层原生支持sqlpython等语言的执行,这样可以丰富脚本开发,因为很有可能某些同学习惯使用sqlpython进行数据处理(算法同学同样需要python加持)。

ides repl支持原生python语言执行能力

python 作为数据科学家必备的技能,常常用来进行数据挖掘、机器学习等工作。ides为了兼顾集成python为数据挖掘的核心工具,打算在repl模式扩展执行原生python语言的能力。
有两种方案考虑:

  1. 开发 pyides 客户端,独立支持ides dsl和原始的python语言的运行
  2. 在原先ides-shell 客户端里实现通过 %python命令进入python语言环境,% 命令退出python坏境,像这样:

image

Repl终端注释存在问题

The repl terminal contains comments when inputting multiple lines of code, and there is a problem of parsing errors, as shown in the following figure:
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.