alibaba / easyrec Goto Github PK

View Code? Open in Web Editor NEW

1.5K 50.0 285.0 113.91 MB

A framework for large scale recommendation algorithms.

License: Apache License 2.0

Python 96.42% Shell 1.22% Lua 1.81% Dockerfile 0.08% C++ 0.47%

recommendation-algorithms recommender-system dssm esmm mind deepfm dlrm autoint din eges

easyrec's Introduction

EasyRec Introduction

What is EasyRec?

EasyRec is an easy-to-use framework for Recommendation

EasyRec implements state of the art deep learning models used in common recommendation tasks: candidate generation(matching), scoring(ranking), and multi-task learning. It improves the efficiency of generating high performance models by simple configuration and hyper parameter tuning(HPO).

Get Started

Running Platform:

Why EasyRec?

Run everywhere

Local / MaxCompute / EMR-DataScience / DLC
TF1.12-1.15 / TF2.x / PAI-TF

Diversified input data

MaxCompute Table
HDFS files / Hive Table
OSS files
CSV files / Parquet files
Datahub / Kafka Streams

Simple to config

Flexible feature config and simple model config
Build models by combining some components
Efficient and robust feature generation[used in taobao]
Nice web interface in development

It is smart

EarlyStop / Best Checkpoint Saver
Hyper Parameter Search / AutoFeatureCross / Knowledge Distillation / Features Selection
In development: NAS

Large scale and easy deployment

Support large scale embedding and online learning
Many parallel strategies: ParameterServer, Mirrored, MultiWorker
Easy deployment to EAS: automatic scaling, easy monitoring
Consistency guarantee: train and serving

A variety of models

DSSM / MIND / DropoutNet / CoMetricLearningI2I / PDN
W&D / DeepFM / MultiTower / DCN / FiBiNet / MaskNet / PPNet / CDN
DIN / BST / CL4SRec
MMoE / ESMM / DBMTL / PLE
HighwayNetwork / CMBF / UNITER
More models in development

Easy to customize

Support component-based development
Easy to implement customized models and components
Not need to care about data pipelines

Fast vector retrieve

Run knn algorithm of vectors in distribute environment

Document

Contribute

Any contributions you make are greatly appreciated!

Please report bugs by submitting a GitHub issue.
Please submit contributions using pull requests.
please refer to the Development document for more details.

Cite

If EasyRec is useful for your research, please cite:

@article{Cheng2022EasyRecAE,
  title={EasyRec: An easy-to-use, extendable and efficient framework for building industrial recommendation systems},
  author={Mengli Cheng and Yue Gao and Guoqiang Liu and Hongsheng Jin and Xiaowen Zhang},
  journal={ArXiv},
  year={2022},
  volume={abs/2209.12766}
}

Contact

Join Us

DingDing Group: 32260796. (EasyRec usage general discussion.)
DingDing Group2: 37930014162, click this url or scan QrCode to join
Email Group: [email protected].

Enterprise Service

If you need EasyRec enterprise service support, or purchase cloud product services, you can contact us by DingDing Group.

License

EasyRec is released under Apache License 2.0. Please note that third-party libraries may not have the same license as EasyRec.

easyrec's People

Contributors

Stargazers

Watchers

Forkers

changqi1 chengaofei 777ki fengzi2023 laurasanchz2 amberlan1001 horsedongmin abc1239 allensmile vincentwei2021 qinhaihong-red paradisehit unaecho ajunlonglive xia-huang-411303 poson huangyanjuner fpzh2011 power1628 shawnli annnnnnnnnnnnn geogubd lgqfhwy cheeriopan mu-l shuizhuyualex taroyutao alexzhf haheh chengtianxiang xinghudamowang liguoyu1 wosu willcwang jqsl2012 fruitboy1226 xiaming9880 jackti yuan925 ssssxxxsd huangzheng001 jamesyang1986 xiejianer ouyangchucai swaitw ambier terrytang0905 1427832045 luke202001 juyongjiang chenzhongde darinyazanr chentingbupt leizton gavinljj xiaoqingwang febonacci katncandix2 hairenliao w764567792 ming-h wenhuanh 0xflotus thinking03 abcdddxy nevg9 xrtbuaa aliang-rec chenzhiyin zhaochenyang xslower techthiyanes zhanglangjd haojunyu guilindi agentai origami-og erdal-pb beijinggao luo-songtao nvzhou dyyyyt arrnos hellowodex faker-make jsnorman sens2010 ada1582 ocpro luckyplusten qingping209 leeflora yueandxuan liyinchao xwyangjshb zdaotian yufeifut xinray wildwind0 xrosliang

easyrec's Issues

能否支持zip或者tar.gz压缩的csv文件?

不支持

MultiTower必须有三个featuregroups吗

可以有很多很多个,一个也可以,DeepFM只能有2个或者3个

ODPS上退不出来

ODPS-1202005:Algo Job Failed-System Error-Wait over 30min, not enough resource. [ RequsetId: null ].

Instance 20211017182224640gq00ata2 Failed.
FAILED: Failed 20211017182228723gepjc292_b9f620e9_506a_48cd_ba2d_d0dad28d8e24:ODPS-1202005:Algo Job Failed-System Error-Wait over 30min, not enough resource. [ RequsetId: null ].

分布式训练阶段，出现这个问题是资源不足？但是我看资源监控，cpu和内存没有占用到100%。
其中hash bucket size 很大，有千万级。
-Dcluster="{"worker":{"count":8,"gpu":0,"cpu":1500,"memory":60000},"ps":{"count":8,"cpu":400,"memory":10000}}"

EasyRec 有实现 EGES(Enhanced Graph Embedding with Side Information) 吗

ComboFeatures组合特征也是按照原始格式进行输入吗

组合特征也是按照原始格式进行输入吗

能否支持zip或者tar.gz压缩的csv文件?

easy rec安装不成功，请问python版本有要求吗

https://easyrec.readthedocs.io/en/latest/feature/rtp_fg.html
我照这个文档装easy rec不成

DSW中的EasyRec WDL 案例部署到EAS中之后，如何请求？

#!/usr/bin/env python
from eas_prediction import PredictClient
from eas_prediction import TFRequest
from eas_prediction import ENDPOINT_TYPE_DIRECT
client = PredictClient('http://xxxx.cn-beijing.pai-eas.aliyuncs.com', 'zhl_deemfm')
client.set_token('M2Fxxxx')
client.init()
req = TFRequest('serving_default')

names = ["c1", "banner_pos", "site_id", "site_domain", "site_category", "app_id", "app_domain", "app_category",
         "device_id", "device_ip", "device_model", "device_type", "device_conn_type", "hour", "c14", "c15", "c16",
         "c17", "c18", "c19", "c20", "c21"]
for name in names:
  req.add_feed(name, [1], TFRequest.DT_STRING, [bytes("1", "utf-8")])
req.add_fetch('probs')

import time

resp = client.predict(req)

print(resp)
print(resp.get_values('probs'))
print(resp.get_tensor_shape('probs'))
print("average response time: %s s" % (timer / 10))

easy rec 中dssm 对于int类型的数据是怎么处理的？

No module named 'easy_rec.python.protos.train_pb2

运行的时候，报错，提示缺少 easy_rec.python.protos.train_pb2文件

AssertionError: sep[b','] maybe invalid: field_num=7, required_num=131

`2020-07-30 12:10:38.673426: W tensorflow/core/framework/op_kernel.cc:1261] Unknown: AssertionError: sep[b','] maybe invalid: field_num=7, required_num=131
Traceback (most recent call last):

File "/apsarapangu/disk3/mengli.cml/anaconda3/envs/tf_12_py36/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 206, in call
ret = func(*args)

File "/apsarapangu/disk3/mengli.cml/easy-rec/easy_rec/python/input/csv_input.py", line 21, in _check_data
(sep, field_num, len(record_defaults))

AssertionError: sep[b','] maybe invalid: field_num=7, required_num=131`

I use DIN algorithm with two sequences ，error is Input 1 has shape [4096 576 16] and doesn't match input 0 with shape [4096 2810 16].

InvalidArgumentError (see above for traceback): All dimensions except 2 must match. Input 1 has shape [4096 576 16] and doesn't match input 0 with shape [4096 2810 16].
[[node gradients/concat_5_grad/ConcatOffset (defined at /worker/tensorflow_jobs/easy_rec/python/compat/optimizers.py:249) = ConcatOffset[N=2, _class=["loc:@gradients/concat_5_grad/Slice"], _device="/job:worker/replica:0/task:6/device:CPU:0"](gradients/concat_6_grad/mod, gradients/concat_5_grad/ShapeN, gradients/concat_5_grad/ShapeN:1)]]

[我是一只搬运工] 如果无法定位错误，请把error.log发给我们协助debug。

其它错误:
在数据里面搜索Traceback
yarn logs -applicationId application_id > error.log
grep Traceback error.log -A 100
grep "Starting python process" error.log -A 100

提交任务后卡死，无法查看log

Killing container master.RMCommunicator (RMCommunicator.java:onContainersCompleted(40))

2020-09-04 11:43:51,536 INFO [AMRM Callback Handler Thread] master.RMCommunicator (RMCommunicator.java:onContainersCompleted(40)) - got container status for containerID=container_1598507699008_0094_01_000010, state=COMPLETE, exitStatus=-104, diagnostics=Container [pid=8788,containerID=container_1598507699008_0094_01_000010] is running beyond physical memory limits. Current usage: 9.8 GB of 9.8 GB physical memory used; 43.0 GB of 47.6 TB virtual memory used. Killing container.
Dump of the process-tree for container_1598507699008_0094_01_000010 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 8795 8788 8788 8788 (java) 2011 1219 2151108608 64952 /usr/lib/jvm/java-1.8.0/bin/java -Xmx256M com.aliyu

DSSM的输出score分数代表什么含义

这个score有时候超过1，是什么问题

combo feature在model config中怎么配置

怎么把combo feature这个组合特征配置到model config中, model config的featuregroup中的feature name怎么填写，因为组合特征是两个独立特征的组合

InvalidArgumentError (see above for traceback): Column size of the record to be saved: '588' does not match the default record column size: '17'.

Column size of the record to be saved: ‘6’ does not match the default record column size: ‘2’.

KeyError: 'device_make'

  File "/usr/lib/python3.7/site-packages/easy_rec/python/feature_column/feature_column.py", line 46, in __init__
    self.parse_id_feaure(config)
  File "/usr/lib/python3.7/site-packages/easy_rec/python/feature_column/feature_column.py", line 119, in parse_id_feaure
    if self.is_wide(config):
  File "/usr/lib/python3.7/site-packages/easy_rec/python/feature_column/feature_column.py", line 86, in is_wide
    return self._wide_deep_dict[feature_name] in [ WideOrDeep.WIDE,
KeyError: 'device_make'

windows上怎么配置环境?

建议使用ubuntu子系统: https://zhuanlan.zhihu.com/p/34133795

支持’\001’, ‘\002’等不可见字符作为Separator吗?

Word mistake

PAI-DSW DEMO (Rember to select Python 3 kernel)

Rember->Remember

配置文件提示大括号错误

[[ ------------------Disable OneDNN--------------------- ]]
Init odps proxy io environment success.
[2021-10-21 10:31:36.850331] [INFO] [78#78] [paiio/cc/platform/odps_io_manager/odps_io_config.cc:85] Odps environment init done.
[2021-10-21 10:31:53,176] [INFO] [78#MainThread] [tensorflow/python/util/auto_strategy_utils.py:108] Disable Auto Strategy.
[2021-10-21 10:31:53,176][INFO] Disable Auto Strategy.
[2021-10-21 10:31:53,177][INFO] set on pai environment variable: IS_ON_PAI
Traceback (most recent call last):
File "run.py", line 508, in
tf.app.run()
File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 128, in run
_sys.exit(main(argv))
File "run.py", line 357, in main
pipeline_config = config_util.get_configs_from_pipeline_file(config, False)
File "/worker/tensorflow_jobs/easy_rec/python/utils/config_util.py", line 48, in get_configs_from_pipeline_file
text_format.Merge(config_str, pipeline_config)
File "/worker/venv/lib/python2.7/site-packages/google/protobuf/text_format.py", line 735, in Merge
allow_unknown_field=allow_unknown_field)
File "/worker/venv/lib/python2.7/site-packages/google/protobuf/text_format.py", line 803, in MergeLines
return parser.MergeLines(lines, message)
File "/worker/venv/lib/python2.7/site-packages/google/protobuf/text_format.py", line 828, in MergeLines
self._ParseOrMerge(lines, message)
File "/worker/venv/lib/python2.7/site-packages/google/protobuf/text_format.py", line 850, in _ParseOrMerge
self._MergeField(tokenizer, message)
File "/worker/venv/lib/python2.7/site-packages/google/protobuf/text_format.py", line 923, in _MergeField
name = tokenizer.ConsumeIdentifierOrNumber()
File "/worker/venv/lib/python2.7/site-packages/google/protobuf/text_format.py", line 1392, in ConsumeIdentifierOrNumber
raise self.ParseError('Expected identifier or number, got %s.' % result)
google.protobuf.text_format.ParseError: 844:1 : '}': Expected identifier or number, got }.
Failed to execute system command. (exit code: 251.)

"train_config.fine_tune_checkpoint"这个参数是基于上一轮训练中表现最好的模型还是训练到最后的模型进行增量训练的

InvalidArgumentError (see above for traceback): Column size of the record to be saved: '4' does not match the default record column size: '2'.

这个是什么原因呢

English Documentation

Is it possible to provide an english documentation how to use EasyRec?

怎么下载EasyRec仓库得这些数据

运行 CUDA_VISIBLE_DEVICES=0 python -m easy_rec.python.train_eval --pipeline_config_path custom_config/dssm_hard_neg_sampler_on_taobao.config这样得命令，发现数据下载不下来

http status code: 400, error code: InvalidRequest, message: It is forbidden to copy appendable object in versioning state,

命令
pai -name easy_rec_ext
-project algo_public_dev
-Dres_project=algo_public_dev
-Dconfig=oss://yanzhen1/easy_rec_test/deepfm.config
-Dcmd=export
-Dexport_dir=oss://yanzhen1/easy_rec_test/export/
-Dcluster='{"worker" : {"count":1, "cpu":1000, "memory":40000}}'
-Darn=acs:ram::1730760139076263:role/aliyunodpspaidefaultrole
-Dbuckets=oss://yanzhen1/
-DossHost=oss-cn-beijing-internal.aliyuncs.com;
错误
Traceback (most recent call last):
File "run.py", line 252, in
tf.app.run()
File "/usr/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 128, in run
_sys.exit(main(argv))
File "run.py", line 246, in main
easy_rec.export(FLAGS.export_dir, config, FLAGS.checkpoint_path)
File "/worker/tensorflow_jobs/easy_rec/python/main.py", line 350, in export
export_dir_base=export_dir, serving_input_receiver_fn=serving_input_fn)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 694, in export_savedmodel
mode=model_fn_lib.ModeKeys.PREDICT)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 820, in _export_saved_model_for_mode
strip_default_attrs=strip_default_attrs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 959, in _export_all_saved_models
gfile.Rename(temp_export_dir.decode("utf-8") + '/', export_dir)
File "/usr/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 456, in rename
compat.as_bytes(oldname), compat.as_bytes(newname), overwrite, status)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnavailableError: req_id: 5F225E6AD0E798313135AFAF, http status code: 400, error code: InvalidRequest, message: It is forbidden to copy appendable object in versioning state, oss host:oss-cn-beijing-internal.aliyuncs.com, path:/yanzhen1/easy_rec_test/export_tmp/temp-1596087897/assets/pipeline.config.

dssm 经常 loss = Nan

run_metadata=run_metadata))
File "/home/xin/anaconda3/envs/tf12/lib/python3.6/site-packages/tensorflow/python/training/basic_session_run_hooks.py", line 753, in after_run
raise NanLossDuringTrainingError
tensorflow.python.training.basic_session_run_hooks.NanLossDuringTrainingError: NaN loss during training.

都是第二步就Nan了。用的是taobao数据.数据没动过.
我这边观察到得现象是，基本用taobao的数据训练都是有可能出现Nan。2000step内会出现。
代码是这里直接拉最新的下来，数据是也是, 应该是一样的版本

tensorflow版本，python版本: tf1.12 py3.6
执行步骤

cd EasyRec-master
wget https://easyrec.oss-cn-beijing.aliyuncs.com/data/easyrec_data_20210818.tar.gz
tar -zxvf easyrec_data_20210818.tar.gz
pip install -r requirements.txt
bash scripts/ci_test.sh
CUDA_VISIBLE_DEVICES=0 python -m easy_rec.python.train_eval --pipeline_config_path samples/model_config/dssm_neg_sampler_on_taobao.config
数据也一致

EasyRec 训练慢怎么办，怎么设置参数？

要么提高计算资源，例如：
（1）增加ps 的数量，从1增加到2，再增加到4，不要一次增加太多。
（2）worker 的cpu设置为1600，处理数据的并行程度增加。
（3）增加worker的数量。

要么缩小网络：把item 和room 的网络合并；把final dnn 缩小。

sync_replicas: false。同步训练修改为异步训练。

Check failed: dtype() == expected_dtype (1 vs. 2) double expected, got float

[2020-08-10 11:37:14.903966] [FATAL] [70#292] [tensorflow/core/framework/tensor.cc:626] Check failed: dtype() == expected_dtype (1 vs. 2) double expected, got float
xargs: ../python_bin: terminated by signal 6

报错:tensorflow.python.framework.error_implement.UnimplementedError: GetChildren not implemented

在试用wide&deep算法的时候报了这个错误，我也复现了。文档里搜出来说使用了CSVInput, 没太明白是什么问题以及要怎么解决？测试数据我用的odps上的pai_online_project里的数据，有哪位同学能帮忙回答下吗，谢谢

负采样中odps路径可以是这样的格式吗？如odps://xxxxxx/tables/xxx_train_sample_table,

easy_rec的最新版的安装包或者SDK里面这个PAI的命令是来源一个吗

pai -name easy_rec_ext -project algo_public
-Dcmd=train
-Dconfig=oss://easyrec/config/MultiTower/dwd_avazu_ctr_deepmodel_ext.config
-Dtables=odps://pai_online_project/tables/dwd_avazu_ctr_deepmodel_train,odps://pai_online_project/tables/dwd_avazu_ctr_deepmodel_test
-Dcluster='{"ps":{"count":1, "cpu":1000}, "worker" : {"count":3, "cpu":1000, "gpu":100, "memory":40000}}'
-Dwith_evaluator=1
-Dmodel_dir=oss://easyrec/ckpt/MultiTower
-Darn=acs:ram::xxx:role/xxx
-Dbuckets=oss://easyrec/
-DossHost=oss-cn-beijing-internal.aliyuncs.com;

tensorflow.python.framework.errors_impl.UnimplementedError: GetChildren not implemented

Traceback (most recent call last):
  File "run.py", line 300, in <module>
    tf.app.run()
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 128, in run
    _sys.exit(main(argv))
  File "run.py", line 243, in main
    train_and_evaluate_impl(pipeline_config, continue_train=FLAGS.continue_train)
  File "/worker/tensorflow_jobs/easy_rec/python/main.py", line 289, in _train_and_evaluate_impl
    _train_and_evaluate(estimator, train_spec, eval_spec)
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 471, in train_and_evaluate
    return executor.run()
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 637, in run
    getattr(self, task_to_run)()
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 642, in run_chief
    return self._start_distributed_training()
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/training.py", line 788, in _start_distributed_training
    saving_listeners=saving_listeners)
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 385, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1242, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1337, in _train_model_default
    input_fn, model_fn_lib.ModeKeys.TRAIN))
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1107, in _get_features_and_labels_from_input_fn
    self._call_input_fn(input_fn, mode))
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1194, in _call_input_fn
    return input_fn(**kwargs)
  File "/worker/tensorflow_jobs/easy_rec/python/input/input.py", line 315, in _input_fn
    dataset = self._build(mode, params)
  File "/worker/tensorflow_jobs/easy_rec/python/input/csv_input.py", line 52, in _build
    file_paths = tf.gfile.Glob(self._input_path)
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 385, in get_matching_files
    compat.as_bytes(single_filename), status)
  File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnimplementedError: GetChildren not implemented
Failed to execute system command. (exit code: 123.)

找不到pai这个命令 pai: command not found

RuntimeError: Collective ops must be configured at program startup

raise NanLossDuringTrainingError

File "/worker/venv/lib/python2.7/site-packages/tensorflow/python/training/basic_session_run_hooks.py", line 792, in after_run
raise NanLossDuringTrainingError
tensorflow.python.training.basic_session_run_hooks.NanLossDuringTrainingError: NaN loss during training.
Failed to execute system command. (exit code: 251.)

config 文件的train_config中num steps这个参数怎样动态传入

我现在用easyrec有个问题，我在做小时级模型更新，然后num steps这个参数基于数据量的变化是不固定的，我通过sql计算算出了一次num step跑的值，存入了oss中，我怎么把这个值赋给num_steps这个参数呢

pai 训练参数怎么设置才不会使用gpu

我的参数
{ "ps": { "count": 2, "cpu": 1000, "memory": 40000 }, "worker": { "count": 8, "cpu": 1000, "memory"}}

支持’\001’, ‘\002’等不可见字符作为分隔符吗?

支持

召回模型不支持fg吗

回只有dssm和mind，转fg配置的时候，那个这个参数写啥,dssm报错不支持

一些命令里面偏配置的有在配置文件里面指定的，也有在命令行参数里面传入的，需要统一

导出命令
Local
python -m easy_rec.python.export --pipeline_config_path dwd_avazu_ctr_deepmodel.config --export_dir ./export
–pipeline_config_path: config文件路径

–model_dir: 如果指定了model_dir将会覆盖config里面的model_dir，一般在周期性调度的时候使用

–export_dir: 导出的目录

比如这里export_dir，只能传参指定，这个不建议，要么和model_dir一样都使用可覆盖模式

自己写的算法，怎么才能在PAI上调用

Loss很大(>9), 不收敛

raw features没有做离散化或者归一化

Unable to get element as bytes. terminate called after throwing an instance of 'apsara::odps::algo::BaseException'

tensorflow.python.framework.errors_impl.InternalError: Unable to get element as bytes.
terminate called after throwing an instance of 'apsara::odps::algo::BaseException'
what(): build/release64/algo/data_io/table_writer/cluster/sql_record_writer.cpp(103): BaseException: |Commit to master failed

内存设置太小，被kill掉任务

xargs: /worker/venv/bin/python: terminated by signal 9
Failed to execute system command. (exit code: 253.)
The job has been killed by "OOM Killer", please check your job's memory usage.
total-vm:30193768kB, anon-rss:19770320kB, file-rss:0kB, shmem-rss:0kB