GithubHelp home page GithubHelp logo

goldandrabbit / gold-deep-rank Goto Github PK

View Code? Open in Web Editor NEW
11.0 1.0 5.0 1010 KB

Deep neural network codes for ctr/cvr prediction task in ranking process implemented by Tensorflow (1.14/2.4.1 version), using tf.estimator api.

Python 99.81% Shell 0.19%

gold-deep-rank's Introduction

gold-deep-rank

  • A well-organized experimental code for Ads/Recsys Ranking process implemented by Tensorflow, adopting tf.estimator api.
  • Support for flexible parameter customization, suitble for industrial development.
  • Tensorflow version compatibility: support tf 1.14 and tf 2.4.1.

Why Deep CTR model?

  • Auto Feature Interaction. 深度学习用于CTR预估问题, 主要优势是通过网络设计达到自动学习特征交互Feature Interaction的目的. 本文中涉及到的模型均是解决Feature Interaction的不同网络设计.
  • Better Sparse ids presentation Support. 相比GBDT模型, DNN对稀疏id类特征有更好的表示学习能力. 业务需求中往往存在海量且稀疏id类特征, 通过embedding支持对海量id类特征具备较强的表示学习能力.
  • Memorization & Generalization. 记忆性和泛化性是推荐系统重要的两类能力, 这两类目标通过Wide & Deep Learning结构同时学得, wide part采用FTRL实现, 目的是使得对id类特征具有memorization(记忆性); DNN结构具有generalization的特性(泛化性);
  • 整理实现. 封装在gold-deep-rank这个项目中, repo地址: https://github.com/GoldAndRabbit/gold-deep-rank 主要参考作者源码以及开源库.

Deep CTR framework

Wide & Deep Learning

  • Google提出将线性层和DNN同时优化的一般结构, 在此基础上对DNN部分做优化/定制.
  • 泛化性和记忆性是推荐系统的重要的两类基础能力.

DeepFM

deepfm

  • fm是二阶特征交互的基础方法, 可作为一般Baseline.
  • fm复杂度降低实现推导,将复杂度从O(kn^2)降低到O(kn), 简单记忆方法: sum_square-square_sum, 不要忘了前面还有1/2常系数.

PNN: Inner/Outer product

  • 向量的内积和外积可以定义两种vec的交叉方式, 很朴素的feature interaction**.

DCN: Cross Network

  • **是实现多项式形式的feature interaction,其实和一般意义上的特征交叉有所区别.

dcn

xDeepFM: CIN

  • 引入vector-wise feature交叉, 而不是bit-wise.
  • CIN的结构不建议理解公式(形式化复杂), 结合图和源码看比较容易理解.

AFM: FM based attention

  • 在原有deepfm基础上, 加一层attention layer

afm

AutoInt: Multi-head attention

  • 引入multi-head self attention学习feature interaction, 关于multi-head self attention查看transformer原理

FiBiNet: SENET & Bi-linear interaction

  • 引入SENET学习feature interaction

Dataset Description

Dataset Description
Census Incomes Extraction was done by Barry Becker from the 1994 Census database. Prediction task is to determine whether a person makes over 50K a year.
Alibaba Display Ads Alibaba Display ADs
Criteo To be updated...
Avazu To be updated...
Tencent Social Ads To be updated...

Evaluation

To be updated...

Update Log

  • 20210421: Support tf 2.4.1 version.
  • 20210410: Fix census csv data read bug. Update README.md: add deep interaction docs.

gold-deep-rank's People

Contributors

goldandrabbit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

gold-deep-rank's Issues

train_census_ctr_model.py运行疑问

按照项目运行train_census_ctr_model.py
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into ./model_dir/ckpt_dir/model.ckpt.
INFO:tensorflow:loss = 0.9890392, step = 1
INFO:tensorflow:global_step/sec: 153.472
INFO:tensorflow:loss = 1.848564e-09, step = 501 (3.258 sec)
INFO:tensorflow:global_step/sec: 263.287
INFO:tensorflow:loss = 2.7345473e-10, step = 1001 (1.899 sec)
INFO:tensorflow:global_step/sec: 265.398
INFO:tensorflow:loss = 3.7554106e-11, step = 1501 (1.884 sec)
INFO:tensorflow:Saving checkpoints for 2000 into ./model_dir/ckpt_dir/model.ckpt.
INFO:tensorflow:global_step/sec: 252.654
INFO:tensorflow:loss = 1.8669908e-11, step = 2001 (1.979 sec)
INFO:tensorflow:Saving checkpoints for 2036 into ./model_dir/ckpt_dir/model.ckpt.
INFO:tensorflow:Loss for final step: 5.3157163e-11.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saving dict for global step 2036: accuracy = 1.0, auc = 1.0, global_step = 2036, loss = 4.4924506e-10
loss 和accuracy、auc等指标都不太对,请问是为什么呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.