GithubHelp home page GithubHelp logo

wechat_big_data_baseline_pytorch's Introduction

2021**高校计算机大赛-微信大数据挑战赛Baseline(Pytorch版)

代码实现基于DeepCTR,只做了简单数据预处理,采用的是基本特征(离散特征:{'userid', 'feedid', 'authorid', 'bgm_song_id', 'bgm_singer_id'},连续特征:{'videoplayseconds'}),单个任务逐个预测,大家可以尝试改进原有的模块,并尝试新的模型和新的建模方法。 baseline纯为学习和参考,有什么做的不对的地方,还请大佬们批评和指正😄

1.环境配置

  • python3
  • torch 1.6
  • deepctr-torch 0.2.6
  • pandas 1.0.1
  • scikit-learn 0.22.1

2.运行配置

  • CPU/GPU均可
  • 最小内存要求
    • 特征/样本生成:6G
    • 模型训练及评估:4G

3.目录结构

  • prepare_data.py 数据集生成
  • baseline.py: 模型训练,评估,提交

4.运行流程

  • 新建data目录,下载比赛数据集,放在data目录下并解压,得到wechat_algo_data1目录
  • 数据集生成:运行prepare_data.py
  • 模型训练,评估,提交:运行baseline.py

5.模型及参数

模型:DeepFM

参数: batch_size: 512

optim: Adagrad

num_epochs: 5

learning_rate: 0.1

6.模型结果

我的线下验证集评价指标还没有改成和官网一致的方法,所以参考意义不大,需要大家重写评估方法

线上:

weight_uauc read_comment like click_avatar forward
0.640712 0.624132 0.61151 0.705983 0.664097

7.相关文献

Guo H, Tang R, Ye Y, et al. Deepfm: a factorization-machine based neural network for ctr prediction[J]. arXiv preprint arXiv:1703.04247, 2017.

wechat_big_data_baseline_pytorch's People

Contributors

dpoqb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.