GithubHelp home page GithubHelp logo

samzhan_tag's Introduction

Dependency:

tensorflow > 1.0.0

python == 2.7.12

Preprocess:

参数在代码最后一行:

Data_extraction('raw_data', 'MySheet', './usr.dict', './stop_words.txt', 'all', 'test_raw', True)
字段 意义
raw_data 训练原始文件夹(可包含多个训练文件)
MySheet 数据所在Sheet,这个导师给的名字老变,最好统一下
usr.dict jieba分词需要load的dict
stop_words.txt 停用词集
all 预处理后数据所在文件夹名,前缀是‘data_’
test_raw ‘看点’ 源文件的文件夹,用于最后测试
True True生成train和dev, False生成test
python Preprocess.py

这个写的有点糙,主要是前期需求,数据格式,不太稳定,不过逻辑简单,很好改,最终确认再改改。这步完成后会生成可训的训练文件或者测试文件。

Train:

建立model文件夹

mkdir yourmodeldir

生成config文件(超参数表)

python train_test.py --weight-path yourmodeldir

训练

CUDA_VISIBLE_DEVICES=1 python train_test.py --weight-path yourmodeldir --load-config

Test:

Model Link

https://pan.baidu.com/s/1K4R8kL9UkC2BYBng5aVwmg
CUDA_VISIBLE_DEVICES=1 python train_test.py --weight-path yourmodeldir --load-config --train-test test > all_res.txt

res.txt即为预测结果

我服务器中文件目录

.
├── all_model
│   ├── checkpoint
│   ├── classifier.weights.data-00000-of-00001
│   ├── classifier.weights.index
│   ├── classifier.weights.meta
│   ├── config
│   └── run.log
├── all_res.txt
├── Config.py
├── Config.pyc
├── data_all
│   ├── all_res.txt
│   ├── dev
│   ├── dict
│   ├── test
│   ├── tmp_label
│   └── train
├── data_entertain
│   ├── dev
│   ├── dict
│   └── train
├── data_finance
│   ├── dev
│   ├── dict
│   ├── test
│   ├── tmp_label
│   └── train
├── data_helpers.py
├── data_helpers.pyc
├── dict
├── entertain_model
│   ├── checkpoint
│   ├── classifier.weights.data-00000-of-00001
│   ├── classifier.weights.index
│   ├── classifier.weights.meta
│   ├── config
│   └── run.log
├── finance_model
│   ├── checkpoint
│   ├── classifier.weights.data-00000-of-00001
│   ├── classifier.weights.index
│   ├── classifier.weights.meta
│   ├── config
│   └── run.log
├── Model_father.py
├── Model_father.pyc
├── Preproess.py
├── raw_data
│   ├── toutiao_content_image_2.xlsx
│   └── toutiao_content_image.xlsx
├── res.txt
├── SeqLabel_model.py
├── SeqLabel_model.pyc
├── stop_words.txt
├── test_raw
│   └── kd_new_0703-2.xls
├── TfUtils.py
├── TfUtils.pyc
├── Train_father.py
├── Train_father.pyc
├── train_test.py
├── usr.dict
├── util.py
└── util.pyc

samzhan_tag's People

Contributors

gitsamshi avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.