GithubHelp home page GithubHelp logo

zhangbincheng1997 / ebook-lsi-cnn Goto Github PK

View Code? Open in Web Editor NEW
10.0 3.0 2.0 74.84 MB

基于潜在语义索引(LSI)和卷积神经网络(CNN)的深度智能阅读模型

Jupyter Notebook 98.72% Python 1.28%
lsi ebook cnn deeplearning

ebook-lsi-cnn's Introduction

智能阅读模型

概述

总体分为两部分:第一部分关键词匹配,第二部分精准匹配,最后选择置信度最高的正确回答。

第一部分可使用 TF-IDF、LSI 等传统方法。

第二部分可使用 基于深度学习的问答系统。

本文主要构建第二部分,第一部分可参考:https://github.com/littleredhat1997/doc-similarity

下载数据

  1. http://www.tipdm.org/jingsa/1253.jhtml
    train_data_complete.json、test_data_sample.json、submit_sample.txt => main/data 文件夹

  2. https://spaces.ac.cn/archives/4338
    me_train.json => generalization/data 文件夹

运行项目

$ ......
1. word2vec/step.ipynb -> 
  word2vec/word2vec.ipynb
2. main/data/data.ipynb -> 
  main/data/1xxxxxx.ipynb 
  main/data/2xxxxxx.ipynb 
  main/data/3xxxxxx.ipynb 
  main/data/4xxxxxx.ipynb 
  main/data/5xxxxxx.ipynb 
  -> main/evaluate.ipynb
3. test/data/newdata.ipynb -> 
  test/data/data.ipynb -> 
  test/predict.ipynb -> 
  test/evaluate.ipynb

$ tree
.
├── word2vec
│   ├── step.ipynb
│   └── word2vec.ipynb
├── main
│   ├── data
│   │   └── data.ipynb
│   ├── 1_FastText.ipynb
│   ├── 2_CNN1.ipynb
│   ├── 3_CNN2.ipynb
│   ├── 4_BiLSTM.ipynb
│   ├── 5_Attention.ipynb
│   └── evaluate.ipynb
└── test
    ├── data
    │   ├── data.ipynb
    │   └── newdata.ipynb
    ├── evaluate.ipynb
    └─── predict.ipynb

模型设计

  1. FastText

alt text

  1. CNN

alt text

  1. Bi-LSTM

alt text

  1. Attention

alt text

ebook-lsi-cnn's People

Contributors

zhangbincheng1997 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.