段清华DEAN's Projects
High level Python client for Elasticsearch
This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.
3D WebGL game engine with online toolset.
An experiment and demo-level tool for text information extraction (event-triples extraction), which can be a route to the event chain and topic graph, 基于依存句法与语义角色标注的事件三元组抽取,可用于文本理解如文档主题链,事件线等应用。
Code for paper 'Audio-Driven Emotional Video Portraits'.
Deepfakes Software For All
A library for efficient similarity search and clustering of dense vectors.
产学界最强(SOTA)的简繁中文拼写检查工具:FASPell Chinese Spell Checker (Chinese Spell Check / 中文拼写检错 / 中文拼写纠错 / 中文拼写检查)
saveTextAs() for all browsers & saveAs() FileSaver for HTML5
综合了同义词词林扩展版与知网(Hownet)的词语相似度计算方法,词汇覆盖更多、结果更准确。
Grid based on CSS3 flexbox
Fnlib provides a simple specification that can be used to create and deploy FaaS.
Wechaty Community Assitant Bot
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成&摘要相关工具、cocoNLP信息抽取工具
:crystal_ball: Tiny and blazing-fast fuzzy search in JavaScript
Toy Python implementation of http://www-nlp.stanford.edu/projects/glove/
Chinese version of GPT2 training code, using BERT or BPE tokenizer.
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型
A flexible tool for redirecting a given program's TCP traffic to SOCKS5 proxy.
gStore - a graph based RDF triple store.
自然语言处理 中文分词 词性标注 命名实体识别 依存句法分析 关键词提取 新词发现 短语提取 自动摘要 文本分类 拼音简繁
领域自适应文本挖掘工具(新词发现、情感分析、实体链接等),基于少量种子词和背景知识
A list of everything that goes in the <head> of your document
一个高性能,易于扩展且完全开源的自然交互系统
PyTorch implementations of neural network models for keyword spotting