GithubHelp home page GithubHelp logo

haiyang's Projects

distribute_crawler icon distribute_crawler

使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现

distributed_spider_pku_java icon distributed_spider_pku_java

1. 主要分为三个模块,一个爬虫抓取模块,一个是数据处理模块,一个是用户模块。 2. 爬虫抓取模块主要是从直播吧、新浪体育、网易体育上爬取有关足球的新闻和用户关于足球的评论,利用集群HADOOP抓取网页,分析得出URL集,提取特征URL 3. 网页linux脚本过滤得到原始网页,然后二次过滤得到文本,并使用分布式储存。 4. 处理模块主要是根据训练集规则一和规则二,得到分词器,然后对文本进行操作,得出训练结果。 5. 通过特征脚本得到训练结果的特征词分类,然后提取出球队模糊集和球星模糊集。 6. 过滤得到球队精确集和球星精确集,并存入MYSQL数据库。 7. 从数据库中提取球星和球队的信息进行图表分析,并动态显示WIKI信息,调入显示模块中和用户进行交换

distributedcrawler icon distributedcrawler

分布式爬虫,redis缓存,mysql持久化,rpc实现分布式。可用docker部署

distribution icon distribution

Short, simple, direct scripts for creating ASCII graphical histograms in the terminal.

distroless icon distroless

🥑 Language focused docker images, minus the operating system.

ditto icon ditto

Lightweight Markdown Documentation System

ditto-1 icon ditto-1

:elephant: Ditto is a scripting language implemented in C

dive icon dive

A tool for exploring each layer in a docker image

dive-into-dl-pytorch icon dive-into-dl-pytorch

本项目将《动手学深度学习》原书中的MXNet代码实现改为PyTorch实现。

dive-into-dl-tensorflow2.0 icon dive-into-dl-tensorflow2.0

本项目将《动手学深度学习》(Dive into Deep Learning)原书中的MXNet实现改为TensorFlow 2.0实现,项目已得到李沐老师的同意

diveintohtml5 icon diveintohtml5

A copy of Mark Pilgrim’s “Dive Into HTML5” book, hosted by HTML5 Doctors. To help improve submit a pull request or add an issue. More info at http://html5doctor.com/dive-into-html5-doctor/

diy icon diy

build you own robot in one hour! (this is the entry version "green" NO Bluetooth, for latest updates please go here:

dj-poetry-es icon dj-poetry-es

django+es搭建的前后端分离,唐诗宋词搜索引擎。

django icon django

The Web framework for perfectionists with deadlines.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.