GithubHelp home page GithubHelp logo

poozhu / crawler-for-github-trending Goto Github PK

View Code? Open in Web Editor NEW
182.0 4.0 19.0 71 KB

🕷️ A node crawler for github trending.

Home Page: https://poozhu.cn/project-center/#/github-trending

JavaScript 100.00%
crawler node-crawler github-trending

crawler-for-github-trending's Introduction

crawler-for-github-trending

50 lines, minimalist node crawler for GitHub Trending.
一个50行的 node 爬虫,一个简单的 axios, express, cheerio 体验项目。

Usage

一篇简单的介绍 https://juejin.cn/post/6844903827024396296

首先保证电脑已存在 node10.0+ 环境,然后

1.拉取本项目

git clone https://github.com/poozhu/crawler-for-github-trending.git
cd crawler-for-github-trending
npm i
node index.js

2.或者下载本项目压缩包,解压

cd crawler-for-github-trending-master  // 进入项目文件夹
npm i
node index.js

Examples

当启动项目后,可以看到控制台输出

Listening on port 3000!

此时打开浏览器,访问 http://localhost:3000/

http://localhost:3000/list/:time/:language // time 表示周期,language 代表语言  例如:

http://localhost:3000/list/daily  // 代表今日 可选参数:weekly,monthly
http://localhost:3000/list/daily/JavaScript  // 代表今日的 JavaScript 分类 可选参数:任意语言

稍微等待即可看到爬取完毕的返回数据:

[
 {
  "title": "lib-pku / libpku",
  "links": "https://github.com/lib-pku/libpku",
  "description": "贵校课程资料民间整理",
  "language": "JavaScript",
  "stars": "14,297",
  "forks": "4,360",
  "info": "3,121 stars this week"
 },
 {
  "title": "SqueezerIO / squeezer",
  "links": "https://github.com/SqueezerIO/squeezer",
  "description": "Squeezer Framework - Build serverless dApps",
  "language": "JavaScript",
  "stars": "3,212",
  "forks": "80",
  "info": "2,807 stars this week"
 },
 ...
]

More

本项目每次访问都会实时爬取数据,所以数据返回速度会比较慢,期望作为接口数据建议定时爬取到数据库。

但了解项目代码可以带来以上各个 node 模块和爬虫最基础的用法和概念,希望可以帮到大家。

Star History

Star History Chart

crawler-for-github-trending's People

Contributors

poozhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

crawler-for-github-trending's Issues

TypeError trimStart is not a function

运行环境:
Node Version:v6.11.3
NPM Version:5.8.0
MacOS Version: 10.14.5

在Mac下运行提示报错:

Listening on port 3000!
TypeError: $(...).find(...).text(...).trimStart is not a function
at Object. (/Users/jacksonshawn/GitHub/Crawler-for-Github-Trending/index.js:16:55)
at initialize.exports.each (/Users/jacksonshawn/GitHub/Crawler-for-Github-Trending/node_modules/cheerio/lib/api/traversing.js:300:24)
at /Users/jacksonshawn/GitHub/Crawler-for-Github-Trending/index.js:13:32

将index.js程序里面“.trimStart().trimEnd()”修改为“.trim()”则不会有这个问题,请问是什么原因,可以帮忙解答一下?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.