GithubHelp home page GithubHelp logo

realdennis / ptt-parser Goto Github PK

View Code? Open in Web Editor NEW
20.0 1.0 13.0 469 KB

非同步 PTT 爬蟲,適用於各大看板,並保存於 JSON 格式,用於資料收集。

Home Page: https://medium.com/@realdennis/crawler-%E4%BD%BF%E7%94%A8puppeteer%E7%88%AC%E5%8F%96ptt%E7%9A%84%E7%B6%B2%E9%A0%81-1684568f6cb4

JavaScript 100.00%

ptt-parser's Introduction

ptt-parser

Introduction

使用puppeteer(pptr)撰寫的爬蟲,核心方法是使用DOM操作(querySelector),再把腳本傳進去pptr跑,讓前端人們輕而一舉的撰寫爬蟲,此repo作為教學/介紹用途,安裝方法請參考以下,或是自行clone下來install。

Dependency

Only one -- Puppeteer!

Demo

Async request/file save

Usage

$  npm install ptt-parser -g
$  ptt-parser gossiping 1000 false
    or
$  ptt-parser beauty //預設從最新一頁爬取

解釋: ptt-parser ${看板名} ${欲往前爬取的頁數} ${是否headless}

Result

ptt-parser's People

Contributors

realdennis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

ptt-parser's Issues

有關關閉 Browser 的問題

Dennis 大大 你好:

我在閱讀 puppeteer 時,他們的範例最後通常都有一個 browser.close() 的動作,而在大大的 codebase 中只有 page.close() 請問這是因為當時開發的版本 (v.1.7.0) 還沒有 browser.close() 這個 method ,還是大大在開發的當時就決定僅使用 page.close() 就足個了呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.