GithubHelp home page GithubHelp logo

aqi's Introduction

利用scrapy+scrapy-redis+selenium爬取aqi天气网全国所有城市的天气信息近50W条

aqistudy

scrapy主要用来做并发,非selenium渲染页面下的请求和存储io操作

redis可以做增量(城市链接不做指纹)或分布式(继承scrapyredis爬虫类),本次只做断点续爬(利用redis保存url指纹——集合、请求队列——有序集合)

selenium做month和day页面的渲染,这两个页面做了JS加密,selenium渲染完美解决。下载中间件重写process_request方法写入selenium操作,配合PhantomJS(此网页渲染对比chrome要快),取得数据重新封装response返回,引擎交给spider做解析。

数据保存为json格式,利用数据分析三件套:numpy、pandas、matplotlib进行数据清洗、展示。

先来一张全家福,看下七月初全国只要城市的aqi对比,其实好像还是能接受。。。调用了echart做渲染,效果比matplotlib好不少(^▽^)

7月1号全国主要城市空气质量

深圳6月份的aqi走势图,深圳不愧是一线中空气质量最好的,六月份的不良天数仅为一天

6月份深圳AQI指数走势

和之前工作的广州做了下对比,生活之都的广州在空气质量上略逊一筹

广深空气质量对比

后续会更新其他分析图表展示,jupyter数据清洗生成图表代码一并上传

aqi's People

Contributors

alige32 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.