GithubHelp home page GithubHelp logo

intern_email's Introduction

intern_email

1.introduce

爬取实习信息(水木社区),归类并评分,将自己感兴趣的实习信息发送至邮箱。 介绍爬虫的blog 目前只有三种实习信息分类:开发/算法/金融

2.部署

安装mongodb,并启动。

mongod -dbpath /path/of/your/db

安装第三方python包

pip install bs4, scrapy, selenium, pymongo, schedule

安装phantomjs,去官网下载

3.启动

cd intern
python my_schedule.py

在服务器端部署:

nohup python my_schedule.py > sm_log.txt 2>1& &

输入自己的126邮箱和密码(以支持126的SMTP邮件转发服务),想修改为其他邮箱,需要修改其他SMTP的服务器地址。

4.个性化

修改目的邮箱: 在文件intern/send_2_email.py中,找到receivers列表,修改目的邮箱即可。 修改每天启动任务的时间: 在文件intern/my_schedule.py中,修改time_str为你需要的时间。 个性化自己的评分标准: 在文件intern/pipeline.py中,找到:

important_key_dict = {
            'NLP':5,
            'nlp':5,
            '自然语言处理':5,
            '文本挖掘': 5,
            'Spark':5,
            'spark':5,
            'LSTM':5,
            'lstm':5,
            'word2vec':5,
            'Tensorflow':5,
            'tensorflow': 5,
            '机器学习':4,
            '深度学习':4,
            '数据挖掘':4,
            '推荐':4,
            '文本分析': 4,
            '情感识别':4,
            '计算广告': 3,
            'python':3,
            'scala':3,
            '住房补贴':2,
            '房补':2,
        }

自己修改关键词,以及相应的权重,不限个数。

5.效果

intern_email's People

Contributors

applenob avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.