GithubHelp home page GithubHelp logo

neea-toefl-testseat-crawler's Introduction

NEEA托福考位爬虫 Getting Started with NEEA TOEFL Testseat Crawler

本文档简要介绍了NEEA托福考位本地爬虫的使用方法。 This document provides a brief intro of the usage of NEEA TOEFL Test Seats Selenium Crawler.

动机 Motivation

NEEA 托福考位网站正在提供着不便的服务。在寻找考位时,我们需要按每个日期,每个城市一个个地搜索考位, 这为那些想尽快找到测试座位的人带来了无法忍受的体验。

为什么不直接以表格形式显示所有考位?

NEEA TOEFL Test Seat website, supported by Chinese National Education Examinations Authority (NEEA), is providing an inconvenience service. When looking for a test seat, we need to search date by every date, every city, which brings an intolerable experience for those who just want to find a test seat ASAP. Why not display the form of all the test seat?

安装要求 Requirements

安装方式 Install

  • Firefox mozilla geckodriver: the default geckodriver path is "C:\Program Files\Mozilla Firefox\geckodriver.exe". If you want to set your executable path, please use --webdriver_path='your path' to start.

  • 默认Firefox mozilla geckodriver是安装在"C:\Program Files\Mozilla Firefox\geckodriver.exe"路径中,如果你希望使用其他路径, 请使用 --webdriver_path='your path' 来启动爬虫。

Get start

default start

python crawler_toefl.py --username='NEEA ID number' --password='password'

When finished, you can get a .csv form file. 爬虫完成后将得到.csv表格文件。

Todo:

  1. faster, test time is 30 min 爬虫速度太慢了, 爬完全部数据目前需要30分钟
  2. headless mode 无界面模式怎么绕开反爬虫?
  3. Anti anti-crawler when click the 'search seats' button 怎么绕开反爬虫?
  4. online crawler (use a server) 在线爬虫(服务器)
  5. different modes 用户定制化爬虫

Acknowledgement

This idea is initially coming from https://www.jianshu.com/p/2541d918869e, thanks!

neea-toefl-testseat-crawler's People

Contributors

jianqiaomo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

neea-toefl-testseat-crawler's Issues

一个思路

这是官方的代码,可以不使用WebDriver。请求不能太频繁,否则会报400错误。

$.getJSON("testSeat/queryTestSeats", {
    city: $("#centerProvinceCity").val(),
    testDay: $("#testDays").val(),
    qryType: "NewOrder"
}, function (data) {
    if (data.status == true) {
        var tmpl = $.templates("testSeatListTemplate", {
            markup: "#testSeatListTpl",
            helpers: {
                formatCurrency: formatTestFee
            }
        }); // Get compiled template
        var html = tmpl.render(data);
        $("#qrySeatResult").html(html);
    } else {
        layer.msg("未查询到考位信息", {time: 2000, icon: 0, shift: 0});
        $("#qrySeatResult").empty();
    }
});

然后可以做一个指定城市和时间的功能吗,这样也可以减少不必要的操作。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.