GithubHelp home page GithubHelp logo

leereilly / swot Goto Github PK

View Code? Open in Web Editor NEW
1.1K 31.0 19.2K 2.76 MB

:school: Identify email addresses or domains names that belong to colleges or universities. Help automate the process of approving or rejecting academic discounts.

License: MIT License

Ruby 100.00%

swot's Introduction

Swot 🍎

Build Status Gem Version

If you have a product or service and offer academic discounts, there's a good chance there's some manual component to the approval process. Perhaps .edu email addresses are automatically approved because, for the most part at least, they're associated with American post-secondary educational institutions. Perhaps .ac.uk email addresses are automatically approved because they're guaranteed to belong to British universities and colleges. Unfortunately, not every country has an education-specific TLD (Top Level Domain) and plenty of schools use .com or .net.

Swot is a community-driven or crowdsourced library for verifying that domain names and email addresses are tied to a legitimate university of college - more specifically, an academic institution providing higher education in tertiary, quaternary or any other kind of post-secondary education in any country in the world.

Pop quiz: Which of the following domain names should be eligible for an academic discount? stanford.edu, america.edu, duep.edu, gla.ac.uk, unizar.es, usask.ca, hil.no, unze.ba, fu-berlin.de, ecla.de, bvb.de, lsmu.com. Answers at the foot of the page.

Installation

Swot is a Ruby gem, so you'll need a little Ruby-fu to get it working. Simply

gem install swot

Or add this to your Gemfile before doing a bundle install:

gem 'swot'

Requirements

  • Ruby >= 2.0

Usage

Verify Email Addresses

Swot::is_academic? '[email protected]'           # true
Swot::is_academic? '[email protected]'           # true
Swot::is_academic? '[email protected]'  # true
Swot::is_academic? '[email protected]'                   # true
Swot::is_academic? '[email protected]'                 # true
Swot::is_academic? '[email protected]'               # false

Verify Domain Names

Swot::is_academic? 'harvard.edu'              # true
Swot::is_academic? 'www.harvard.edu'          # true
Swot::is_academic? 'http://www.harvard.edu'   # true
Swot::is_academic? 'http://www.github.com'    # false
Swot::is_academic? 'http://www.rangers.co.uk' # false

Find School Names

Swot::school_name '[email protected]'
# => "University of Strathclyde"

Swot::school_name 'http://www.stanford.edu'
# => "Stanford University"

Contributing to Swot

Contributions welcome! Please see the contribution guidelines for details on how to add, update, or delete schools. Code contributions and ports to different languages welcome too.

Thanks to the following people for their contributions: @blutack, @captn3m0, @chrishunt, @johndbritton, @johnotander, @pborreli, @rcurtis, @vikhyat,.

Special thanks to @weppos for the public_suffix gem 🤘

Known Issues

  • You can search by email and domain names only. You cannot search by IP.
  • You don't know if the email address belongs to a student, faculty, staff member, alumni, or a contractor.
  • There may be a few false positives, missing institutions... maybe even a couple of typos. Contributions welcome!

Please note: just because someone has verified that they own [email protected] does not mean that they're a student. They could be faculty, staff, alumnni, or maybe even an external contractor. If you're suddenly getting a lot of traffic from websites like FatWallet or SlickDeals, you might want to find out why. If you're suddenly getting a lot of requests from a particular school, you should look into that too. It may be good business, word of mouth, or someone may have found a loophole. Swot gives you a high confidence level - not a guarantee. I recommend putting some controls in place or at least monitor how it's doing from time to time.

What is a swot?

According to UrbanDictionary 📘

A word used by morons to insult a person of superior academic abilities.

or

[verb] To Swot; Revision undertaken preceding an examination.

or

[backronym] Stupid Waste of Time

Pop Quiz Answers

Hopefully, you'll be surprised by some of this:

Domain Academic? Comments
stanford.edu ✔️ OK, this was an easy one so you could get at least one right
america.edu ✖️ Prior to October 29th 2001, anyone could register a .edu domain name (details)
duep.edu ✔️ Alfred Nobel University is a Ukranian University in the Ukraine i.e. not in the USA 🇺🇸
gla.ac.uk ✔️ Glasgow University in Scotland
unizar.es ✔️ The University of Zaragoza in Spain
usask.ca ✔️ The University of Saskatchewan in Canada
hil.no ✔️ Lillehammer University College in Norway
unze.ba ✔️ University of Zenica in Bosnia and Herzegovina
fu-berlin.de ✔️ Free University of Berlin in Germany
ecla.de ✔️ ECLA of Bard is a state recognized liberal arts university in Berlin, Germany
bvb.de ✖️ It's a soccer team from Germany
lsmu.com ✔️ Lugansk State Medical University in the Ukraine

If you verified this by visiting all of the websites, how long did it take you? Did you have fun? Imagine you had to do this 10 - 100 times every day. Now you know a little something about the inspiration for Swot. Swot can verify them all in a fraction of a second and remove a 💩 part of someone's job.

See Also

swot's People

Contributors

afeld avatar atcastells avatar benbalter avatar bup3 avatar dorothyhelene avatar ertugrulcetin avatar fcroc avatar haroenv avatar harshadsabne avatar johndbritton avatar kasienkanowicka avatar leereilly avatar lyzidiamond avatar martin-rueegg avatar mateunho avatar mik-laj avatar mkcode avatar nathanshox avatar oluwasetemi avatar palaxer avatar pchaigno avatar reidweb avatar rolandspannagl avatar sannithibalaji avatar santhoshmenon avatar steinam avatar sushramesh avatar tarebyte avatar tmcw avatar zhedar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swot's Issues

School uses subdomain for students

Hi!

I wanted to add my school to the list but they use a subdomain for student emails.

So my email is [student-id]@student.idcollege.nl

Should i create "domains/nl/idcollege/student.txt" or "domains/nl/student.idcollege.nl"

Thanks in advance!

Add my college

Hi I want to add my college to your list but I can't do that :( It's name is Killester College of Further Education

Website:

killestercollege.ie

2 missing web sites

Although I see that there is odtu.edu.tr I cannot see metu.edu.tr and metu.edu

Bohunt School

Please include the domain for this school.
bohunt.hants.sch.uk

Process to validate new domains

I've just been added as a contributor by @leereilly (thanks!) and I'm wondering how you usually proceed to check that a new domain is valid.
I guess you check that there is a website for the school. Do you also check the SMTP server? Do you try to find examples of email addresses? Anything else?

/cc @tmcw @afeld @lyzidiamond

Give schools some attributes

Per #37 (comment), I think it would be helpful to include a few pieces of information (if possible) about each school:

  • Name
  • Homepage
  • Level (high-school/community-college/etc.)
    • Is this hard to standardize internationally?
  • Accredited (true/false)
    • May be inferrable from level?
  • public/private
  • Location
    • How to handle schools w/ multiple campuses?

Having this information will broadly increase the applicability of this dataset, e.g. I'd love to collaborate w/ Code.org on merging their listing. /cc @brandonbloom @tobyk100

@leereilly is the reason for having the flat files (instead of a a big file) that you don't have to load all the information in memory? If we want to stick with that structure, it might make sense to make each entry under https://github.com/leereilly/swot/tree/master/lib/domains its own little YAML/JSON file with the attributes above.

Please, add SPSE a VOS Pardubice

Please, add "Střední průmyslová škola elektrotechnická a Vyšší odborná škola Pardubice",
in english "High Technical School of Electrical Engineering and Further Education College Pardubice"

Domain is www.spse.cz

Thank you!

Pull request generator

Type in the URL and the name of a school and it generates a pull request for you

We have 106 incorrectly formatted pull requests in the queue. This is a problem.

Idea: Create a "domain" label

To help triage the backlog and differentiate between "add this domain" and "add this feature" or "fix this bug".

Depending on the workflow, other labels may make sense (e.g., needs review), but right now, it looks like we have no labels?

Fixing tests

So, as far as I can read the Travis & Git tea leaves

The failures are

  1) Failure:
Swot#test_0004_test aliased methods [/home/travis/build/leereilly/swot/test/test_swot.rb:65]:
Expected: nil
  Actual: "University of Strathclyde"


  2) Failure:
Swot#test_0001_recognizes academic email addresses and domains [/home/travis/build/leereilly/swot/test/test_swot.rb:12]:
Expected: false
  Actual: true


  3) Failure:
Swot#test_0002_returns name of valid institution [/home/travis/build/leereilly/swot/test/test_swot.rb:55]:
Expected: nil
  Actual: "University of Strathclyde"

These tests pass for me locally using 2.0.0-p643 and `2.2.0 but fail in both on travis.


/cc @leereilly & @afeld as resident rubyists

No one accepting PR's

For the past month no PR has been accepted, by anyone.

This repository is going to become unmaintained if no one is given the power to accept PRs. The PRs are already piling up.

We need to get some new maintainers selected somehow. Informal election may work? With candidates submitting themselves to help out, and getting plus-oned by the community.

Please Add School

Domain: student.nhvweb.net
School: North Hunterdon-Voorhees High School
For: Both my daughter and son are enrolled in web design classes

add fhstp.ac.at

Hi,
please add fhstp.ac.at (=fh-stpoelten, Fachhochschule St. Pölten)

ibgen-rs

IBGEN - Instituto Brasileiro de Gestão de Negócios - Porto Alegre/RS

Add UEZO

Centro Universitário Estadual da Zona Oeste

New release

It seems the last release was in December.

Looking at v0.4.2...master, it seems the only change (other than naughty_or_nice 1.0 was list updates).

Any objection to pushing out a new Gem? Does it need to be a Major bump since the Swot class is no longer a child of NaughtyOrNice?

/cc @tmcw and @lyzidiamond as I'm not sure if you're using this in Ruby-land or Node-land and don't want to break things.

Support for Luxembourg

Hi there, great Script.

Please consider adding luxembourgish TLDs.
The ones mostly used are "uni.lu", "school.lu" and "education.lu"
Same for e-mail endings.

Thank you.

Tests should assert format

Tests should assert that all files within the schools directory end with .txt, not any other extension.

No more directories

I really want to stop nesting files in swot: the nesting was, I believe a perf optimization for ruby, but it really doesn't work very well usability-wise, and isn't necessary for a system like node

Backronym

I'm being super nit-picky here (sorry!) but:

[anagram] Stupid Waste of Time

That's a backronym, not an anagram. An example of an anagram of swot is stow.

Not that this matters at all ;)

gymji

Gymnázium Jihlava

Mexican University

Hi, I'm trying to sync this url

www.itlac.mx

but I failed a lot of times. Tried:
*mx
* itlac.txt
--- Instituto Tecnológico de Lázaro Cárdenas (ITLAC)
--- Instituto Tecnologico de Lazaro Cardenas (ITLAC)

Separate repo for data, or script for ports to sync?

I can see that there are a few ports of swot, and I'm not sure what's the strategy for a port to stay up to date with the data component of swot. If not a standalone data repo, could there be a script that pulls & syncs data from this repo?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.