GithubHelp home page GithubHelp logo

yuanganggg / stripe_people Goto Github PK

View Code? Open in Web Editor NEW

This project forked from arthurjohnston/stripe_people

0.0 2.0 0.0 156 KB

Code to calculate what words Stripe employees use to describe themselves

Python 0.52% HTML 99.48%

stripe_people's Introduction

How Stripe Employees describe themselves

Over the weekend I was looking into using Stripe for a project. While reading their documentation I clicked on their about page. On it they seem to have pictures of (I believe) all of their employees. Scrolling over each of these pictures a short bio pops up. Viewing the source, you can see that the bios all look like this:

<span class="bio" style="top: -121px; left: -60px;">
                    <span class="inner">
                        <em>Patrick Collison</em>
                              Before Stripe, Patrick co-founded Auctomatic and wrote Encyclopedia. He
      studied math at MIT.

                    </span>
                    <span class="arrow"></span>
                </span>

After looking at a few of these I wondered if there were any common themes. Since I needed a 20 minute break from my actual project I decided to do some analysis. Since each of these mini biographies is in well formed html it was easy to use BeautifulSoup to pull out the raw text of each bio. After extracting the data and filtering out any stop words using NLTK I was curious to see what words were most common. The 25 lines to do this are all in main.py

Results

Since there are 180 bios on the page I looked for words that appeared in at least 5% or 9 biographies (making the simplifying assumption that in a given bios no word appears twice)

  • stripe appeared 114 times
  • studied appeared 85 times
  • worked appeared 57 times
  • previously appeared 36 times
  • joining appeared 29 times
  • enjoys appeared 26 times
  • university appeared 25 times
  • grew appeared 24 times
  • science appeared 22 times
  • computer appeared 20 times
  • time appeared 20 times
  • works appeared 17 times
  • team appeared 15 times
  • prior appeared 15 times
  • loves appeared 15 times
  • harvard appeared 15 times
  • new appeared 14 times
  • support appeared 14 times
  • stanford appeared 13 times
  • school appeared 13 times
  • also appeared 13 times
  • engineering appeared 12 times
  • mit appeared 11 times
  • things appeared 11 times
  • berkeley appeared 10 times
  • originally appeared 10 times
  • infrastructure appeared 10 times
  • dan appeared 10 times
  • d appeared 9 times (this is an artifact of how I cleaned up the text)
  • lives appeared 9 times
  • facebook appeared 9 times
  • founded appeared 9 times
  • google appeared 9 times
  • likes appeared 9 times

Ignoring words that are obviously about their transition to Stripe ('stripe', 'previously', 'joining') the most obvious trend is everyone talking about their education. 'studied' is the most common word after 'stripe' beating out 'worked' by a comfortable margin, 'university' and 'science' also making it into the top 10. People also tend to mention their alma matter with Harvard, MIT, Stanford and Berkeley getting a combined 49 mentions.

Another interesting trend you can glean is it doesn't look like they have a lot of employees from any particular company. 'google' and 'facebook' are the only two companies that have at least 9 mentions. But they're both tied with 'founded'. As in the person was a founded/co-founded a company

stripe_people's People

Contributors

arthurjohnston avatar

Watchers

James Cloos avatar yuangang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.