GithubHelp home page GithubHelp logo

Comments (6)

NickCrews avatar NickCrews commented on June 26, 2024

Hi! Thanks for opening this! A few things:

  1. The relationship between canonical and nickname is not symmetrical. So Matt is a nickname for Matthew, but not vice versa. matt and Matthew are already present in the data, are you just not using this library correctly?
  2. I want to be conservative with what links are added, so that there aren't false positives. For instance, I'm skeptical of how common Barrett-Garrett is. 95% of your suggestions look good, but I want to leave out a few of the weirder ones.
  3. I can add these cases to the code, but only if you help me with the grunt work of formatting for me, putting these in the form CANONICAL,NICK0,NICK1,NICK2, etc

Let me know what you'd like to do!

from nicknames.

afeibus avatar afeibus commented on June 26, 2024

I'm unclear on the file structure and how to decide whether something is a name or nickname (e.g, is Kari a real name or a nickname for Carrie?). Would need more info to help with this.

I'm ok if stuff gets left out, the point of my issue report was to try to close some of the holes. Some of the names I'd never seen before either, but then looked up and found they were common in other countries (e.g., Garrett is big in Ireland).

from nicknames.

NickCrews avatar NickCrews commented on June 26, 2024

The file structure is CANONICAL,NICK0,NICK1,NICK2, NICK3,etc as you can see in the csv. Does that make sense?

Yeah the Kari/Carrie case is ambiguous. I would lean towards Carrie being the longer one and therefore the canonical one. But for the Sara/Sarah case, I think that is symmetrical, so you should have a line sarah,sara as well as sara,sarah. Just try your best and I can go through and give my 2 cents and we should be able to find something. Just trying to make it better than how it currently is, it doesn't need to be perfect.

from nicknames.

afeibus avatar afeibus commented on June 26, 2024

names.csv
This is close, maybe not all the possible canonicals, but enough that code could look in the nicknames to find related nicknames as canonical names too.

from nicknames.

NickCrews avatar NickCrews commented on June 26, 2024

Thank you @afeibus ! I made a few adjustments, but most of them looked great. Thank you very much, your work is very much appreciated! If you want, take a look at the above linked change and double check that I didn't do any changes to your edits that you disagree with.

from nicknames.

NickCrews avatar NickCrews commented on June 26, 2024

Closing as done, but if you find any problems with the tweaks I made please raise a new issue ( and link to this one)

from nicknames.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.