GithubHelp home page GithubHelp logo

bluzi / name-db Goto Github PK

View Code? Open in Web Editor NEW
59.0 7.0 230.0 477 KB

:rocket: A multilingual collection of names from around the world

License: MIT License

JavaScript 100.00%
names language translations data

name-db's Introduction

Build Status

name-db

name-db is a collection of names in all languages. Our goal is to collect as much data as we can, and to provide an open-source free API for name translations.

Specs

name-db currently stores only first names.

Each name is stored in a JSON file, located in collection/. The following is the structure of a name file:

collection/{lowercase name}.json:

{
    "name": "", // English name, lowercase, coresponding to the filename
    "meaning": "", // The meaning of the name, in English
    "aliases": [], // An array of lowercase alias names, such as: richard -> dick, daniel -> dan, etc.
    "translations": {
        "{lowercase ISO-639-3 language code}": "{translation}" 
    },
    "sex": "" // (Optional) Gender of the name. Use a single, lowercase letter: `m` for male, `f` for female or `u` for unisex (names that can be male or female).
}

Example:

collections/jonathan.json

{
    "name": "jonathan",
    "meaning": "Hebrew for \"YHWH has given\"",
    "aliases": [
        "johnathan",
        "john",
        "yonathan"
    ], 
    "translations": {
        "heb": "ג'ונתן" 
    },
    "sex": "m"
}

The language codes are ISO 639-3 codes. For a list of language codes, please see: https://en.wikipedia.org/wiki/List_of_ISO_639-3_codes

Note that everything except the translations should be in English.

API

The API is still under development, but you can see the latest stable version here Note that you shouldn't use it in production yet - We still don't have enough data, and the endpoint is running on a cheap machine. Feel free to view the code, suggest features or create new features with a pull request - we're looking for help with the API.

Contribution (Easy PR, large impact!)

Making a contribution is real easy - just read the specs, and do one of these:

  • Add your/a name (if it doesn't exist yet)
  • Add a translation to existing name
  • Add meanings to existing names
  • Correct translations / meanings
  • Come up with a way we can do things better, and create an issue

Also, feel free to take a few aliases that doesn't have a file, and create their files.

Just fork the repository, do one of the tasks above, make a pull request and we'll approve it.

License

This project is licensed under the MIT License.

name-db's People

Contributors

achromik avatar ahmetcetin avatar alanmcruickshank avatar arminkhoshbin avatar bhargodevarya avatar bluzi avatar buslov avatar carnubak avatar cherkacho avatar chrisf avatar codewritingcow avatar copolycube avatar dashan124 avatar haykokoryun avatar heinerenrique avatar jhalaa avatar jose4125 avatar krns avatar lorsabyan avatar matiasgarciaisaia avatar neelambugalia avatar rahulkant13may avatar riznob avatar siqueira-ec avatar tigran-k avatar tomaszga avatar vichitr avatar vsc-github avatar yong0011 avatar yongcs19 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

name-db's Issues

Add a name

Hey guys,

We need you to add names to the collcetion.
see #1 for more info

Thanks!

Create a script to scrape names

I was thinking if it's allowed to just find a website with names and scrape it all and then using another script find translations of the name using Google translate?

I'm mean that would be so fast to grow the database.. instead of waiting for someone to commit a PR.

Require lowercase translations

  • Change all translations to be lowercase
  • Add a test that will prevent uppercase letters from appearing in a translation

Add tests

#- Add your favorite test engine to the project

  • Write some test cases (Validate the JSON, validate the encoding, enforce the specs, etc)
  • Add package.json and add test script. (So that npm run test will run the tests)

Add a license

We might want to specify a license somewhere in the project

Add language alphabet test

We need a solution for when someone adds a translation with the wrong alphabet.

For example, an Hebrew translation must always be spelled using the Hebrew alphabet, and a German translation must be spelled using the Latin alphabet.

Please create a generic test or use an external library. (For instance, this library)

Thanks!

Looking for collaborators

We're looking for people who would like to review and approve changes to the collection.
If you're interested and you already contributed to this repository before, you may apply by submitting a comment to this issue.

The goal is to grow and to make this repository as responsive to pull requests as possible.

Sinitic and Japanese names should be handled differently

First of all, I'd like to say this is a cool project. It's just that I think the schema is suitable for names that are not Chinese, Japanese, Korean, or Vietnamese.

  1. Homophones tend to be an issue when dealing with such names.
  • Take the file mei.json for instance. According to the file, "mei" means "plum" in Chinese. This is only true if the word is said with the correct tone (Chinese is a tonal language) and within a context where it would be implied that "plums" are being discussed. Disregarding tone, another possible candidate for the definition would include "beautiful."
  • I will use the file sora.json as another example. At first glance of the file, it is clear that there are multiple candidates for the Japanese translation in the form of kanji. I would suggest using kana instead of kanji. One kanji could have multiple homophones that are names, by either its on or kun reading.
  1. Such names are usually not monosyllabic or just a single character.

While I admit that this is a lot to take in, I would strongly suggest that you research this. Thank you for your time.

Multiple possible translations

What should I do in the case of a name with multiple possible translations? For example:
"Jonathon" can be either "יונתן" or "יהונתן" in Hebrew.

I can think of a few possibilities

  1. "code": "name/otherName"
    2. "code": "name"
    "code": "otherName"
  2. "code" ["name", "otherName"]

What do you think?

Clarify instructions - add names for each language

I noticed that pull requests are given for only the original language. I'm not sure if it's due to misunderstanding the intent of the list that you want translation of each name into all languages.

Add more tests

Add the following tests to the existing test cases:

  • Test for duplicate names - Solved #78
  • Test for good indention (compare the JSON as plain text to the result of JSON.stringify(json, null, 4) or something like that)
  • Test all language codes against a ISO-639-3 language codes, to make sure they all valid
  • Test if name has non-english characters
  • Make sure the translations object does not contain English (eng) translation - Solved #66
  • etc

These are just ideas, feel free to suggest more tests here, or just create a PR.

Names and nicknames/derivations

Hi, I was curious if as a feature request it would be cool to include common derivations of a name.

For example Richard can be shortened to Dick
Or Daniel is shortened to Dan or can be Danielle (female form).

What are your thoughts?

Add sex to names

Add genders to existing names. Add new key-value pair "sex": "string" to JSON name files.

Use single, lowercase letters: m for male, f for female and u for unisex names. Don't spell out the whole words.

For example:

{
    "name": "Mary"
    "sex": "f", 
    "meaning": "Most likely Egyptian name derived in part from mry \"beloved\" or mr \"love\". Also known by Greek and Hebrew form of \"sea of bitterness\", \"rebelliousness\", and \"wished for child\"",
    ...
}

You can also:

Translation crawler

@bluzi Hi there. I strongly believe that from time to time, a crawler could be ran in order to fetch translations from some official providers (for now, let's stick to Wikipedia).

Therefore, the goal would be to write such a scraper that goes through all entries and tries to fetch meaning/ translations/ aliases from given sources.

Let me know what you think. I could give you a hand of help with Python, if that's alright.

Add gender

Hi @bluzi and everyone: Should we add "sex" or "gender" to the JSON file structure?

Users might want to know whether a name is male, female or unisex. Especially if the name is from a language foreign to them.

Just a suggestion ...

English translation should not fail the tests

I've realised that English translations (eng ISO code) for names fail the test. This is not ideal since there are names which originates from other countries but they may have English meanings.

Has there been any reasoning behind this @bluzi ?
PR #244 is a good example.

Review deployment script

I recently added a deployment script and Travis deployment

I'd love if someone review it and write in this issue if there are things that could be better implemented.

Thanks!

Some info about what this script actually does:
Every time a branch/pull request is being merged to the master branch, Travis execute the deployment script atdeploy/deploy.collections.js.
This script connects to a MySQL database, checks to see what changed in collection/, and updates the database. (removes/changes/adds data, so it'll be exactly like collection/)

The MySQL database from above, is being used by the API to deliver the data. (see code)

Add route to test JSON contributions

Maybe we could consider adding a route that allows contributors to paste in their JSON and run it against our tests, that way they can verify it passes before submitting a PR. I think it's possible to run mocha tests on the client side.

Add features to the API

Currently, all the API can do is to search an exact name/alias in the database and return its meaning and translations.
We need more features, such as:

  • Search (/search/:term) - Should return a list of matches by a partial term (Search in both aliases/names)
  • Get translation (/:name/:language) - Should return the translation of :name in :language

Anything else you can think of :)

You can easily run the API just by cloning the repo and run npm install && npm start, then you should be able to browse to http://localhost:3000 and see the API is running.
Your clone will connect to the actual production database, however the user supplied there is only able to select, and is limited by its number of queries per hour.

If you need help or have any ideas, feel free to comment here.

Write about deployment and document the API in README.md

Every time a branch/pull request is being merged to the master branch, Travis execute the deployment script atdeploy/deploy.collections.js.
This script connects to a MySQL database, checks to see what changed in collection/, and updates the database. (removes/changes/adds data, so it'll be exactly like collection/)

The MySQL database from above, is being used by the API to deliver the data. (see code)

I'd love if someone could write about it in the README file (dive into the code to understand more)

More information about how to run the API (for example, how to run it locally): #198

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.