GithubHelp home page GithubHelp logo

BDPROTO

DOI

BDPROTO is a database of phonological inventories from ancient and reconstructed languages. The aggregated phonological inventory data and associated metadata is available in a flat CSV file in this directory named bdproto.csv. Bibliographic references for data points are available in the sources.bib file.

BDPROTO 1.0 is described in:

  • Marsico, Egidio, Sebastien Flavier, Annemarie Verkerk and Steven Moran. 2018. BDPROTO: A Database of Phonological Inventories from Ancient and Reconstructed Languages. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 1654-1658. May 7-12, Miyazaki, Japan. Online: http://www.lrec-conf.org/proceedings/lrec2018/pdf/534.pdf.

An expanded version, BDPROTO 1.1, is described in:

The BDPROTO data are available under the Creative Commons Attribution Share Alike 4.0 International license. If you use the BDPROTO data in your research, it is recommended that you make use of the most recent release in your own analyses. We archive each release of BDPROTO in Zenodo.

The original source data (and project name) come from:

This legacy resource was converted into Unicode UTF-8 using principles defined in Moran & Cysouw, 2018 and according to the PHOIBLE Unicode IPA conventions.

The original BDPROTO data is available in various formats along with the extraction and transformation scripts at: https://github.com/bdproto/bdproto-legacy. BDPROTO-legacy contains a convenience sample aimed at genealogical diversity and it contains no duplicate inventories for a given reconstruction.

Three additional resources have been compiled to update and extend the coverage of the original BDPROTO sample. These include the raw data in the src directory for the three resources:

These data points contain more recent reconstructions, which in some cases introduces more than one inventory for a given reconstruction.

The ancient-near-east inventories were collected as part of a project on ancient Near East languages at the Department of Comparative Linguistics at the University of Zurich. Additional inventories were also extracted from source references at the Department of Comparative Linguistics (we simply call this source uz). Work by The Hebrew University of Jerusalem includes phonological inventories from recent publications. This source is labeled huji.

For all of the data sources, we have gathered additional metadata (when available) including identifying information such as Glottolog codes and information about estimated time-depths, possible homelands, etc.

The phonological inventory data from the various raw input sources and their metadata are aggregated into a single flat-file table.

We have also collected and curated references for each data point and these are available in the sources.bib file.

Inferred geo-coordinates for homelands, when available, come from:

  • Wichmann, S., Müller, A., & Velupillai, V. 2010. Homelands of the world’s language families: A quantitative approach. Diachronica, 27(2), 247-276.

Preliminary studies based on BDPROTO have been presented at the following conferences:

  • Grossman, Eitan. 2021. Universals of phonological segment borrowing? Questions, evidence, methods, findings. Keynote address at the 43 Jahrestagung der Deutschen Gesellschaft fuer Sprachwissenschaft (DGfS). (Freiburg, Germany, 23-26 February 2021). Slides.

  • Moran, Steven and Eitan Grossman. 2021. Temporal bias: a new type of bias for typologists to worry about. 5th Usage-Based Linguistics Conference (Tel Aviv, Israel, July 5-7 2021). Slides.

Published papers making extensive use of BDPROTO include:

The data in this repository contain the development version, i.e., we continue to add, edit, and refine BDPROTO. We are also making the BDPROTO data available in the Cross-Linguistic Data Format here:

And a website is forthcoming.

bdproto's Projects

bdproto icon bdproto

BDPROTO: a database of phonological inventories from ancient and reconstructed languages

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.