npm install gulp-cli -g
npm install
gulp
gulp build:dist
A collection of utilities for handling IPA phones.
Home Page: https://cdminix.me/phones
License: MIT License
I know it's not documented in the API reference, but attribute/method/property PhoneCollection.dialect_list
returns TypeError
:
>>> from phones import PhoneCollection
>>> pc = PhoneCollection(load_dialects=True)
>>> pc.langs("eus").dialect_list
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "[...]/python3.8/site-packages/phones/__init__.py", line 135, in dialect_list
return list(sorted(self.data[self.source.dialect_column].unique()))
TypeError: '<' not supported between instances of 'float' and 'str'
It marks an error here
I am using version 0.0.4
It might be still on testing phase since it is not documented, but in case you still haven't notice this bug.
Thanks in advance!
First of all, thanks for the repository, it may result very helpful.
I am having some issues, though. I installed the repository via pip. I tried to check for Iberian languages, and surprisingly not Catalan nor Asturian were loaded:
>>> from phones import PhoneCollection
>>> pc = PhoneCollection()
>>> pc.langs('cat').values
[]
>>> pc.langs('ast').values
[]
I checked the .csv file the program is using from phoible (https://raw.githubusercontent.com/phoible/dev/master/data/phoible.csv
) and both Asturian and Catalan are present, with those ISO codes (cat
and ast
). I don't know which could be the problem in this case. I imagine it could be related to the names of the dialects in some way.
Thanks in advance
To avoid future build fails that go undetected, some basic test should be implemented that import all modules and run some basic functionality
Currently https://cdminix.me/phones/examples/plots/ shows the phones in english and german - but it seems like some phones are not recognized to be shared between those two languages. Maybe this is some kind of float equality issue?
Greetings.
I see that the method PhoneCollection.values
fails. Using the commands featured in the basic usage section I encounter an error:
>>> from phones import PhoneCollection
>>> pc = PhoneCollection()
>>> ph = pc.langs("eng").values[0]
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1490, in array_func
result = self.grouper._cython_operation(
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 959, in _cython_operation
return cy_op.cython_operation(
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 657, in cython_operation
return self._cython_op_ndim_compat(
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 497, in _cython_op_ndim_compat
return self._call_cython_op(
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 541, in _call_cython_op
func = self._get_cython_function(self.kind, self.how, values.dtype, is_numeric)
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 173, in _get_cython_function
raise NotImplementedError(
NotImplementedError: function is not implemented for this dtype: [how->mean,dtype->object]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/pandas/core/nanops.py", line 1692, in _ensure_numeric
x = float(x)
ValueError: could not convert string to float: 'a aː aː'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/pandas/core/nanops.py", line 1696, in _ensure_numeric
x = complex(x)
ValueError: complex() arg is a malformed string
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/site-packages/phones/__init__.py", line 219, in values
self.data.groupby(
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1855, in mean
result = self._cython_agg_general(
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1507, in _cython_agg_general
new_mgr = data.grouped_reduce(array_func)
File "/usr/local/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 1503, in grouped_reduce
applied = sb.apply(func)
File "/usr/local/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 329, in apply
result = func(self.values, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1503, in array_func
result = self._agg_py_fallback(values, ndim=data.ndim, alt=alt)
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1457, in _agg_py_fallback
res_values = self.grouper.agg_series(ser, alt, preserve_dtype=True)
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 994, in agg_series
result = self._aggregate_series_pure_python(obj, func)
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/ops.py", line 1015, in _aggregate_series_pure_python
res = func(group)
File "/usr/local/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1857, in <lambda>
alt=lambda x: Series(x).mean(numeric_only=numeric_only),
File "/usr/local/lib/python3.8/site-packages/pandas/core/generic.py", line 11556, in mean
return NDFrame.mean(self, axis, skipna, numeric_only, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pandas/core/generic.py", line 11201, in mean
return self._stat_function(
File "/usr/local/lib/python3.8/site-packages/pandas/core/generic.py", line 11158, in _stat_function
return self._reduce(
File "/usr/local/lib/python3.8/site-packages/pandas/core/series.py", line 4670, in _reduce
return op(delegate, skipna=skipna, **kwds)
File "/usr/local/lib/python3.8/site-packages/pandas/core/nanops.py", line 96, in _f
return f(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pandas/core/nanops.py", line 158, in f
result = alt(values, axis=axis, skipna=skipna, **kwds)
File "/usr/local/lib/python3.8/site-packages/pandas/core/nanops.py", line 421, in new_func
result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pandas/core/nanops.py", line 727, in nanmean
the_sum = _ensure_numeric(values.sum(axis, dtype=dtype_sum))
File "/usr/local/lib/python3.8/site-packages/pandas/core/nanops.py", line 1699, in _ensure_numeric
raise TypeError(f"Could not convert {x} to numeric") from err
TypeError: Could not convert a aː aː to numeric
However, I see that the method that takes allophones into account works as expected:
>>> from phones import PhoneCollection
>>> pc = PhoneCollection()
>>> pc.langs("eng").values_with_allophones[:5]
[aː (eng), b (eng), b (eng), d (eng), d (eng)]
I guess it must be due to the c != self.source.allophone_column
part in this line:
Line 220 in a497d50
I encountered this error in versions 0.0.5 and 0.0.6
Thanks in advance
As pointed out in #4, some languages are made up of a group of specific dialects in phoible.
We need to figure out how to communicate this best and have to add an option to filter by dialect to phones.
I think it might be best to go with something like this.
from phones import PhoneCollection
pc = PhoneCollection()
pc.langs("ast")
>>> ValueError: Need to select a dialect for "ast". Dialects can be listed using the list_dialects flag
pc.langs("ast", list_dialects=True)
>>> ["Asturian (Western)", "Asturian (North-Eastern)"
pc.langs("ast", "Asturian (Western)")
It would be nice to allow something like pc.langs("ast", "western")
, which could be achieved by just checking if the dialect string only occurs in one of the dialect options.
Hello again.
I found no allophones are being loaded, although marked as default in __init__
of PHOIBLE
(allophone_column='Allophones'
).
I checked it in all phonemes:
>>> from phones import PhoneCollection
>>> pc = PhoneCollection()
>>> {tuple(sorted(p.allophones)) for p in pc.langs(pc.lang_list).values}
{()}
>>>
I might be missing something, though...
I am using version 0.0.4
I am also having this warning, btw:
python3.8/site-packages/phones/__init__.py:219: FutureWarning: The default value of numeric_only
in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False.
Either specify numeric_only or select only columns which should be valid for the function.
self.data.groupby(
Thanks in advance!
Pointed out in #9
python3.8/site-packages/phones/__init__.py:219: FutureWarning: The default value of numeric_only
in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False.
Either specify numeric_only or select only columns which should be valid for the function.
self.data.groupby(
I have install the library using the pip install phones
command and ran the following script:
from phones.convert import Converter
converter = Converter()
but have this error:
Traceback (most recent call last):
File "ipa_arpa.py", line 3, in <module>
from phones.convert import Converter
File "/Users/yehorsmoliakov/opt/miniconda3/lib/python3.8/site-packages/phones/convert.py", line 22, in <module>
from .phonecodes.src import phonecodes
ModuleNotFoundError: No module named 'phones.phonecodes'
probably because of the upgrade to 3.8+ python, the documentation build seems to fail
I should add SAMPA as another phonetic alphabet and use the reference here: https://www.phon.ucl.ac.uk/home/sampa/
I'm creating a PhoneCollection with the drop_dialects
and merge_same_language
flags both set to False in order to load as many languages as possible.
>>> from phones import PhoneCollection
>>> pc=PhoneCollection(drop_dialects=False,merge_same_language=False)
but I get an exception ...
>>> pc=PhoneCollection(drop_dialects=False,merge_same_language=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python310\lib\site-packages\phones\__init__.py", line 77, in __init__
].apply(lambda x: unicodedata.normalize("NFC", x))
File "C:\Python310\lib\site-packages\pandas\core\series.py", line 4433, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "C:\Python310\lib\site-packages\pandas\core\apply.py", line 1088, in apply
return self.apply_standard()
File "C:\Python310\lib\site-packages\pandas\core\apply.py", line 1143, in apply_standard
mapped = lib.map_infer(
File "pandas\_libs\lib.pyx", line 2870, in pandas._libs.lib.map_infer
File "C:\Python310\lib\site-packages\phones\__init__.py", line 77, in <lambda>
].apply(lambda x: unicodedata.normalize("NFC", x))
TypeError: normalize() argument 2 must be str, not float
I don't know if you've this in hand in #5, but for now I can proceed with a function wrapping unicodedata.normalize
in an exception handler
i.e.
def normalize(x):
try:
return unicodedata.normalize("NFC", x)
except:
return x
if self.source.allophone_column is not None:
self.data[self.source.allophone_column] = self.data[
self.source.allophone_column
].apply(lambda x: normalize(x))
Thanks for a really useful library!!
"ɶ" which appears when converting "&" from xsampa to ipa, does not seem to be found in the phoible database.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.