sdispater / tomlkit Goto Github PK
View Code? Open in Web Editor NEWStyle-preserving TOML library for Python
License: MIT License
Style-preserving TOML library for Python
License: MIT License
The following is valid however tomlkit isn't able to properly handle the dual scopes:
a.b.c = 12
[a.b]
d = 34
import tomlkit
doc = tomlkit.parse("""\
a.b.c = 12
[a.b]
d = 34
""")
# looking at the doc shows incomplete data
doc # {'a': {'b': {'d': 34}}}
# retrieving data works
doc['a']['b']['c'] # 12
# modifying data fails
doc['a']['b']['c'] = 45
doc['a']['b']['c'] # 12
doc['a']['b']['d'] = 100
doc['a']['b']['d'] # 34
@sdispater I have been working on a refactored version of tomlkit. This new version addresses many of the outstanding issues of the current implementation as well as making the TOML objects more natural. I found that in some cases having TOML objects are simply problematic so I needed a way to quickly convert TOML objects into Python objects (pyobj
API).
I fully understand if you don't want these changes, they are broad. One of the underlying goals of this refactor was to make the parsing more modular to where the same parser could be used to parse several different versions of TOML (a very possible future once TOML specifies a versioning pragma/scheme). One of the choices made in this refactor was to no longer perfectly preserve whitespaces. I found too many whitespace instances to be ridiculous to preserve (e.g. this key a . b ."foo" .c
). Instead, I make sure to preserve the insert order of comments and key-values but let tomlkit decide how to lay out the TOML object when flattening into a TOML document (str). Some whitespace preservation can be reimplemented without much difficulty (e.g. newlines). We preserve comments and newlines. Adding block indents (as in the entire table gets indented by X spaces) can be added relatively easily. I do not see value in perfectly preserving inconsistent whitespacing and if any whitespace is perserved I would rather see some amount of whitespace standardization (much like black does for Python).
toml
: converts Python object into TOML objectpyobj
: converts TOML object into Python objectloads
/parse
: converts TOML document (str) into TOML objectdumps
: converts TOML object (convert into base type first) into TOML document (str)flatten
: converts TOML object (use as is) into TOML document (str)load
: reads TOML document (str) from filehandle, uses loadsdump
: uses dumps, writes TOML document (str) to filehandleThe refactor also introduces the ability for tables and inline tables to be interchangeable, rendering of one versus the other is based on a table's complexity which can either be set to true or is derived based on TOML rules (e.g. if a table contains comments it is complex). This same logic is used to toggle between AoT and "inline" AoT.
Table
s and Array
s were strongly influenced by collections.OrderedDict
.I'm trying to create
[a.b.c]
d = 10
but I get
[a]
[a.b]
[a.b.c]
d = 10
Looking around, somethings I see that might be useful
tomlkit.items.Table(..., is_super_table)
but am unclear where I apply it and what problems I need to avoid vs what is enforced for metomlkit.items.Key
supports a type and dotted but it is unclear what should be done.
dotted=True
, it looks like it'll still get a Basic
type and be quoted. So I need to specify both Bare
and dotted
?Hi. I ran into an issue when using tomlkit on windows with multiprocessing.
import multiprocessing as mp
import tomlkit
class Worker:
def run(self):
print(self.db_conf)
print(self.db_conf['path'])
# bug here, get() returns None
print(self.db_conf.get('path'))
if __name__ == '__main__':
w = Worker()
conf = tomlkit.loads("""
[db]
path = '~/files/'
""")
w.db_conf = conf['db']
p = mp.Process(target=w.run)
p.start()
p.join()
The output is:
{'path': '~/files/'}
~/files/
None
Somehow Container.get()
lost track of the values after pickling into another process. On Linux this script works fine.
Hi
>>> import tomlkit
>>> import pickle
>>> example = tomlkit.loads("foo = 0")
>>> pickle.loads(pickle.dumps(example))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __new__() missing 2 required positional arguments: 'trivia' and 'raw'
This is very unexpected from a library that claims that a TOMLDocument "behaves like a standard dictionary".
It's the Integers (and Floats) that cause this issue.
Parsing large TOML files is prohibitively slow. On a large Django application with 230 dependencies, Poetry has generated a 4145 line, 5 KB pyproject.lock
file that takes more than 4 minutes to parse on my iMac:
$ time .venv/bin/python -c "import tomlkit; tomlkit.parse(open('pyproject.lock').read())"
250.63s user 2.00s system 99% cpu 4:13.40 total
If I break early, it always seems to be stuck in _restore_idx
:
>>> tomlkit.parse(text)
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../.venv/lib/python2.7/site-packages/tomlkit/api.py", line 51, in parse
return Parser(string).parse()
File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 144, in parse
key, value = self._parse_table()
File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 959, in _parse_table
result = self._parse_aot(result, name)
File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 1008, in _parse_aot
_, table = self._parse_table(name_first)
File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 927, in _parse_table
key_next, table_next = self._parse_table(name)
File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 924, in _parse_table
is_aot_next, name_next = self._peek_table()
File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 993, in _peek_table
self._restore_idx(*idx)
File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 301, in _restore_idx
[(i + idx, TOMLChar(c)) for i, c in enumerate(self._src[idx:])]
File ".../.venv/lib/python2.7/site-packages/tomlkit/toml_char.py", line 8, in __init__
super(TOMLChar, self).__init__()
KeyboardInterrupt
A bare key must be non-empty, but an empty quoted key is allowed (though discouraged).
I have almost finished a fix for this. Just waiting for #15 to be accepted.
# Python 3.6.5
toml_file = """
[[patterns]]
[patterns.start.0]
name = "name 0"
[patterns.start.1]
name = "name 1"
"""
import toml, tomlkit
toml.loads(toml_file)
# {'patterns': [{'start': {'0': {'name': 'name 0'}, '1': {'name': 'name 1'}}}]}
tomlkit.loads(toml_file)
Traceback (most recent call last):
File "<input>", line 1, in <module>
tomlkit.loads(toml_file)
File "/~/.venv/lib64/python3.6/site-packages/tomlkit/api.py", line 38, in loads
return parse(string)
File "/~/.venv/lib64/python3.6/site-packages/tomlkit/api.py", line 52, in parse
return Parser(string).parse()
File "/~/.venv/lib64/python3.6/site-packages/tomlkit/parser.py", line 170, in parse
key, value = self._parse_table()
File "/~/.venv/lib64/python3.6/site-packages/tomlkit/parser.py", line 935, in _parse_table
values.append(key_next, table_next)
File "/~/.venv/lib64/python3.6/site-packages/tomlkit/container.py", line 103, in append
raise KeyAlreadyPresent(key)
tomlkit.exceptions.KeyAlreadyPresent: Key "start" already exists.
tomlkit.items.Array
inherits a list
, but overrides only append
method.
Remaining methods execute but do nothing, so it is not possible to remove elements from list or insert new one in specific place, making editing files very problematic.
When you subtract two Date objects tomlkit takes the result and makes it into a new Date object:
def __sub__(self, other):
result = super(Date, self).__sub__(other)
return self._new(result)
This behavior is incorrect as the value that should get returned is a timedelta
object as per the Python Docs.
The behavior is implemented correctly for Datetime
objects in tomlkit, so I'll go ahead and make a pull request with the changes added to the Date
object as well.
table.copy
appears to return an instance of dict
, not table
:
โ rm -rf venv; python3.7 -m venv venv; venv/bin/python -m pip install --quiet tomlkit; venv/bin/python -m pip list | grep tomlkit; venv/bin/python -c ' julian@Air
import tomlkit
table = tomlkit.table()
print(type(table.copy()))'
/Users/julian/Desktop/venv/lib/python3.7/site-packages/pip/_vendor/msgpack/fallback.py:133: DeprecationWarning: encoding is deprecated, Use raw=False instead.
unpacker = Unpacker(None, max_buffer_size=len(packed), **kwargs)
tomlkit 0.5.8
<class 'dict'>
Besides having the type change during copying, this makes doing immutable changes to tables more difficult (copying a table and mutating the copy).
>>> import tomlkit
>>> contents = """\
... [students]
... tommy = 87
... mary = 66
...
... [subjects]
... maths = "maths"
... english = "english"
...
... [students.bob]
... score = 91
... """
>>> d = tomlkit.loads(contents)
>>> d.get('students')
{'bob': {'score': 91}}
>>> d
{'students': {'tommy': 87, 'mary': 66, 'bob': {'score': 91}},
'subjects': {'maths': 'maths', 'english': 'english'},
'students': {'tommy': 87, 'mary': 66, 'bob': {'score': 91}}}
We defined [students.bob]
section after [subjects]
, while this format is supported by TOML spec.
>>> from tomlkit import dumps
>>> from tomlkit import parse
>>> doc = parse("foo=10")
>>> doc["bar"]=11
>>> dumps(doc)
'foo=10bar = 11\n'
TOML has a few versions. The latest version, 0.5.0, was released a week ago.
It would be good if TOML Kit's README specified what version of the spec it supported.
=================================== FAILURES ===================================
_____________________ test_datetimes_behave_like_datetimes _____________________
def test_datetimes_behave_like_datetimes():
i = item(datetime(2018, 7, 22, 12, 34, 56))
assert i == datetime(2018, 7, 22, 12, 34, 56)
assert i.as_string() == "2018-07-22T12:34:56"
> i += timedelta(days=1)
tests/test_items.py:275:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tomlkit/items.py:525: in __add__
result = super(DateTime, self).__add__(other)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cls = <class 'tomlkit.items.DateTime'>, value = 2018
_ = (7, 23, 12, 34, 56, 0, ...)
def __new__(cls, value, *_): # type: (..., datetime, ...) -> datetime
return datetime.__new__(
cls,
> value.year,
value.month,
value.day,
value.hour,
value.minute,
value.second,
value.microsecond,
tzinfo=value.tzinfo,
)
E AttributeError: 'int' object has no attribute 'year'
tomlkit/items.py:498: AttributeError
_________________________ test_dates_behave_like_dates _________________________
def test_dates_behave_like_dates():
i = item(date(2018, 7, 22))
assert i == date(2018, 7, 22)
assert i.as_string() == "2018-07-22"
> i += timedelta(days=1)
tests/test_items.py:295:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tomlkit/items.py:584: in __add__
result = super(Date, self).__add__(other)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cls = <class 'tomlkit.items.Date'>, value = 2018, _ = (7, 23)
def __new__(cls, value, *_): # type: (..., date, ...) -> date
> return date.__new__(cls, value.year, value.month, value.day)
E AttributeError: 'int' object has no attribute 'year'
tomlkit/items.py:565: AttributeError
I stumbled upon this when poetry version
would not update the pyproject.toml
.
This is the smallest example with which I can reproduce this on master:
# tests/examples/out_of_order_write.toml
[a.a]
key = "value"
[a.b]
[a.a.a]
def test_write_nested_array(example):
doc = loads(example("out_of_order_write"))
doc["a"]["a"]["key"] = "new_value"
assert doc["a"]["a"]["key"] == "new_value"
The test succeeds when changing the order in the example document to
[a.a]
key = "value"
[a.a.a]
[a.b]
I want to group my tables:
[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30
[second]
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543
When I try to manually construct this, I instead get
[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30
[second]
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543
I expected no newlines to be added for me and explicitly called out a newline between [first.a.b.c]
table and [second]
. Newlines being added surprised me and made made me wonder if tomlkit was properly preserving the lack of newlines but it seems to.
So I created the following test case to experiment with how newlines are dealt with
import tomlkit
t = tomlkit.loads("""[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30
[second]
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543""")
print("Round-trip")
print("```toml")
print(tomlkit.dumps(t))
print("```")
print()
t["second"].add("extra", tomlkit.table())
t["second"].add("extra1", tomlkit.table())
three = tomlkit.table()
three["foo"] = 2
three["bar"] = 1
child = tomlkit.table()
child["alice"] = 3
child["bob"] = 10
three["child"] = child
t["three"] = three
print("Changed")
print("```toml")
print(tomlkit.dumps(t))
print("```")
print()
The output is:
Round-trip
[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30
[second]
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543
Changed
[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30
[second]
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543
[second.extra]
[second.extra1]
[three]
foo = 2
bar = 1
[three.child]
alice = 3
bob = 10
I was surprised that second.extra
didn't have a newline before it but second.extra1
did. I imagine this just shows how the newlines are being auto-added but it is still surprising.
This might be partly a bug in poetry but the sdist tarball's setup.py installs the tests package alongside the tomlkit package.
Following test case
import pytest
import tomlkit
@pytest.fixture
def toml_content():
return """
[major]
name = "Charles Bownam"
[alien]
name = "et"
[major.clerk]
name = "John Barradell"
""".strip()
def test_parse(toml_content):
doc = tomlkit.loads(toml_content)
assert doc
raise: tomlkit.exceptions.KeyAlreadyPresent: Key "major" already exists.
According to https://www.tomllint.com/ is the content valid TOML.
Affects python-poetry/poetry#563
In accordance with the grammar
https://github.com/toml-lang/toml/blob/bb47759841ac368d86eb7a459bd7eea7162b9a80/toml.abnf#L213-L221
leading, trailing, and duplicate commas are not allowed.
In other words the following cases should not be allowed (but currently are possible):
No comma:
a = { b = 12 c= 'hello' }
Leading comma:
a = { , b = 12 }
Duplicate comma:
a = { b = 12 ,,,,,, c='hello' }
Trailing comma:
a = { b = 12 , }
I have also addressed this with a fix dependent upon #15.
Could you please add MANIFEST.in and include tests in the sdist archive? In distributions we execute tests on the packages to ensure some sanity checking as othewise python packages are just copying files and we wouldn't know if something broke them.
Hi, I'm getting started with tomlkit. I noticed that Container
inherits from dict, which means its behavior may be difficult to predict in different environments.
In particular, quoting from PyPy's docs about subclassing builtin types:
Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden getitem() in a subclass of dict will not be called by e.g. the built-in get() method.
As far as I can tell, though, most of the methods are already explicitly implemented in Container
, so this unpredictability could be eliminated by subclassing collections.abc.MutableMapping
instead.
from tomlkit import dumps, parse
doc = parse("""[target.x86_64-pc-windows-gnu]
linker = "D:/msys64/mingw64/bin/gcc.exe"
ar = "D:/msys64/mingw64/bin/ar.exe"
""")
doc.remove("target.x86_64-pc-windows-gnu")
print(dumps(doc))
tomlkit.exceptions.NonExistentKey: 'Key ""target.x86_64-pc-windows-gnu"" does not exist.'
[TOMLDocument] behaves like a standard dictionary
>>> from tomlkit import parse
>>> tomldoc = parse('''[table]\nfoo="bar"''')
>>> tomldoc
{'table': {'foo': 'bar'}}
>>> data = {'table': {'foo': 'bar'}}
>>> tomldoc['table'].setdefault('baz', 'waldo')
'waldo'
>>> data['table'].setdefault('baz', 'waldo')
'waldo'
>>> tomldoc
{'table': {'foo': 'bar'}}
>>> data
{'table': {'foo': 'bar', 'baz': 'waldo'}}
I expected the TOMLDocument
to gain a baz
entry after the setdefault
call. This would be helpful for adding data to document sections which may or may not exist yet.
tomlkit==0.5.3
I am wondering, would this be a feasible feature to implement? I dug into the code and saw there is an internal _insert_at
method implemented; would it be a good idea to expose it as public?
My use case it there is an important metadata table in the document, but if itโs missing, I would like to put it at the very front so the user can be encouraged to fill it out. Appending it at the end makes it much less prominent, but there doesnโt seem to be a good way to โcloneโ a document with internal trivials intact either (or maybe this is what I actually need?).
I realise I can simply write to the TOML file directly, and re-parse
the document again, but that feels more like a hack than a proper solution to me. Plus that approach would be suspect to race conditions IMO because the user may be editing it at the same time.
These warnings are new with python 3.8 (see the "Changed in version 3.8" note at the end of https://docs.python.org/3.8/reference/lexical_analysis.html#string-and-bytes-literals ):
=============================== warnings summary ===============================
tests/test_write.py:8
/builddir/build/BUILD/tomlkit-0.5.3/tests/test_write.py:8: SyntaxWarning: invalid escape sequence \e
d = {"foo": "\e\u25E6\r"}
tests/test_write.py:14
/builddir/build/BUILD/tomlkit-0.5.3/tests/test_write.py:14: SyntaxWarning: invalid escape sequence \e
assert loads(dumps(d))["foo"] == "\e\u25E6\r"
-- Docs: https://docs.pytest.org/en/latest/warnings.html
=============== 2 failed, 225 passed, 2 warnings in 2.28 seconds ===============
>>> import tomlkit
>>> doc = tomlkit.parse('[foo]\nbar=1')
>>> doc
{'foo': {'bar': 1}}
>>> import copy
>>> copy.copy(doc)
{}
In accordance with the grammar
https://github.com/toml-lang/toml/blob/bb47759841ac368d86eb7a459bd7eea7162b9a80/toml.abnf#L188-L200
leading and duplicate commas are not allowed (trailing is allowed).
In other words the following cases should not be allowed (but currently are possible):
No comma:
a = [ 12 34 ]
Leading comma:
a = [ , 12 ]
Duplicate comma:
a = [ 12 ,,,,,, 34 ]
Dotted key names are not converted into nested tables.
See https://github.com/toml-lang/toml/tree/v0.5.0#keys
According to the 5.0 standard, this:
[VOLUME]
FILE = 'data/interim/volume.tif'
CRS = 'epsg:26949'
Z.MIN = -0.3048
Z.MAX = 1.524
Z.STEPS = 6
should produce (indented for readability)
{ 'VOLUME': {
'FILE': 'data/interim/volume.tif',
'CRS': 'epsg:26949',
'Z': {
'MIN': -0.3048,
'MAX': 1.524,
'STEPS': 6
}
}
}
But it produces
{'VOLUME': {
'FILE': 'data/interim/volume.tif',
'CRS': 'epsg:26949',
'Z.MIN': -0.3048,
'Z.MAX': 1.524,
'Z.STEPS': 6
}
}
I am interested in extending the TOML syntax to include a handful of changes that simplify my config files (for example I'm looking to make <value>
and (condition)
into valid keys without quotes).
The cleanest way for this to be done would be to adjust tomlkit to support a plugin design where I can register additional types of keys and additional types of values. This could further mean that users could for example configure tomlkit to disallow keys/values they do not wish to deal with (like inline tables).
Is this a direction/design change that would be of interest for tomlkit? Or is this outside of the desired scope? In theory this would also make future changes/additions to the TOML syntax easier to implement.
As more projects incorporate PEP518, many will be aiming to keep compatibility with their existing config files as well (be they ini, yaml, json, etc.). In order to help abstract over config files of different types, it would be useful to be able to convert tomlkit objects into POPOs. One of TOML's strengths for python is that all of its constructs can deserialise to POPOs: could this be made available for tomlkit users?
It would be helpful if we could get the line numbers of every parsed toml element.
We could probably add it as a new parameter to the Item
class.
The line and column numbers could be stored during parsing. The method used to get line numbers for reporting parsing error can be used to get the line numbers.
The current comparison method does not take into account the possibility that an object without a key
attribute could be passed in as other
.
What follows is some slightly obfuscated output from one of my pytest runs.
self = <Key "a\lovely\key">
other = 'a\\lovely\\key'
def __eq__(self, other): # type: (Key) -> bool
> return self.key == other.key
E AttributeError: 'str' object has no attribute 'key'
..\..\.virtualenvs\<virtualenv>\lib\site-packages\tomlkit\items.py:164: AttributeError
As you can see, I tried comparing a 'tomlkit.items.Key' against a regular string which failed spectacularly.
If I'm not mistaken, the worst that should happen when doing a comparison is getting False
.
I was in the process of updating usage of tomlkit over in requirementslib where I have a need to support compatibility with an old usage of sources
keys in a Pipfile
(we now just use the source
key in an AoT
). I was hoping to simply pop the key and reassign it from sources
to its proper name, source
, if I encounter it. However, pop
does not remove the key from the table, so I wind up retaining the original key and also adding the new key. The original key is not valid for the schema so the document fails validation.
Here is an example document:
[[sources]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
[dev-packages]
sphinx = "*"
requests = {extras = ["security"], version = "*"}
And for completeness:
>>> toml_data = """
... [[sources]]
... url = "https://pypi.org/simple"
... verify_ssl = true
... name = "pypi"
...
... [packages]
...
... [dev-packages]
... sphinx = "*"
... requests = {extras = ["security"], version = "*"}
... """
>>> data = tomlkit.loads(toml_data)
>>> data["sources"]
<AoT [{'url': 'https://pypi.org/simple', 'verify_ssl': True, 'name': 'pypi'}]>
>>> data["source"] = data.get("source", tomlkit.aot()) + data.pop("sources", tomlkit.aot())
>>> data["source"]
<AoT [{'name': 'pypi', 'url': 'https://pypi.org/simple', 'verify_ssl': True}]>
>>> data["sources"]
<AoT [{'url': 'https://pypi.org/simple', 'verify_ssl': True, 'name': 'pypi'}]>
>>> data._body
[(None, <Whitespace '\n'>), (<Key sources>, <AoT [{'url': 'https://pypi.org/simple', 'verify_ssl': True, 'name': 'pypi'}]>), (<Key packages>, {}), (<Key dev-packages>, {'sphinx': '*', 'requests': {'extras': ['security'], 'version': '*'}}), (<Key source>, <AoT [{'name': 'pypi', 'url': 'https://pypi.org/simple', 'verify_ssl': True}]>)]
So it's just effectively duplicating the key and not removing it. I also noticed that pop
seems to just pop the _body
value of the item and not the item itself (as compared with __getitem__
which returns the actual AoT in this case), so what I thought was an AoT
instance was just a list.
Is this the expected behavior or should someone put together a patch?
Whitespace around dot-separated parts is ignored, however, best practice is to not use any extraneous whitespace.
I have almost finished a fix for this. Just waiting for #15 to be accepted.
I sent an email to Sebastien a month ago but got no reply. Here I request again to be a maintainer of tomlkit.
I am also a maintainer of Pipenv which depends on tomlkit heavily, but due to some bugs(#56 ), things get broken. This project hasn't seen any new activities for the last months, so I volunteer to pick it up. PyPI access would be even greater.
I appreciate for your great work in so many wonderful projects @sdispater
__setstate__
is unused anywhere in tomlkit so if it exists for other thirdparty tools it needs commenting, other it should be removed.
Suppose we have following toml:
[site.user]
name = "John" # Inline comment
age = 28
I want to convert site.user
to an inline table:
v = parsed['site']['user']
table = tomlkit.inline_table()
table.update(v)
print(table.as_string())
{age = 21,name = "John"# Inline comment}
What is weird is that even with:
table.update(dict(v))
The bug still exists.
The expected behavior should be dropping all comments of original table.
Via sarugaku/plette#9.
>>> import tomlkit
>>> s = """
... [foo]
... value = false
... """
>>> tomlkit.parse(s) == {'foo': {'value': False}} # This works correctly.
True
>>> tomlkit.parse(s)['foo'] == {'value': False} # This does not.
False
At a quick glance, tomlkit.items.Bool
seems to be the problem here.
>>> d = tomlkit.parse(s)
>>> d
{'foo': [{'value': False}]}
>>> d['foo']
{'value': <tomlkit.items.Bool object at 0x10c9ac780>}
Hi! I've encountered parser regression in 0.5.2 in pypy3(6.0.0), when trying to use poetry, it says that it have problem parsing the toml file.
After investigation, I've found that in pypy3(6.0.0) this behavior exists while not affecting normal python 3.5 environments.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user/.pyenv/versions/toml/site-packages/tomlkit/api.py", line 51, in parse
return Parser(string).parse()
File "/home/user/.pyenv/versions/toml/site-packages/tomlkit/parser.py", line 153, in parse
key, value = self._parse_table()
File "/home/user/.pyenv/versions/toml/site-packages/tomlkit/parser.py", line 975, in _parse_table
is_aot_next, name_next = self._peek_table()
TypeError: 'NoneType' object is not iterable
This is exposed by the requirementslib testsuite or in isort:
[ 146s] _____________________________ test_pipfile_finder ______________________________
[ 146s]
[ 146s] tmpdir = local('/tmp/pytest-of-abuild/pytest-0/test_pipfile_finder0')
[ 146s]
[ 146s] def test_pipfile_finder(tmpdir):
[ 146s] pipfile = tmpdir.join('Pipfile')
[ 146s] pipfile.write(PIPFILE)
[ 146s] si = SortImports(file_contents="")
[ 146s] finder = finders.PipfileFinder(
[ 146s] config=si.config,
[ 146s] sections=si.sections,
[ 146s] > path=str(tmpdir)
[ 146s] )
[ 146s]
[ 146s] /home/abuild/rpmbuild/BUILD/isort-4.3.21/test_isort.py:2685:
[ 146s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[ 146s] /home/abuild/rpmbuild/BUILD/isort-4.3.21/isort/finders.py:199: in __init__
[ 146s] self.names = self._load_names()
[ 146s] /home/abuild/rpmbuild/BUILD/isort-4.3.21/isort/finders.py:221: in _load_names
[ 146s] for name in self._get_names(path):
[ 146s] /home/abuild/rpmbuild/BUILD/isort-4.3.21/isort/finders.py:332: in _get_names
[ 146s] for req in project.packages:
[ 146s] /usr/lib/python2.7/site-packages/requirementslib/models/pipfile.py:240: in __getattr__
[ 146s] return super(Pipfile, self).__getattribute__(k, *args, **kwargs)
[ 146s] /usr/lib/python2.7/site-packages/requirementslib/models/pipfile.py:326: in packages
[ 146s] return self.requirements
[ 146s] /usr/lib/python2.7/site-packages/requirementslib/models/pipfile.py:240: in __getattr__
[ 146s] return super(Pipfile, self).__getattribute__(k, *args, **kwargs)
[ 146s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[ 146s]
[ 146s] self = Pipfile(path=PosixPath('/tmp/pytest-of-abuild/pytest-0/test_pipfile_finder0/Pi...e=None, _pyproject={}, build_system={}, _requirements=[], _dev_requirements=[])
[ 146s]
[ 146s] @property
[ 146s] def requirements(self):
[ 146s] # type: () -> List[Requirement]
[ 146s] if not self._requirements:
[ 146s] > packages = tomlkit_value_to_python(self.pipfile.get("packages", {}))
[ 146s] E AttributeError: 'NoneType' object has no attribute 'get'
[ 146s]
[ 146s] /usr/lib/python2.7/site-packages/requirementslib/models/pipfile.py:344: AttributeError
The logic in requirementslib actually looks okayish so I suppose it is an issue inside tomlkit.
>>> import tomlkit
>>> t = tomlkit.table()
>>> t.update({'a': 1})
>>> t
{'a': 1}
>>> t.as_string()
''
I think the solution is to implement them to update _value
? Iโd be interested to work on this (and probably some other inherited methods when I find them) if they are considered acceptable additions.
Is this supposed to do something?
Why are commas being consumed as part of the whitespace before a comma?
import tomlkit
p = tomlkit.parser.Parser("hello = 'world' , # this")
doc = p.parse()
doc
# {'hello': 'world'}
doc.as_string()
# "hello = 'world' , # this"
doc["hello"]._trivia.indent
# ''
doc["hello"]._trivia.comment_ws
# ' , '
doc["hello"]._trivia.comment
# '# this'
doc["hello"]._trivia.trail
# ''
Simply removing the comma from the comment parsing also doesn't break any unittests.
Code:
... # removing many of the keys except the first in the table
table['test'] = 1
Traceback:
File "/home/gram/.local/lib/python3.6/site-packages/tomlkit/container.py", line 528, in __setitem__
self.append(key, value)
File "/home/gram/.local/lib/python3.6/site-packages/tomlkit/container.py", line 175, in append
return self._insert_at(key_after + 1, key, item)
File "/home/gram/.local/lib/python3.6/site-packages/tomlkit/container.py", line 292, in _insert_at
and "\n" not in previous_item.trivia.trail
File "/home/gram/.local/lib/python3.6/site-packages/tomlkit/items.py", line 237, in trivia
return self._trivia
AttributeError: 'Null' object has no attribute '_trivia'
Troubles somewhere in this cycle:
https://github.com/sdispater/tomlkit/blob/master/tomlkit/container.py#L156-L170
After this block execution idx
points on Null
that should be avoided.
However, I haven't found where is the exact error.
Locals inside Container.append
:
{'idx': 3, 'key_after': 2, 'is_table': False, 'v': <Whitespace '\n'>, 'k': None, 'item': {'path': '/home/gram/Documents/dephell/tests/requirements/sdist.tar.gz', 'version': '*'}, 'key': <Key test>, 'self': {'python': '==3.5'}, '__class__': <class 'tomlkit.container.Container'>}
self._body
:
(<Key python>, '==3.5')
(None, <tomlkit.items.Null object at 0x7fb1130380b8>)
(None, <tomlkit.items.Null object at 0x7fb113038be0>)
(None, <tomlkit.items.Null object at 0x7fb113038dd8>)
(None, <tomlkit.items.Null object at 0x7fb113038eb8>)
(None, <tomlkit.items.Null object at 0x7fb113038f98>)
(None, <tomlkit.items.Null object at 0x7fb1130385c0>)
(None, <tomlkit.items.Null object at 0x7fb113000048>)
(None, <tomlkit.items.Null object at 0x7fb113000668>)
(None, <tomlkit.items.Null object at 0x7fb1130005f8>)
(None, <tomlkit.items.Null object at 0x7fb113000828>)
(None, <tomlkit.items.Null object at 0x7fb1130007f0>)
(None, <tomlkit.items.Null object at 0x7fb1130008d0>)
(None, <tomlkit.items.Null object at 0x7fb1130002b0>)
(None, <tomlkit.items.Null object at 0x7fb113003128>)
(None, <tomlkit.items.Null object at 0x7fb1130030b8>)
(None, <tomlkit.items.Null object at 0x7fb113003278>)
(None, <tomlkit.items.Null object at 0x7fb113003390>)
(None, <tomlkit.items.Null object at 0x7fb113003320>)
(None, <tomlkit.items.Null object at 0x7fb1130034a8>)
(None, <tomlkit.items.Null object at 0x7fb1130035f8>)
(None, <Whitespace '\n'>)
(None, <tomlkit.items.Comment object at 0x7fb113003860>)
(None, <tomlkit.items.Null object at 0x7fb113003710>)
(None, <tomlkit.items.Null object at 0x7fb1130036a0>)
(None, <tomlkit.items.Null object at 0x7fb113003780>)
(None, <tomlkit.items.Null object at 0x7fb113003908>)
(None, <tomlkit.items.Null object at 0x7fb113003b38>)
(None, <Whitespace '\n'>)
(None, <tomlkit.items.Comment object at 0x7fb113006588>)
(None, <tomlkit.items.Null object at 0x7fb113003fd0>)
(None, <tomlkit.items.Null object at 0x7fb113003d68>)
(None, <tomlkit.items.Null object at 0x7fb113003b00>)
(None, <tomlkit.items.Null object at 0x7fb1130038d0>)
(None, <tomlkit.items.Null object at 0x7fb113003748>)
(None, <tomlkit.items.Null object at 0x7fb113003668>)
(None, <tomlkit.items.Null object at 0x7fb113003588>)
(None, <tomlkit.items.Null object at 0x7fb1130034e0>)
(None, <tomlkit.items.Null object at 0x7fb1130033c8>)
(None, <tomlkit.items.Null object at 0x7fb1130032e8>)
(None, <Whitespace '\n'>)
v
points on the latest element in the body. So, break
hasn't been called.
๐ค
Given this file:
[site.user]
name = "John"
Then re-assign the table to be a string:
parsed = tomlkit.parse(open("my.toml").read())
parsed['site']['user']="Tom"
print(parsed.as_string())
gives:
user = "Tom"
The table header is missing!
tomlkit version: v0.5.1
One needs to reassign the parent:
v = parsed['site']
parsed['site'] = v.copy()
parsed['site']['user']="Tom"
print(parsed.as_string())
[site]
user = "Tom"
It'd be nice if Table
and related types implement relevant ABCs to act more like regular python types.
pipenv applies a patch to their vendored copy of tomlkit, and it seems that it doesnt exist here yet.
The patch is https://github.com/pypa/pipenv/blob/master/tasks/vendoring/patches/vendor/tomlkit-fix.patch
Parts of that have been merged, but the bit that looks missing is:
diff -ru tomlkit-0.5.3-orig/tomlkit/container.py tomlkit-0.5.3/tomlkit/container.py
--- tomlkit-0.5.3-orig/tomlkit/container.py 2018-11-14 23:10:40.697032200 +0700
+++ tomlkit-0.5.3/tomlkit/container.py 2019-03-14 10:57:20.658602016 +0700
@@ -19,6 +19,7 @@
from .items import Key
from .items import Null
from .items import Table
+from .items import Trivia
from .items import Whitespace
from .items import item as _item
@@ -223,7 +224,12 @@
for i in idx:
self._body[i] = (None, Null())
else:
- self._body[idx] = (None, Null())
+ old_data = self._body[idx][1]
+ trivia = getattr(old_data, "trivia", None)
+ if trivia and trivia.comment:
+ self._body[idx] = (None, Comment(Trivia(comment_ws="", comment=trivia.comment)))
+ else:
+ self._body[idx] = (None, Null())
super(Container, self).__delitem__(key.key)
diff -ru tomlkit-0.5.3-orig/tomlkit/items.py tomlkit-0.5.3/tomlkit/items.py
--- tomlkit-0.5.3-orig/tomlkit/items.py 2018-11-20 01:11:57.965421000 +0700
+++ tomlkit-0.5.3/tomlkit/items.py 2019-03-14 10:59:38.451740952 +0700
@@ -20,6 +20,7 @@
from ._compat import long
from ._compat import unicode
from ._utils import escape_string
+from toml.decoder import InlineTableDict
if PY2:
from functools32 import lru_cache
@@ -40,7 +41,10 @@
elif isinstance(value, float):
return Float(value, Trivia(), str(value))
elif isinstance(value, dict):
- val = Table(Container(), Trivia(), False)
+ if isinstance(value, InlineTableDict):
+ val = InlineTable(Container(), Trivia())
+ else:
+ val = Table(Container(), Trivia(), False)
for k, v in sorted(value.items(), key=lambda i: (isinstance(i[1], dict), i[0])):
val[k] = item(v, _parent=val)
Is this an appropriate addition to tomlkit?
It appears the author was @frostming , originally at https://github.com/pypa/pipenv/commits/6df7d8861da841e552049dcde9ff9a0f23edc01e/tasks/vendoring/patches/vendor/tomlkit-dump-inline-table.patch
I dont see any similar patch in https://github.com/sdispater/tomlkit/commits?author=frostming
Ideally they should be submit the patch, or someone elses should submit it with the patch author set to the correct author. Ping also @techalchemy who has been heavily involved in the pipenv vendoring.
Run tomlkit.loads("foo bar")
, observe that it raises, as expected,
UnexpectedCharError: Unexpected character: 'b' at line 1 col 4
Run tomlkit.loads("hello there")
, observe that it suddenly raises
TypeError: __init__() missing 1 required positional argument: 'char'
If a string contains \n
, is it possible to dump as multiline? e.g
foo = """
my
string
"""
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.