sdispater / tomlkit Goto Github PK

View Code? Open in Web Editor NEW

647.0 9.0 94.0 931 KB

Style-preserving TOML library for Python

License: MIT License

Python 100.00%

tomlkit's People

Contributors

Stargazers

Watchers

Forkers

nchammas batisteo pombredanne kenodegard theendsofinvention frostming jixiangqd uranusjr jsabak dunkmann00 wboler05 andydecleyre slavfox orsinium-forks toppk xavfernandez yunstanford barseghyanartur mbelang haeilifax kuehn-innerste vashek cash bobfang1992 dantiston third-party-oneoffs mrijken duxiaoyao brettcannon bosatsu qwaz cjolowicz annu-ps31 dldinternet-rs gc-ss kwshi abravalheri thirdegree monideka0088 alvistack utek houbie yinzhiqing musicinmybrain kianmeng syntapy abn rob-smallshire kernelsvm radoering mondeja secrus laixintao achimnol kasperski95 wsgalaxy ibm-z-oss-oda cclauss hoefling webknjaz yathon alexander-haw sandyrogers capuanob prudvi-data mayhemheroes dacodedbeat zerocewl avasam fflorent test-heywtu jk1ng caniko joaopalmeiro pikers arpitjain799 iq-scm flichtenheld moduon luizribeiro josephmarinier hugovk anibali n-takumasa sysfce2 fidencio pinal005 wolfi-chainguard-demo robbotorigami waketzheng chainguard-wolfi-bites-back deyjaa

tomlkit's Issues

Scope merging

The following is valid however tomlkit isn't able to properly handle the dual scopes:

a.b.c = 12

[a.b]
d = 34

import tomlkit

doc = tomlkit.parse("""\
a.b.c = 12

[a.b]
d = 34
""")

# looking at the doc shows incomplete data
doc  # {'a': {'b': {'d': 34}}}

# retrieving data works
doc['a']['b']['c']  # 12

# modifying data fails
doc['a']['b']['c']  = 45
doc['a']['b']['c']  # 12

doc['a']['b']['d']  = 100
doc['a']['b']['d']  # 34

@sdispater I have been working on a refactored version of tomlkit. This new version addresses many of the outstanding issues of the current implementation as well as making the TOML objects more natural. I found that in some cases having TOML objects are simply problematic so I needed a way to quickly convert TOML objects into Python objects (pyobj API).

I fully understand if you don't want these changes, they are broad. One of the underlying goals of this refactor was to make the parsing more modular to where the same parser could be used to parse several different versions of TOML (a very possible future once TOML specifies a versioning pragma/scheme). One of the choices made in this refactor was to no longer perfectly preserve whitespaces. I found too many whitespace instances to be ridiculous to preserve (e.g. this key a . b ."foo" .c). Instead, I make sure to preserve the insert order of comments and key-values but let tomlkit decide how to lay out the TOML object when flattening into a TOML document (str). Some whitespace preservation can be reimplemented without much difficulty (e.g. newlines). We preserve comments and newlines. Adding block indents (as in the entire table gets indented by X spaces) can be added relatively easily. I do not see value in perfectly preserving inconsistent whitespacing and if any whitespace is perserved I would rather see some amount of whitespace standardization (much like black does for Python).

Issues Addressed:

#17
#18
#19
#25
#37

Porcelain API:

toml: converts Python object into TOML object
pyobj: converts TOML object into Python object
loads/parse: converts TOML document (str) into TOML object
dumps: converts TOML object (convert into base type first) into TOML document (str)
flatten: converts TOML object (use as is) into TOML document (str)
load: reads TOML document (str) from filehandle, uses loads
dump: uses dumps, writes TOML document (str) to filehandle

Other Changes:

The refactor also introduces the ability for tables and inline tables to be interchangeable, rendering of one versus the other is based on a table's complexity which can either be set to true or is derived based on TOML rules (e.g. if a table contains comments it is complex). This same logic is used to toggle between AoT and "inline" AoT.

Influences:

Large chunks of the code are just the original tomlkit moved around.
Tables and Arrays were strongly influenced by collections.OrderedDict.

Document how to create a table without the supers

I'm trying to create

[a.b.c]
d = 10

but I get

[a]
[a.b]
[a.b.c]
d = 10

Looking around, somethings I see that might be useful

tomlkit.items.Table(..., is_super_table) but am unclear where I apply it and what problems I need to avoid vs what is enforced for me
tomlkit.items.Key supports a type and dotted but it is unclear what should be done.
- Even if I pass dotted=True, it looks like it'll still get a Basic type and be quoted. So I need to specify both Bare and dotted?

Bug in Windows with multiprocessing

Hi. I ran into an issue when using tomlkit on windows with multiprocessing.

import multiprocessing as mp
import tomlkit


class Worker: 
    def run(self): 
        print(self.db_conf) 
        print(self.db_conf['path'])
        # bug here, get() returns None
        print(self.db_conf.get('path')) 

if __name__ == '__main__':
    w = Worker()
    conf = tomlkit.loads("""
    [db]
    path = '~/files/'
    """)
    w.db_conf = conf['db']
    p = mp.Process(target=w.run)
    p.start()
    p.join()

The output is:

{'path': '~/files/'}
~/files/
None

Somehow Container.get() lost track of the values after pickling into another process. On Linux this script works fine.

tomlkit.items not picklable

>>> import tomlkit
>>> import pickle
>>> example = tomlkit.loads("foo = 0")
>>> pickle.loads(pickle.dumps(example))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __new__() missing 2 required positional arguments: 'trivia' and 'raw'

This is very unexpected from a library that claims that a TOMLDocument "behaves like a standard dictionary".

It's the Integers (and Floats) that cause this issue.

Parsing large files is slow

Parsing large TOML files is prohibitively slow. On a large Django application with 230 dependencies, Poetry has generated a 4145 line, 5 KB pyproject.lock file that takes more than 4 minutes to parse on my iMac:

$ time .venv/bin/python -c "import tomlkit; tomlkit.parse(open('pyproject.lock').read())"
250.63s user 2.00s system 99% cpu 4:13.40 total

If I break early, it always seems to be stuck in _restore_idx:

>>> tomlkit.parse(text)
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../.venv/lib/python2.7/site-packages/tomlkit/api.py", line 51, in parse
    return Parser(string).parse()
  File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 144, in parse
    key, value = self._parse_table()
  File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 959, in _parse_table
    result = self._parse_aot(result, name)
  File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 1008, in _parse_aot
    _, table = self._parse_table(name_first)
  File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 927, in _parse_table
    key_next, table_next = self._parse_table(name)
  File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 924, in _parse_table
    is_aot_next, name_next = self._peek_table()
  File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 993, in _peek_table
    self._restore_idx(*idx)
  File ".../.venv/lib/python2.7/site-packages/tomlkit/parser.py", line 301, in _restore_idx
    [(i + idx, TOMLChar(c)) for i, c in enumerate(self._src[idx:])]
  File ".../.venv/lib/python2.7/site-packages/tomlkit/toml_char.py", line 8, in __init__
    super(TOMLChar, self).__init__()
KeyboardInterrupt

Empty keys are supposed to be allowed

TOML: Keys

A bare key must be non-empty, but an empty quoted key is allowed (though discouraged).

~~I have almost finished a fix for this. Just waiting for #15 to be accepted.~~

KeyAlreadyPresent raised when it should be valid

# Python 3.6.5
toml_file = """
[[patterns]]
[patterns.start.0]
name = "name 0"
[patterns.start.1]
name = "name 1"
"""

import toml, tomlkit

toml.loads(toml_file)
# {'patterns': [{'start': {'0': {'name': 'name 0'}, '1': {'name': 'name 1'}}}]}


tomlkit.loads(toml_file)

Traceback (most recent call last):
  File "<input>", line 1, in <module>
    tomlkit.loads(toml_file)
  File "/~/.venv/lib64/python3.6/site-packages/tomlkit/api.py", line 38, in loads
    return parse(string)
  File "/~/.venv/lib64/python3.6/site-packages/tomlkit/api.py", line 52, in parse
    return Parser(string).parse()
  File "/~/.venv/lib64/python3.6/site-packages/tomlkit/parser.py", line 170, in parse
    key, value = self._parse_table()
  File "/~/.venv/lib64/python3.6/site-packages/tomlkit/parser.py", line 935, in _parse_table
    values.append(key_next, table_next)
  File "/~/.venv/lib64/python3.6/site-packages/tomlkit/container.py", line 103, in append
    raise KeyAlreadyPresent(key)
tomlkit.exceptions.KeyAlreadyPresent: Key "start" already exists.

tomlkit.items.Array missing implementation of insert, extend, remove etc

tomlkit.items.Array inherits a list, but overrides only append method.
Remaining methods execute but do nothing, so it is not possible to remove elements from list or insert new one in specific place, making editing files very problematic.

Subtracting two dates incorrectly return a Date object

When you subtract two Date objects tomlkit takes the result and makes it into a new Date object:

def __sub__(self, other):
        result = super(Date, self).__sub__(other)

        return self._new(result)

This behavior is incorrect as the value that should get returned is a timedelta object as per the Python Docs.

The behavior is implemented correctly for Datetime objects in tomlkit, so I'll go ahead and make a pull request with the changes added to the Date object as well.

table.copy should return an instance of table

table.copy appears to return an instance of dict, not table:

⊙  rm -rf venv; python3.7 -m venv venv; venv/bin/python -m pip install --quiet tomlkit; venv/bin/python -m pip list | grep tomlkit; venv/bin/python -c '                julian@Air
import tomlkit
table = tomlkit.table()
print(type(table.copy()))'
/Users/julian/Desktop/venv/lib/python3.7/site-packages/pip/_vendor/msgpack/fallback.py:133: DeprecationWarning: encoding is deprecated, Use raw=False instead.
  unpacker = Unpacker(None, max_buffer_size=len(packed), **kwargs)
tomlkit    0.5.8
<class 'dict'>

Besides having the type change during copying, this makes doing immutable changes to tables more difficult (copying a table and mutating the copy).

Document gets messed up when subtable is defined after another table

>>> import tomlkit
>>> contents = """\
... [students]
... tommy = 87
... mary = 66
... 
... [subjects]
... maths = "maths"
... english = "english"
... 
... [students.bob]
... score = 91
... """
>>> d = tomlkit.loads(contents)
>>> d.get('students')
{'bob': {'score': 91}}
>>> d
{'students': {'tommy': 87, 'mary': 66, 'bob': {'score': 91}},
 'subjects': {'maths': 'maths', 'english': 'english'},
 'students': {'tommy': 87, 'mary': 66, 'bob': {'score': 91}}}

We defined [students.bob] section after [subjects], while this format is supported by TOML spec.

invalid tomlkit.dumps() output

>>> from tomlkit import dumps
>>> from tomlkit import parse
>>> doc = parse("foo=10")
>>> doc["bar"]=11
>>> dumps(doc)
'foo=10bar = 11\n'

What version of the TOML spec does TOML Kit support?

TOML has a few versions. The latest version, 0.5.0, was released a week ago.

It would be good if TOML Kit's README specified what version of the spec it supported.

python 3.8: `test_date_behaves_like_date` tests fail with AttributeError

=================================== FAILURES ===================================
_____________________ test_datetimes_behave_like_datetimes _____________________

    def test_datetimes_behave_like_datetimes():
        i = item(datetime(2018, 7, 22, 12, 34, 56))
    
        assert i == datetime(2018, 7, 22, 12, 34, 56)
        assert i.as_string() == "2018-07-22T12:34:56"
    
>       i += timedelta(days=1)

tests/test_items.py:275: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tomlkit/items.py:525: in __add__
    result = super(DateTime, self).__add__(other)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'tomlkit.items.DateTime'>, value = 2018
_ = (7, 23, 12, 34, 56, 0, ...)

    def __new__(cls, value, *_):  # type: (..., datetime, ...) -> datetime
        return datetime.__new__(
            cls,
>           value.year,
            value.month,
            value.day,
            value.hour,
            value.minute,
            value.second,
            value.microsecond,
            tzinfo=value.tzinfo,
        )
E       AttributeError: 'int' object has no attribute 'year'

tomlkit/items.py:498: AttributeError
_________________________ test_dates_behave_like_dates _________________________

    def test_dates_behave_like_dates():
        i = item(date(2018, 7, 22))
    
        assert i == date(2018, 7, 22)
        assert i.as_string() == "2018-07-22"
    
>       i += timedelta(days=1)

tests/test_items.py:295: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tomlkit/items.py:584: in __add__
    result = super(Date, self).__add__(other)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'tomlkit.items.Date'>, value = 2018, _ = (7, 23)

    def __new__(cls, value, *_):  # type: (..., date, ...) -> date
>       return date.__new__(cls, value.year, value.month, value.day)
E       AttributeError: 'int' object has no attribute 'year'

tomlkit/items.py:565: AttributeError

Can't set value if sections out of order

I stumbled upon this when poetry version would not update the pyproject.toml.

This is the smallest example with which I can reproduce this on master:

# tests/examples/out_of_order_write.toml
[a.a]
key = "value"

[a.b]

[a.a.a]

def test_write_nested_array(example):
    doc = loads(example("out_of_order_write"))
    doc["a"]["a"]["key"] = "new_value"
    assert doc["a"]["a"]["key"] == "new_value"

The test succeeds when changing the order in the example document to

[a.a]
key = "value"

[a.a.a]

[a.b]

Unwanted newlines being added to output.

I want to group my tables:

[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30

[second]       
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543

When I try to manually construct this, I instead get

[first]
foo = 5
bar = 20

[first.a.b.c]
alice = 10
bob = 30


[second]       
foo = 3
bar = 21

[second.a.b.c]
alice = 15
bob = 3543

I expected no newlines to be added for me and explicitly called out a newline between [first.a.b.c] table and [second]. Newlines being added surprised me and made made me wonder if tomlkit was properly preserving the lack of newlines but it seems to.

So I created the following test case to experiment with how newlines are dealt with

import tomlkit

t = tomlkit.loads("""[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30
[second]
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543""")

print("Round-trip")
print("```toml")
print(tomlkit.dumps(t))
print("```")
print()

t["second"].add("extra", tomlkit.table())
t["second"].add("extra1", tomlkit.table())
three = tomlkit.table()
three["foo"] = 2
three["bar"] = 1
child = tomlkit.table()
child["alice"] = 3
child["bob"] = 10
three["child"] = child
t["three"] = three


print("Changed")
print("```toml")
print(tomlkit.dumps(t))
print("```")
print()

The output is:
Round-trip

[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30
[second]
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543

Changed

[first]
foo = 5
bar = 20
[first.a.b.c]
alice = 10
bob = 30
[second]
foo = 3
bar = 21
[second.a.b.c]
alice = 15
bob = 3543
[second.extra]

[second.extra1]

[three]
foo = 2
bar = 1

[three.child]
alice = 3
bob = 10

I was surprised that second.extra didn't have a newline before it but second.extra1 did. I imagine this just shows how the newlines are being auto-added but it is still surprising.

setup.py installs the tests package

This might be partly a bug in poetry but the sdist tarball's setup.py installs the tests package alongside the tomlkit package.

KeyAlreadyPresent, if sections mix their order

Following test case

import pytest
import tomlkit


@pytest.fixture
def toml_content():
    return """
[major]
name = "Charles Bownam"

[alien]
name = "et"

[major.clerk]
name = "John Barradell"
""".strip()


def test_parse(toml_content):
    doc = tomlkit.loads(toml_content)
    assert doc

raise: tomlkit.exceptions.KeyAlreadyPresent: Key "major" already exists.

According to https://www.tomllint.com/ is the content valid TOML.

Affects python-poetry/poetry#563

Inline Table's commas

In accordance with the grammar

https://github.com/toml-lang/toml/blob/bb47759841ac368d86eb7a459bd7eea7162b9a80/toml.abnf#L213-L221

leading, trailing, and duplicate commas are not allowed.

In other words the following cases should not be allowed (but currently are possible):

No comma:
```
a = { b = 12  c= 'hello' }
```
Leading comma:
```
a = { , b = 12 }
```
Duplicate comma:
```
a = { b = 12 ,,,,,, c='hello' }
```
Trailing comma:
```
a = { b = 12 , }
```

~~I have also addressed this with a fix dependent upon #15.~~

Include unittests in pypi sdist archive

Could you please add MANIFEST.in and include tests in the sdist archive? In distributions we execute tests on the packages to ensure some sanity checking as othewise python packages are just copying files and we wouldn't know if something broke them.

Inheriting from dict

Hi, I'm getting started with tomlkit. I noticed that Container inherits from dict, which means its behavior may be difficult to predict in different environments.

In particular, quoting from PyPy's docs about subclassing builtin types:

Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden getitem() in a subclass of dict will not be called by e.g. the built-in get() method.

As far as I can tell, though, most of the methods are already explicitly implemented in Container, so this unpredictability could be eliminated by subclassing collections.abc.MutableMapping instead.

bug report

code

from tomlkit import dumps, parse

doc = parse("""[target.x86_64-pc-windows-gnu]
linker = "D:/msys64/mingw64/bin/gcc.exe"
ar = "D:/msys64/mingw64/bin/ar.exe"
""")
doc.remove("target.x86_64-pc-windows-gnu")
print(dumps(doc))

remove failure

tomlkit.exceptions.NonExistentKey: 'Key ""target.x86_64-pc-windows-gnu"" does not exist.'

setdefault does not behave as expected

[TOMLDocument] behaves like a standard dictionary

>>> from tomlkit import parse
>>> tomldoc = parse('''[table]\nfoo="bar"''')
>>> tomldoc
{'table': {'foo': 'bar'}}
>>> data = {'table': {'foo': 'bar'}}
>>> tomldoc['table'].setdefault('baz', 'waldo')
'waldo'
>>> data['table'].setdefault('baz', 'waldo')
'waldo'
>>> tomldoc
{'table': {'foo': 'bar'}}
>>> data
{'table': {'foo': 'bar', 'baz': 'waldo'}}

I expected the TOMLDocument to gain a baz entry after the setdefault call. This would be helpful for adding data to document sections which may or may not exist yet.

tomlkit==0.5.3

Insert element in the start or in the middle of a document

I am wondering, would this be a feasible feature to implement? I dug into the code and saw there is an internal _insert_at method implemented; would it be a good idea to expose it as public?

My use case it there is an important metadata table in the document, but if it’s missing, I would like to put it at the very front so the user can be encouraged to fill it out. Appending it at the end makes it much less prominent, but there doesn’t seem to be a good way to “clone” a document with internal trivials intact either (or maybe this is what I actually need?).

I realise I can simply write to the TOML file directly, and re-parse the document again, but that feels more like a hack than a proper solution to me. Plus that approach would be suspect to race conditions IMO because the user may be editing it at the same time.

new SyntaxWarning with python 3.8: invalid escape sequence \e

These warnings are new with python 3.8 (see the "Changed in version 3.8" note at the end of https://docs.python.org/3.8/reference/lexical_analysis.html#string-and-bytes-literals ):

=============================== warnings summary ===============================
tests/test_write.py:8
  /builddir/build/BUILD/tomlkit-0.5.3/tests/test_write.py:8: SyntaxWarning: invalid escape sequence \e
    d = {"foo": "\e\u25E6\r"}

tests/test_write.py:14
  /builddir/build/BUILD/tomlkit-0.5.3/tests/test_write.py:14: SyntaxWarning: invalid escape sequence \e
    assert loads(dumps(d))["foo"] == "\e\u25E6\r"

-- Docs: https://docs.pytest.org/en/latest/warnings.html
=============== 2 failed, 225 passed, 2 warnings in 2.28 seconds ===============

Container fails to preserve content on copy

>>> import tomlkit
>>> doc = tomlkit.parse('[foo]\nbar=1')
>>> doc
{'foo': {'bar': 1}}
>>> import copy
>>> copy.copy(doc)
{}

Array's commas

In accordance with the grammar

https://github.com/toml-lang/toml/blob/bb47759841ac368d86eb7a459bd7eea7162b9a80/toml.abnf#L188-L200

leading and duplicate commas are not allowed (trailing is allowed).

In other words the following cases should not be allowed (but currently are possible):

No comma:
```
a = [ 12  34 ]
```
Leading comma:
```
a = [ , 12 ]
```
Duplicate comma:
```
a = [ 12 ,,,,,, 34 ]
```

Dotted keys are not turned into nested dicts

Dotted key names are not converted into nested tables.
See https://github.com/toml-lang/toml/tree/v0.5.0#keys

According to the 5.0 standard, this:

[VOLUME]
FILE = 'data/interim/volume.tif'
CRS = 'epsg:26949'

Z.MIN = -0.3048
Z.MAX = 1.524
Z.STEPS = 6

should produce (indented for readability)

{ 'VOLUME': { 
        'FILE': 'data/interim/volume.tif',
        'CRS': 'epsg:26949',
        'Z': { 
            'MIN': -0.3048,
            'MAX': 1.524,
            'STEPS': 6
        }
     }
}

But it produces

{'VOLUME': {
    'FILE': 'data/interim/volume.tif',
    'CRS': 'epsg:26949',
    'Z.MIN': -0.3048,
    'Z.MAX': 1.524,
    'Z.STEPS': 6
    }
}

Extending tomlkit

I am interested in extending the TOML syntax to include a handful of changes that simplify my config files (for example I'm looking to make <value> and (condition) into valid keys without quotes).

The cleanest way for this to be done would be to adjust tomlkit to support a plugin design where I can register additional types of keys and additional types of values. This could further mean that users could for example configure tomlkit to disallow keys/values they do not wish to deal with (like inline tables).

Is this a direction/design change that would be of interest for tomlkit? Or is this outside of the desired scope? In theory this would also make future changes/additions to the TOML syntax easier to implement.

Function to convert everything to plain old python objects

As more projects incorporate PEP518, many will be aiming to keep compatibility with their existing config files as well (be they ini, yaml, json, etc.). In order to help abstract over config files of different types, it would be useful to be able to convert tomlkit objects into POPOs. One of TOML's strengths for python is that all of its constructs can deserialise to POPOs: could this be made available for tomlkit users?

Get the line and column numbers of parsed elements

It would be helpful if we could get the line numbers of every parsed toml element.
We could probably add it as a new parameter to the Item class.
The line and column numbers could be stored during parsing. The method used to get line numbers for reporting parsing error can be used to get the line numbers.

'tomlkit.items.Key.eq()' should return 'NotImplemented' when comparing objects without a 'key' attribute

The current comparison method does not take into account the possibility that an object without a key attribute could be passed in as other.

What follows is some slightly obfuscated output from one of my pytest runs.

self = <Key "a\lovely\key">
other = 'a\\lovely\\key'

    def __eq__(self, other):  # type: (Key) -> bool
>       return self.key == other.key
E       AttributeError: 'str' object has no attribute 'key'

..\..\.virtualenvs\<virtualenv>\lib\site-packages\tomlkit\items.py:164: AttributeError

As you can see, I tried comparing a 'tomlkit.items.Key' against a regular string which failed spectacularly.
If I'm not mistaken, the worst that should happen when doing a comparison is getting False.

document.pop(key) doesn't actually remove the item from the document

I was in the process of updating usage of tomlkit over in requirementslib where I have a need to support compatibility with an old usage of sources keys in a Pipfile (we now just use the source key in an AoT). I was hoping to simply pop the key and reassign it from sources to its proper name, source, if I encounter it. However, pop does not remove the key from the table, so I wind up retaining the original key and also adding the new key. The original key is not valid for the schema so the document fails validation.

Here is an example document:

[[sources]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]

[dev-packages]
sphinx = "*"
requests = {extras = ["security"], version = "*"}

And for completeness:

>>> toml_data = """ 
... [[sources]] 
... url = "https://pypi.org/simple" 
... verify_ssl = true 
... name = "pypi" 
...  
... [packages] 
...  
... [dev-packages] 
... sphinx = "*" 
... requests = {extras = ["security"], version = "*"} 
... """
>>> data = tomlkit.loads(toml_data)
>>> data["sources"]
<AoT [{'url': 'https://pypi.org/simple', 'verify_ssl': True, 'name': 'pypi'}]>

>>> data["source"] = data.get("source", tomlkit.aot()) + data.pop("sources", tomlkit.aot())
>>> data["source"]
<AoT [{'name': 'pypi', 'url': 'https://pypi.org/simple', 'verify_ssl': True}]>

>>> data["sources"]
<AoT [{'url': 'https://pypi.org/simple', 'verify_ssl': True, 'name': 'pypi'}]>

>>> data._body
[(None, <Whitespace '\n'>), (<Key sources>, <AoT [{'url': 'https://pypi.org/simple', 'verify_ssl': True, 'name': 'pypi'}]>), (<Key packages>, {}), (<Key dev-packages>, {'sphinx': '*', 'requests': {'extras': ['security'], 'version': '*'}}), (<Key source>, <AoT [{'name': 'pypi', 'url': 'https://pypi.org/simple', 'verify_ssl': True}]>)]

So it's just effectively duplicating the key and not removing it. I also noticed that pop seems to just pop the _body value of the item and not the item itself (as compared with __getitem__ which returns the actual AoT in this case), so what I thought was an AoT instance was just a list.

Is this the expected behavior or should someone put together a patch?

Dotted Keys can have spaces around the dot delimiter

TOML: Keys

Whitespace around dot-separated parts is ignored, however, best practice is to not use any extraneous whitespace.

~~I have almost finished a fix for this. Just waiting for #15 to be accepted.~~

[Maintainance] Willing to maintain tomlkit

I sent an email to Sebastien a month ago but got no reply. Here I request again to be a maintainer of tomlkit.

I am also a maintainer of Pipenv which depends on tomlkit heavily, but due to some bugs(#56 ), things get broken. This project hasn't seen any new activities for the last months, so I volunteer to pick it up. PyPI access would be even greater.

I appreciate for your great work in so many wonderful projects @sdispater

This needs commenting or removing

https://github.com/sdispater/tomlkit/blob/fa6fe3bbdc0d0dbe8ab578f967df1fe3210c1e87/tomlkit/container.py#L549-L552

__setstate__ is unused anywhere in tomlkit so if it exists for other thirdparty tools it needs commenting, other it should be removed.

Convert an table to inline table, the comment is preserved unexpectedly

Suppose we have following toml:

[site.user]
name = "John"  # Inline comment
age = 28

I want to convert site.user to an inline table:

v = parsed['site']['user']
table = tomlkit.inline_table()
table.update(v)
print(table.as_string())

{age = 21,name = "John"# Inline comment}

What is weird is that even with:

table.update(dict(v))

The bug still exists.

The expected behavior should be dropping all comments of original table.

Boolean value in document causes erratic comparison results

Via sarugaku/plette#9.

>>> import tomlkit
>>> s = """
... [foo]
... value = false
... """
>>> tomlkit.parse(s) == {'foo': {'value': False}}    # This works correctly.
True
>>> tomlkit.parse(s)['foo'] == {'value': False}      # This does not.
False

At a quick glance, tomlkit.items.Bool seems to be the problem here.

>>> d = tomlkit.parse(s)
>>> d
{'foo': [{'value': False}]}
>>> d['foo']
{'value': <tomlkit.items.Bool object at 0x10c9ac780>}

Parser regression in 0.5.2 in pypy3(6.0.0)

Hi! I've encountered parser regression in 0.5.2 in pypy3(6.0.0), when trying to use poetry, it says that it have problem parsing the toml file.

After investigation, I've found that in pypy3(6.0.0) this behavior exists while not affecting normal python 3.5 environments.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/.pyenv/versions/toml/site-packages/tomlkit/api.py", line 51, in parse
    return Parser(string).parse()
  File "/home/user/.pyenv/versions/toml/site-packages/tomlkit/parser.py", line 153, in parse
    key, value = self._parse_table()
  File "/home/user/.pyenv/versions/toml/site-packages/tomlkit/parser.py", line 975, in _parse_table
    is_aot_next, name_next = self._peek_table()
TypeError: 'NoneType' object is not iterable

release 0.5.5 breaks requirementslib

This is exposed by the requirementslib testsuite or in isort:

[  146s] _____________________________ test_pipfile_finder ______________________________
[  146s] 
[  146s] tmpdir = local('/tmp/pytest-of-abuild/pytest-0/test_pipfile_finder0')
[  146s] 
[  146s]     def test_pipfile_finder(tmpdir):
[  146s]         pipfile = tmpdir.join('Pipfile')
[  146s]         pipfile.write(PIPFILE)
[  146s]         si = SortImports(file_contents="")
[  146s]         finder = finders.PipfileFinder(
[  146s]             config=si.config,
[  146s]             sections=si.sections,
[  146s] >           path=str(tmpdir)
[  146s]         )
[  146s] 
[  146s] /home/abuild/rpmbuild/BUILD/isort-4.3.21/test_isort.py:2685: 
[  146s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[  146s] /home/abuild/rpmbuild/BUILD/isort-4.3.21/isort/finders.py:199: in __init__
[  146s]     self.names = self._load_names()
[  146s] /home/abuild/rpmbuild/BUILD/isort-4.3.21/isort/finders.py:221: in _load_names
[  146s]     for name in self._get_names(path):
[  146s] /home/abuild/rpmbuild/BUILD/isort-4.3.21/isort/finders.py:332: in _get_names
[  146s]     for req in project.packages:
[  146s] /usr/lib/python2.7/site-packages/requirementslib/models/pipfile.py:240: in __getattr__
[  146s]     return super(Pipfile, self).__getattribute__(k, *args, **kwargs)
[  146s] /usr/lib/python2.7/site-packages/requirementslib/models/pipfile.py:326: in packages
[  146s]     return self.requirements
[  146s] /usr/lib/python2.7/site-packages/requirementslib/models/pipfile.py:240: in __getattr__
[  146s]     return super(Pipfile, self).__getattribute__(k, *args, **kwargs)
[  146s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[  146s] 
[  146s] self = Pipfile(path=PosixPath('/tmp/pytest-of-abuild/pytest-0/test_pipfile_finder0/Pi...e=None, _pyproject={}, build_system={}, _requirements=[], _dev_requirements=[])
[  146s] 
[  146s]     @property
[  146s]     def requirements(self):
[  146s]         # type: () -> List[Requirement]
[  146s]         if not self._requirements:
[  146s] >           packages = tomlkit_value_to_python(self.pipfile.get("packages", {}))
[  146s] E           AttributeError: 'NoneType' object has no attribute 'get'
[  146s] 
[  146s] /usr/lib/python2.7/site-packages/requirementslib/models/pipfile.py:344: AttributeError

The logic in requirementslib actually looks okayish so I suppose it is an issue inside tomlkit.

update() for Table and InlineTable yield surprising result

>>> import tomlkit
>>> t = tomlkit.table()
>>> t.update({'a': 1})
>>> t
{'a': 1}
>>> t.as_string()
''

I think the solution is to implement them to update _value? I’d be interested to work on this (and probably some other inherited methods when I find them) if they are considered acceptable additions.

if False?

https://github.com/sdispater/tomlkit/blob/5d949e3956e0f78a9497c6b1e8e62676106ff8d4/tomlkit/container.py#L288-L292

Is this supposed to do something?

Commas consumed as part of a comment?

https://github.com/sdispater/tomlkit/blob/07442dcd5e9e9f44092d34f6f4a437316182c79f/tomlkit/parser.py#L342-L344

Why are commas being consumed as part of the whitespace before a comma?

import tomlkit
p = tomlkit.parser.Parser("hello = 'world'    ,  # this")
doc = p.parse()

doc
# {'hello': 'world'}
doc.as_string()
# "hello = 'world'    ,  # this"
doc["hello"]._trivia.indent
# ''
doc["hello"]._trivia.comment_ws
# '    ,  '
doc["hello"]._trivia.comment
# '# this'
doc["hello"]._trivia.trail
# ''

Simply removing the comma from the comment parsing also doesn't break any unittests.

`AttributeError: 'Null' object has no attribute '_trivia'` after removing elements from table.

Code:

...  # removing many of the keys except the first in the table
table['test'] = 1

Traceback:

  File "/home/gram/.local/lib/python3.6/site-packages/tomlkit/container.py", line 528, in __setitem__
    self.append(key, value)
  File "/home/gram/.local/lib/python3.6/site-packages/tomlkit/container.py", line 175, in append
    return self._insert_at(key_after + 1, key, item)
  File "/home/gram/.local/lib/python3.6/site-packages/tomlkit/container.py", line 292, in _insert_at
    and "\n" not in previous_item.trivia.trail
  File "/home/gram/.local/lib/python3.6/site-packages/tomlkit/items.py", line 237, in trivia
    return self._trivia
AttributeError: 'Null' object has no attribute '_trivia'

Troubles somewhere in this cycle:
https://github.com/sdispater/tomlkit/blob/master/tomlkit/container.py#L156-L170

After this block execution idx points on Null that should be avoided.

However, I haven't found where is the exact error.

Locals inside Container.append:

{'idx': 3, 'key_after': 2, 'is_table': False, 'v': <Whitespace '\n'>, 'k': None, 'item': {'path': '/home/gram/Documents/dephell/tests/requirements/sdist.tar.gz', 'version': '*'}, 'key': <Key test>, 'self': {'python': '==3.5'}, '__class__': <class 'tomlkit.container.Container'>}

self._body:

(<Key python>, '==3.5')
(None, <tomlkit.items.Null object at 0x7fb1130380b8>)
(None, <tomlkit.items.Null object at 0x7fb113038be0>)
(None, <tomlkit.items.Null object at 0x7fb113038dd8>)
(None, <tomlkit.items.Null object at 0x7fb113038eb8>)
(None, <tomlkit.items.Null object at 0x7fb113038f98>)
(None, <tomlkit.items.Null object at 0x7fb1130385c0>)
(None, <tomlkit.items.Null object at 0x7fb113000048>)
(None, <tomlkit.items.Null object at 0x7fb113000668>)
(None, <tomlkit.items.Null object at 0x7fb1130005f8>)
(None, <tomlkit.items.Null object at 0x7fb113000828>)
(None, <tomlkit.items.Null object at 0x7fb1130007f0>)
(None, <tomlkit.items.Null object at 0x7fb1130008d0>)
(None, <tomlkit.items.Null object at 0x7fb1130002b0>)
(None, <tomlkit.items.Null object at 0x7fb113003128>)
(None, <tomlkit.items.Null object at 0x7fb1130030b8>)
(None, <tomlkit.items.Null object at 0x7fb113003278>)
(None, <tomlkit.items.Null object at 0x7fb113003390>)
(None, <tomlkit.items.Null object at 0x7fb113003320>)
(None, <tomlkit.items.Null object at 0x7fb1130034a8>)
(None, <tomlkit.items.Null object at 0x7fb1130035f8>)
(None, <Whitespace '\n'>)
(None, <tomlkit.items.Comment object at 0x7fb113003860>)
(None, <tomlkit.items.Null object at 0x7fb113003710>)
(None, <tomlkit.items.Null object at 0x7fb1130036a0>)
(None, <tomlkit.items.Null object at 0x7fb113003780>)
(None, <tomlkit.items.Null object at 0x7fb113003908>)
(None, <tomlkit.items.Null object at 0x7fb113003b38>)
(None, <Whitespace '\n'>)
(None, <tomlkit.items.Comment object at 0x7fb113006588>)
(None, <tomlkit.items.Null object at 0x7fb113003fd0>)
(None, <tomlkit.items.Null object at 0x7fb113003d68>)
(None, <tomlkit.items.Null object at 0x7fb113003b00>)
(None, <tomlkit.items.Null object at 0x7fb1130038d0>)
(None, <tomlkit.items.Null object at 0x7fb113003748>)
(None, <tomlkit.items.Null object at 0x7fb113003668>)
(None, <tomlkit.items.Null object at 0x7fb113003588>)
(None, <tomlkit.items.Null object at 0x7fb1130034e0>)
(None, <tomlkit.items.Null object at 0x7fb1130033c8>)
(None, <tomlkit.items.Null object at 0x7fb1130032e8>)
(None, <Whitespace '\n'>)

v points on the latest element in the body. So, break hasn't been called.

🤔

Reassign the value of an outline table, the table becomes messy.

Issue

Given this file:

[site.user]
name = "John"

Then re-assign the table to be a string:

parsed = tomlkit.parse(open("my.toml").read())
parsed['site']['user']="Tom"
print(parsed.as_string())

gives:

user = "Tom"

The table header is missing!
tomlkit version: v0.5.1

Workaround

One needs to reassign the parent:

v = parsed['site']
parsed['site'] = v.copy()
parsed['site']['user']="Tom"
print(parsed.as_string())

[site]
user = "Tom"

Support common python ABCs?

It'd be nice if Table and related types implement relevant ABCs to act more like regular python types.

pipenv's Trivia patch to make tomlkit dump toml's inline table

pipenv applies a patch to their vendored copy of tomlkit, and it seems that it doesnt exist here yet.

The patch is https://github.com/pypa/pipenv/blob/master/tasks/vendoring/patches/vendor/tomlkit-fix.patch

Parts of that have been merged, but the bit that looks missing is:

diff -ru tomlkit-0.5.3-orig/tomlkit/container.py tomlkit-0.5.3/tomlkit/container.py
--- tomlkit-0.5.3-orig/tomlkit/container.py	2018-11-14 23:10:40.697032200 +0700
+++ tomlkit-0.5.3/tomlkit/container.py	2019-03-14 10:57:20.658602016 +0700
@@ -19,6 +19,7 @@
 from .items import Key
 from .items import Null
 from .items import Table
+from .items import Trivia
 from .items import Whitespace
 from .items import item as _item
 
@@ -223,7 +224,12 @@
             for i in idx:
                 self._body[i] = (None, Null())
         else:
-            self._body[idx] = (None, Null())
+            old_data = self._body[idx][1]
+            trivia = getattr(old_data, "trivia", None)
+            if trivia and trivia.comment:
+                self._body[idx] = (None, Comment(Trivia(comment_ws="", comment=trivia.comment)))
+            else:
+                self._body[idx] = (None, Null())
 
         super(Container, self).__delitem__(key.key)
 
diff -ru tomlkit-0.5.3-orig/tomlkit/items.py tomlkit-0.5.3/tomlkit/items.py
--- tomlkit-0.5.3-orig/tomlkit/items.py	2018-11-20 01:11:57.965421000 +0700
+++ tomlkit-0.5.3/tomlkit/items.py	2019-03-14 10:59:38.451740952 +0700
@@ -20,6 +20,7 @@
 from ._compat import long
 from ._compat import unicode
 from ._utils import escape_string
+from toml.decoder import InlineTableDict
 
 if PY2:
     from functools32 import lru_cache
@@ -40,7 +41,10 @@
     elif isinstance(value, float):
         return Float(value, Trivia(), str(value))
     elif isinstance(value, dict):
-        val = Table(Container(), Trivia(), False)
+        if isinstance(value, InlineTableDict):
+            val = InlineTable(Container(), Trivia())
+        else:
+            val = Table(Container(), Trivia(), False)
         for k, v in sorted(value.items(), key=lambda i: (isinstance(i[1], dict), i[0])):
             val[k] = item(v, _parent=val)

Is this an appropriate addition to tomlkit?

It appears the author was @frostming , originally at https://github.com/pypa/pipenv/commits/6df7d8861da841e552049dcde9ff9a0f23edc01e/tasks/vendoring/patches/vendor/tomlkit-dump-inline-table.patch

I dont see any similar patch in https://github.com/sdispater/tomlkit/commits?author=frostming

Ideally they should be submit the patch, or someone elses should submit it with the patch author set to the correct author. Ping also @techalchemy who has been heavily involved in the pipenv vendoring.

Unexpected TypeError while running tomlkit.loads()

Run tomlkit.loads("foo bar"), observe that it raises, as expected,

UnexpectedCharError: Unexpected character: 'b' at line 1 col 4

Run tomlkit.loads("hello there"), observe that it suddenly raises

TypeError: __init__() missing 1 required positional argument: 'char'

Multiline dump?

If a string contains \n, is it possible to dump as multiline? e.g

foo = """
my 
string
"""