avakar / pytoml Goto Github PK

View Code? Open in Web Editor NEW

129.0 7.0 27.0 81 KB

A TOML-0.4.0 parser/writer for Python.

License: Other

Python 100.00%

pytoml's Introduction

Deprecated

The pytoml project is no longer being actively maintained. Consider using the toml package instead.

pytoml

This project aims at being a specs-conforming and strict parser and writer for TOML files. The library currently supports version 0.4.0 of the specs and runs with Python 2.7+ and 3.5+.

Install:

pip install pytoml

The interface is the same as for the standard json package.

>>> import pytoml as toml
>>> toml.loads('a = 1')
{'a': 1}
>>> with open('file.toml', 'rb') as fin:
...     obj = toml.load(fin)
>>> obj
{'a': 1}

The loads function accepts either a bytes object (that gets decoded as UTF-8 with no BOM allowed), or a unicode object.

Use dump or dumps to serialize a dict into TOML.

>>> print toml.dumps(obj)
a = 1

tests

To run the tests update the toml-test submodule:

git submodule update --init --recursive

Then run the tests:

python test/test.py

pytoml's People

Contributors

Stargazers

Watchers

pytoml's Issues

Encoder outputs invalid characters in key

When I run

pytoml.dump({u"\u00c0": 1}, sys.stdout)

it prints

À = 1

which is not a valid TOML document. Bare keys must only contain ASCII characters.
A quoted string needs to be used, for example "\u00c0" = 1.

Similar case:

pytoml.dump({u"\u00c0": { "a": 1}}, sys.stdout)

prints

[À]
a = 1

which is also invalid.

pytoml strips incorrectly strips leading newlines from single-line strings

See below:

>>> toml_content
u'["a.com"]\nusername = "\\n0"\npassword = "0"\n'
>>> print(toml_content)
["a.com"]
username = "\n0"
password = "0"

>>> pytoml.loads(toml_content)
{u'a.com': {u'username': u'0', u'password': u'0'}}

The leading \n in username is stripped in the pytoml.loads function.

Making TomlDecodeError more useful

I have some code parsing TOML doing:

    except toml.TomlDecodeError as e:
        msg = "'{}' is not a valid TOML file (Error given is: {})\n"
        error  = e.args[0]
        if error == 'Invalid date or number':
            msg += (
                "One frequent cause of this is forgetting to put quotes "
                "around secret keys. Check the file."
            )

        match = re.search(r"What\? (\w+) ", error)
        duplicate = match and next(iter(match.groups()), None)
        if duplicate:
            msg += (
                "One frequent cause of this is using the same account name "
                "twice. Check that you didn't use '{}' several times."
            ).format(duplicate)

        ctx.fail(msg.format(secrets_file, e))

All that is because I get bare TomlDecodeError exceptions.

When raising an exception, we could attach some data so that catching code can make more accurate decision.

E.G:

TomlDecodeError with Invalid date or number and What? Foo already exists should have a field attribute so that we can inspect what data causes the problem.

Also, it's good to have TomlDecodeError subclasses for each particular problem so that we can have a more granular control over what we catch.

E.G:

class InvalidDateOrNumberTomlError(TomlDecodeError, ValueError):
    pass

class DuplicateKeyTomlError(TomlDecodeError, KeyError):
    pass

In the end, my ugly hack code could be replaced by:

    except InvalidDateOrNumberTomlError as e:
        ctx.fail((
            'Duplicate key: "{}" '
            'One frequent cause of this is using the same account name twice.'
        )).format(e.field))
    except DuplicateKeyTomlError as e:
        ctx.fail((
            'Invalid date or number on field "{}" '
            'One frequent cause of this is forgetting to put quotes '
            'around secret keys. Check the file.'
        )).format(e.field))

Which is shorter, clearer, less error prone, and less likely to break in the future.

It's fair to expect that programmers will want to give feedback to their user if parsing a config file fail, so let's make that easy.

[Feature Request] Add pretty print/indented output

Hello, would it be possible to add an option for pretty printed/indented output?

So we can have:

[[fruit]]
  name = "apple"

  [fruit.physical]
    color = "red"
    shape = "round"

  [[fruit.variety]]
    name = "red delicious"

  [[fruit.variety]]
    name = "granny smith"

[[fruit]]
  name = "banana"

  [[fruit.variety]]
    name = "plantain"

instead of:

[[fruit]]
name = "apple"

[fruit.physical]
color = "red"
shape = "round"

[[fruit.variety]]
name = "red delicious"

[[fruit.variety]]
name = "granny smith"

[[fruit]]
name = "banana"

[[fruit.variety]]
name = "plantain"

Pytoml chokes on input from rust thread_local Cargo.toml

I'm not sure exactly what it is about this line, but it causes pytoml to fail. If I remove the middle third, it will pass, but I'm not sure if it's the quotes, or the parens or something else.

[target.'cfg(not(target_os = "emscripten"))'.dependencies]

Source: https://github.com/Amanieu/thread_local-rs/blob/master/Cargo.toml

Failure:

>>> pytoml.loads(open('Cargo.toml').read())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/vagrant/client-dropbox-python/dropbox-virtual-env-3dfc741c0e6a-mac-intel-10.12/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pytoml/parser.py", line 23, in loads
    ast = _p_toml(src)
  File "/Users/vagrant/client-dropbox-python/dropbox-virtual-env-3dfc741c0e6a-mac-intel-10.12/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pytoml/parser.py", line 344, in _p_toml
    s.expect_eof()
  File "/Users/vagrant/client-dropbox-python/dropbox-virtual-env-3dfc741c0e6a-mac-intel-10.12/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pytoml/parser.py", line 124, in expect_eof
    return self._expect(self.consume_eof())
  File "/Users/vagrant/client-dropbox-python/dropbox-virtual-env-3dfc741c0e6a-mac-intel-10.12/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pytoml/parser.py", line 164, in _expect
    raise TomlError('msg', self._pos[0], self._pos[1], self._filename)
pytoml.core.TomlError: <string>(15, 1): msg

Unhelpful error message 'msg' when `_expect` fails

When _Source._expect() fails it raises a TomlError with the unhelpful error message msg:

pytoml/pytoml/parser.py

Line 164 in cb92445

raise TomlError('msg', self._pos[0], self._pos[1], self._filename)

Example to provoke this error:

[a]b

End-of-line comments are not parsed correctly

The following example is taken from the current (0.4) TOML specification (https://github.com/toml-lang/toml#user-content-comment):

    # This is a full-line comment
    key = "value" # This is a comment at the end of a line

Parsing it with pytoml produces an error near the comment on the second line:

import pytoml
pytoml.loads("""
# This is a full-line comment
key = "value" # This is a comment at the end of a line
""")

The error is:

TomlError: <string>(2, 15): msg

How would you feel about pytoml becoming a dependency of pip?

Hello!

There's a discussion happening on distutils-sig about choosing a new standard configuration file format for Python source trees, to be read by tools like pip when they want to build a package. There are multiple threads, but maybe this one is the most relevant.

Currently sentiment seems to be shifting towards TOML, and attention is focused on pytoml because in some initial testing we found that pytoml seems to handle unicode correctly on py2 and the other toml parser we could find doesn't.

So... I wanted to reach out to you and see what you thought about this. I'm super excited about the possibility of having something like TOML become a standard part of our package building toolkit. But... this would also mean a sudden huge influx of users and attention on your little project here, which I know can be a very mixed blessing. And there'd be issues that come up, like, pip needs support for python 2.6 and 3.3 -- I don't know if you're actually interested in supporting those going forward? (I actually have a patch for this that I'll submit as a PR in a moment, but there are also ongoing costs to keeping old python versions working and a PR doesn't solve that.) And we'd probably need to fix the error message situation. And so forth.

What do you think?

load from io streams

I just encountered an issue when trying to pytoml.load from a python io.StringIO (or other stream). PR #22 is a quick fix - NB no extra tests added.

Including commented settings

Hi!

I was wondering if there would be any possibilities to include commented out pairs. My problem at the moment is that I load a toml file, that has default settings commented out, but left in file for user to simply activate, when doing it manually via nano. Pytoml loads that file, but drops the commented out lines, and when dumping, all these are actually gone...

Could you tag and push tags on new versions?

It makes packaging slightly easier for downstream.

datetime issue

python 3.4.3, pytoml==0.1.7

>>> import pytoml as toml
>>> import datetime
>>> d={'dt':datetime.datetime.now()}
>>> d
{'dt': datetime.datetime(2016, 2, 25, 11, 18, 17, 536767)}

>>> toml.dumps(d)
'dt = 2016-02-25T11:18:17.536767Z\n'
>>> toml.loads(toml.dumps(d))
{'dt': datetime.datetime(2016, 2, 25, 11, 18, 17, 536767, tzinfo=<pytoml.parser._TimeZone object at     0x7f9aca381978>)}
>>> d == toml.loads(toml.dumps(d))
False

Are any changes required for TOML v0.5?

https://github.com/toml-lang/toml/blob/master/README.md
https://github.com/toml-lang/toml/wiki#implementations

String literals containing braces are misparsed

toml.loads('bar=["${BAZ}/quux"]')

produces a mangled:

{u'bar': [u'${BAZ']}

Parser fails on raw quote followed by escape inside multi-line basic string

I get an error when I try to parse the following valid TOML document:

a = """b "\n c"""

Result: TomlError: <string>(1, 1): msg
Expected: {'a': 'b "\\n c'}

Encoder outputs invalid timezone offset in date-time value

When I run

pytoml.dump({'numoffset': datetime.datetime(1977, 6, 28, 7, 32, tzinfo=datetime.timezone(datetime.timedelta(0, -5*60*60)))}, sys.stdout)

it prints

numoffset = 1977-06-28T07:32:00-5.00.0

which is not a valid TOML document.
According to the TOML specification and RFC 3339 section 5.6, the time offset must be formatted differently. For example numoffset = 1977-06-28T07:32:00-05:00 would be a correct TOML encoding.

Note this issue comes up in testcase valid/datetime from https://github.com/BurntSushi/toml-test

Leading zeroes in floating-point indexes raise a TomlError

The following code raises a TomlError exception both under Python2 and Python3:

import pytoml
pytoml.loads("""
# This is a full-line comment
maximum_error = 4.85e-06 # 1 arcsec
""")

The error message is the following:

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pytoml/parser.py", line 23, in loads
    ast = _p_toml(src)
  File "/usr/local/lib/python2.7/dist-packages/pytoml/parser.py", line 344, in _p_toml
    s.expect_eof()
  File "/usr/local/lib/python2.7/dist-packages/pytoml/parser.py", line 124, in expect_eof
    return self._expect(self.consume_eof())
  File "/usr/local/lib/python2.7/dist-packages/pytoml/parser.py", line 164, in _expect
    raise TomlError('msg', self._pos[0], self._pos[1], self._filename)
pytoml.core.TomlError: <string>(3, 24): msg

Removing the leading zero solves the problem:

import pytoml
pytoml.loads("""
# This is a full-line comment
maximum_error = 4.85e-6 # 1 arcsec
""")
# Result: {'maximum_error': 4.85e-06}

table name with quoted strings fails to parse

According to the toml spec, table names follow the same rules for keys which means they can be quoted. The following cases fail:

['bar']
[foo.'bar']
[foo.'bar'.baz]

tabs in multi-line literal strings yield error

The TOML specification states that literal-strings are interpreted as-is. Hence, tab characters in literal strings should be possible. Our TOML files contain partial makefile rules and as such require tab characters.

Multi-line literal strings are surrounded by three single quotes on each side and allow newlines. Like literal strings, there is no escaping whatsoever. A newline immediately following the opening delimiter will be trimmed. All other content between the delimiters is interpreted as-is without modification.

However, the decoding fails with the usual generic error. It works in pytoml version 0.1.2

File ".../pytoml/parser.py", line 10, in load
    return loads(fin.read(), translate=translate, object_pairs_hook=object_pairs_hook, filename=getattr(fin, 'name', repr(fin)))
  File ".../pytoml/parser.py", line 23, in loads
    ast = _p_toml(src, object_pairs_hook=object_pairs_hook)
  File ".../pytoml/parser.py", line 352, in _p_toml
    s.expect_eof()
  File ".../pytoml/parser.py", line 124, in expect_eof
    return self._expect(self.consume_eof())
  File ".../pytoml/parser.py", line 164, in _expect
    raise TomlError('msg', self._pos[0], self._pos[1], self._filename)
pytoml.core.TomlError: somefilename(14, 5): msg

Encoder fails on array of arrays of tables

When I run

pytoml.dumps({'a': [ [ {} ] ] })

it raises RuntimeError: {}

Perhaps this occurs because the specified data structure can only be encoded as an array of arrays of inline tables. For example a = [ [ {} ] ] would be a correct TOML encoding.

Invalid floats such as nan included in output

When the data includes a float such as nan, pytoml writes this out using repr. However, this isn't valid TOML, and indeed the parser raises an error when trying to parse it. Would it make more sense to raise an error in this situation?

dump() parameters are switched compared to json, pickle

pytoml looks like it's mirroring the API of modules like json and pickle, with load[s] and dump[s] methods. But whereas json and pickle have dump(obj, fd), pytoml has dump(fd, obj).

Obviously fixing this would be a breaking API change, so maybe it's too late. But it's a minor annoyance.

Why not de-deprecate pytoml?

Hi! Your README states that pytoml is deprecated in favor of to toml package, yet I have less issues using pytoml than toml.

So, first of all thanks and congratulations \o/

Also what about removing the statement? It can mislead people towards a broken implementation, while yours is good.

Dotted keys

a.b.c = 1

# Equivalent to:
[a.b]
c = 1

This is a new feature added to what's going to become TOML 1.0:
toml-lang/toml#505

Of course, TOML 1.0 is not released yet, and pytoml currently declares itself a TOML 0.4.0 parser. So it's not a bug that this doesn't work. But I hope that we can update pytoml once the new spec version is finalised. :-)

Support escaping strings with '' instead of ""

Currently this is hard-coded:

pytoml/pytoml/writer.py

Line 47 in cd54e91

return '"' + ''.join(res) + '"'

Would be nice to pass an option to use the '' format for projects where this is the standard

Parser error if final line has a comment

pytoml doesn't seem to like it if the very last line has a comment on (and there is no blank line at the end of the file).

Parses ok:

a=1 #comment

Errors:

a=1 #comment

Example:

>>> import pytoml as toml
>>> toml.loads('[table] #comment\n')
{u'table': {}}
>>> toml.loads('[table] #comment')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/pytoml/parser.py", line 23, in loads
    ast = _p_toml(src)
  File "/usr/local/lib/python2.7/site-packages/pytoml/parser.py", line 344, in _p_toml
    s.expect_eof()
  File "/usr/local/lib/python2.7/site-packages/pytoml/parser.py", line 124, in expect_eof
    return self._expect(self.consume_eof())
  File "/usr/local/lib/python2.7/site-packages/pytoml/parser.py", line 164, in _expect
    raise TomlError('msg', self._pos[0], self._pos[1], self._filename)
pytoml.core.TomlError: <string>(1, 9): msg
>>> import pkg_resources; pkg_resources.get_distribution("pytoml").version
'0.1.8'

Preceding '\n' at Table serialization

Hi,

I wonder: what is: https://github.com/avakar/pytoml/blob/master/pytoml/writer.py#L123-L124 for? I recently started to use your library and I've been wondering (until I've been taking a peek into the source) how come that when using tables there's always an "unnecessary newline" prepended.

Knowing the "TOML"-spec, my initial guess was:

(...) Tables (...) appear in square brackets on a line by themselves (...)[1]

Yet, isn't this covered by all the appended '\n's in all calls to io.StringIO() [specifically assigned to 'fout' in pytoml's context]?

Thanks for your work and best regards,
Patrick

[1] - https://github.com/toml-lang/toml#table

Literal strings bugs.

I am using python 3.5.

Parsing with pytoml.loads()

This toml:

[table]
key = 'val\nue'

Gives out this error

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "/home/simon/venv/3.5/lib/python3.5/site-packages/pytoml/parser.py", line 23, in loads
    ast = _p_toml(src)
  File "/home/simon/venv/3.5/lib/python3.5/site-packages/pytoml/parser.py", line 344, in _p_toml
    s.expect_eof()
  File "/home/simon/venv/3.5/lib/python3.5/site-packages/pytoml/parser.py", line 124, in expect_eof
    return self._expect(self.consume_eof())
  File "/home/simon/venv/3.5/lib/python3.5/site-packages/pytoml/parser.py", line 164, in _expect
    raise TomlError('msg', self._pos[0], self._pos[1], self._filename)
pytoml.core.TomlError: <string>(3, 1): msg

That toml:

[table]
key = 'val\ue'

Gives out this error:

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 19-21: truncated \uXXXX escape

This toml:

[table]
key = '''value\n'''

Gives out:

{'table': {'key': 'value\n'}}

But should give out (by v0.4.0 specs):

{'table': {'key': 'value\\n'}}

local date and/or time load error

The following value types cause load errors:

Local Date-Time
Local Date
Local Time

Example

>>> import pytoml as toml
>>> toml.loads('date = 1979-05-27')
---------------------------------------------------------------------------
TomlError                                 Traceback (most recent call last)
<ipython-input-32-55780aac767f> in <module>()
      1 import pytoml as toml
----> 2 toml.loads('date = 1979-05-27')
      3 

/usr/local/lib/python3.6/site-packages/pytoml/parser.py in loads(s, filename, translate)
     21 
     22     src = _Source(s, filename=filename)
---> 23     ast = _p_toml(src)
     24 
     25     def error(msg):

/usr/local/lib/python3.6/site-packages/pytoml/parser.py in _p_toml(s)
    350             stmts.append(_p_stmt(s))
    351     _p_ews(s)
--> 352     s.expect_eof()
    353     return stmts
    354 

/usr/local/lib/python3.6/site-packages/pytoml/parser.py in expect_eof(self)
    122 
    123     def expect_eof(self):
--> 124         return self._expect(self.consume_eof())
    125 
    126     def consume(self, s):

/usr/local/lib/python3.6/site-packages/pytoml/parser.py in _expect(self, r)
    162     def _expect(self, r):
    163         if not r:
--> 164             raise TomlError('msg', self._pos[0], self._pos[1], self._filename)
    165         return r
    166 

TomlError: <string>(1, 12): msg

No description on PyPI

In the PyPI page, there is no description. Does it even support markdown these days?

Why is there another TOML parser?

There are currently two TOML parsers for Python. Can the two be merged?

bare integers don't work as keys

From the spec:

"Keys may be either bare or quoted. Bare keys may only contain letters, numbers, underscores, and dashes (A-Za-z0-9_-)."

Test case:

5="bruce"

Result:

pytoml.core.TomlError: script.bb(1, 1): unexpected

Reason:

By the time pytoml has gotten to the question of whether a token is an id, it has already checked to see if it is a datetime or a float; or failed back to int if it was neither a datetime or a float.

However, what it should do is instead check to see if there's an equals sign (possibly after some amount of whitespace) after the int, and if there is, mark it as an id.

Missing tag for 0.1.18 release

You probably forgot to tag it when uploading to PyPI, due to releasing two versions in quick succession...

RuntimeError on dumps if dict is embedded into list.

Hi,

my understanding is that toml is equivalent to json.
Lets consider following code:

import pytoml as toml

# this is totally serializable to json but pytoml can't handle it
print toml.dumps({'foo': [{'quux': 'quuuux'}, 'bar']})

following trace occurs:

Traceback (most recent call last):
  File "./t", line 5, in <module>
    print toml.dumps({'foo': [{'quux': 'quuuux'}, 'bar']})
  File "/Users/b.palmowski/Library/Python/2.7/lib/python/site-packages/pytoml/writer.py", line 11, in dumps
    dump(fout, obj, sort_keys=sort_keys)
  File "/Users/b.palmowski/Library/Python/2.7/lib/python/site-packages/pytoml/writer.py", line 118, in dump
    fout.write('{0} = {1}\n'.format(_escape_id(k), _format_value(v)))
  File "/Users/b.palmowski/Library/Python/2.7/lib/python/site-packages/pytoml/writer.py", line 86, in _format_value
    return _format_list(v)
  File "/Users/b.palmowski/Library/Python/2.7/lib/python/site-packages/pytoml/writer.py", line 49, in _format_list
    return '[{0}]'.format(', '.join(_format_value(obj) for obj in v))
  File "/Users/b.palmowski/Library/Python/2.7/lib/python/site-packages/pytoml/writer.py", line 49, in <genexpr>
    return '[{0}]'.format(', '.join(_format_value(obj) for obj in v))
  File "/Users/b.palmowski/Library/Python/2.7/lib/python/site-packages/pytoml/writer.py", line 88, in _format_value
    raise RuntimeError(v)
RuntimeError: {'quux': 'quuuux'}

Tab in string yielding error

Executing this:

Python 2.7.10 (default, May 23 2015, 09:44:00) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pytoml
>>> pytoml.loads('a="\t"')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build\bdist.win-amd64\egg\pytoml\parser.py", line 23, in loads
  File "build\bdist.win-amd64\egg\pytoml\parser.py", line 344, in _p_toml
  File "build\bdist.win-amd64\egg\pytoml\parser.py", line 124, in expect_eof
  File "build\bdist.win-amd64\egg\pytoml\parser.py", line 164, in _expect
pytoml.core.TomlError: <string>(1, 1): msg

Apparently a tab is not allowed in a string, yielding a uninformative error message. Would be nice to see this fixed to prevent others from stumbling over the same message.

translate isn't called for dictionaries

I am trying to use a custom dictionary class for all parsed dictionaries and I thought that I could do that using the translate callback to load/loads. Turns out, translate is only called for scalars and lists, but not for dictionaries (even though there is an elif kind == 'table': in process_value).

It would also be useful if there was a way to provide the dict class and have it done automatically (uiri/toml has such an option).

Thanks!

avakar / pytoml Goto Github PK

pytoml's Introduction

Deprecated

pytoml

tests

pytoml's People

Contributors

Stargazers

Watchers

Forkers

pytoml's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs