jmespath / jmespath.py Goto Github PK

View Code? Open in Web Editor NEW

2.1K 35.0 172.0 683 KB

JMESPath is a query language for JSON.

Home Page: http://jmespath.org

License: MIT License

Python 100.00%

jmespath.py's Introduction

JMESPath

JMESPath (pronounced "james path") allows you to declaratively specify how to extract elements from a JSON document.

For example, given this document:

{"foo": {"bar": "baz"}}

The jmespath expression foo.bar will return "baz".

JMESPath also supports:

Referencing elements in a list. Given the data:

{"foo": {"bar": ["one", "two"]}}

The expression: foo.bar[0] will return "one". You can also reference all the items in a list using the * syntax:

{"foo": {"bar": [{"name": "one"}, {"name": "two"}]}}

The expression: foo.bar[*].name will return ["one", "two"]. Negative indexing is also supported (-1 refers to the last element in the list). Given the data above, the expression foo.bar[-1].name will return "two".

The * can also be used for hash types:

{"foo": {"bar": {"name": "one"}, "baz": {"name": "two"}}}

The expression: foo.*.name will return ["one", "two"].

Installation

You can install JMESPath from pypi with:

pip install jmespath

API

The jmespath.py library has two functions that operate on python data structures. You can use search and give it the jmespath expression and the data:

>>> import jmespath
>>> path = jmespath.search('foo.bar', {'foo': {'bar': 'baz'}})
'baz'

Similar to the re module, you can use the compile function to compile the JMESPath expression and use this parsed expression to perform repeated searches:

>>> import jmespath
>>> expression = jmespath.compile('foo.bar')
>>> expression.search({'foo': {'bar': 'baz'}})
'baz'
>>> expression.search({'foo': {'bar': 'other'}})
'other'

This is useful if you're going to use the same jmespath expression to search multiple documents. This avoids having to reparse the JMESPath expression each time you search a new document.

Options

You can provide an instance of jmespath.Options to control how a JMESPath expression is evaluated. The most common scenario for using an Options instance is if you want to have ordered output of your dict keys. To do this you can use either of these options:

>>> import jmespath
>>> jmespath.search('{a: a, b: b}',
...                 mydata,
...                 jmespath.Options(dict_cls=collections.OrderedDict))


>>> import jmespath
>>> parsed = jmespath.compile('{a: a, b: b}')
>>> parsed.search(mydata,
...               jmespath.Options(dict_cls=collections.OrderedDict))

Custom Functions

The JMESPath language has numerous built-in functions, but it is also possible to add your own custom functions. Keep in mind that custom function support in jmespath.py is experimental and the API may change based on feedback.

If you have a custom function that you've found useful, consider submitting it to jmespath.site and propose that it be added to the JMESPath language. You can submit proposals here.

To create custom functions:

Create a subclass of jmespath.functions.Functions.
Create a method with the name _func_<your function name>.
Apply the jmespath.functions.signature decorator that indicates the expected types of the function arguments.
Provide an instance of your subclass in a jmespath.Options object.

Below are a few examples:

import jmespath
from jmespath import functions

# 1. Create a subclass of functions.Functions.
#    The function.Functions base class has logic
#    that introspects all of its methods and automatically
#    registers your custom functions in its function table.
class CustomFunctions(functions.Functions):

    # 2 and 3.  Create a function that starts with _func_
    # and decorate it with @signature which indicates its
    # expected types.
    # In this example, we're creating a jmespath function
    # called "unique_letters" that accepts a single argument
    # with an expected type "string".
    @functions.signature({'types': ['string']})
    def _func_unique_letters(self, s):
        # Given a string s, return a sorted
        # string of unique letters: 'ccbbadd' ->  'abcd'
        return ''.join(sorted(set(s)))

    # Here's another example.  This is creating
    # a jmespath function called "my_add" that expects
    # two arguments, both of which should be of type number.
    @functions.signature({'types': ['number']}, {'types': ['number']})
    def _func_my_add(self, x, y):
        return x + y

# 4. Provide an instance of your subclass in a Options object.
options = jmespath.Options(custom_functions=CustomFunctions())

# Provide this value to jmespath.search:
# This will print 3
print(
    jmespath.search(
        'my_add(`1`, `2`)', {}, options=options)
)

# This will print "abcd"
print(
    jmespath.search(
        'foo.bar | unique_letters(@)',
        {'foo': {'bar': 'ccbbadd'}},
        options=options)
)

Again, if you come up with useful functions that you think make sense in the JMESPath language (and make sense to implement in all JMESPath libraries, not just python), please let us know at jmespath.site.

Specification

If you'd like to learn more about the JMESPath language, you can check out the JMESPath tutorial. Also check out the JMESPath examples page for examples of more complex jmespath queries.

The grammar is specified using ABNF, as described in RFC4234. You can find the most up to date grammar for JMESPath here.

You can read the full JMESPath specification here.

Testing

In addition to the unit tests for the jmespath modules, there is a tests/compliance directory that contains .json files with test cases. This allows other implementations to verify they are producing the correct output. Each json file is grouped by feature.

Discuss

Join us on our Gitter channel if you want to chat or if you have any questions.

jmespath.py's People

Contributors

Stargazers

Watchers

Forkers

dpippen jamesls imclab sopel traviscross trevorrowe testingci areski stenlarsson bendalexis mitesh91 pombreda gregroberts iamjry erikbgithub knvpk moreati felixonmars yeyuguo aleh-rudzko jstewmon jean luoyufu aaronkalair adamchainz sumitnagal laxmanlax conley sanyer me2d optionalg salewski rflaperuta r3ap3r2004 tnx-limited jbryan drpoggi ophiry master-dhd jansel snjypl prateekmehta dnappier library-collections ksharpdabu gmega shigemk2 guineveresaenger tmshn lostinthefrost jonike bobh66 sksundaram-learning metalerk mrmichalis wjo1212 raymondseger ponach krismolendyke mmichael-s rob-smallshire moolighty hannes-ucsc edhodapp cs-christopher-carsey thuync krkredde hugovk devopseze scottpeterman thilo-maurer jiaju-yang sijoyjoseph hellowangsai flamingo0 apoclast martinzugnoni dc-avasilev inilien hermit-crab inception-insights entactogenesis vanroy86 shadiakiki1986 poeblu pmp-p munyola zh9210 huornlmj merlisk iwankgb stelligent wimg 53ningen rnampaos aptalca henning ethznn fnordahl hailiang-wang

jmespath.py's Issues

Add ability to merge JSON objects together.

Use case:

Given

{"a": {"key1": "val1", "key2": "val2"}, "key3": "val3", "b": {"key4": "val4"}

I'd like to create:

{"key1": "val1", "key2": "val2", "key3": "val3", "key4": "val4"}

In python it's the equivalent of:

d = {}
d.update(current['a'])
d['key3'] = current['key3']
d.update(current['b'])

Possible syntax:

Add a merge function. It would variadic and take multiple JSON objects, and produce a merged JSON object.

merge(a, {key3: key3}, b)

Add syntax to multiselect-hash. We could borrow python's **kwargs:

{**a, key3: key3, **b}

Index expression following filter expression fails

Is this a bug? I.e. search('[0]', search('[?a==1].a', x)) works, but search('[?a==1].a.[0]', x) doesn't.

$ python
Python 2.7.10 (default, Jul 25 2015, 21:04:56) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from jmespath import search
>>> from json import loads
>>> x = loads('[{"a":1}]')
>>> search('[?a==`1`]', x)
[{u'a': 1}]
>>> search('[0]', search('[?a==`1`].a', x))
1
>>> search('[?a==`1`].a.[0]', x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/__init__.py", line 11, in search
    return parser.Parser().parse(expression).search(data)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 81, in parse
    parsed_result = self._do_parse(expression)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 89, in _do_parse
    return self._parse(expression)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 104, in _parse
    parsed = self._expression(binding_power=0)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 117, in _expression
    left = nud_function(left_token)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 159, in _token_nud_filter
    return self._token_led_filter(ast.identity())
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 280, in _token_led_filter
    right = self._parse_projection_rhs(self.BINDING_POWER['filter'])
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 381, in _parse_projection_rhs
    right = self._parse_dot_rhs(binding_power)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 403, in _parse_dot_rhs
    return self._expression(binding_power)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 126, in _expression
    left = led(left)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 236, in _token_led_dot
    right = self._parse_dot_rhs(self.BINDING_POWER['dot'])
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 406, in _parse_dot_rhs
    return self._parse_multi_select_list()
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 342, in _parse_multi_select_list
    expression = self._expression()
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 117, in _expression
    left = nud_function(left_token)
  File "/home/bhyde/.pyenv/versions/2.7.10/lib/python2.7/site-packages/jmespath/parser.py", line 437, in _error_nud_token
    token['type'], 'Invalid token.')
jmespath.exceptions.ParseError: Invalid token.: Parse error at column 13, token "0" (NUMBER), for expression:
"[?a==`1`].a.[0]"
              ^
>>>

ps. Jmespath is very cool and useful.

Indices convert wildcards to null

Give: foo[*].bar[2]
With data:

{
    "foo": [
        {
            "bar": [
                "one",
                "two"
            ]
        },
        {
            "bar": [
                "three",
                "four"
            ]
        },
        {
            "bar": [
                "five"
            ]
        }
    ]
}

The compliance tests say the results should be null. However, I think it should be an empty list because a result of a wildcard projection should not convert the wildcard root result to null.

From the docs:

Note that if any subsequent expression after a wildcard expression returns a null value, it is omitted from the final result list.

CLI has the wrong shebang

When I install the package in my virtualenv, jp does not work as it has this shebang:

#!/Users/jamessar/.virtualenvs/a16cce535ca9ee79/bin/python

Custom functions

I installed jmespath from pypi (0.9.0) and realized after a bit that the changes to allow custom functions to be registered aren't yet merged. Any idea when that might happen?

Thanks,

Zac

Added "current index" support for projections

It is sometimes useful when in a projection to be able to know which element of the thing being projected you are evaluating. When projecting on an array, the index would be the current index of the array (starting at 0), and when projecting an object, the index would be the current key of the array. I would suggest adding a new token to the grammar to represent the "current_node". The character I would suggest is #.

I can flesh this out with more details and a JEP if there's interest in something like this.

Recursive traversal?

Is it possible to recursively traverse the json tree with jmespath?

for example:

{
    "one": {
        "two": {
            "three": [{
                "four": {
                    "name": "four1_name"
                }
            }, {
                "four": {
                    "name": "four2_name"
                }
            }]
        }
    }
}

Parsing it with something like ..four.name instead of one.two.three[].four.name?
For example how xpath would allow you to //four/name/text() instead of one/two/three/four/name/text()

Is that possible to do with Jmespath?

No copyright notice

This repository declares a license (MIT/Expat in LICENSE.txt and a MIT declaration in PKG-INFO) but does not claim copyright. There is no copyright notice. So who owns the copyright and is issuing the license?

Could an explicit copyright notice be added, please? This would make the copyright and licensing situation unambiguous. Thanks!

Invalid compliance test for a number key?

https://github.com/boto/jmespath/blob/develop/tests/compliance/basic.json#L87

Given: {"foo": {"1": ["one", "two", "three"], "-1": "bar"}}

Expression: foo.-1

I think that this compliance test is invalid because -1 should be parsed as a number. Because JSON objects cannot have number indices, this test should raise an error. I think this test should be updated to use a quoted "-1" which matches up with the string key of "-1".

foo."-1"

'Filter Projections' not support int ?

code:

from jmespath import search
test = {
    'data': {
        'fields': [
            {
                'class_id': 100,
                'orders': [1, 2, 3]
            },
            {
                'class_id': 200,
                'orders': [3, 2, 1]
            }
        ]
    }
}
print search("data.fields[?class_id=='100'].orders", test)
[]

debug:

print search("data.fields[?class_id==100].orders", test)

jmespath.exceptions.ParseError: invalid token: Parse error at column 23, token "100" (NUMBER), for expression:
"data.fields[?class_id==100].orders"

Must be that:

from jmespath import search
test = {
    'data': {
        'fields': [
            {
                'class_id': '100',
                'orders': [1, 2, 3]
            },
            {
                'class_id': 200,
                'orders': [3, 2, 1]
            }
        ]
    }
}
print search("data.fields[?class_id=='100'].orders", test)
[[1, 2, 3]]

It is like same bug with #132

Hypothesis test failure: TypeError: unorderable types

From Travis:

AssertionError: AssertionError: Non JMESPathError raised: unorderable types: list() < NoneType()
Falsifying example: test_search_api(expr='@<A', data=[])

From the spec:

Evaluating any other type with a comparison operator will yield a null value,

to_number does not correctly handle floating point numbers without dots

The JMESPath specification of the to_number function states that everything matching the json-number production must be supported. However, this implementation will handle floating point numbers incorrectly if they do not contain a dot.

Please consider the following example

{"x": "1e+21"}

When searched with the JMESPath

x | to_number(@)

this implementation will return None since the number is not recognized as such. The parser will only try parsing a floating point number if it contains a dot, as can be seen in the code

Therefore, the following (incorrect) behavior can be observed

$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import jmespath
>>> data = {"x": "1e+21"}
>>> jmespath.search("x | to_number(@)", data)
>>>

Please note the None return value which is not printed by the REPL.

The JSON standard allows such numbers, as per http://www.json.org/ where the grammar allows a special case of number number := int exp where exp := e digits and e := 'e+'.

My package version is 0.9.1.

Non-rooted expressions support

I may just be missing it, but it doesn't seem to be possible to do the equivalent of // in XPath. I.e. find the desired expression anywhere in the supplied JSON versus from the root of it. If not, can that be made a feature request?

The use case is when you are trying to find subsets of JSON in a document whose overall structure you cannot predict in advance.

Multiselect list parsing accepts invalid expression

This should be a syntax error:

$ jp.py --ast '[foo bar]'
{'type': 'multi_select_list', 'children': [{'type': 'field', 'children': [], 'value': 'foo'}, {'type': 'field', 'children': [], 'value': 'bar'}]}

Add License to Setup.py

The License isn't properly reported in the setup.py file and on pip, so programmatically it looks like this has no license. Notice the License icon column here: https://requires.io/github/Miserlou/Zappa/requirements/?branch=master - jsmespath is the only one missing.

Allow "." to be escaped

A lot of JSON keys contain dots. I suggest adding a "" escape character for dots to allow them embedded within keys.

Minus (-) char in expression raises LexerError exception

Hi, if path contains minus character, the following error raises:

jmespath.exceptions.LexerError: Bad jmespath expression: Unknown character:
statistics.destinations-voice
                       ^

Custom functions

What are your thoughts about adding a method to register custom functions directly into RuntimeFunctions class in functions.py?

JMESPath is almost good enough to use as a domain specific language for general language transforming objects. You can sneak in literals in the multi-select hash. You can filter for values, and transform them to booleans using <=, == operators. There's some support for making sure values are numbers.

However I don't see anyway to do something like "if value is x, return y" where you show a literal if a condition matches. There's no way to convert conditions to an arbitrary literal - if a value in a multi-select hash is going to be a literal, it has to be of the same value no matter what.

I can see a possible workaround if custom functions on JMESPath. E.g. if I implement the "if" function for use, I can do something like:

search("if(bar==`1`, `hello`, `world`)", {'bar': '1'})

This would return the literal hello if the bar key is 1, otherwise it returns world. The only issue is the current python implementation means its going to be hacky to do this. You have to override multiple classes, in functions.py and override the TreeInterpreter and the ParsedResult classes as well.

I think if custom functions were desired, it would be much more elegant if there is a method to register them directly into the RuntimeFunctions in functions.py, rather than either forcing a fork or overriding a litany of classes.

What do you think?

Specify a website for the project linked to the docs

Change "or" to "||"

Because JamesPath is related to JSON data and not married to Python, I think it makes more sense for the OR token to be "||" rather than " or ". This also has the added benefit of not being whitespace sensitive.

Can't flatten sub-sub-lists?

Let's say I have this list: [[1, 2, 3, [4]], [5, 6, 7, [8, 9]]

I want to extract this output: [[1, 2, 3, 4], [5, 6, 7, 8, 9]]

It seems like this query should work: [*][], i.e., "project the input into a list, and flatten each element of the outer list" but it doesn't work. I get [1, 2, 3, [4], 5, 6, 7, [8, 9]], which is the same as if I had just passed in []. Oddly, [*][0] does return what I would expect, [1, 5] which the first element of each element of the outer list. Why is it that in the [*][0] expression the [0] operates on each element of the top-level list, while in [*][] the [] seems to operate on the list as a whole? I would expect that behavior out of [*] | []. Similarly, [0][] returns [1, 2, 3, 4] and [1][] returns [5, 6, 7, 8, 9].

I see the same behavior both on the released 0.7.1 and the current develop branch.

Thanks!

--Joel

jp script has wrong shebang (in jmespath-0.9.2-py2.py3-none-any.whl on pypi)

This was previously fixed in this issue: #90 , but it seems it has returned. There is a jp file in the jmespath-0.9.2-py2.py3-none-any.whl wheel that is including a shebang line that looks like it is from the developer's virtualenv (#!/Users/jamessar/.virtualenvs/a16cce535ca9ee79/bin/python)

The problem doesn't seem to be the presence of the jp command, its just that it has the wrong shebang line.

Possible solutions:

add a jp script (that calls or is the same as jp.py, since it seems the current jp is different than jp.py

or just have setup.py ignore jp, or remove whatever is causing this file to be included in builds

workarounds for now:

install jmespath by doing pip3.6 install jmespath --no-use-wheel

Ability to set and delete based on a jmespath

For example if I had following dict;

data = {
    "foo": {
        "bar": [
            {"name": "one"}, 
            {"name": "two"}
        ]
    }
}

I'd like to be able to change the value of "one" to "three" be using something like;

new_data = jmespath.replace('foo.bar[1].name', 'three', data)

And then remove the first item of the "bar" list;

new_data = jmespath.remove('foo.bar[1]', data)

Raise exception when match failure, to distinguish from “matched, value is None”

The jmespath.search function returns None in two distinct cases:

>>> import jmespath

>>> foo = {'bar': {'lorem': 13, 'ipsum': None}}
>>> repr(jmespath.search('bar.lorem', foo))
'13'
>>> repr(jmespath.search('bar.ipsum', foo))    # Path matches, value None
'None'
>>> repr(jmespath.search('dolor', foo))    # Path does not match
'None'

It appears the JMESPath search API returns None in these two distinct cases. How can the caller know the difference between them?

I would expect no return in the case of a match failure, and instead an exception (such as KeyError or ValueError).

`jp.py` installed from `pip` has `#!/Users/jamessar/.virtualenvs/a16cce535ca9ee79/bin/python` as the interpreter

% mkvirtualenv jp-test
...
% pip install jmespath
...
% head -1 "$( which jp )"
#!/Users/jamessar/.virtualenvs/a16cce535ca9ee79/bin/python

I know I'm not jamessar! 😁 But, I'm not sure how that happens. 😕 It doesn't look like it comes from the PyPi download:

% curl >jmespath-0.7.1.tar.gz https://pypi.python.org/packages/source/j/jmespath/jmespath-0.7.1.tar.gz#md5=ca76cb014165306c1eded212cfb78cf5
% md5 jmespath-0.7.1.tar.gz
MD5 (jmespath-0.7.1.tar.gz) = ca76cb014165306c1eded212cfb78cf5
% tar xpvf jmespath-0.7.1.tar.gz
...
% head -1 jmespath-0.7.1/bin/jp.py
#!/usr/bin/env python
% find jmespath-0.7.1 -type f | xargs grep -l jamessar
% # nothing found

Regression in 0.4.0, wrong binding power for LBRACKET

The RBP of LBRACKET is wrong. It needs to be a higher BP to ensure it is consumed when calling _parse_dot_rhs. Testcase:

  {
    "given": {
      "baz": "other",
      "foo": [
        {"bar": 1}, {"bar": 2}, {"bar": 3}, {"bar": 4}, {"bar": 1, "baz": 2}
      ]
    },
    "cases": [
      {
        "expression": "foo[?bar==`1`].bar[0]",
        "result": []
      }
    ]
  }

This fails in 0.4.0 and returns 1, because the outermost AST node is IndexExpression(FilterExpression(...), Index(0)) which is wrong.

Grammar on README is incorrect

There seems to be a couple of typos in your grammer on the README. Also, based on your compliance tests, there seems to be a few things missing. How does this look?

expression : expression ' or ' expression
           | expression '.' expression
           | expression '[' (number | wildcard) ']'
           | expression '.' wildcard
           | expression '.' number
           | identifier
identifier : [a-zA-Z_][a-zA-Z0-9]+
number : -?[1-9]+
wildcard : '*'

Also, according to your grammar, identifier has to be 2 characters or more. Wouldn't [a-zA-Z_][a-zA-Z0-9]* be better?

Are number or wildcard allowed at the root of the expression? If so, then the grammar could be changed to:

expression : expression ' or ' expression
           | expression '.' expression
           | expression '[' (number | wildcard) ']'
           | identifier
           | number
           | wildcard
identifier : [a-zA-Z_][a-zA-Z0-9]*
number : -?[1-9]+
wildcard : '*'

Check one value exists in a list with specific output

Actually I have such a use case:

Call DescribeAutoScalingGroups to get all ASG information,
Filter in the list of LoadBalancerNames, if one contains "XX-LB",
Output the AutoScalingGroupName

It seems directly to use jp.py cannot support that case. Am I right?

Raw literal string does not handle escaped single quote

When using a raw string literal ('foobar'), escaping the single quote results in the \ char still present in the output. For example:

import jmespath
print(jmespath.compile(r"'foo\'bar'"))
# prints: {'type': 'literal', 'children': [], 'value': "foo\\'bar"}

The expected result should be the single quote without the escaped char:

{'type': 'literal', 'children': [], 'value': "foo'bar"}

Consider the case when using double quotes or literals. Both of these handle escaping the delimiting character:

print(jmespath.compile(r'"foo\"bar"'))
# prints: {'type': 'field', 'children': [], 'value': u'foo"bar'}

print(jmespath.compile(r'`foo\`bar`'))
{'type': 'literal', 'children': [], 'value': u'foo`bar'}

When numbers become LONG search won't match records

If data being searched contains numeric values that are large enough to become LONG (depends on sys.maxint of the platform) then jmespath.search can't find them. To illustrate lets consider the following simple query that works:

>>> jmespath.search('[?b > `2`]', [{"a": 1, "b": 2}, {"a": 1, "b": 3}])
[{'a': 1, 'b': 3}]

Then simply convert the numeric data type of the field being searched to long and note the result is no longer found:

>>> jmespath.search('[?b > `2`]', [{"a": 1, "b": 2}, {"a": 1, "b": 3L}])
[]

Wildcard issue

jmespath.search("a..c", { "a" : { "b" : { "c" : { "d" : "AAAA" } } } } )
[{'d': 'AAAA'}]
jmespath.search("a..c.d", { "a" : { "b" : { "c" : { "d" : "AAAA" } } } } )

"a..c" works but "a..c.d" does not work. It should work like "a.b.c.d" -> "AAAA" back.

`.` is valid

Given: *.*.foo

With data:

{
    "top1": {
        "sub1": {
            "foo": "one"
        }
    },
    "top2": {
        "sub1": {
            "foo": "two"
        }
    },
    "top3": {
        "sub3": {
            "notfoo": "notfoo"
        }
    }
}

The compliance tests say the result should be null. However, I think the result should be:

[
    [ "one"],
    ["two"],
    []
]

*.*.foo says: Iterate over the results of each key of the outer element. For each of those results, iterate over each key of those results. For each result, grab the "foo" key. The fact that the intermediate result is an array of the values at each key is irrelevant because it's a projection.

The reason I think this is valid is because * creates a projection over the keys of the current result. Because the values in each projection are a hash, then the next .* creates a projection of the sub elements of each key in the has created in the outer hash. Then the projection is finally terminated when told to grab each foo key.

Unicode equal comparison failed to convert both arguments to Unicode

Im a Chinese pythoner, I use python 2.7, have a unicode problem.

{
  "machines": [
    {"name": "a", "state": "**"},
    {"name": "b", "state": "你好"},
    {"name": "b", "state": "谢谢"}
  ]
}

And use filter projections:

machines[?state=='**'].name

Have a error:

visitor.py:10: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

Of course I can edit function _equals like this, It work well

def _equals(x, y):
    if _is_special_integer_case(x, y):
        return False
    else:
        if type(y) == str:
            y = y.decode('utf-8')
        return x == y

But, have other non-invasive way to fix this 'bug' ?

Thanks

subexpression is one word

https://github.com/boto/jmespath/blob/develop/jmespath/ast.py#L77

This should be subexpression IMO

max/min/max_by/min_by functions should support strings

I think the current inconsistently is confusing with sort/sort_by. You can sort/sort_by with arrays of strings, but not with the min_/max_ functions. I'd propose updating the spec to support strings:

# Works
>>> jmespath.search('sort(@)', ['b', 'a', 'd', 'c'])
['a', 'b', 'c', 'd']

# Fails
>>> jmespath.search('max(@)', ['b', 'a', 'd', 'c'])
jmespath.exceptions.JMESPathTypeError: In function max(), invalid type for value: b, expected one of: ['array-number'], received: "str"

cc @mtdowling

sort_by and the list of values

The method signature and implementation requirements for sort_by is kind of awkward. Forcing the internal array to be a list of strings or a list of numbers but allowing either type of list is awkward to implement and probably hard for users to understand. Why can't it just be an array of strings and numbers and then implementations use a natural sorting algorithm?

jmespath does not work if json key has a dash in the name

Observe:

>>> d = {'a': 'b', 'a-b': 'b-c'}
>>> jmespath.search('a', d)
'b'
>>> jmespath.search('a-b', d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dpopes/scratch/python/virtualenv_run/lib/python2.7/site-packages/jmespath/__init__.py", line 12, in search
    return parser.Parser().parse(expression).search(data, options=options)
  File "/Users/dpopes/scratch/python/virtualenv_run/lib/python2.7/site-packages/jmespath/parser.py", line 87, in parse
    parsed_result = self._do_parse(expression)
  File "/Users/dpopes/scratch/python/virtualenv_run/lib/python2.7/site-packages/jmespath/parser.py", line 95, in _do_parse
    return self._parse(expression)
  File "/Users/dpopes/scratch/python/virtualenv_run/lib/python2.7/site-packages/jmespath/parser.py", line 108, in _parse
    self._tokens = list(self.tokenizer)
  File "/Users/dpopes/scratch/python/virtualenv_run/lib/python2.7/site-packages/jmespath/lexer.py", line 71, in tokenize
    yield {'type': 'number', 'value': int(buff),
ValueError: invalid literal for int() with base 10: '-'

Is that behavior by design?

sum() would be a useful built-in function

In AWS CLI, a query like:

aws ec2 describe-volumes --query Volumes[].Size --filters Name=status,Values=available

returns a list of numbers representing the GB's used by EBS volumes that are not attached to any instances. Being able to sum that list in the query would be useful.

Add CLI support for JSON text sequences

RFC 7464 specifies a (fairly simple) format for sequences of JSON values, an alternative to having a giant array of values.

It's got support already in a bunch of places, including within the Python ecosystem (e.g. it is the output format of twisted.logger).

For comparison, e.g. jq already supports jq --seq to process JSON seqs.

Allow root level wildcards

From your examples, it looks like root level wildcards are not supported.

ValueError: invalid literal for int() with base 10: '-'

Running 0.9.0 I am getting "ValueError: invalid literal for int() with base 10: '-'" errors when trying the following:

JSON

{
    "minimum-stability": "dev",
    "prefer-stable": true,
}

Code

result = jmespath.search('prefer-stable', composer_json)

Stack Trace

Traceback (most recent call last):
  File "./validate.py", line 98, in <module>
    result = jmespath.search('prefer-stable', composer_json)
  File "/Users/dave/.virtualenvs/composer-validator/lib/python2.7/site-packages/jmespath/__init__.py", line 12, in search
    return parser.Parser().parse(expression).search(data, options=options)
  File "/Users/dave/.virtualenvs/composer-validator/lib/python2.7/site-packages/jmespath/parser.py", line 87, in parse
    parsed_result = self._do_parse(expression)
  File "/Users/dave/.virtualenvs/composer-validator/lib/python2.7/site-packages/jmespath/parser.py", line 95, in _do_parse
    return self._parse(expression)
  File "/Users/dave/.virtualenvs/composer-validator/lib/python2.7/site-packages/jmespath/parser.py", line 108, in _parse
    self._tokens = list(self.tokenizer)
  File "/Users/dave/.virtualenvs/composer-validator/lib/python2.7/site-packages/jmespath/lexer.py", line 71, in tokenize
    yield {'type': 'number', 'value': int(buff),
ValueError: invalid literal for int() with base 10: '-'

The JSON above is a snippet from a composer.json file. It passes jsonlint and works with the jmespath tutorial.

Let me know if you need any more info.

When definition custom functions, built-in funciton not working

code:

# __version__ = '0.9.2'
import jmespath
from jmespath import functions

class CustomFunctions(functions.Functions):
    @functions.signature({'types': ['number']}, {'types': ['number']})
    def _func_my_add(self, x, y):
        return x + y

# options = jmespath.Options(custom_functions=CustomFunctions())

print jmespath.search('length(`test`)', {})

error:

TypeError: unbound method _func_length() must be called with CustomFunctions instance as first argument (got Functions instance instead)

I just definition CustomFunctions, and not use every time. It is very regularly use in a big framework.

Preserv json order

Hello,

I'm not sur it's a bug but it would be nice if results would keep order of declaration :
"horaire[].{jour : jourOuverture.content, ouvertmatin : heureOuvertureAM.content, fermmatin : heureFermetureAM.content, ouvertaprem : heureOuverturePM.content, fermaprem : heureFermeturePM.content }"
won't keep order of attributes in python.
It can be { ouvertaprem : heureOuverturePM.content, jour : jourOuverture.content, ouvertmatin : heureOuvertureAM.content, fermaprem : heureFermeturePM.content, fermmatin : heureFermetureAM.content }

Trying to send pull request - seems to fail because of an unrelated issue

#129

a small fix of changing None to NoneType

when sending the PR I get an error which doesn't appear related to this change

Update CHANGELOG.rst for latest 0.9.x release

Seems it only covers up to 0.8.0 at the moment.

0.9.1 date comparisons not working

From https://gitter.im/jmespath/chat:

aws ec2 describe-snapshots --owner-id self --query 'Snapshots[?StartTime>='2017--02-01']'
The above command returns and empty set ( '[]' ) when run with v0.9.1, and a list of appropriate snapshots with run with v0.9.0.

I have the same problem fetching AMIs by date. It appears Date comparisons (except ==) are broken in 0.9.1 (and work fine in 0.9.0)

avg function throws ZeroDivisionError

According to avg function specifications (http://jmespath.org/specification.html#avg) - "An empty array will produce a return value of null."
Instead, for json like this:

{"data": [1,2,3,4], "zero_data": []}

jmespath zero_data | avg(@) throws an Exception:

lib/python2.7/site-packages/jmespath/functions.pyc in _func_avg(self, arg)
    177     @builtin_function({'types': ['array-number']})
    178     def _func_avg(self, arg):
--> 179         return sum(arg) / float(len(arg))
    180
    181     @builtin_function({'types': [], 'variadic': True})

ZeroDivisionError: float division by zero

OR (`||`) behaviour seems odd

From http://jmespath.org/specification.html#or-expressions:

An or expression will evaluate to either the left expression or the right expression. If the evaluation of the left expression is not false it is used as the return value. If the evaluation of the right expression is not false it is used as the return value. If neither the left or right expression are non-null, then a value of null is returned. A false value corresponds to any of the following conditions:

False boolean: false

search('a == `foo`', {"a": "foo", "b": "bar"})  # => True
search('b == `bar`', {"a": "foo", "b": "bar"})  # => True
search('a == `foo` || b == `bar`', {"a": "foo", "b": "bar"})  # => False

I would have expected a True value for all three expressions.

Escape test escaping issue

The following test does not work correctly when run using PHP. Maybe this is something to do with how PHP handles string escaping when parsing JSON.

https://github.com/boto/jmespath/blob/develop/tests/compliance/escape.json#L37

In PHP, this creates a key containing """", which is not a valid jmespath identifier. It should be "\"\"\"".

For example (php -a):

php > var_dump(json_decode($j, true));
array(2) {
  'expression' =>
  string(5) """""""
  'result' =>
  string(11) "threequotes"
}

Does this particular test case pass in Python?

'pip install --upgrade awscli' makes jp command fail

$ jp -h
bash: /usr/local/bin/jp: /Users/jamessar/.virtualenvs/a16cce535ca9ee79/bin/python: bad interpreter: No such file or directory