russellballestrini / nested-lookup Goto Github PK

View Code? Open in Web Editor NEW

209.0 10.0 50.0 98 KB

moved: https://git.unturf.com/python/nested-lookup/

Home Page: https://git.unturf.com/python/nested-lookup/

Python 100.00%

nested-lookup's Introduction

nested_lookup

This repo was archived in favor of gitlab, permanent redirect:

https://git.unturf.com/python/nested-lookup/

Make working with JSON, YAML, and XML document responses fun again!

The nested_lookup package provides many Python functions for working with deeply nested documents. A document in this case is a a mixture of Python dictionary and list objects typically derived from YAML or JSON.

nested-lookup's People

Contributors

Stargazers

Watchers

nested-lookup's Issues

Update dictionaries by update rule map

In case there's a need to update many keys ,
I need to run the nested_update as number of keys required to update.
not efficient as it be done in single search .

So it could really nice update if update nested_update function
will support multiple updates in single batch .
in example by dictionary which contain update rules .

update_rules = {"name": fake.first_name() ,"last_name": fake.last_name, "id": random(100)} )

result = nested_update(my_document, multiple_update_by = update_rules  )

Have to install "Six" module manually

'Six' module (file neste_lookup.py) is imported (from six import iteritems), but it isn't installed with this module.

Annotations break compatibility with Python 2.x

It looks like version 0.2.20 introduced the use of annotations on one function, and it causes a syntax error when running on Python 2.7. The classifiers still indicate it should be compatible with 2.6 and 2.7, and there isn't a python_requires argument passed to setup.

Is the break in Python 2.x compatibility intentional?

Keys of empty values are not removed

document = [ { 'taco' : {} } , { 'salsa' : [ { 'burrito' : { 'taco' : 69 } } ] } ]
nested_delete(document, 'taco')

Response: [{'taco': {}}, {'salsa': [{'burrito': {}}]}]

nested_alter isn't truly recursive

nested_alter fails to update elements deep in the hierarchy if the structure of the hierarchy higher up has been altered in the same call. Example:

from nested_lookup import nested_alter

def rename_subkeys(data):
    try:
        data['renamed'] = data.pop('rename me')
    except KeyError:
        pass
    return data


document = {'key': {'rename me': 1}, 'sub': {'key': {'rename me': 1}}}
altered_document = nested_alter(document, 'key', rename_subkeys)
# The above call works, since renaming the first key doesn't alter how the second key is accessed. Result:
# {'key': {'renamed': 1}, 'sub': {'key': {'renamed': 1}}}

document = {'key': {'rename me': {'rename me': 1}}}
altered_document = nested_alter(document, 'key', rename_subkeys)
# This does not work, only one of the keys are renamed. Result: 
# {'key': {'renamed': {'rename me': 1}}}

Request: Option to return the "path" to the key and the "nested dict" that represents the value

Hello,

great lib. May I suggest to add a feature that would give me the path to the found value(s) as well.

def _nested_update, counter "run" was cleared on recursion.

WORKED

def _nested_update(
    document, key, value, val_len, run=[0]
):   
    if isinstance(document, list):
        for list_items in document:
            _nested_update(document=list_items, key=key, value=value,
                           val_len=val_len, run=run)
    elif isinstance(document, dict):
        if document.get(key):
            # check if a value with the coresponding index exists and
            # use it otherwise recycle the intially given value
            if run[0] < val_len:
                val = value[run[0]]
            else:
                run[0] = 0
                val = value[run[0]]
            document[key] = val
            run[0] = run[0] + 1
        for dict_key, dict_value in iteritems(document):
            _nested_update(document=dict_value, key=key, value=value,
                           val_len=val_len, run=run)
    return document

get_occurence_of_value not returning exact count if the value is list

sample4 = {
            "values": [{
                "checks": [{
                    "monitoring_zones":
                    ["mzdfw", "mzfra", "mzhkg", "mziad",
                     "mzlon", "mzord", "mzsyd"]
                }]
            }]
        }

For the above document if i am searching for the occurrence of value 'mzhkg'. It should return 1 but it is returning 0 since it can't able to lookup inside a value if it is a list

Tag releases

It would be nice for downstream packaging if releases where tagged.

Enable key lookup after the key is found

If the key is found we stoped the lookup inside value of that key.

sample_data = {
    "build_version": {
        "model_name": "MacBook Pro",
        "build_version": {
            "processor_name": "Intel Core i7",
            "processor_speed": "2.7 GHz",
            "core_details": {
                "build_version": "4",
                "l2_cache(per_core)": "256 KB"
            }
        },
        "number_of_cores": "4",
        "memory": "256 KB",
    },
    "os_details": {
        "product_version": "10.13.6",
        "build_version": "17G65"
    },
    "name": "Test",
    "date": "YYYY-MM-DD HH:MM:SS"
}

For the above sample data, lookup for key build_version should return

result = [
    {
        'build_version': {
            'processor_name': 'Intel Core i7',
            'processor_speed': '2.7 GHz',
            'core_details': {
                'build_version': '4',
                'l2_cache(per_core)': '256 KB'
            }
        },
        'memory': '256 KB',
        'model_name': 'MacBook Pro',
        'number_of_cores': '4'
    },
    {
        'processor_name': 'Intel Core i7',
        'processor_speed': '2.7 GHz',
        'core_details': {
            'build_version': '4',
            'l2_cache(per_core)': '256 KB'
        }
    }, '4', '17G65'
]

But it is returning

result = [
    {
        'build_version': {
            'processor_name': 'Intel Core i7',
            'processor_speed': '2.7 GHz',
            'core_details': {
                'build_version': '4',
                'l2_cache(per_core)': '256 KB'
            }
        },
        'memory': '256 KB',
        'model_name': 'MacBook Pro',
        'number_of_cores': '4'
    }, '17G65'
]

When wild=True, getting Attribute error if keys are not strings

I have complex dict objects that have dictionaries with keys that are sometimes integers. If I do this following over one of these like so it works fine:

nested_results = nested_lookup.nested_lookup(document=my_map, key='amount')

if I change it to:

nested_results = nested_lookup.nested_lookup(document=my_map, key='amount', wild=True)

I get:
File "/usr/local/lib/python3.7/site-packages/nested_lookup/nested_lookup.py", line 29, in _nested_lookup
if key == k or (wild and key.lower() in k.lower()):
AttributeError: 'int' object has no attribute 'lower'

Need volunteer to write a couple unit tests to prevent future regressions

We caught a nasty edge case in pull request #36 but we never took the time to actually codify the test cases embedded in the comments. We really need somebody to add these test cases (and any others you can thing of) into our tests located here:

current tests: https://github.com/russellballestrini/nested-lookup/blob/master/test_nested_lookup.py#L12
tips on getting pull requests approved: https://russell.ballestrini.net/tips-for-getting-pull-requests-approved/
reference: #36 for hints on the defect we want to guard against.

Consider adding an (un)license file

Thanks for this tool!

A little story: this was adopted by executablebooks/jupyter-book#1123, which I'd like to re-package for conda-forge. Part of the process there requires some kind of file which documents the rights that various parties (maintainers, users, repackagers) have with respect to the software. The words, "Public Domain," don't really have a well-established meaning across various legal boundaries, etc. which makes it hard for various entities to use the software.

It would be very helpful if an actual document was added to this repo, and better still, to source distributions. Some examples include:

https://unlicense.org/
https://spdx.org/licenses/CC-PDDC.html
or really any of these that have Public Domain in them: https://spdx.org/licenses/

Happy to make a PR! (e.g. add file, include in MANIFEST.in)

Attribute error in get_occurrences_and_values function

Hi,

I have dictionary

topo = {
"ipv4base": "10.0.0.0",
"ipv4mask": 30,
"ipv6base": "fd00::",
"ipv6mask": 64,
"link_ip_start": {
"ipv4": "10.0.0.0",
"v4mask": 30,
"ipv6": "fd00::",
"v6mask": 64
},
"lo_prefix": {
"ipv4": "1.0.",
"v4mask": 32,
"ipv6": "2001:DB8:F::",
"v6mask": 128
},
"routers": {
"r1": {
"links": {
"lo": {
"ipv4": "auto",
"ipv6": "auto",
"type": "loopback"
},
"r2": {
"ipv4": "auto",
"ipv6": "auto"
},
"r3": {
"ipv4": "auto",
"ipv6": "auto"
}
},
"bgp": {
"local_as": "100",
"address_family": {
"ipv4": {
"unicast": {
"neighbor": {
"r2": {
"dest_link": {
"r1": {}
}
},
"r3": {
"dest_link": {
"r1": {}
}
}
}
}
}
}
}
},
"r2": {
"links": {
"lo": {
"ipv4": "auto",
"ipv6": "auto",
"type": "loopback"
},
"r1": {
"ipv4": "auto",
"ipv6": "auto"
},
"r3": {
"ipv4": "auto",
"ipv6": "auto"
}
},
"bgp": {
"local_as": "100",
"address_family": {
"ipv4": {
"unicast": {
"neighbor": {
"r1": {
"dest_link": {
"r2": {}
}
},
"r3": {
"dest_link": {
"r2": {}
}
}
}
}
}
}
}
},
"r3": {
"links": {
"lo": {
"ipv4": "auto",
"ipv6": "auto",
"type": "loopback"
},
"r1": {
"ipv4": "auto",
"ipv6": "auto"
},
"r2": {
"ipv4": "auto",
"ipv6": "auto"
},
"r4": {
"ipv4": "auto",
"ipv6": "auto"
}
},
"bgp": {
"local_as": "100",
"address_family": {
"ipv4": {
"unicast": {
"neighbor": {
"r1": {
"dest_link": {
"r3": {}
}
},
"r2": {
"dest_link": {
"r3": {}
}
},
"r4": {
"dest_link": {
"r3": {}
}
}
}
}
}
}
}
},
"r4": {
"links": {
"lo": {
"ipv4": "auto",
"ipv6": "auto",
"type": "loopback"
},
"r3": {
"ipv4": "auto",
"ipv6": "auto"
}
},
"bgp": {
"local_as": "200",
"address_family": {
"ipv4": {
"unicast": {
"neighbor": {
"r3": {
"dest_link": {
"r4": {}
}
}
}
}
}
}
}
}
}
}

i am searching for value 200. I am getting this error. Please check once.
(Pdb) get_occurrences_and_values(my_documents, value='200')
*** AttributeError: 'unicode' object has no attribute 'values'

if dict are part of list, get_all_keys fails

Example -
[
{
"listings": [
{
"name": "title",
"postcode": "postcode",
"full_address": "fulladdress",
"city": "city",
"lat": "latitude",
"lng": "longitude"
}
]
}
]

get_all_keys fails here

search for the values instead of keys

I am looking for a way to look for a certain keywords in a dict which are stored in values not keys.

here is an example made by ESPrima. In this example I am looking for the occurrence of each of the following keywords (localstorage queries):

"localStorage"
"sessionStorage"
"setItem"
"getItem"
"removeItem"
"clear"

do you have any solution for it?

PyPi version

Could you please release version with last merged PR?

get_occurrences_and_values throws when same value in multiple dicts

l = [
{
  "item1": 1234,
  "item2": 5678,
  "item4": [1234, 9012]
},
{
  "item1": "abcd",
  "item2": "efgh",
  "item4": [1234, "adcf"]
}
]

get_occurrences_and_values(l, value=1234)

The code above will throw TypeError: 'NoneType' object is not iterable.
I beieve the recursive action is matching 1234 within the list of item4 of the second dict, where it should be matching the entire value, or perhaps some kinda of different behaviour.

Suggestion: option to search for keys that have string in them

I have a dict that uses "Email" and "Email__other" it would be great if you could add an option to search "if in key" instead of if key is equal to.

Thanks for putting this in pip. Saved me some time!

assistance deletion of key/value

Hi,

Im using this, I have a dictionary I recieve from a shodan api. When I list the keys I get all that I expect..

keys = get_all_keys(shodan_data)

['region_code', 'ip', 'postal_code', 'country_code', 'city', 'dma_code', 'last_update', 'latitude', 'tags', 'area_code', 'country_name', 'hostnames', 'org', 'data', '_shodan', 'id', 'options', 'ptr', 'module', 'crawler', 'product', 'hash', 'os', 'opts', 'vulns', 'heartbleed', 'ip', 'isp', 'http', 'html_hash', 'robots_hash', 'redirects', 'securitytxt', 'title', 'sitemap_hash', 'robots', 'favicon', 'hash', 'data', 'location', 'host', 'html', 'location', 'components', 'React', 'categories', 'webpack', 'categories', 'Stripe', 'categories', 'Gatsby', 'categories', 'Google Font API', 'categories', 'server', 'sitemap', 'securitytxt_hash', 'cpe', 'port', 'ssl', 'dhparams', 'tlsext', 'id', 'name', 'id', 'name', 'id', 'name', 'versions', 'acceptable_cas', 'cert', 'sig_alg', 'issued', 'expires', 'expired', 'version', 'extensions', 'critical', 'data', 'name', 'data', 'name', 'critical', 'data', 'name', 'data', 'name', 'data', 'name', 'data', 'name', 'data', 'name', 'data', 'name', 'data', 'name', 'fingerprint', 'sha256', 'sha1', 'serial', 'subject', 'CN', 'pubkey', 'type', 'bits', 'issuer', 'C', 'CN', 'O', 'cipher', 'version', 'bits', 'name', 'chain', 'alpn', 'hostnames', 'location', 'city', 'region_code', 'area_code', 'longitude', 'country_code3', 'country_name', 'postal_code', 'dma_code', 'country_code', 'latitude', 'timestamp', 'domains', 'org', 'data', 'asn', 'transport', 'ip_str']

when i use
results = nested_delete(shodan_data, "html")
the key is not deleted, am I using this correctly or making a massive rookie mistake?
If someone could point me in the right direction that would be great?

Request: nested_lookup multiple keys like nested_lookup('Code,Name,Street1',tmp)

I would like to search multiple keys like nested_lookup('Code,Name,Street1',tmp)

get_occurrence, nested_delete, nested_update is not working for keys having values 'False' or 0

Example:

>>> import nested_lookup
>>> 
>>> data = {
...             "hardware_details": {
...                 "model_name": 'MacBook Pro',
...                 "total_number_of_cores": 0,
...                 "memory": False
...             }
...         }
>>> 
>>> nested_lookup.nested_update(data, key='total_number_of_cores', value=5)
{'hardware_details': {'model_name': 'MacBook Pro', 'total_number_of_cores': 0, 'memory': False}}
>>> nested_lookup.nested_delete(data, 'memory')
{'hardware_details': {'model_name': 'MacBook Pro', 'total_number_of_cores': 0, 'memory': False}}
>>> 
>>> nested_lookup.get_occurrence_of_key(data, key='total_number_of_cores')
0

Operations are skipped since data.get(key) returned false.

russellballestrini / nested-lookup Goto Github PK

nested-lookup's Introduction

nested_lookup

nested-lookup's People

Contributors

Stargazers

Watchers

Forkers

nested-lookup's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs