GithubHelp home page GithubHelp logo

Comments (18)

gvanrossum avatar gvanrossum commented on July 28, 2024 2

IIRC This has been fixed in recent versions. There are explicit dunder attributes. Check the docs (the PEP wasn’t updated). Sorry, I don’t have the details in my head.

from typedload.

ltworf avatar ltworf commented on July 28, 2024 1

I will look more into it

from typedload.

ltworf avatar ltworf commented on July 28, 2024 1

They are different. There can even be types that mypy can resolve (from string to real type) but at runtime can't be resolved at all.

Just how python works.

from typedload.

kkom avatar kkom commented on July 28, 2024

Btw, probably shouldn't have included this part, is it actually doesn't mean much - the constructor of TypedDict does not do any runtime type assertion:

td = ExampleTypedDict(**d)

print(f"{td=}")

This works just as fine and prints td={'a': 0}, even though it contains a type error:

class ExampleTypedDict(TypedDict):
    a: int
    b: int

d = {"a": 0}

print(f"{d=}")

td = ExampleTypedDict(**d)

print(f"{td=}")

from typedload.

ltworf avatar ltworf commented on July 28, 2024

long explaination of what you are doing wrong. tl;dr: put total=False in your class.

I think TypedDict only makes sense for analyzers such as mypy, which does fail for your code.

Basically it just returns a dict but mypy will know that certains keys are in this dict and what type they have.

Ensuring the dictionary is correct is up to you (at runtime).

NamedTuple is kinda the same, python will no absolutely no runtime checks on types, they are there for mypy. typedload is if you have data coming from external sources and want to make sure it actually respects the types you think it should have.

typedload can do this job.

Take this example:

class ExampleTypedDict(TypedDict):
    a: int
    b: int

td = ExampleTypedDict(a='ciao')

mypy will say:

error: Missing key 'b' for TypedDict "ExampleTypedDict"

that is also the issue typedload is complaining about.

If I go and fix it to:

td = ExampleTypedDict(a='ciao', b=3)

then mypy will notice the wrong type:

/tmp/ciccio.py:8: error: Incompatible types (expression has type "str", TypedDict item "a" has type "int")

If you want to make a dictionary with set keys, but where some of those keys are missing, you need to use total=False

class ExampleTypedDict(TypedDict, total=False):
    a: int
    b: int

td = ExampleTypedDict(a=3)

Incidentally, then typedload will also work.

from typedload.

kkom avatar kkom commented on July 28, 2024

Uffff, I really shouldn't have included the ExampleTypedDict(**d) part – it just confused everyone... My bad for doing this - sorry!

My point was that I want to express the fact that a dictionary has one required and one optional field:

class ExampleTypedDict(TypedDict):
    a: int
    b: Optional[int]

I don't want all fields to be optional, this is what total=False does. In this toy example I want just field b to be optional.

But I still think that this should pass at runtime in typedload:

class ExampleTypedDict(TypedDict):
    a: int
    b: Optional[int]

d = {"a": 0}

tdl = typedload.load(d, ExampleTypedDict)

PS: Python doesn't separate the concept of optional (keys) vs nullable (values), the way JavaScript, HHVM/Hack or Thrift does. Even though the generic type says Optional[X] it's really more akin to how "nullable" is defined in those languages, given the fact that it means just Union[X, None]. But given that this is all that we have, I'm still assuming that this annotation should allow for the key to be completely missing when asserting the type of TypeDict. I don't think there is a chance I'm interpreting this wrong?

from typedload.

ltworf avatar ltworf commented on July 28, 2024

Optional[something] in python expresses the fact that the value can be a None, for example in a function signature

def f(i: Optional[int], j: Optional[int]=None): ...

Both i and j support None as valid value, and j has a default value, so it doesn't need to be provided. But for i, if you don't pass it at all, you only obtain an error.

Since in general this is the normal behaviour in Python, I didn't think it would be a good general decision to diverge from that and always assign a value of None whenever the actual value is not provided.

With a NamedTuple you have to do:

class Bla(NamedTuple):
    optional_field: Optional[int] = None

And then if you call that constructor with nothing, or give an empty dictionary to typedload to construct that type, it will work fine, but in general Optional doesn't mean that you can assume a default value.

Anyway I won't do what you ask because it'd just behave differently than all the other classes in python so I think it would be unreasonable.

If for whatever reason you can't switch to using a NamedTuple or a Dataclass, which do support having default value set to None, you need to write your own handler for the TypedDict and tell your loader object to use your handler rather than the built-in one.

I have documented how to define handlers: https://ltworf.github.io/typedload/examples/#custom-handlers

To get good exceptions with the path of the error, you will need to do annotations like here: https://github.com/ltworf/typedload/blob/master/typedload/dataloader.py#L529

from typedload.

kkom avatar kkom commented on July 28, 2024

Thank you so much for your patient explanation @ltworf, everything finally makes sense to me now!

I didn't appreciate the fact that TypedDict does not support default values (I can understand why – it's not meant to be really used at runtime), which would have been a solution. But, regardless of that, I also agree that the solution I proposed (implicitly providing None as a default for Optional[X]) would be non-standard and unreasonable.

Upon reading PEP 859 and specifically the Totality section I learned that there in fact is a way to express that only subset of TypedDict fields is optional (edit: "non-required" is the proper term here):

The totality flag only applies to items defined in the body of the TypedDict definition. Inherited items won't be affected, and instead use totality of the TypedDict type where they were defined. This makes it possible to have a combination of required and non-required keys in a single TypedDict type.

The PEP 859 - Rejected alternatives section confirms that this is the only way to do so:

These features were left out from this PEP, but they are potential extensions to be added in the future:

(...)

There is no way to individually specify whether each key is required or not. No proposed syntax was clear enough, and we expect that there is limited need for this.

I think that typedload however does not correctly implement the inheritance of the totality field:

from typing import TypedDict

import typedload

class BaseTypedDict(TypedDict):
    a: int

class PartiallyOptionalTypedDict(BaseTypedDict, total=False):
    b: int

valid_potd_untyped = {"a": 0}

print(f"{valid_potd_untyped=}")

valid_potd = typedload.load(valid_potd_untyped, PartiallyOptionalTypedDict)

print(f"{valid_potd=}")

invalid_potd_untyped = {"b": 0}

print(f"{invalid_potd_untyped=}")

invalid_potd = typedload.load(invalid_potd_untyped, PartiallyOptionalTypedDict)

print(f"{invalid_potd=}")

Executing this files gives the following results:

➜  optional_value pipenv run python main.py
valid_potd_untyped={'a': 0}
valid_potd={'a': 0}
invalid_potd_untyped={'b': 0}
invalid_potd={'b': 0}

According to PEP 859 PartiallyOptionalTypedDict should have a required a: int field and a non-required b: int field. But invalid_potd = typedload.load(invalid_potd_untyped, PartiallyOptionalTypedDict) succeeds at runtime, even though the provided dictionary doesn't contain the key a. It appears that typedload interprets the total=False flag as if it also applied to the inherited fields.

PS: Yes, I am the same person who posted this comment: #100 (comment) :) I didn't get to implement that functionality there, but I might be able to contribute this improvement now!

from typedload.

kkom avatar kkom commented on July 28, 2024

PEP 655 discusses this in depth, I recommend taking a look at it @ltworf if you haven't yet.

As a side note, I realized that my use case would be actually best served by leveraging TypedDict inheritance, with both base and inherited classes annotated as total=True, but with the base class being "open" – analogously to Hack/HHVM's open shapes.

That's because I have an API response with a list of items, all of which share some common keys. Additionally, some items are guaranteed to have extra keys depending on the type. So I want to first check the type, and subsequently use typedload to assert that extra keys are present.

But that's currently not possible, as there is no concept of an open TypedDict in Python. That's actually a really easy addition though, I'll see what can be done there! I feel like open=True annotation would be tremendously useful in a TypedDict inheritance scenario.

from typedload.

ltworf avatar ltworf commented on July 28, 2024

But that PEP is a draft.

from typing import *

class A(TypedDict, total=True):
    a: int

class B(A, total=False):
    b: int
    
B.mro()

It will return [__main__.B, dict, object]

Which means that A is not actually counted as a superclass and when you get a B you have no way of knowing it actually comes from A and that some fields must be there and some can be absent.

So I wouldn't know how to implement this thing. I'm not sure it's possible at the moment.

Do you have any more ideas?

from typedload.

kkom avatar kkom commented on July 28, 2024

This may be a silly question, but shouldn't it be this?

class B(A, total=False):
    b: int

from typedload.

ltworf avatar ltworf commented on July 28, 2024

lol yes, sorry, i tried to make it nice to paste it here and messed it up.

The issue is that the result is what I said it is anyways :D

from typedload.

ltworf avatar ltworf commented on July 28, 2024

So at run time I don't really see a way of inspecting ancestors, since the object created doesn't list them as ancestors.

from typedload.

kkom avatar kkom commented on July 28, 2024

You're right... I tried a few more things and nothing worked:

import inspect
from typing import TypedDict

class A(TypedDict):
    a: int

class B(A):
    b: int

print(f"{B.mro()=}")
print(f"{B.__bases__=}")
print(f"{inspect.getclasstree([B])=}")
➜  typeddict_inheritance python3 main.py
B.mro()=[<class '__main__.B'>, <class 'dict'>, <class 'object'>]
B.__bases__=(<class 'dict'>,)
inspect.getclasstree([B])=[(<class 'dict'>, (<class 'object'>,)), [(<class '__main__.B'>, (<class 'dict'>,))]]

The only lead I can think of now is again the Rejected alternatives section of the PEP:

These are rejected on principle, as incompatible with the spirit of this proposal:

(...)

  • TypedDict type definitions could plausibly used to perform runtime type checking of dictionaries. For example, they could be used to validate that a JSON object conforms to the schema specified by a TypedDict type. This PEP doesn't include such functionality, since the focus of this proposal is static type checking only, and other existing types do not support this, as discussed in Class-based syntax. Such functionality can be provided by a third-party library using the typing_inspect [10] third-party module, for example.

It does suggest looking at typing_inspect to help with implementing runtime checks based on TypedDict classes. The source code is here: https://github.com/ilevkivskyi/typing_inspect

I don't see very many mentions of TypedDict in the source code though: https://github.com/ilevkivskyi/typing_inspect/search?q=typeddict

from typedload.

kkom avatar kkom commented on July 28, 2024

Would seeing if mypy, pyright or pyre correctly implement TypedDict inheritance statically be of any help? Or is the runtime and static environment completely different? I think it doesn't hurt to at least test it, and see where this takes us. (I'm quite new to the implementation details of Python's typing system, be it runtime or static, so I'm really trying random things here.)

Edit: both mypy and pyright do implement this correctly, but I'm not sure if this fact helps us...

from typedload.

kkom avatar kkom commented on July 28, 2024

@JukkaL @davidfstr @gvanrossum, since you authored or sponsored PEP 589 or PEP 655, could you help us figure out if it's possible to reconstruct at runtime the exact set of required and non-required keys of a TypedDict?

From a quick investigation it appears that it's actually impossible. The combination of required and non-required keys is achieved through inheritance of TypedDict classes with different totality flags, but at runtime a TypedDict inheriting from another TypedDict appears to not list it as a base class – see #204 (comment).

from typedload.

ltworf avatar ltworf commented on July 28, 2024
In [2]: B.__required_keys__
Out[2]: frozenset({'a'})

In [3]: B.__optional_keys__
Out[3]: frozenset({'b'})

True :D

from typedload.

kkom avatar kkom commented on July 28, 2024

Indeed, thank you @gvanrossum! We're good to go here now :)

Here is the relevant bug tracker link for reference: https://bugs.python.org/issue38834 and here is the PR implementing it: python/cpython#17214

from typedload.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.