GithubHelp home page GithubHelp logo

transpy's People

Contributors

tirasz avatar

Stargazers

 avatar

Watchers

 avatar  avatar

transpy's Issues

Class Patterns

How do they work

  • First, the class object Class is looked up, and an isinstance(subject, Class) is called. If this is false, then the match fails.
  • If the above is true, then any sub-patterns in the form of either positional or keyword arguements are matched left to right. The match fails, as soon as a sub-patterns fails. If all the sub-patterns succeed, the class pattern match succeeds.
  • Positional arguments are matched by the classes __match__args__ attribute.
    • If there are more positional arguments, than the lenght of __match__args__ --> TypeError is raised.
    • If the __match__args__ attribute is absent on the class --> TypeError is raised.
    • The argument at position i is matched against the value looked up by attribute __match__args__[i]
      • e.g.: Pattern: Point(2,1) where Point.__match__args__ == ["x", "y"], the pattern is roughly translated to obj.x == 2 and obj.y == 1
  • Keyword arguments are looked up as attributes on the subject.
    • If the lookup fails, the match fails.
    • Otherwise, the value is matched against the sub-pattern.

__match__args__

  • Used for matching Positional arguments.
  • Only Keyword / Match-by-name arguments work by default.
  • If a class wants to support Positional matching, it should define the __match__args__ as a class attribute, which is a list/tuple of string, naming the allowed positional arguments in order. It is recommended to mirror the constructor.
  • Dataclasses, and named tuples support Keyword / Match-by-name arguments by default.
  • For the most commonly matched built-in types (bool, bytearray, bytes, dict, float, frozenset, int, list, set, str, and tuple) a single Positional sub-pattern is allowed to be passed to the call. It is matched against the subject itself.
    • e.g.: bool(False) matches False (but not 0)
    • tuple((1,2,3)) matches (1,2,3) (but not [1,2,3])
    • int(i) matches any int, and binds its value to i.

Analyzing

  • My first instinct is that i need to find three things for a branch, if i want to transform it into a class-pattern:

    • Class: kind of obvious, i need a class to do a class-pattern
    • Subject: also obvious, but not trivial.
    • Attributes: I need to find the attributes that the branch is testing. I think its clear, that i cannot transform into a pattern that uses Positional arguments, since it is not supported by default for most classes.
  • Possible subjects in order:

    • If there is an isinstance(subj, Class) call in the branch, then subj should be the subject.
    • If there is only one type of literal comparison like subj.attribute == something then both subj and subj.attribute are possible subjects
    • If there are multiple literal comparisons like subj.attribute == something and subj.other_attribute == something, then we can be sure that only subj is a possible subject.
    • (LiteralCase will be able to transform the second case most of the time)
    • Examples:
      • isinstance(x, AnyClass) and ( x.something == "something" or x.something == 42) and (x.other_attr == "24" or x.other_attr == 24):
      • subject: x: --> case AnyClass(something = "something" | 42, other_attr = "24" | 24):
      • if (obj.attr1 == 2 or obj.attr1 == 3) and some_bool():
        • subject: obj --> case object(attr1 = 2 | 3) if some_bool():
        • subject: obj.attr1 --> case 2 | 3 if some_bool() [LiteralCase]
      • if (obj.attr1 == 2 or obj.attr1 == 3) and (obj.attr2 == 4 or obj.attr2 == 5):
        • subject: obj --> case object(attr1 = 2 | 3, attr2 = 4|5):
      • if isinstance(obj, SomeClass) and obj.attr1 == 2 and obj.attr2.x = 1 and obj.attr2.y == 2:
        • subject: obj --> case SomeClass(attr1 = 2, attr2 = object(x = 1, y = 2))

Function calls (and other shenanigans) inside isinstance() calls

While testing, i came accros something like this:

if isinstance(obj, type(obj2)):
   something()
elif isinstance(obj, type(obj3)):
   something_else()
elif isinstance(obj, SomeClass):
   some_function()

That got turned into:

match obj:
   case type(obj2)():
       something()
   case type(obj3)():
       something_else()
   case SomeClass():
       some_function()

The produced code has syntax errors, so the transformer resets the transformation, but still, this is an obvious oversight.
Im not exactly sure whats going on here, but the interpreter sometimes throws a TypeError, sometimes doesnt, when i try to do something like this:

match obj:
   case type(obj2):
       something()
   case type(obj3):
       something_else()
   case SomeClass():
       some_function()

image

Other weird things I've seen while testing regarding isinstance calls:

if isinstance(obj, (variable, (Class1, Class2))):
   something()
elif isinstance(obj, tuple(SOME_DICT['something'])):
   something_else()

This gets turned into:

match obj:
   case variable() | (Class1, Class2)():
       something()
   case tuple(SOME_DICT['something'])():
       something_else()
   case 2:
       asd()

Besides causing brain damage to anyone who reads the transformed version (sorry), it also causes a syntax error, so thankfully it never actually gets transformed.
I'm going to be honest, I have no idea what the first branch is even trying to accomplish in the original code, so I'm not suprised the transformer was also confused in this regard.
The second branch is pretty similar to the isinstance(obj, type(var)) case mentioned before. I'm pretty sure pattern matching cannot support dynamic classpatterns, so the solution i propose is:

ClassPattern should only accept isinstance calls, where its second argument is either a tuple of Name nodes, or a single Name node.
I'm a bit worried that it might be a little too strict, but i definitely cannot transform isinstance calls where the second argument is (or contains) a Call node (type(var)), or a Subscript node (SOME_DICT['something']), etc..
As soon as i figure out what this type of isinstance call does: isinstance(obj, (Class1, (Class2, Class3))) then i might be able to handle nested tuples as the second argument for an isinstance call.

Visit branches instead of If-nodes.

Make plugins visit the branches instead of the main If-node.
The plugins should return with a list of subjects for the given branch.
In the main analyzer, construct a dictionary of branch -> [ (plugin, [subjects]) ]
(A list of tuples saying: PluginXY can transform this branch using these subjects)
By going over and intersecting the returned subjects, determine if it can be transformed into a pattern match.
For example, given the code:

if isinstance(x, SomeClass) and x.prop = "something" and x.other_prop = 42 and something_else():
    ...
elif (x == None or x == "") and (y == "Error" or y ==404) and something_else():
    ...

For the first branch, LiteralCase would return nothing, while ClassCase would return a list: ["x"]
For the second branch, ClassCase would return nothing, while LiteralCase would return a list: ["x", "y"]
In the main analyzer this would look like:
branch1 -> [(ClassCase(), ["x"])]
branch2 -> [(LiteralCase(), ["x", "y"])]
After somehow making sense of the above, we can conclude that:
The whole If-node can be transformed into a pattern match, using the subject x.
Like so:

match x:
    case SomeClass(prop = "something", other_prop=42) if something_else():
        ...
    case None | "" if (y == "Error" or y == 404) and something_else():
        ...

Reworking the one (1!) already existing plugin to work this way shouldn't be too hard.
But implementing this in the main analyzer is going to be bit more difficult.

  • Some ideas:
    • After the dictionary is done, order it by the lenght of the lists.
    • Start intersecting - starting with the smallest list. This is the quickest way to make sure a node is not transformable.
    • After an intersection, if you are left with an empty set, try "going back", and try intersecting with the other tuples in the list.
    • If there are no other tuples, try "going back even more" to the previous branch, and selecting a new tuple.
    • If you cannot "go back" anymore, the node is not transformable.
    • If you are done with all the branches, and the intersection is not an empty list, the node can be transformed.
    • Select one of the subjects from the set (Could ask user?)
    • Go through all the branches again, this time delete all the tuples from the list, except one, that contains the selected subject.
    • In the end, you should end up with a dictionary: Branch -> [(Plugin(), [subject])] for every branch of the main If-node.

Even though this solution smells, in practice i dont think many (if any) branches are gonna have more than one plugin that can transform them.

Nested Ifs

Nested If-nodes

if A:
    A_1  #Pre-nest
    if B:
        A AND B
    elif C:
        A AND C (AND not B)
    else:
        A_2 (AND not B AND not C)
    A_3 #Post-nest

'Flattening'

It is possible to "flatten" a nested If-node, but not without code repetition.
The example above can be transformed into:

if A AND B:
    A_1 #Pre-nest
    A AND B
    A_3 #Post-nest
elif A AND C:
    A_1 #Pre-nest
    A AND C
    A_3 #Post-nest
elif A:
    A_1 #Pre-nest
    A_2
    A_3 #Post-nest

It is worth noting, that without the Pre- and -Post nested code segments, code repetition is minimal.

Why is this useful

Consider the following code:

if isinstance(cat, Cat):
    if cat.color == 'black' or cat.color == 'gray':
        turn_around()
    elif cat.color == 'orange' and cat.weight == 'a lot':
        give_lasagne()
    else:
        ignore_cat()

This is a prefectly reasonable example code.
Before pattern matching, when you wanted to branch your code, and all the branches shared at least one of the conditions, you could just nest the branches inside the shared condition.
But, with pattern matching, you could very easily write:

match cat:
    case Cat(color = 'black' | 'gray'):
        turn_around()
    case Cat(color = 'orange', weight = 'a lot'):
        give_lasagne()
    case Cat():
        ignore_cat()

The transformer cannot recognise nested If-nodes by default, since it doesnt even touch the body of the If-nodes.
So at the moment, the transformer would transform the code into:

match cat:
    case Cat():
        if cat.color == 'black' or cat.color == 'gray':
            turn_around()
        elif cat.color == 'orange' and cat.weight == 'a lot':
            give_lasagne()
        else:
            ignore_cat()

But, the transformer COULD recognise these patterns and produce the correct pattern-matching above, if it got the flattened version of the code as the input:

if isinstance(cat, Cat) and (cat.color == 'black' or cat.color == 'gray'):
    turn_around()
elif isinstance(cat, Cat) and (cat.color == 'orange' and cat.weight == 'a lot'):
    give_lasagne()
elif isinstance(cat, Cat):
    ignore_cat()

Faulty flattening logic

This:

if isinstance(channel, (Thread, TextChannel)) and guild is not None:
    member = guild.get_member(user_id)
    if member is None:
        member_data = data.get('member')
        if member_data:
            member = Member(data=member_data, state=self, guild=guild)

Got turned into this:

match channel:
    case Thread() | TextChannel() if guild is not None and member is None:
        member = guild.get_member(user_id)
        member_data = data.get('member')
        if member_data:
            member = Member(data=member_data, state=self, guild=guild)

Im not 100% sure, but i think this is because the nested if-node doesnt have an "else:" block, thus there is no case created for when only the main if-node's test is true, which is a pretty big oversight.
Possible fixes:

  • only flatten if the nested node has else block
  • or, always create a case where only the parents test is transformed?

Return statements inside flattened If nodes

While testing, i found an oversight regarding the logic of "flattening".

I came across something like this:

def foo(obj):
    if isinstance(obj, someClass) and something()
        if something_else():
            return obj.copy()
        return obj

That got turned into:

def foo(obj):
    match obj:
        case SomeClass() if something() and something_else():
            return obj.copy()
            return obj

This is of course wrong. But Im pretty sure its quite easily fixable, with a simple check.
I need to check if there are any return, continue, break, yield keywords inside of the nested if-node.
The whole idea of flattening is based on the fact, that the code before the nested node (Pre-nest), and the code after it (Post-nest) are always getting executed, if the main condition is true. However, if the nested node contains any of the before mentioned keywords, then this is no longer true.
To be safe, I'm only going to allow flattening, if neither the pre and post-nested code segments, neither the nested if-node itself contains any of the return, continue, break, yield keywords.

Recognize subjects other than Name nodes

Right now the subject must be a string (id), that is used to construct a Name node.
This limits tranforming capabilities, for example:

if obj.attr = 2 or obj.attr = 3:
    do_something()
elif (obj.attr = 12 or obj.attr = 24) and something():
    do_something()

Could be transformed with LiteralCase to:

match obj.attr:
    case 2 | 3:
        do_something()
    case 12 | 24 if something():
        do_something()

If the plugin was able to recognize that obj.attr could be a subject, even though its not just a single Name() node.
I think this could be changed, by modifying the get_subject, _get_subject, and get_const_node methods.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.