You use str as variable name and by doing so hiding the underlying function. Here i

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Refactor snippets to avoid using str as a variable about 30-seconds-of-python HOT 6 CLOSED

chalarangelo commented on May 12, 2024

Refactor snippets to avoid using str as a variable

from 30-seconds-of-python.

Comments (6)

Chalarangelo commented on May 12, 2024 2

Feel free to open a PR with a better naming convention. Some of these snippets have been written by people that come from a non-Pythonian background and we might have made some mistakes in the process. I would love to see some things like this, that might fly under the radar for us, fixed by the community.

from 30-seconds-of-python.

Dekker1967 commented on May 12, 2024 1

I would like to tweak the functions. Would you do that function by function or in a single pull request?

Going through the examples I found many stumble stones. I am not perfect myself, but I could give it a first wash:

Naming is one thing: don't use str (built-in), filter (built-in), string (std module)
Creating a function called zip() is "sub-optimal" to say the least but also shadows the real built-in "zip" function which does exactly what the function tries to do: https://docs.python.org/3.5/library/functions.html#zip

I have for example adapted chunk.md - based on recipe found on stack-overflow... do you mention those sources: # https://stackoverflow.com/questions/9671224/split-a-python-list-into-other-sublists-i-e-smaller-lists

title: chunk
tags: list,intermediate

Chunks a list into smaller lists of a specified size.

By using the step parameter of range() we create a list of indices
where to start each sub-list. By employing these indices we can create
the sub-lists by splicing lst.

def chunk(lst, chunk_size):
    return [lst[x:x + chunk_size] for x in range(0, len(lst), chunk_size)]

chunk([1,2,3,4,5], 2)  # [[1, 2], [3, 4], [5]]

Or then count_by.md:

title: count_by
tags: list,intermediate

Groups the elements of a list based on the given function and returns the count of elements in each group.

Use map() to map the values of the given list using the given function.
Use collections.defaultdict to avoid the check whether a key is present in the count_dict.
Iterate over the list and increase the counter for each mapped element.
Freeze count_dict after filling it.

from collections import defaultdict

def count_by(lst, fn=None):
    if fn is None:
        fn = lambda x: x
    count_dict = defaultdict(int)
    for el in map(fn, lst):
        count_dict[el] += 1
    count_dict.default_factory = None
    return count_dict

from math import floor
count_by([6.1, 4.2, 6.3], floor) # {4: 1, 6: 2}
count_by(['one', 'two', 'three'], len) # {3: 2, 5: 1}

If you allow me to push changes, I will gladly bring them in. Should I do a pull-request for each md or can I pull them in all together?

from 30-seconds-of-python.

Chalarangelo commented on May 12, 2024 1

@Dekker1967 PR anything you see being "not very pythonic" and we will get right on it. Like I said, some of the people who contributed the content might have been either beginners or people not very well-versed in Python.

from 30-seconds-of-python.

fejes713 commented on May 12, 2024

Thanks for the feedback. If you don't feel like opening a PR just let me know the proposed change here and I'll do it for you 👍

from 30-seconds-of-python.

Chalarangelo commented on May 12, 2024

@Dekker1967 I would rather see changes bundled up in PRs thematically (e.g. replacing all uses of str variable), so we can more easily check them.

Sources are not something we are doing across repos for now, but you can mention in the PR for further reference.

zip itself is something I didn't know existed, leave it as is and we will figure it out independently (I'll get back to you later about this).

from 30-seconds-of-python.

Dekker1967 commented on May 12, 2024

Meanwhile I went through much more examples. The variables named 'str' and 'filter' are the least problem. Forget about the inconsistent naming (lst vs arr). But they show the underlying issue: I realized that the person who wrote all this examples is not very savvy in the Python language. Most of the "issues" the 30-seconds-function try to solve, are already solved (mostly solutions are "hidden" in the modules: itertools, collections, random). Often the created function do not feel very pythonic (eg. the all_equal.md feels c-like where we can compare by pointer, but the solution proposed will create two new lists in Python, which is a big expense for large lists; the Python solution would be more something like: return set(lst) <= 1. Much easier to grasp! And works also for empty lists.). Finally the desire to cast all the results to list doesn't feel right and very efficient. Is the result really needed as list? To begin with: Maybe the developer should have used a set from the beginning because a set can be expanded: set.add(). Data structures are here for a purpose.

I like the idea of 30-second-functions. But as they are now, they are not very helpful. This is not a question of renaming 'str' to something else... all the content is affected. I analyzed the last few functions and added some notes to each function. I retreat my offer to help but I wish you all the best for your project!

Here are a few comments on the last few function (alphabetically):

zip.md - not needed is a built-in python function
values_only.md - no need to create a function for that .values() returns a list of values - no need to surround it with list()
unique_elements.md - no need to create a function for that
union_by.md - this is a very specific use case
union.md - why always render result into a list - a set has its advantages and a "cast" to list should happen for a reason
tail.md - a short and concise one-liner... is it worth to create a function for that?
symmetric_difference_by.md - analoguous to symmetric_differenc.md: list(set(map(fn, a)) ^ set(map(fn, b)))
symmetric_differenc.md - the following code does the same: list(set(a) ^ set(b)) --> although I would not return a list
sum_by.md - a very specific use case and in the end almost no code - is it worth a function?
spread.md - a very specific use case - who mixes lists_of_ints and ints into the same list - the "expected" case would be: [[1],[2],[3],[4,5,6],[7],[8],[9]]
split_lines.md - this is not worth a function... split('\n') is enough - no need for indirection; 'str' is a bad variable name (shadows str-function)
some.md - exists already in the form of any()
l1 = [0, 1, 2, 0]
l2 = [1, 1, 1, 1]
l3 = [0, 0, 1, 0]
l4 = [0, 0, 0, 0]
print any(map(lambda x: x >= 2, l1))
print any(map(lambda x: x >= 2, l2))
print any(l3)
print any(l4)
snake.md - https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case
similarity.md - achieved via: set(a) & set(b)
shuffle.md - module random --> random.shuffle(lst) or if you want a copy: random.shuffle(list[:])
sample.md - module random --> random.sample() is your friend

from 30-seconds-of-python.

Refactor snippets to avoid using str as a variable about 30-seconds-of-python HOT 6 CLOSED

Comments (6)

title: chunk
tags: list,intermediate

title: count_by
tags: list,intermediate

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

Comments (6)

title: chunk tags: list,intermediate

title: count_by tags: list,intermediate

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

Jobs

title: chunk
tags: list,intermediate

title: count_by
tags: list,intermediate