Basic cheatsheet for Python mostly based on the book written by Al Sweigart, Automate the Boring Stuff with Python under the Creative Commons license and many other sources.
All contributions are welcome:
- Read the issues, Fork the project and do a Pull Request.
- Request a new topic creating a
New issue
with theenhancement
tag. - Find any kind of errors in the cheat sheet and create a
New issue
with the details or fork the project and do a Pull Request. - Suggest a better or more pythonic way for existing examples.
- The Zen of Python
- Python Basics
- Flow Control
- Comparison Operators
- Boolean evaluation
- Boolean Operators
- Mixing Boolean and Comparison Operators
- if Statements
- else Statements
- elif Statements
- while Loop Statements
- break Statements
- continue Statements
- for Loops and the range() Function
- For else statement
- Importing Modules
- Ending a Program Early with sys.exit()
- Functions
- Exception Handling
- Lists
- Getting Individual Values in a List with Indexes
- Negative Indexes
- Getting Sublists with Slices
- Getting a List’s Length with len()
- Changing Values in a List with Indexes
- List Concatenation and List Replication
- Removing Values from Lists with del Statements
- Using for Loops with Lists
- Looping Through Multiple Lists with zip()
- The in and not in Operators
- The Multiple Assignment Trick
- Augmented Assignment Operators
- Finding a Value in a List with the index() Method
- Adding Values to Lists with the append() and insert() Methods
- Removing Values from Lists with remove()
- Sorting the Values in a List with the sort() Method
- Tuple Data Type
- Converting Types with the list() and tuple() Functions
- Dictionaries and Structuring Data
- sets
- itertools Module
- Comprehensions
- Manipulating Strings
- Escape Characters
- Raw Strings
- Multiline Strings with Triple Quotes
- Indexing and Slicing Strings
- The in and not in Operators with Strings
- The in and not in Operators with list
- The upper(), lower(), isupper(), and islower() String Methods
- The isX String Methods
- The startswith() and endswith() String Methods
- The join() and split() String Methods
- Justifying Text with rjust(), ljust(), and center()
- Removing Whitespace with strip(), rstrip(), and lstrip()
- Copying and Pasting Strings with the pyperclip Module (need pip install)
- String Formatting
- Regular Expressions
- Matching Regex Objects
- Grouping with Parentheses
- Matching Multiple Groups with the Pipe
- Optional Matching with the Question Mark
- Matching Zero or More with the Star
- Matching One or More with the Plus
- Matching Specific Repetitions with Curly Brackets
- Greedy and Nongreedy Matching
- The findall() Method
- Making Your Own Character Classes
- The Caret and Dollar Sign Characters
- The Wildcard Character
- Matching Everything with Dot-Star
- Matching Newlines with the Dot Character
- Review of Regex Symbols
- Case-Insensitive Matching
- Substituting Strings with the sub() Method
- Managing Complex Regexes
- Handling File and Directory Paths
- Backslash on Windows and Forward Slash on OS X and Linux
- The Current Working Directory
- Creating New Folders
- Absolute vs. Relative Paths
- Handling Absolute and Relative Paths
- Checking Path Validity
- Finding File Sizes and Folder Contents
- Copying Files and Folders
- Moving and Renaming Files and Folders
- Permanently Deleting Files and Folders
- Safe Deletes with the send2trash Module
- Walking a Directory Tree
- Reading and Writing Files
- JSON, YAML and configuration files
- Debugging
- Lambda Functions
- Ternary Conditional Operator
- args and kwargs
- Context Manager
__main__
Top-level script environment- setup.py
- Dataclasses
- Virtual Environment
From the PEP 20 -- The Zen of Python:
Long time Pythoneer Tim Peters succinctly channels the BDFL's guiding principles for Python's design into 20 aphorisms, only 19 of which have been written down.
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
From Highest to Lowest precedence:
Operators | Operation | Example |
---|---|---|
** | Exponent | 2 ** 3 = 8 |
% | Modulus/Remaider | 22 % 8 = 6 |
// | Integer division | 22 // 8 = 2 |
/ | Division | 22 / 8 = 2.75 |
* | Multiplication | 3 * 3 = 9 |
- | Subtraction | 5 - 2 = 3 |
+ | Addition | 2 + 2 = 4 |
Examples of expressions in the interactive shell:
>>> 2 + 3 * 6
20
>>> (2 + 3) * 6
30
>>> 2 ** 8
256
>>> 23 // 7
3
>>> 23 % 7
2
>>> (5 - 1) * ((7 + 1) / (3 - 1))
16.0
Data Type | Examples |
---|---|
Integers | -2, -1, 0, 1, 2, 3, 4, 5 |
Floating-point numbers | -1.25, -1.0, --0.5, 0.0, 0.5, 1.0, 1.25 |
Strings | 'a', 'aa', 'aaa', 'Hello!', '11 cats' |
String concatenation:
>>> 'Alice' 'Bob'
'AliceBob'
Note: Avoid +
operator for string concatenation. Prefer string formatting.
String Replication:
>>> 'Alice' * 5
'AliceAliceAliceAliceAlice'
You can name a variable anything as long as it obeys the following three rules:
- It can be only one word.
- It can use only letters, numbers, and the underscore (
_
) character. - It can’t begin with a number.
- Variable name starting with an underscore (
_
) are considered as "unuseful`.
Example:
>>> spam = 'Hello'
>>> spam
'Hello'
>>> _spam = 'Hello'
_spam
should not be used again in the code.
Inline comment:
# This is a comment
Multiline comment:
# This is a
# multiline comment
Code with a comment:
a = 1 # initialization
Please note the two spaces in front of the comment.
Function docstring:
def foo():
"""
This is a function docstring
You can also use:
''' Function Docstring '''
"""
>>> print('Hello world!')
Hello world!
>>> a = 1
>>> print('Hello world!', a)
Hello world! 1
Example Code:
>>> print('What is your name?') # ask for their name
>>> myName = input()
>>> print('It is good to meet you, {}'.format(myName))
What is your name?
Al
It is good to meet you, Al
Evaluates to the integer value of the number of characters in a string:
>>> len('hello')
5
Note: test of emptiness of strings, lists, dictionary, etc, should not use len, but prefer direct boolean evaluation.
>>> a = [1, 2, 3]
>>> if a:
>>> print("the list is not empty!")
Integer to String or Float:
>>> str(29)
'29'
>>> print('I am {} years old.'.format(str(29)))
I am 29 years old.
>>> str(-3.14)
'-3.14'
Float to Integer:
>>> int(7.7)
7
>>> int(7.7) + 1
8
Operator | Meaning |
---|---|
== |
Equal to |
!= |
Not equal to |
< |
Less than |
> |
Greater Than |
<= |
Less than or Equal to |
>= |
Greater than or Equal to |
These operators evaluate to True or False depending on the values you give them.
Examples:
>>> 42 == 42
True
>>> 40 == 42
False
>>> 'hello' == 'hello'
True
>>> 'hello' == 'Hello'
False
>>> 'dog' != 'cat'
True
>>> 42 == 42.0
True
>>> 42 == '42'
False
Never use ==
or !=
operator to evaluate boolean operation. Use the is
or is not
operators,
or use implicit boolean evaluation.
NO (even if they are valid Python):
>>> True == True
True
>>> True != False
True
YES (even if they are valid Python):
>>> True is True
True
>>> True is not False
True
These statements are equivalent:
>>> if a is True:
>>> pass
>>> if a is not False:
>>> pass
>>> if a:
>>> pass
And these as well:
>>> if a is False:
>>> pass
>>> if a is not True:
>>> pass
>>> if not a:
>>> pass
There are three Boolean operators: and, or, and not.
The and Operator’s Truth Table:
Expression | Evaluates to |
---|---|
True and True |
True |
True and False |
False |
False and True |
False |
False and False |
False |
The or Operator’s Truth Table:
Expression | Evaluates to |
---|---|
True or True |
True |
True or False |
True |
False or True |
True |
False or False |
False |
The not Operator’s Truth Table:
Expression | Evaluates to |
---|---|
not True |
False |
not False |
True |
>>> (4 < 5) and (5 < 6)
True
>>> (4 < 5) and (9 < 6)
False
>>> (1 == 2) or (2 == 2)
True
You can also use multiple Boolean operators in an expression, along with the comparison operators:
>>> 2 + 2 == 4 and not 2 + 2 == 5 and 2 * 2 == 2 + 2
True
if name == 'Alice':
print('Hi, Alice.')
name = 'Bob'
if name == 'Alice':
print('Hi, Alice.')
else:
print('Hello, stranger.')
name = 'Bob'
age = 5
if name == 'Alice':
print('Hi, Alice.')
elif age < 12:
print('You are not Alice, kiddo.')
name = 'Bob'
age = 30
if name == 'Alice':
print('Hi, Alice.')
elif age < 12:
print('You are not Alice, kiddo.')
else:
print('You are neither Alice nor a little kid.')
spam = 0
while spam < 5:
print('Hello, world.')
spam = spam + 1
If the execution reaches a break statement, it immediately exits the while loop’s clause:
while True:
print('Please type your name.')
name = input()
if name == 'your name':
break
print('Thank you!')
When the program execution reaches a continue statement, the program execution immediately jumps back to the start of the loop.
while True:
print('Who are you?')
name = input()
if name != 'Joe':
continue
print('Hello, Joe. What is the password? (It is a fish.)')
password = input()
if password == 'swordfish':
break
print('Access granted.')
>>> print('My name is')
>>> for i in range(5):
>>> print('Jimmy Five Times ({})'.format(str(i)))
My name is
Jimmy Five Times (0)
Jimmy Five Times (1)
Jimmy Five Times (2)
Jimmy Five Times (3)
Jimmy Five Times (4)
The range() function can also be called with three arguments. The first two arguments will be the start and stop values, and the third will be the step argument. The step is the amount that the variable is increased by after each iteration.
>>> for i in range(0, 10, 2):
>>> print(i)
0
2
4
6
8
You can even use a negative number for the step argument to make the for loop count down instead of up.
>>> for i in range(5, -1, -1):
>>> print(i)
5
4
3
2
1
0
This allows to specify a statement to execute in case of the full loop has been executed. Only
useful when a break
condition can occur in the loop:
>>> for i in [1, 2, 3, 4, 5]:
>>> if i == 3:
>>> break
>>> else:
>>> print("only executed when no item of the list is equal to 3")
import random
for i in range(5):
print(random.randint(1, 10))
import random, sys, os, math
from random import *.
import sys
while True:
print('Type exit to exit.')
response = input()
if response == 'exit':
sys.exit()
print('You typed {}.'.format(response))
>>> def hello(name):
>>> print('Hello {}'.format(name))
>>>
>>> hello('Alice')
>>> hello('Bob')
Hello Alice
Hello Bob
When creating a function using the def statement, you can specify what the return value should be with a return statement. A return statement consists of the following:
-
The return keyword.
-
The value or expression that the function should return.
import random
def getAnswer(answerNumber):
if answerNumber == 1:
return 'It is certain'
elif answerNumber == 2:
return 'It is decidedly so'
elif answerNumber == 3:
return 'Yes'
elif answerNumber == 4:
return 'Reply hazy try again'
elif answerNumber == 5:
return 'Ask again later'
elif answerNumber == 6:
return 'Concentrate and ask again'
elif answerNumber == 7:
return 'My reply is no'
elif answerNumber == 8:
return 'Outlook not so good'
elif answerNumber == 9:
return 'Very doubtful'
r = random.randint(1, 9)
fortune = getAnswer(r)
print(fortune)
>>> spam = print('Hello!')
Hello!
>>> spam is None
True
Note: never compare to None
with the ==
operator. Always use is
.
>>> print('Hello', end='')
>>> print('World')
HelloWorld
>>> print('cats', 'dogs', 'mice')
cats dogs mice
>>> print('cats', 'dogs', 'mice', sep=',')
cats,dogs,mice
-
Code in the global scope cannot use any local variables.
-
However, a local scope can access global variables.
-
Code in a function’s local scope cannot use variables in any other local scope.
-
You can use the same name for different variables if they are in different scopes. That is, there can be a local variable named spam and a global variable also named spam.
If you need to modify a global variable from within a function, use the global statement:
>>> def spam():
>>> global eggs
>>> eggs = 'spam'
>>>
>>> eggs = 'global'
>>> spam()
>>> print(eggs)
spam
There are four rules to tell whether a variable is in a local scope or global scope:
-
If a variable is being used in the global scope (that is, outside of all functions), then it is always a global variable.
-
If there is a global statement for that variable in a function, it is a global variable.
-
Otherwise, if the variable is used in an assignment statement in the function, it is a local variable.
-
But if the variable is not used in an assignment statement, it is a global variable.
>>> def spam(divideBy):
>>> try:
>>> return 42 / divideBy
>>> except ZeroDivisionError as e:
>>> print('Error: Invalid argument: {}'.format(e))
>>>
>>> print(spam(2))
>>> print(spam(12))
>>> print(spam(0))
>>> print(spam(1))
21.0
3.5
Error: Invalid argument: division by zero
None
42.0
Code inside the finally
section is always executed, no matter if an exception has been raised or
not, and even if an exception is not caught.
>>> def spam(divideBy):
>>> try:
>>> return 42 / divideBy
>>> except ZeroDivisionError as e:
>>> print('Error: Invalid argument: {}'.format(e))
>>> finally:
>>> print("-- division finished --")
>>> print(spam(12))
>>> print(spam(0))
21.0
-- division finished --
3.5
-- division finished --
Error: Invalid argument: division by zero
-- division finished --
None
-- division finished --
42.0
-- division finished --
>>> spam = ['cat', 'bat', 'rat', 'elephant']
>>> spam
['cat', 'bat', 'rat', 'elephant']
>>> spam = ['cat', 'bat', 'rat', 'elephant']
>>> spam[0]
'cat'
>>> spam[1]
'bat'
>>> spam[2]
'rat'
>>> spam[3]
'elephant'
>>> spam = ['cat', 'bat', 'rat', 'elephant']
>>> spam[-1]
'elephant'
>>> spam[-3]
'bat'
>>> 'The {} is afraid of the {}.'.format(spam[-1], spam[-3])
'The elephant is afraid of the bat.'
>>> spam = ['cat', 'bat', 'rat', 'elephant']
>>> spam[0:4]
['cat', 'bat', 'rat', 'elephant']
>>> spam[1:3]
['bat', 'rat']
>>> spam[0:-1]
['cat', 'bat', 'rat']
>>> spam = ['cat', 'bat', 'rat', 'elephant']
>>> spam[:2]
['cat', 'bat']
>>> spam[1:]
['bat', 'rat', 'elephant']
Slicing the complete list will perform a copy:
>>> spam2 = spam[:]
['cat', 'bat', 'rat', 'elephant']
>>> spam.append('dog')
>>> spam
['cat', 'bat', 'rat', 'elephant', 'dog']
>>> spam2
['cat', 'bat', 'rat', 'elephant']
>>> spam = ['cat', 'dog', 'moose']
>>> len(spam)
3
>>> spam = ['cat', 'bat', 'rat', 'elephant']
>>> spam[1] = 'aardvark'
>>> spam
['cat', 'aardvark', 'rat', 'elephant']
>>> spam[2] = spam[1]
>>> spam
['cat', 'aardvark', 'aardvark', 'elephant']
>>> spam[-1] = 12345
>>> spam
['cat', 'aardvark', 'aardvark', 12345]
>>> [1, 2, 3] + ['A', 'B', 'C']
[1, 2, 3, 'A', 'B', 'C']
>>> ['X', 'Y', 'Z'] * 3
['X', 'Y', 'Z', 'X', 'Y', 'Z', 'X', 'Y', 'Z']
>>> spam = [1, 2, 3]
>>> spam = spam + ['A', 'B', 'C']
>>> spam
[1, 2, 3, 'A', 'B', 'C']
>>> spam = ['cat', 'bat', 'rat', 'elephant']
>>> del spam[2]
>>> spam
['cat', 'bat', 'elephant']
>>> del spam[2]
>>> spam
['cat', 'bat']
>>> supplies = ['pens', 'staplers', 'flame-throwers', 'binders']
>>> for i, supply in enumerate(supplies):
>>> print('Index {} in supplies is: {}'.format(str(i), supply))
Index 0 in supplies is: pens
Index 1 in supplies is: staplers
Index 2 in supplies is: flame-throwers
Index 3 in supplies is: binders
>>> name = ['Pete', 'John', 'Elizabeth']
>>> age = [6, 23, 44]
>>> for n, a in zip(name, age):
>>> print('{} is {} years old'.format(n, a))
Pete is 6 years old
John is 23 years old
Elizabeth is 44 years old
>>> 'howdy' in ['hello', 'hi', 'howdy', 'heyas']
True
>>> spam = ['hello', 'hi', 'howdy', 'heyas']
>>> 'cat' in spam
False
>>> 'howdy' not in spam
False
>>> 'cat' not in spam
True
The multiple assignment trick is a shortcut that lets you assign multiple variables with the values in a list in one line of code. So instead of doing this:
>>> cat = ['fat', 'orange', 'loud']
>>> size = cat[0]
>>> color = cat[1]
>>> disposition = cat[2]
You could type this line of code:
>>> cat = ['fat', 'orange', 'loud']
>>> size, color, disposition = cat
The multiple assignment trick can also be used to swap the values in two variables:
>>> a, b = 'Alice', 'Bob'
>>> a, b = b, a
>>> print(a)
'Bob'
>>> print(b)
'Alice'
Operator | Equivalent |
---|---|
spam += 1 |
spam = spam + 1 |
spam -= 1 |
spam = spam - 1 |
spam *= 1 |
spam = spam * 1 |
spam /= 1 |
spam = spam / 1 |
spam %= 1 |
spam = spam % 1 |
Examples:
>>> spam = 'Hello'
>>> spam += ' world!'
>>> spam
'Hello world!'
>>> bacon = ['Zophie']
>>> bacon *= 3
>>> bacon
['Zophie', 'Zophie', 'Zophie']
>>> spam = ['Zophie', 'Pooka', 'Fat-tail', 'Pooka']
>>> spam.index('Pooka')
1
append():
>>> spam = ['cat', 'dog', 'bat']
>>> spam.append('moose')
>>> spam
['cat', 'dog', 'bat', 'moose']
insert():
>>> spam = ['cat', 'dog', 'bat']
>>> spam.insert(1, 'chicken')
>>> spam
['cat', 'chicken', 'dog', 'bat']
>>> spam = ['cat', 'bat', 'rat', 'elephant']
>>> spam.remove('bat')
>>> spam
['cat', 'rat', 'elephant']
If the value appears multiple times in the list, only the first instance of the value will be removed.
>>> spam = [2, 5, 3.14, 1, -7]
>>> spam.sort()
>>> spam
[-7, 1, 2, 3.14, 5]
>>> spam = ['ants', 'cats', 'dogs', 'badgers', 'elephants']
>>> spam.sort()
>>> spam
['ants', 'badgers', 'cats', 'dogs', 'elephants']
You can also pass True for the reverse keyword argument to have sort() sort the values in reverse order:
>>> spam.sort(reverse=True)
>>> spam
['elephants', 'dogs', 'cats', 'badgers', 'ants']
If you need to sort the values in regular alphabetical order, pass str. lower for the key keyword argument in the sort() method call:
>>> spam = ['a', 'z', 'A', 'Z']
>>> spam.sort(key=str.lower)
>>> spam
['a', 'A', 'z', 'Z']
You can use the built-in function sorted
to return a new list:
>>> spam = ['ants', 'cats', 'dogs', 'badgers', 'elephants']
>>> sorted(spam)
['ants', 'badgers', 'cats', 'dogs', 'elephants']
>>> eggs = ('hello', 42, 0.5)
>>> eggs[0]
'hello'
>>> eggs[1:3]
(42, 0.5)
>>> len(eggs)
3
The main way that tuples are different from lists is that tuples, like strings, are immutable.
>>> tuple(['cat', 'dog', 5])
('cat', 'dog', 5)
>>> list(('cat', 'dog', 5))
['cat', 'dog', 5]
>>> list('hello')
['h', 'e', 'l', 'l', 'o']
Example Dictionary:
myCat = {'size': 'fat', 'color': 'gray', 'disposition': 'loud'}
values():
>>> spam = {'color': 'red', 'age': 42}
>>> for v in spam.values():
>>> print(v)
red
42
keys():
>>> for k in spam.keys():
>>> print(k)
color
age
items():
>>> for i in spam.items():
>>> print(i)
('color', 'red')
('age', 42)
Using the keys(), values(), and items() methods, a for loop can iterate over the keys, values, or key-value pairs in a dictionary, respectively.
>>> spam = {'color': 'red', 'age': 42}
>>>
>>> for k, v in spam.items():
>>> print('Key: {} Value: {}'.format(k, str(v)))
Key: age Value: 42
Key: color Value: red
>>> spam = {'name': 'Zophie', 'age': 7}
>>> 'name' in spam.keys()
True
>>> 'Zophie' in spam.values()
True
>>> # You can omit the call to keys() when checking for a key
>>> 'color' in spam
False
>>> 'color' not in spam
True
>>> picnic_items = {'apples': 5, 'cups': 2}
>>> 'I am bringing {} cups.'.format(str(picnic_items.get('cups', 0)))
'I am bringing 2 cups.'
>>> 'I am bringing {} eggs.'.format(str(picnic_items.get('eggs', 0)))
'I am bringing 0 eggs.'
Let's consider this code:
spam = {'name': 'Pooka', 'age': 5}
if 'color' not in spam:
spam['color'] = 'black'
Using setdefault
we could write the same code more succinctly:
>>> spam = {'name': 'Pooka', 'age': 5}
>>> spam.setdefault('color', 'black')
'black'
>>> spam
{'color': 'black', 'age': 5, 'name': 'Pooka'}
>>> spam.setdefault('color', 'white')
'black'
>>> spam
{'color': 'black', 'age': 5, 'name': 'Pooka'}
>>> import pprint
>>>
>>> message = 'It was a bright cold day in April, and the clocks were striking
>>> thirteen.'
>>> count = {}
>>>
>>> for character in message:
>>> count.setdefault(character, 0)
>>> count[character] = count[character] + 1
>>>
>>> pprint.pprint(count)
{' ': 13,
',': 1,
'.': 1,
'A': 1,
'I': 1,
'a': 4,
'b': 1,
'c': 3,
'd': 3,
'e': 5,
'g': 2,
'h': 3,
'i': 6,
'k': 2,
'l': 3,
'n': 4,
'o': 2,
'p': 1,
'r': 5,
's': 3,
't': 6,
'w': 2,
'y': 1}
# in Python 3.5+:
>>> x = {'a': 1, 'b': 2}
>>> y = {'b': 3, 'c': 4}
>>> z = {**x, **y}
>>> z
{'c': 4, 'a': 1, 'b': 3}
# in Python 2.7
>>> z = dict(x, **y)
>>> z
{'c': 4, 'a': 1, 'b': 3}
From the Python 3 documentation
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
There are two ways to create sets: using curly braces {}
and the bult-in function set()
>>> s = {1, 2, 3}
>>> s = set([1, 2, 3])
When creating an empty set, be sure to not use the curly braces {}
or you will get an empty dictionary instead.
>>> s = {}
>>> type(s)
<class 'dict'>
A set automatically remove all the duplicate values.
>>> s = {1, 2, 3, 2, 3, 4}
>>> s
{1, 2, 3, 4}
And as an unordered data type, they can't be indexed.
>>> s = {1, 2, 3}
>>> s[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'set' object does not support indexing
>>>
Using the add()
method we can add a single element to the set.
>>> s = {1, 2, 3}
>>> s.add(4)
>>> s
{1, 2, 3, 4}
And with update()
, multiple ones .
>>> s = {1, 2, 3}
>>> s.update([2, 3, 4, 5, 6])
>>> s
{1, 2, 3, 4, 5, 6} # remember, sets automatically remove duplicates
Both methods will remove an element from the set, but remove()
will raise a key error
if the value doesn't exist.
>>> s = {1, 2, 3}
>>> s.remove(3)
>>> s
{1, 2}
>>> s.remove(3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 3
discard()
won't raise any errors.
>>> s = {1, 2, 3}
>>> s.discard(3)
>>> s
{1, 2}
>>> s.discard(3)
>>>
union()
or |
will create a new set that contains all the elements from the sets provided.
>>> s1 = {1, 2, 3}
>>> s2 = {3, 4, 5}
>>> s1.union(s2) # or 's1 | s2'
{1, 2, 3, 4, 5}
intersection
or &
will return a set containing only the elements that are common to all of them.
>>> s1 = {1, 2, 3}
>>> s2 = {2, 3, 4}
>>> s3 = {3, 4, 5}
>>> s1.intersection(s2, s3) # or 's1 & s2 & s3'
{3}
difference
or -
will return only the elements that are in one of the sets.
>>> s1 = {1, 2, 3}
>>> s2 = {2, 3, 4}
>>> s1.difference(s2) # or 's1 - s2'
{1}
symetric_difference
or ^
will return all the elements that are not common between them.
>>> s1 = {1, 2, 3}
>>> s2 = {2, 3, 4}
>>> s1.symmetric_difference(s2) # or 's1 ^ s2'
{1, 4}
The itertools module is a colection of tools intented to be fast and use memory efficiently when handling iterators (like lists or dictionaries).
From the official Python 3.x documentation:
The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an “iterator algebra” making it possible to construct specialized tools succinctly and efficiently in pure Python.
The itertools module comes in the standard library and must be imported.
The operator module will also be used. This module is not necessary when using itertools, but needed for some of the examples below.
Makes an iterator that returns the results of a function.
itertools.accumulate(iterable[, func])
Example:
>>> data = [1, 2, 3, 4, 5]
>>> result = itertools.accumulate(data, operator.mul)
>>> for each in result:
>>> print(each)
1
2
6
24
120
The operator.mul takes two numbers and multiplies them:
operator.mul(1, 2)
2
operator.mul(2, 3)
6
operator.mul(6, 4)
24
operator.mul(24, 5)
120
Passing a function is optional:
>>> data = [5, 2, 6, 4, 5, 9, 1]
>>> result = itertools.accumulate(data)
>>> for each in result:
>>> print(each)
5
7
13
17
22
31
32
If no function is designated the items will be summed:
5
5 + 2 = 7
7 + 6 = 13
13 + 4 = 17
17 + 5 = 22
22 + 9 = 31
31 + 1 = 32
Takes an iterable and a integer. This will create all the unique combination that have r members.
itertools.combinations(iterable, r)
Example:
>>> shapes = ['circle', 'triangle', 'square',]
>>> result = itertools.combinations(shapes, 2)
>>> for each in result:
>>> print(each)
('circle', 'triangle')
('circle', 'square')
('triangle', 'square')
Just like combinations(), but allows individual elements to be repeated more than once.
itertools.combinations_with_replacement(iterable, r)
Example:
>>> shapes = ['circle', 'triangle', 'square']
>>> result = itertools.combinations_with_replacement(shapes, 2)
>>> for each in result:
>>> print(each)
('circle', 'circle')
('circle', 'triangle')
('circle', 'square')
('triangle', 'triangle')
('triangle', 'square')
('square', 'square')
Makes an iterator that returns evenly spaced values starting with number start.
itertools.count(start=0, step=1)
Example:
>>> for i in itertools.count(10,3):
>>> print(i)
>>> if i > 20:
>>> break
10
13
16
19
22
This function cycles through an iterator endlessly.
itertools.cycle(iterable)
Example:
>>> colors = ['red', 'orange', 'yellow', 'green', 'blue', 'violet']
>>> for color in itertools.cycle(colors):
>>> print(color)
red
orange
yellow
green
blue
violet
red
orange
When reached the end of the iterable it start over again from the beginning.
Take a series of iterables and return them as one long iterable.
itertools.chain(*iterables)
Example:
>>> colors = ['red', 'orange', 'yellow', 'green', 'blue']
>>> shapes = ['circle', 'triangle', 'square', 'pentagon']
>>> result = itertools.chain(colors, shapes)
>>> for each in result:
>>> print(each)
red
orange
yellow
green
blue
circle
triangle
square
pentagon
Filters one iterable with another.
itertools.compress(data, selectors)
Example:
>>> shapes = ['circle', 'triangle', 'square', 'pentagon']
>>> selections = [True, False, True, False]
>>> result = itertools.compress(shapes, selections)
>>> for each in result:
>>> print(each)
circle
square
Make an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element.
itertools.dropwhile(predicate, iterable)
Example:
>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
>>> result = itertools.dropwhile(lambda x: x<5, data)
>>> for each in result:
>>> print(each)
5
6
7
8
9
10
1
Makes an iterator that filters elements from iterable returning only those for which the predicate is False.
itertools.filterfalse(predicate, iterable)
Example:
>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> result = itertools.filterfalse(lambda x: x<5, data)
>>> for each in result:
>>> print(each)
5
6
7
8
9
10
Simply put, this function groups things together.
itertools.groupby(iterable, key=None)
Example:
>>> robots = [{
'name': 'blaster',
'faction': 'autobot'
}, {
'name': 'galvatron',
'faction': 'decepticon'
}, {
'name': 'jazz',
'faction': 'autobot'
}, {
'name': 'metroplex',
'faction': 'autobot'
}, {
'name': 'megatron',
'faction': 'decepticon'
}, {
'name': 'starcream',
'faction': 'decepticon'
}]
>>> for key, group in itertools.groupby(robots, key=lambda x: x['faction']):
>>> print(key)
>>> print(list(group))
autobot
[{'name': 'blaster', 'faction': 'autobot'}]
decepticon
[{'name': 'galvatron', 'faction': 'decepticon'}]
autobot
[{'name': 'jazz', 'faction': 'autobot'}, {'name': 'metroplex', 'faction': 'autobot'}]
decepticon
[{'name': 'megatron', 'faction': 'decepticon'}, {'name': 'starcream', 'faction': 'decepticon'}]
This function is very much like slices. This allows you to cut out a piece of an iterable.
itertools.islice(iterable, start, stop[, step])
Example:
>>> colors = ['red', 'orange', 'yellow', 'green', 'blue',]
>>> few_colors = itertools.islice(colors, 2)
>>> for each in few_colors:
>>> print(each)
red
orange
itertools.permutations(iterable, r=None)
Example:
>>> alpha_data = ['a', 'b', 'c']
>>> result = itertools.permutations(alpha_data)
>>> for each in result:
>>> print(each)
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')
Creates the cartesian products from a series of iterables.
>>> num_data = [1, 2, 3]
>>> alpha_data = ['a', 'b', 'c']
>>> result = itertools.product(num_data, alpha_data)
>>> for each in result:
print(each)
(1, 'a')
(1, 'b')
(1, 'c')
(2, 'a')
(2, 'b')
(2, 'c')
(3, 'a')
(3, 'b')
(3, 'c')
This function will repeat an object over and over again. Unless, there is a times argument.
itertools.repeat(object[, times])
Example:
>>> for i in itertools.repeat("spam", 3):
print(i)
spam
spam
spam
Makes an iterator that computes the function using arguments obtained from the iterable.
itertools.starmap(function, iterable)
Example:
>>> data = [(2, 6), (8, 4), (7, 3)]
>>> result = itertools.starmap(operator.mul, data)
>>> for each in result:
>>> print(each)
12
32
21
The opposite of dropwhile(). Makes an iterator and returns elements from the iterable as long as the predicate is true.
itertools.takwwhile(predicate, iterable)
Example:
>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
>>> result = itertools.takewhile(lambda x: x<5, data)
>>> for each in result:
>>> print(each)
1
2
3
4
Return n independent iterators from a single iterable.
itertools.tee(iterable, n=2)
Example:
>>> colors = ['red', 'orange', 'yellow', 'green', 'blue']
>>> alpha_colors, beta_colors = itertools.tee(colors)
>>> for each in alpha_colors:
>>> print(each)
red
orange
yellow
green
blue
>>> colors = ['red', 'orange', 'yellow', 'green', 'blue']
>>> alpha_colors, beta_colors = itertools.tee(colors)
>>> for each in beta_colors:
>>> print(each)
red
orange
yellow
green
blue
Makes an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted.
itertools.zip_longest(*iterables, fillvalue=None)
Example:
>>> colors = ['red', 'orange', 'yellow', 'green', 'blue',]
>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,]
>>> for each in itertools.zip_longest(colors, data, fillvalue=None):
>>> print(each)
('red', 1)
('orange', 2)
('yellow', 3)
('green', 4)
('blue', 5)
(None, 6)
(None, 7)
(None, 8)
(None, 9)
(None, 10)
>>> a = [1, 3, 5, 7, 9, 11]
>>> [i - 1 for i in a]
[0, 2, 4, 6, 8, 10]
>>> b = {"abc", "def"}
>>> {s.upper() for s in b}
{"ABC", "DEF}
>>> c = {'name': 'Pooka', 'age': 5}
>>> {v: k for k, v in c.items()}
{'Pooka': 'name', 5: 'age'}
A List comprehension can be generated from a dictionary:
>>> c = {'name': 'Pooka', 'first_name': 'Oooka'}
>>> ["{}:{}".format(k.upper(), v.upper()) for k, v in c.items()]
['NAME:POOKA', 'FIRST_NAME:OOOKA']
Escape character | Prints as |
---|---|
\' |
Single quote |
\" |
Double quote |
\t |
Tab |
\n |
Newline (line break) |
\\ |
Backslash |
Example:
>>> print("Hello there!\nHow are you?\nI\'m doing fine.")
Hello there!
How are you?
I'm doing fine.
A raw string completely ignores all escape characters and prints any backslash that appears in the string.
>>> print(r'That is Carol\'s cat.')
That is Carol\'s cat.
Note: mostly used for regular expression definition (see re
package)
>>> print('''Dear Alice,
>>>
>>> Eve's cat has been arrested for catnapping, cat burglary, and extortion.
>>>
>>> Sincerely,
>>> Bob''')
Dear Alice,
Eve's cat has been arrested for catnapping, cat burglary, and extortion.
Sincerely,
Bob
To keep a nicer flow in your code, you can use the dedent
function from the textwrap
standard package.
>>> from textwrap import dedent
>>>
>>> def my_function():
>>> print('''
>>> Dear Alice,
>>>
>>> Eve's cat has been arrested for catnapping, cat burglary, and extortion.
>>>
>>> Sincerely,
>>> Bob
>>> ''').strip()
This generates the same string than before.
H e l l o w o r l d !
0 1 2 3 4 5 6 7 8 9 10 11
>>> spam = 'Hello world!'
>>> spam[0]
'H'
>>> spam[4]
'o'
>>> spam[-1]
'!'
Slicing:
>>> spam[0:5]
'Hello'
>>> spam[:5]
'Hello'
>>> spam[6:]
'world!'
>>> spam[6:-1]
'world'
>>> spam[:-1]
'Hello world'
>>> spam[::-1]
'!dlrow olleH'
>>> spam = 'Hello world!'
>>> fizz = spam[0:5]
>>> fizz
'Hello'
>>> 'Hello' in 'Hello World'
True
>>> 'Hello' in 'Hello'
True
>>> 'HELLO' in 'Hello World'
False
>>> '' in 'spam'
True
>>> 'cats' not in 'cats and dogs'
False
>>> a = [1, 2, 3, 4]
>>> 5 in a
False
>>> 2 in a
True
upper()
and lower()
:
>>> spam = 'Hello world!'
>>> spam = spam.upper()
>>> spam
'HELLO WORLD!'
>>> spam = spam.lower()
>>> spam
'hello world!'
isupper() and islower():
>>> spam = 'Hello world!'
>>> spam.islower()
False
>>> spam.isupper()
False
>>> 'HELLO'.isupper()
True
>>> 'abc12345'.islower()
True
>>> '12345'.islower()
False
>>> '12345'.isupper()
False
- isalpha() returns True if the string consists only of letters and is not blank.
- isalnum() returns True if the string consists only of lettersand numbers and is not blank.
- isdecimal() returns True if the string consists only ofnumeric characters and is not blank.
- isspace() returns True if the string consists only of spaces,tabs, and new-lines and is not blank.
- istitle() returns True if the string consists only of wordsthat begin with an uppercase letter followed by onlylowercase letters.
>>> 'Hello world!'.startswith('Hello')
True
>>> 'Hello world!'.endswith('world!')
True
>>> 'abc123'.startswith('abcdef')
False
>>> 'abc123'.endswith('12')
False
>>> 'Hello world!'.startswith('Hello world!')
True
>>> 'Hello world!'.endswith('Hello world!')
True
join():
>>> ', '.join(['cats', 'rats', 'bats'])
'cats, rats, bats'
>>> ' '.join(['My', 'name', 'is', 'Simon'])
'My name is Simon'
>>> 'ABC'.join(['My', 'name', 'is', 'Simon'])
'MyABCnameABCisABCSimon'
split():
>>> 'My name is Simon'.split()
['My', 'name', 'is', 'Simon']
>>> 'MyABCnameABCisABCSimon'.split('ABC')
['My', 'name', 'is', 'Simon']
>>> 'My name is Simon'.split('m')
['My na', 'e is Si', 'on']
rjust() and ljust():
>>> 'Hello'.rjust(10)
' Hello'
>>> 'Hello'.rjust(20)
' Hello'
>>> 'Hello World'.rjust(20)
' Hello World'
>>> 'Hello'.ljust(10)
'Hello '
An optional second argument to rjust() and ljust() will specify a fill character other than a space character. Enter the following into the interactive shell:
>>> 'Hello'.rjust(20, '*')
'***************Hello'
>>> 'Hello'.ljust(20, '-')
'Hello---------------'
center():
>>> 'Hello'.center(20)
' Hello '
>>> 'Hello'.center(20, '=')
'=======Hello========'
>>> spam = ' Hello World '
>>> spam.strip()
'Hello World'
>>> spam.lstrip()
'Hello World '
>>> spam.rstrip()
' Hello World'
>>> spam = 'SpamSpamBaconSpamEggsSpamSpam'
>>> spam.strip('ampS')
'BaconSpamEggs'
>>> import pyperclip
>>> pyperclip.copy('Hello world!')
>>> pyperclip.paste()
'Hello world!'
>>> name = 'Pete'
>>> 'Hello %s' % name
"Hello Pete"
We can use the %x
format specifier to convert an int value to a string:
>>> num = 5
>>> 'I have %x apples' % num
"I have 5 apples"
Note: For new code, using str.format or f-strings (Python 3.6+) is strongly recommended over the %
operator.
Python 3 introduced a new way to do string formatting that was later back-ported to Python 2.7. This makes the syntax for string formatting more regular.
>>> name = 'John'
>>> age = 20'
>>> "Hello I'm {}, my age is {}".format(name, age)
"Hello I'm John, my age is 20"
>>> "Hello I'm {0}, my age is {1}".format(name, age)
"Hello I'm John, my age is 20"
The official Python 3.x documentation recommend str.format
over the %
operator:
The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals or the str.format() interface helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.
You would only use %s
string formatting on functions that can do lazy parameters evaluation,
the most common being logging:
Prefer:
>>> name = "alice"
>>> logging.debug("User name: %s", name)
Over:
>>> logging.debug("User name: {}".format(name))
Or:
>>> logging.debug("User name: " + name)
>>> name = 'Elizabeth'
>>> f'Hello {name}!'
'Hello Elizabeth!
It is even possible to do inline arithmetic with it:
>>> a = 5
>>> b = 10
>>> f'Five plus ten is {a + b} and not {2 * (a + b)}.'
'Five plus ten is 15 and not 30.'
A simpler and less powerful mechanism, but it is recommended when handling format strings generated by users. Due to their reduced complexity template strings are a safer choice.
>>> from string import Template
>>> name = 'Elizabeth'
>>> t = Template('Hey $name!')
>>> t.substitute(name=name)
'Hey Elizabeth!'
- Import the regex module with
import re
. - Create a Regex object with the
re.compile()
function. (Remember to use a raw string.) - Pass the string you want to search into the Regex object’s
search()
method. This returns aMatch
object. - Call the Match object’s
group()
method to return a string of the actual matched text.
All the regex functions in Python are in the re module:
>>> import re
>>> phone_num_regex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')
>>> mo = phone_num_regex.search('My number is 415-555-4242.')
>>> print('Phone number found: {}'.format(mo.group()))
Phone number found: 415-555-4242
>>> phone_num_regex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
>>> mo = phone_num_regex.search('My number is 415-555-4242.')
>>> mo.group(1)
'415'
>>> mo.group(2)
'555-4242'
>>> mo.group(0)
'415-555-4242'
>>> mo.group()
'415-555-4242'
To retrieve all the groups at once: use the groups() method—note the plural form for the name.
>>> mo.groups()
('415', '555-4242')
>>> area_code, main_number = mo.groups()
>>> print(area_code)
415
>>> print(main_number)
555-4242
The | character is called a pipe. You can use it anywhere you want to match one of many expressions. For example, the regular expression r'Batman|Tina Fey' will match either 'Batman' or 'Tina Fey'.
>>> hero_regex = re.compile (r'Batman|Tina Fey')
>>> mo1 = hero_regex.search('Batman and Tina Fey.')
>>> mo1.group()
'Batman'
>>> mo2 = hero_regex.search('Tina Fey and Batman.')
>>> mo2.group()
'Tina Fey'
You can also use the pipe to match one of several patterns as part of your regex:
>>> bat_regex = re.compile(r'Bat(man|mobile|copter|bat)')
>>> mo = bat_regex.search('Batmobile lost a wheel')
>>> mo.group()
'Batmobile'
>>> mo.group(1)
'mobile'
The ? character flags the group that precedes it as an optional part of the pattern.
>>> bat_regex = re.compile(r'Bat(wo)?man')
>>> mo1 = bat_regex.search('The Adventures of Batman')
>>> mo1.group()
'Batman'
>>> mo2 = bat_regex.search('The Adventures of Batwoman')
>>> mo2.group()
'Batwoman'
The * (called the star or asterisk) means “match zero or more”—the group that precedes the star can occur any number of times in the text.
>>> bat_regex = re.compile(r'Bat(wo)*man')
>>> mo1 = bat_regex.search('The Adventures of Batman')
>>> mo1.group()
'Batman'
>>> mo2 = bat_regex.search('The Adventures of Batwoman')
>>> mo2.group()
'Batwoman'
>>> mo3 = bat_regex.search('The Adventures of Batwowowowoman')
>>> mo3.group()
'Batwowowowoman'
While * means “match zero or more,” the + (or plus) means “match one or more”. The group preceding a plus must appear at least once. It is not optional:
>>> bat_regex = re.compile(r'Bat(wo)+man')
>>> mo1 = bat_regex.search('The Adventures of Batwoman')
>>> mo1.group()
'Batwoman'
>>> mo2 = bat_regex.search('The Adventures of Batwowowowoman')
>>> mo2.group()
'Batwowowowoman'
>>> mo3 = bat_regex.search('The Adventures of Batman')
>>> mo3 is None
True
If you have a group that you want to repeat a specific number of times, follow the group in your regex with a number in curly brackets. For example, the regex (Ha){3} will match the string 'HaHaHa', but it will not match 'HaHa', since the latter has only two repeats of the (Ha) group.
Instead of one number, you can specify a range by writing a minimum, a comma, and a maximum in between the curly brackets. For example, the regex (Ha){3,5} will match 'HaHaHa', 'HaHaHaHa', and 'HaHaHaHaHa'.
>>> ha_regex = re.compile(r'(Ha){3}')
>>> mo1 = ha_regex.search('HaHaHa')
>>> mo1.group()
'HaHaHa'
>>> mo2 = ha_regex.search('Ha')
>>> mo2 is None
True
Python’s regular expressions are greedy by default, which means that in ambiguous situations they will match the longest string possible. The non-greedy version of the curly brackets, which matches the shortest string possible, has the closing curly bracket followed by a question mark.
>>> greedy_ha_regex = re.compile(r'(Ha){3,5}')
>>> mo1 = greedy_ha_regex.search('HaHaHaHaHa')
>>> mo1.group()
'HaHaHaHaHa'
>>> nongreedy_ha_regex = re.compile(r'(Ha){3,5}?')
>>> mo2 = nongreedy_ha_regex.search('HaHaHaHaHa')
>>> mo2.group()
'HaHaHa'
In addition to the search() method, Regex objects also have a findall() method. While search() will return a Match object of the first matched text in the searched string, the findall() method will return the strings of every match in the searched string.
>>> phone_num_regex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d') # has no groups
>>> phone_num_regex.findall('Cell: 415-555-9999 Work: 212-555-0000')
['415-555-9999', '212-555-0000']
To summarize what the findall() method returns, remember the following:
-
When called on a regex with no groups, such as \d-\d\d\d-\d\d\d\d, the method findall() returns a list of ng matches, such as ['415-555-9999', '212-555-0000'].
-
When called on a regex that has groups, such as (\d\d\d)-d\d)-(\d\ d\d\d), the method findall() returns a list of es of strings (one string for each group), such as [('415', ', '9999'), ('212', '555', '0000')].
There are times when you want to match a set of characters but the shorthand character classes (\d, \w, \s, and so on) are too broad. You can define your own character class using square brackets. For example, the character class [aeiouAEIOU] will match any vowel, both lowercase and uppercase.
>>> vowel_regex = re.compile(r'[aeiouAEIOU]')
>>> vowel_regex.findall('Robocop eats baby food. BABY FOOD.')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']
You can also include ranges of letters or numbers by using a hyphen. For example, the character class [a-zA-Z0-9] will match all lowercase letters, uppercase letters, and numbers.
By placing a caret character (^) just after the character class’s opening bracket, you can make a negative character class. A negative character class will match all the characters that are not in the character class. For example, enter the following into the interactive shell:
>>> consonant_regex = re.compile(r'[^aeiouAEIOU]')
>>> consonant_regex.findall('Robocop eats baby food. BABY FOOD.')
['R', 'b', 'c', 'p', ' ', 't', 's', ' ', 'b', 'b', 'y', ' ', 'f', 'd', '.', '
', 'B', 'B', 'Y', ' ', 'F', 'D', '.']
-
You can also use the caret symbol (^) at the start of a regex to indicate that a match must occur at the beginning of the searched text.
-
Likewise, you can put a dollar sign ($) at the end of the regex to indicate the string must end with this regex pattern.
-
And you can use the ^ and $ together to indicate that the entire string must match the regex—that is, it’s not enough for a match to be made on some subset of the string.
The r'^Hello' regular expression string matches strings that begin with 'Hello':
>>> begins_with_hello = re.compile(r'^Hello')
>>> begins_with_hello.search('Hello world!')
<_sre.SRE_Match object; span=(0, 5), match='Hello'>
>>> begins_with_hello.search('He said hello.') is None
True
The r'\d$' regular expression string matches strings that end with a numeric character from 0 to 9:
>>> whole_string_is_num = re.compile(r'^\d+$')
>>> whole_string_is_num.search('1234567890')
<_sre.SRE_Match object; span=(0, 10), match='1234567890'>
>>> whole_string_is_num.search('12345xyz67890') is None
True
>>> whole_string_is_num.search('12 34567890') is None
True
The . (or dot) character in a regular expression is called a wildcard and will match any character except for a newline:
>>> at_regex = re.compile(r'.at')
>>> at_regex.findall('The cat in the hat sat on the flat mat.')
['cat', 'hat', 'sat', 'lat', 'mat']