Python is a multipurpose language, beloved by data scientists and full stack engineers alike. It promotes clear and concise code, thanks to its syntactic sugarness and philosophy. In this article, I will explain in depth how the most anticipated Python feature works: structural pattern matching.
Structural Pattern Matching (PEP 634) was introduced for the first time in Python 3.10. Currently it is in beta, and you can get it here, as of 27th of May, 2021
What is structural pattern matching?
The most obvious analogy is the switch-case
operator from other languages, like Java or C#. It lets you perform different actions based on the assertions on the input data. In other words, it is syntactic sugar on top of multiple if-else
statements. Here how switch statements are used in Java:
int day = 3;
switch (day) {
case 6:
System.out.println("Saturday");
break;
case 7:
System.out.println("Sunday");
break;
default:
System.out.println("Almost Weekend");
}
Before Python 3.10, the only way to rewrite the code above in Python was a bunch of if
statements:
day = 3
if day == 6:
print("Saturday")
elif day == 7:
print("Sunday")
else:
print("Almost Weekend")
Of course, that does not seem so bad, but as cases grow and become more complex this construct will be unbearable. Python developers figured it out and implemented a new feature for Python 3.10.
How to use structural pattern matching?
To apply structural pattern matching, you will need to use two new keywords: match
and case
. This is how our earlier example would look like if rewritten in Python 3.10:
day = 3
match day:
case 6:
print("Saturday")
case 7:
print("Sunday")
case _:
print("Almost Weekend")
Firstly, you open the match block using the match
keyword, followed by the variable you want to match on. Then, in the match
block, you write cases with the case
keyword, followed by the value you want the variable to be equal to. Lastly, the case _:
block will match anything: it it the default handler.
Let’s explore some more challenging examples. For example, you can specify multiple conditions for a single case block like this:
month = 'jan'
match month:
case 'jan' | 'mar' | 'may' | 'jul' | 'aug' | 'oct' | 'dec':
print(31)
case 'apr' | 'jun' | 'sep' | 'nov':
print(30)
case 'feb':
print(28)
case _:
print('Bad month!')
This snippet will print out the number of days in a month, given by the month
variable. To use multiple values, we separate them with the |
operator. We could take it even further and check for leap year, using the guard expression:
month = 'jan'
match month:
case 'jan' | 'mar' | 'may' | 'jul' | 'aug' | 'oct' | 'dec':
print(31)
case 'apr' | 'jun' | 'sep' | 'nov':
print(30)
case 'feb' if is_leap_year():
print(29)
case 'feb':
print(28)
case _:
print('Bad month!')
Suppose that is_leap_year
is a function that returns true if the current year is a leap year. In this case, Python will firstly evaluate the condition, and, if met, print 29. If not, it will go to the next condition and print 28.
Packing and unpacking
Structural pattern matching provides much more features than basic switch-case statements. You could use it to evaluate complex data structures and extract data from them. For example, suppose you store date in a tuple of form (day, month, year)
, all integers. This code will print out what season is it, along with day and year:
date = (2, 3, 2000)
match date:
case (day, 12, year) | (day, 1, year) | (day, 2, year):
print(f'Winter of {year}, day {day}')
case (day, 3, year) | (day, 4, year) | (day, 5, year):
print(f'Spring of {year}, day {day}')
case (day, 6, year) | (day, 7, year) | (day, 8, year):
print(f'Summer of {year}, day {day}')
case (day, _, year):
print(f'Fall of {year}, day {day}')
case _:
print('Invalid data')
You can see we are using the tuple syntax to capture the values. By placing day
and year
names in place of values, we are telling Python we want those values extracted and made available as variables. By placing actual number value in place of month, we are checking only for that value. When we check for fall, you can notice we are using the _
symbol again. It is a wildcard and will match all other month values that were missed by the previous checks. Lastly, we have one more wildcard case to account for badly formatted data.
Class matchers
There is one more cool thing you can do with structured pattern matching in Python 3.10. Suppose you have a class like this:
class Vector:
speed: int
acceleration: int
You can use its attributes to match Vector
objects:
vec = Vector()
vec.speed = 4
vec.acceleration = 1
match vec:
case Vector(speed=0, acceleration=0):
print('Object is standing still')
case Vector(speed=speed, acceleration=0):
print('Object is travelling at {speed}m/s')
case Vector(speed=0, acceleration=acceleration):
print('Object is accelerating from standing still at {acceleration}m/s2')
case Vector(speed=speed, acceleration=acceleration):
print('Object is at {speed}m/s, accelerating at {acceleration}m/s2')
case _:
print('Not a vector')
You can see that by passing in parameters as constructor arguments, we can match any kind of object and capture its parameters. Note that no actual objects are created that way and you do not need a corresponding constructor in your class declaration.
When to use structural pattern matching?
This feature will be helpful to those working with big data structures when categorizing or filtering data. It will also be helpful to clean up large chains of if-elif-else
statements to clean up the code. Right now this feature is in beta stream of Python 3.10, so probably don’t use it in production projects right now and wait for a stable release in Fall of 2021.
Closing notes
Thank you for reading, I hope you liked my article. Consider subscribing by email if you find my content useful!