Practicing Clean Code to Accelerate Collaboration in Python
Have you ever looked at a function you wrote one month earlier and found it difficult to understand in 3 minutes? If that is the case, it is time to refactor your code. If it takes you more than 3 minutes to understand your code, imagine how long it would take for your teammates to understand your code.
If you want your code to be reusable, you want it to be readable. Writing clean code is especially important to software engineers who collaborate with other team members in different roles.
You want your Python function to:
- use descriptive names
- be small
- do one thing
- contain code with the same level of abstraction
- have fewer than 3 arguments
- have no duplication
Naming Convention
A good name for the code is quite crucial. Whether for functions, classes or variables. To implement it, we can use names that are clear and in context and make sure that the writing format can be consistent for all base source code.
Use meaningful and pronounceable variable names
This is the basic, I think most of the programmers out there already know how to name the variable.
import datetime# Bad
ymdstr = datetime.date.today().strftime("%y-%m-%d")
# Good
current_date: str = datetime.date.today().strftime("%y-%m-%d")
Use the same vocabulary for the same type of variable
If the entity is the same, you should be consistent in referring to it in your functions. Here is examples of the code from our repository.
# Bad
def get_user_info(): pass
def get_client_data(): pass
def get_customer_record(): pass# Good
def get_user_info(): pass
def get_user_data(): pass
def get_user_record(): pass
Don’t add unneeded context
If your class/object name tells you something, don’t repeat that in your variable name.
# Bad
class Bill:
bill_id: str
bill_name: str
bill_created: str# Good
class Bill:
id: str
name: str
created: str
Good Comment
“A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment.”
― Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship
The code should be as clear as possible so that without any comments, the function can be well known. This is to avoid confusion between comments and code content. It often happens that the code is changed but the comments are not changed so that the reader feels confused about the code. However, if there must be a comment, as much as possible the comments are written not too long or short and cover all the functions that explain what.
Use Type Hint
We all know that python is dynamic. You don’t need to declare the variable as we know that in the middle of code the data type can change for example from str to int.
Type Hint Local Variable
def get(self, user_id: int) -> Optional[User]:
user: User
try:
user = User.objects.get(pk=user_id)
except User.DoesNotExist:
return None return user
If you see that, I use type hint for local variable user
of type User. This will make hint the user
variable is an instance of User
object.
Type Hint Function Parameter and Return
If you see that some basic python function is just passing the parameter. If you realize this is result in a bad habit for programmers. The parameter they pass needs to be hinted at so that another programmer will be able to know what they need to pass for the function. Also, don't forget about the return type of the function. Since python procedure and function is actually the same, we can use type hint to give information of the data type returned by the function.
# Bad
def get_user_from_token(self, token):
decoded_token = self.firebase_provider.decode_token(token) return self.get_by_firebase_uid(
firebase_uid=decoded_token.user_id
)# Good
def get_user_from_token(self, token: UUID) -> Optional[User]:
decoded_token = self.firebase_provider.decode_token(token) return self.get_by_firebase_uid(
firebase_uid=decoded_token.user_id
)
There is 2 thing that we can learn. The 1st function did not put the type hint of token
parameter. It will make another developer wonder “am I passing the right parameter to the function.”
If you hover over the 2nd function (good example), you can see that it will give you a hint of what are the parameter and its data type.
What is the purpose of type hint?
Besides it making your code more readable for your teammates and less unexpected, it can be good for some code editors that will give you information about the type when you hover it.
If you see the example above, I use type hint for function parameter. When I tried to get to know the attribute of spec
by hovering it, I can see what the type of spec.user_ids
is.
It will make another developer easier to code and stop wondering “am I passing the right parameter to the function.” or “what’s is the data type supposed to returned to the function”.
Function
Do One Thing
This is by far the most important rule in software engineering. When functions do more than one thing, they are harder to compose, test, and reason about. When you can isolate a function to just one action, they can be refactored easily and your code will read much cleaner. If you take nothing else away from this guide other than this, you’ll be ahead of many developers.
This is a bad example
from typing import Listclass Client:
active: booldef email(client: Client) -> None:
passdef email_clients(clients: List[Client]) -> None:
"""Filter active clients and send them an email.
"""
for client in clients:
if client.active:
email(client)
You see that the email_clients
function tried to send emails to active clients. This will result in the function doing multiple things. First, it checks whether the client is active then try to send the email. We breakdown the function into multiple functions. See the good example below.
from typing import Listclass Client:
active: booldef email(client: Client) -> None:
passdef get_active_clients(clients: List[Client]) -> List[Client]:
"""Filter active clients.
"""
return [client for client in clients if client.active]def email_clients(clients: List[Client]) -> None:
"""Send an email to a given list of clients.
"""
for client in get_active_clients(clients):
email(client)
You see that the function is cleaner, right !!?
Function arguments (2 or fewer ideally)
Limiting the number of function parameters is incredibly important because it makes testing your function easier. Having more than three leads to a combinatorial explosion where you have to test tons of different cases with each separate argument.
Zero arguments are the ideal case. One or two arguments are ok, and three should be avoided. Anything more than that should be consolidated. Usually, if you have more than two arguments then your function is trying to do too much. In cases where it’s not, most of the time a higher-level object will suffice as an argument.
# Bad
def create_user(self, username, password, name, phone_number):
pass# Good
from dataclasses import dataclass@dataclass
class CreateUserSpec:
username: str
password: str
name: str
phone_number: strdef create_user(self, spec: CreateUserSpec) -> Optional[User]:
passcreate_user(
CreateUserSpec(
username,
password,
name,
phone_number,
)
)
The example above is an example from my software engineering projects. You can see that we can use a dataclass, the class that contains attributes. This is a good and fancy example of using python. This will result in a cleaner of a function call and makes the code more readable.
High Cohesion & Low Coupling
Code written with the lowest possible coupling will avoid the difficulty of fixing a problem. So that if we encounter a problem, the only thing that must be corrected is the wrong part. Coupling is the degree of dependence of a module on other modules. We have to make sure that a module as minimally as possible depends or affects other modules. For example, a cellphone with a high coupling has a problem with the battery, then the entire cellphone must be replaced. But on cellphones that have low coupling, we can only change the battery.
High cohesion makes code within classes and components simpler and easier to understand. Cohesion is the binding of functions within a module. The point is that the modules created have similar functions for one responsibility. Therefore, high cohesion is very important in software design.
Don’t repeat yourself (DRY)
Try to observe the DRY principle.
Do your absolute best to avoid duplicate code. Duplicate code is bad because it means that there’s more than one place to alter something if you need to change some logic.
Imagine if you run a restaurant and you keep track of your inventory: all your tomatoes, onions, garlic, spices, etc. If you have multiple lists that you keep this on, then all have to be updated when you serve a dish with tomatoes in them. If you only have one list, there’s only one place to update!
Often you have duplicate code because you have two or more slightly different things, that share a lot in common, but their differences force you to have two or more separate functions that do much of the same things. Removing duplicate code means creating an abstraction that can handle this set of different things with just one function/module/class.
Getting the abstraction right is critical. Bad abstractions can be worse than duplicate code, so be careful! Having said this, if you can make a good abstraction, do it! Don’t repeat yourself. Otherwise, you’ll find yourself updating multiple places any time you want to change one thing.
Bad example
from typing import List, Dict
from dataclasses import dataclassdef get_developer_list(developers: List[Developer]) -> List[Dict]:
developers_list = []
for developer in developers:
developers_list.append({
'experience' : developer.experience,
'github_link' : developer.github_link
})
return developers_listdef get_manager_list(managers: List[Manager]) -> List[Dict]:
managers_list = []
for manager in managers:
managers_list.append({
'experience' : manager.experience,
'github_link' : manager.github_link
})
return managers_list## create list objects of developers
company_developers = [
Developer(experience=2.5, github_link='https://github.com/1'),
Developer(experience=1.5, github_link='https://github.com/2')
]
company_developers_list = get_developer_list(developers=company_developers)## create list objects of managers
company_managers = [
Manager(experience=4.5, github_link='https://github.com/3'),
Manager(experience=5.7, github_link='https://github.com/4')
]
company_managers_list = get_manager_list(managers=company_managers)
As you can see, we just repeating our code to get the manager and developer list. We can make it clearer and cleaner with an abstraction of employees that is managers and developers. See the good example below.
from typing import List, Dict
from dataclasses import dataclass@dataclass
class Employee:
def __init__(self, experience: float, github_link: str) -> None:
self._experience = experience
self._github_link = github_link
@property
def experience(self) -> float:
return self._experience
@property
def github_link(self) -> str:
return self._github_linkdef get_employee_list(employees: List[Employee]) -> List[Dict]:
employees_list = []
for employee in employees:
employees_list.append({
'experience' : employee.experience,
'github_link' : employee.github_link
})
return employees_list## create list objects of developers
company_developers = [
Employee(experience=2.5, github_link='https://github.com/1'),
Employee(experience=1.5, github_link='https://github.com/2')
]
company_developers_list = get_employee_list(employees=company_developers)## create list objects of managers
company_managers = [
Employee(experience=4.5, github_link='https://github.com/3'),
Employee(experience=5.7, github_link='https://github.com/4')
]
company_managers_list = get_employee_list(employees=company_managers)
The Development Process with These Guidelines
I explained the benefits of each guideline independently, but how does applying all of these guidelines affect the software development process?
Faster Coding
There’s an old joke that famous entrepreneurs like Bill Gates and Mark Zuckerberg wear the same clothes every day to save the time needed to pick what to wear.
This has a similar vibe. By setting guidelines on naming convention, we no longer waste time thinking “hmm, what should I name this new function?”. By setting guidelines on using type-hint in functions, we won’t bother to lookup at what should I pass the parameters to the function or accept its return value. Therefore, we can always put our focus on developing actual code confidently instead of the little things.
Easier Code Review
By setting common ways to do things, it’s easier to review teammates’ code. On code reviewing, I first check whether they followed the guidelines we set. Then, I know where to look for the important logic inside the code.
For example, my friend created a merge request for a new endpoint. First, I will check whether he created a function without type-hint or not. If he/she created a function without type-hint, that MR automatically requested-change
.
Finale
Coding in Python can be quite freeing. This article did not discuss the common clean code practices such as “short methods” and “meaningful names”, since there are tons of articles out there for this. This article shared the guidelines our team set based on real problems/disagreements that occurred during the development process.
In my opinion, clean code is not always the shortest nor the neatest code, but the best code for the current problem. That’s why setting a team guideline in languages such as Python is essential to smoothen the development process and ease the collaboration.