Practicing Clean Code to Accelerate Collaboration in Python

Have you ever looked at a function you wrote one month earlier and found it difficult to understand in 3 minutes? If that is the case, it is time to refactor your code. If it takes you more than 3 minutes to understand your code, imagine how long it would take for your teammates to understand your code.

If you want your code to be reusable, you want it to be readable. Writing clean code is especially important to software engineers who collaborate with other team members in different roles.

You want your Python function to:

  • use descriptive names
  • be small
  • do one thing
  • contain code with the same level of abstraction
  • have fewer than 3 arguments
  • have no duplication

Naming Convention

A good name for the code is quite crucial. Whether for functions, classes or variables. To implement it, we can use names that are clear and in context and make sure that the writing format can be consistent for all base source code.

This is the basic, I think most of the programmers out there already know how to name the variable.

import datetime# Bad
ymdstr = datetime.date.today().strftime("%y-%m-%d")
# Good
current_date: str = datetime.date.today().strftime("%y-%m-%d")

If the entity is the same, you should be consistent in referring to it in your functions. Here is examples of the code from our repository.

# Bad
def get_user_info(): pass
def get_client_data(): pass
def get_customer_record(): pass
# Good
def get_user_info(): pass
def get_user_data(): pass
def get_user_record(): pass

If your class/object name tells you something, don’t repeat that in your variable name.

# Bad
class Bill:
bill_id: str
bill_name: str
bill_created: str
# Good
class Bill:
id: str
name: str
created: str

Good Comment

“A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment.”
Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

The code should be as clear as possible so that without any comments, the function can be well known. This is to avoid confusion between comments and code content. It often happens that the code is changed but the comments are not changed so that the reader feels confused about the code. However, if there must be a comment, as much as possible the comments are written not too long or short and cover all the functions that explain what.

Example from my Software Engineering Projects

Use Type Hint

We all know that python is dynamic. You don’t need to declare the variable as we know that in the middle of code the data type can change for example from str to int.

def get(self, user_id: int) -> Optional[User]:
user: User
try:
user = User.objects.get(pk=user_id)
except User.DoesNotExist:
return None
return user

If you see that, I use type hint for local variable user of type User. This will make hint the user variable is an instance of User object.

If you see that some basic python function is just passing the parameter. If you realize this is result in a bad habit for programmers. The parameter they pass needs to be hinted at so that another programmer will be able to know what they need to pass for the function. Also, don't forget about the return type of the function. Since python procedure and function is actually the same, we can use type hint to give information of the data type returned by the function.

# Bad
def get_user_from_token(self, token):
decoded_token = self.firebase_provider.decode_token(token)
return self.get_by_firebase_uid(
firebase_uid=decoded_token.user_id
)
# Good
def get_user_from_token(self, token: UUID) -> Optional[User]:
decoded_token = self.firebase_provider.decode_token(token)
return self.get_by_firebase_uid(
firebase_uid=decoded_token.user_id
)

There is 2 thing that we can learn. The 1st function did not put the type hint of token parameter. It will make another developer wonder “am I passing the right parameter to the function.”

If you hover over the 2nd function (good example), you can see that it will give you a hint of what are the parameter and its data type.

Besides it making your code more readable for your teammates and less unexpected, it can be good for some code editors that will give you information about the type when you hover it.

If you see the example above, I use type hint for function parameter. When I tried to get to know the attribute of spec by hovering it, I can see what the type of spec.user_ids is.

It will make another developer easier to code and stop wondering “am I passing the right parameter to the function.” or “what’s is the data type supposed to returned to the function”.

Function

This is by far the most important rule in software engineering. When functions do more than one thing, they are harder to compose, test, and reason about. When you can isolate a function to just one action, they can be refactored easily and your code will read much cleaner. If you take nothing else away from this guide other than this, you’ll be ahead of many developers.

This is a bad example

from typing import Listclass Client:
active: bool
def email(client: Client) -> None:
pass
def email_clients(clients: List[Client]) -> None:
"""Filter active clients and send them an email.
"""
for client in clients:
if client.active:
email(client)

You see that the email_clients function tried to send emails to active clients. This will result in the function doing multiple things. First, it checks whether the client is active then try to send the email. We breakdown the function into multiple functions. See the good example below.

from typing import Listclass Client:
active: bool
def email(client: Client) -> None:
pass
def get_active_clients(clients: List[Client]) -> List[Client]:
"""Filter active clients.
"""
return [client for client in clients if client.active]
def email_clients(clients: List[Client]) -> None:
"""Send an email to a given list of clients.
"""
for client in get_active_clients(clients):
email(client)

You see that the function is cleaner, right !!?

Limiting the number of function parameters is incredibly important because it makes testing your function easier. Having more than three leads to a combinatorial explosion where you have to test tons of different cases with each separate argument.

Zero arguments are the ideal case. One or two arguments are ok, and three should be avoided. Anything more than that should be consolidated. Usually, if you have more than two arguments then your function is trying to do too much. In cases where it’s not, most of the time a higher-level object will suffice as an argument.

# Bad
def create_user(self, username, password, name, phone_number):
pass
# Good
from dataclasses import dataclass
@dataclass
class CreateUserSpec:
username: str
password: str
name: str
phone_number: str
def create_user(self, spec: CreateUserSpec) -> Optional[User]:
pass
create_user(
CreateUserSpec(
username,
password,
name,
phone_number,
)
)

The example above is an example from my software engineering projects. You can see that we can use a dataclass, the class that contains attributes. This is a good and fancy example of using python. This will result in a cleaner of a function call and makes the code more readable.

High Cohesion & Low Coupling

Code written with the lowest possible coupling will avoid the difficulty of fixing a problem. So that if we encounter a problem, the only thing that must be corrected is the wrong part. Coupling is the degree of dependence of a module on other modules. We have to make sure that a module as minimally as possible depends or affects other modules. For example, a cellphone with a high coupling has a problem with the battery, then the entire cellphone must be replaced. But on cellphones that have low coupling, we can only change the battery.

High cohesion makes code within classes and components simpler and easier to understand. Cohesion is the binding of functions within a module. The point is that the modules created have similar functions for one responsibility. Therefore, high cohesion is very important in software design.

Don’t repeat yourself (DRY)

Try to observe the DRY principle.

Do your absolute best to avoid duplicate code. Duplicate code is bad because it means that there’s more than one place to alter something if you need to change some logic.

Imagine if you run a restaurant and you keep track of your inventory: all your tomatoes, onions, garlic, spices, etc. If you have multiple lists that you keep this on, then all have to be updated when you serve a dish with tomatoes in them. If you only have one list, there’s only one place to update!

Often you have duplicate code because you have two or more slightly different things, that share a lot in common, but their differences force you to have two or more separate functions that do much of the same things. Removing duplicate code means creating an abstraction that can handle this set of different things with just one function/module/class.

Getting the abstraction right is critical. Bad abstractions can be worse than duplicate code, so be careful! Having said this, if you can make a good abstraction, do it! Don’t repeat yourself. Otherwise, you’ll find yourself updating multiple places any time you want to change one thing.

Bad example

from typing import List, Dict
from dataclasses import dataclass
def get_developer_list(developers: List[Developer]) -> List[Dict]:
developers_list = []
for developer in developers:
developers_list.append({
'experience' : developer.experience,
'github_link' : developer.github_link
})
return developers_list
def get_manager_list(managers: List[Manager]) -> List[Dict]:
managers_list = []
for manager in managers:
managers_list.append({
'experience' : manager.experience,
'github_link' : manager.github_link
})
return managers_list
## create list objects of developers
company_developers = [
Developer(experience=2.5, github_link='https://github.com/1'),
Developer(experience=1.5, github_link='https://github.com/2')
]
company_developers_list = get_developer_list(developers=company_developers)
## create list objects of managers
company_managers = [
Manager(experience=4.5, github_link='https://github.com/3'),
Manager(experience=5.7, github_link='https://github.com/4')
]
company_managers_list = get_manager_list(managers=company_managers)

As you can see, we just repeating our code to get the manager and developer list. We can make it clearer and cleaner with an abstraction of employees that is managers and developers. See the good example below.

from typing import List, Dict
from dataclasses import dataclass
@dataclass
class Employee:
def __init__(self, experience: float, github_link: str) -> None:
self._experience = experience
self._github_link = github_link

@property
def experience(self) -> float:
return self._experience

@property
def github_link(self) -> str:
return self._github_link
def get_employee_list(employees: List[Employee]) -> List[Dict]:
employees_list = []
for employee in employees:
employees_list.append({
'experience' : employee.experience,
'github_link' : employee.github_link
})
return employees_list
## create list objects of developers
company_developers = [
Employee(experience=2.5, github_link='https://github.com/1'),
Employee(experience=1.5, github_link='https://github.com/2')
]
company_developers_list = get_employee_list(employees=company_developers)
## create list objects of managers
company_managers = [
Employee(experience=4.5, github_link='https://github.com/3'),
Employee(experience=5.7, github_link='https://github.com/4')
]
company_managers_list = get_employee_list(employees=company_managers)

The Development Process with These Guidelines

I explained the benefits of each guideline independently, but how does applying all of these guidelines affect the software development process?

There’s an old joke that famous entrepreneurs like Bill Gates and Mark Zuckerberg wear the same clothes every day to save the time needed to pick what to wear.

This has a similar vibe. By setting guidelines on naming convention, we no longer waste time thinking “hmm, what should I name this new function?”. By setting guidelines on using type-hint in functions, we won’t bother to lookup at what should I pass the parameters to the function or accept its return value. Therefore, we can always put our focus on developing actual code confidently instead of the little things.

By setting common ways to do things, it’s easier to review teammates’ code. On code reviewing, I first check whether they followed the guidelines we set. Then, I know where to look for the important logic inside the code.

For example, my friend created a merge request for a new endpoint. First, I will check whether he created a function without type-hint or not. If he/she created a function without type-hint, that MR automatically requested-change .

Finale

Coding in Python can be quite freeing. This article did not discuss the common clean code practices such as “short methods” and “meaningful names”, since there are tons of articles out there for this. This article shared the guidelines our team set based on real problems/disagreements that occurred during the development process.

In my opinion, clean code is not always the shortest nor the neatest code, but the best code for the current problem. That’s why setting a team guideline in languages such as Python is essential to smoothen the development process and ease the collaboration.

Source:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store