If you sniff the above code, you can easily spot code smells. There are many including-
- Inconsistent variable naming (email, email_json, email_json_data)
- Magic numbers (indent=2),
- Limited error/exception handling
- Long methods, such as parseContent try to do too many things- perform regex search, fix duplicity in data, convert to JSON, etcetera
Among many things that you can do to clean this code, the simplest is to use the Extract Method, wherein you organize the blocks of code into different functions to make them comprehensible, and maintainable. Here’s how you can do it-
# Extracting Multiple Methods from Long Methods to Avoid such Code Smells
def parseContent(self, page_content):
result = {
"email_json": self.parse_and_convert_to_json(self.parse_emails(page_content)),
"social_json": self.parse_and_convert_to_json(self.parse_social_media(page_content)),
"address_json": None,
"phone_json": None
}
return result
def parse_and_convert_to_json(self, data_list):
if data_list:
data_set = set(data_list)
data_json_data = {str(i): data for i, data in enumerate(data_set)}
return json.dumps(data_json_data, indent=2)
return None
def parse_emails(self, page_content):
email_list = re.findall(r'\b[\w.-]+?@\w+?\.\w+?\b', page_content)
return email_list
def parse_social_media(self, page_content):
social_list = re.findall(r'https?://(?:www\.)?(?:facebook|twitter|instagram|linkedin|youtube)\.\w+/(?:[\w-]+/?)*', page_content)
return social_list
Did you observe how the long method is now broken into 4 methods with singular goals for each? Didn’t avoiding the code smells make the code more readable and understandable?
Data Clumps Code Smells
The identical groups of variables (clumps) that are not organized under a specific custom data type or data class are data clumps code smells. At times, they make code cluttered.
# Data Clump Code Smells Example- Repeated Company Address Information
company1_name = "ABC Inc."
company1_street = "123 Main St"
company1_city = "Cityville"
company1_state = "CA"
company1_zip = "12345"
company2_name = "XYZ Corp."
company2_street = "456 Elm St"
company2_city = "Townsville"
company2_state = "NY"
company2_zip = "54321"
# ... more companies with the same address fields
Data clump code smells could easily be addressed by using a dictionary or a custom data class. For example, you can use a custom data class for the above--
# Using a Custom Data Class to Avoid Code Smells (Data Clump)
class CompanyAddress:
def __init__(self, name, street, city, state, zip_code):
self.name = name
self.street = street
self.city = city
self.state = state
self.zip_code = zip_code
company1 = CompanyAddress("ABC Inc.", "123 Main St", "Cityville", "CA", "12345")
company2 = CompanyAddress("XYZ Corp.", "456 Elm St", "Townsville", "NY", "54321")
# ... more companies as instances of CompanyAddress
Primitive Obsession Code Smells
These are code smells that implement concepts in code using primitives, which are not type-safe and can lead to problems in the code.
# Primitive Obsession Code Smells Example- Using dictionaries for orders
order1 = {
"order_id": 1001,
"customer_id": 500,
"total_amount": 75.99,
"order_date": "2023-11-06",
"items": [
{"product_id": 1, "quantity": 2},
{"product_id": 2, "quantity": 1}
]
}
order2 = {
"order_id": 1002,
"customer_id": 501,
"total_amount": 49.99,
"order_date": "2023-11-07",
"items": [
{"product_id": 3, "quantity": 3}
]
}
# ... more orders represented as dictionaries
A simple solution to primitive obsession code smells could be custom classes/objects.
# Define Custom Classes for Orders to Avoid Code Smells (primitive obsession)
class Order:
def __init__(self, order_id, customer_id, total_amount, order_date, items):
self.order_id = order_id
self.customer_id = customer_id
self.total_amount = total_amount
self.order_date = order_date
self.items = items
class OrderItem:
def __init__(self, product_id, quantity):
self.product_id = product_id
self.quantity = quantity
# Using custom classes for orders
order1 = Order(
order_id=1001,
customer_id=500,
total_amount=75.99,
order_date="2023-11-06",
items=[
OrderItem(product_id=1, quantity=2),
OrderItem(product_id=2, quantity=1)
]
)
order2 = Order(
order_id=1002,
customer_id=501,
total_amount=49.99,
order_date="2023-11-07",
items=[OrderItem(product_id=3, quantity=3)]
)
# ... more orders represented as instances of the Order class
Long Parameter List Code Smells
If you’re passing four or more parameters to a function call, you can consider that to be a case of long parameter list code smells.
# A Function with Long Parameter List Code Smells Example
def create_order(order_id, customer_id, product_id, quantity, price, shipping_address, billing_address, payment_method, promotion_code, order_date):
# Long and complex logic for creating an order
pass
As a solution, you can either split the function into smaller ones aligned with the single responsibility principle or if you need to preserve the variables together, define the variables within an object and pass that as a parameter.
# Configuration Object to Avoid Code Smells (long parameter list)
class OrderConfig:
def __init__(self, order_id, customer_id, product_id, quantity, price, shipping_address, billing_address, payment_method, promotion_code, order_date):
self.order_id = order_id
self.customer_id = customer_id
self.product_id = product_id
self.quantity = quantity
self.price = price
self.shipping_address = shipping_address
self.billing_address = billing_address
self.payment_method = payment_method
self.promotion_code = promotion_code
self.order_date = order_date
def create_order(order_config):
# Access parameters using the config object
order_id = order_config.order_id
customer_id = order_config.customer_id
# ... (access other parameters)
# Long and complex logic for creating an order
pass
That sums up the bloater code smells. Bloaters are a result of poor coding practices, quick hotfixes & immediacy, deadlines, imprudence, and developer oversight.
2. Object-Orientation Abusers
When object-oriented programming (OOP) concepts like inheritance, polymorphism, encapsulation, and abstraction are violated, it paves the way for code smells to creep into your code. Common object-oriented programming-related code smells include-
Refused Bequest Code Smells
These code smells are a phenomenon when a subclass inherits the methods of a parent class but doesn’t use it. In the long run, it creates confusion for those maintaining the code.
# A Function with Refused Bequest Code Smells Example
class MarketingCampaign:
def create_campaign(self):
print("Campaign created.")
def launch_campaign(self):
print("Campaign launched.")
class EmailCampaign(MarketingCampaign):
def send_email(self):
print("Email sent to subscribers.")
class SocialMediaCampaign(MarketingCampaign):
def create_post(self):
print("Posted on social media platforms.")
def post_to_social_media(self):
print("Posted on social media platforms.")
email_campaign = EmailCampaign()
email_campaign.create_campaign()
email_campaign.send_email()
email_campaign.launch_campaign()
social_media_campaign = SocialMediaCampaign()
social_media_campaign.create_post()
social_media_campaign.post_to_social_media()
The class SocialMediaCampaign inherits MarketingCampaign but doesn’t make use of the inherited methods. This is the case of refused bequest code smells. Such code smells are problematic as they result in inefficient use of resources.
You can use delegation instead of inheritance, or depending on what sections of the parent class could be used in the inherited class infected with a refused bequest. You may also consider extracting the parent class methods and objects into a super class. Then, make both parent and subclass inherit that super class.
Switch Statements, aka Conditional Complexity Code Smells
In Python, there is no switch statement. But programming languages like Javascript do have it. Switch statements, or cascading IF, ELSE statements when become complex result in conditional complexity code smells.
Ideally, it is recommended to prefer polymorphism over Switch statements.
- Conditional complexity code smells violate the open-closed principle i.e., the base class should be open for extension but closed for modification.
- Cascading statements force you to think of all possible logical paths, and thus increase the chances of missing some and introducing vulnerabilities.
- Also, it increases the testing complexity, as each branch would need a subsequent test.
Here’s an example of conditional complexity code smells-
# A Class with Conditional Complexity Code Smells Example
class ShapeHandler:
def handle(self, shape_type):
if shape_type == 'circle':
self.handleCircle()
elif shape_type == 'rectangle':
self.handleRectangle()
elif shape_type == 'triangle':
self.handleTriangle()
def handleCircle(self):
print("Handling a circle")
def handleRectangle(self):
print("Handling a rectangle")
def handleTriangle(self):
print("Handling a triangle")
And here’s the recommended way to avoid code smells that arise from conditional complexity.
class ShapeHandler:
def __init__(self):
self.shape_handlers = {
'circle': CircleHandler(),
'rectangle': RectangleHandler(),
'triangle': TriangleHandler(),
# Add more shape types and handlers as needed
}
def handle(self, shape_type):
shape = self.get_shape_handler(shape_type)
shape.handle()
def get_shape_handler(self, shape_type):
if shape_type in self.shape_handlers:
return self.shape_handlers[shape_type]
raise UnsupportedShapeException("Unsupported shape type")
class CircleHandler:
def handle(self):
print("Handling a circle")
class RectangleHandler:
def handle(self):
print("Handling a rectangle")
class TriangleHandler:
def handle(self):
print("Handling a triangle")
Temporary Field Code Smells
This is one of the most common code smells. Developers create temporary fields/variables at the class level, but they are only used in a particular method temporarily, and remain useless the rest of the time. It becomes unnecessary overhead for the developers to remember the class-level variables when this is done at scale.
For example, consider the following shoppingCart class. In this, discount_amount is a temporary field that you can get rid of to avoid code smells.
class ShoppingCart:
def __init__(self):
self.items = []
def add_item(self, item):
self.items.append(item)
def calculate_total(self):
total = 0
for item in self.items:
total += item.price
return total
def apply_discount(self, discount_percent):
# Temporary field to hold the discount amount
discount_amount = 0
if discount_percent > 0:
discount_amount = (discount_percent / 100) * self.calculate_total()
self.items.append({
'name': 'Discount',
'price': -discount_amount # Negative price to represent a discount
})
cart = ShoppingCart()
cart.add_item({'name': 'Item 1', 'price': 50})
cart.add_item({'name': 'Item 2', 'price': 30})
cart.apply_discount(10) # Apply a 10% discount
print(f"Total: ${cart.calculate_total()}")
The apply_discount function without temporary field code smell would be:
# Avoid Code Smells (Temporary Field)
def apply_discount(self, discount_percent):
if discount_percent > 0:
discount_amount =
self.items.append({
'name': 'Discount',
'price': -(discount_percent / 100) * self.calculate_total() # Negative price to represent a discount
})
Alternative Classes with Different Interfaces Code Smells
Often, two or more classes with similar functionality but different implementations are present in the code base for different use cases. This hints at sprouting code smells.
As a remediation, you can refactor the classes to have the same interface and extract logic that could be a superclass to limit code duplication.
An example could be of calculating validity for daily & monthly passes at coworking offices. Let’s say, we have a class to calculate how many days the pass validity would last.
# Alternative Classes with Different Interfaces Code Smells Example
class DailyPass:
def __init__(self, start_date, amount_paid):
self.start_date = start_date
self.amount_paid = amount_paid
def pass_validity_days(self):
return self.amount // 294
def daily_remaining_validity(self):
today = datetime.now()
days_valid = self.pass_validity_days()
pass_expiry_date = self.start_date + timedelta(days=days_valid)
remaining_days = (pass_expiry_date - today).days
return max(0, remaining_days)
# Example usage:
start_date = datetime(2023, 1, 1) # Replace with your start date
amount_paid = 1000 # Replace with the amount you paid
pass_obj = DailyPass(start_date, amount_paid)
remaining_validity = pass_obj.daily_remaining_validity()
print(f"Remaining validity: {remaining_validity} days")
We have another class to calculate how many months and days would a pass be valid.
# Alternative Classes with Different Interfaces Code Smells Example
class MonthlyPass:
def __init__(self, start_date, amount_paid):
self.start_date = start_date
self.amount_paid = amount_paid
def pass_validity_days(self):
return self.amount // 294
def monthly_remaining_validity(self):
today = datetime.now()
days_valid = self.pass_validity()
pass_expiry_date = self.start_date + timedelta(days=days_valid)
remaining_days = (pass_expiry_date - today).days
remaining_months = remaining_days // 30
remaining_days %= 30
return remaining_months, remaining_days
# Example usage:
start_date = datetime(2023, 1, 1) # Replace with your start date
amount_paid = 1150 # Replace with the amount you paid
pass_obj = MonthlyPass(start_date, amount_paid)
remaining_months, remaining_days = pass_obj.monthly_remaining_validity()
print(f"Remaining validity: {remaining_months} months and {remaining_days} days")
If you observe closely, both the classes have similar functionality, but with personalized logic implementation and different interfaces i.e., monthly_remaining_validity and daily_remaining_validity. This is a case of alternative classes with different interface code smells.
To avoid such code smells, look for opportunities to extract super class from the alternative classes, and inherit it in the respective subclasses that need specific code logic. Here’s how it would be done for the above case-
# Avoid Code Smells (Alternative Classes with Different Interfaces)
class Pass:
def __init__(self, start_date, amount_paid):
self.start_date = start_date
self.amount_paid = amount_paid
def pass_validity_days(self):
return self.amount_paid // 294
def pass_remaining_validity(self):
today = datetime.now()
days_valid = self.pass_validity_days()
pass_expiry_date = self.start_date + timedelta(days=days_valid)
remaining_days = (pass_expiry_date - today).days
return remaining_days
class DailyPass(Pass):
def __init__(self, start_date, amount_paid):
super().__init__(start_date, amount_paid)
def pass_remaining_validity(self):
return super().pass_remaining_validity()
class MonthlyPass(Pass):
def __init__(self, start_date, amount_paid):
super().__init__(start_date, amount_paid)
def pass_remaining_validity(self):
remaining_days = super().pass_remaining_validity()
remaining_months = remaining_days // 30
remaining_days %= 30
return remaining_months, remaining_days
# Example usage for DailyPass:
start_date = datetime(2023, 1, 1)
amount_paid = 1000
pass_obj = DailyPass(start_date, amount_paid)
remaining_days = pass_obj.pass_remaining_validity()
print(f"Remaining validity: {remaining_days} days")
# Example usage for MonthlyPass:
start_date = datetime(2023, 1, 1)
amount_paid = 1150
pass_obj = MonthlyPass(start_date, amount_paid)
remaining_months, remaining_days = pass_obj.pass_remaining_validity()
print(f"Remaining validity: {remaining_months} months and {remaining_days} days")
3. Change Preventers
Next in the code smells list is change preventers. Change preventers code smells is a condition wherein if you change something in one place, you are required to make multiple changes across different methods and classes to make it work properly. It makes it tough to manage coherence in the code.
Divergent Change Code Smells
When a small change in one of the methods of your class requires you to change multiple other methods within the same class, it is a case of divergent change code smells.
For example, in the below code, instead of an html_page code if you pass JSON data to the do_crud function, you will have to change not just the do_crud method, but also the get_leads and modify_leads method, depending on your implementation logic.
# Divergent Change Code Smells Example
class LeadCRUD:
def get_leads(self, html_page):
# ...logic to parse leads
return leads
def modify_leads(self, leads_file, leads):
# ...logic to open leads file and write new leads in it, or update
return modified_leads_file
def do_crud(self, html_page, leads_file):
leads = self.get_leads(html_page)
return self.modify_leads(leads_file, leads)
html_page = requests.get(hypothetical_URL)
lead_manager = LeadCRUD(...)
update_leads = lead_manager.do_crud(html_page, 'leads.csv')
# save updated leads file to the s3 bucket in AWS
To avoid divergent change code smells, you can split the class in a way that each new class performs singular responsibilities. It helps, but don’t overdo it.
# Avoid Code Smells (Divergent Change)
class LeadsParser:
def get_leads(self, html_page)
# Logic to parse leads
return leads
class LeadsModifier:
def update_leads(self, leads_file, leads)
# logic to store & update leads in leads_file
return leads_file
leads_parser = LeadsParser(...)
leads = leads_parser.get_leads(html_page)
leads_modifier = LeadsModifier(...)
leads_file = leads_parser.get_leads('leads_file.csv', leads)
# save updated leads file to the s3 bucket in AWS
Shotgun Surgery Code Smells
Do excessive decoupling of divergent change code smell classes, and you may fall prey to shotgun surgery code smells.
This is exactly the opposite of the divergent change code smells. The symptom of this code smell is that you might need to do the same code changes across different classes.
For example, consider this simple example where you send SMS and email updates to your customers when they make a purchase.
# Avoid Code Smells (Shotgun Surgery)
# email_module.py
class EmailSender:
def send_email(self, recipient, message):
# Email sending logic here
pass
# sms_module.py
class SMSSender:
def send_sms(self, recipient, message):
# SMS sending logic here
pass
# main_application.py
from email_module import EmailSender
from sms_module import SMSSender
def main():
email_sender = EmailSender()
email_sender.send_email("nishant@nishant.com", "Thank you for your purchase!")
sms_sender = SMSSender()
sms_sender.send_sms("888-123-0000", "Thank you for your purchase!")
if __name__ == "__main__":
main()
Now, let’s say you want to modify the message parameter and want to send customized discount offers on the products the customer may have viewed in the past, but didn’t purchase.
To make this simple change, you first need to modify how you pass parameters to the Email class & the SMS class.
Next, you will need to change both the classes to process the dictionary parameter (assume) consisting of leads activity data about product purchases, and the logic to compute personalized discounts based on customer lifetime value. If you had separated the message class earlier, this shotgun surgery code smell could have been avoided.
4. Dispensable
The dispensable code smells are pieces of code that if vanished would make the repository more hygienic, clean, beautiful, and efficient.
Poor coding practices incite you into adding excessive comments in the methods and the classes. It hampers code readability and often signals poor variable, function, and class nomenclature. It could also indicate poor implementation of design patterns.
As a best practice, if your methods need you to comment more than a line, you need to refactor methods and improve variable names, method names, and make use of assertions, etcetera.
Here’s an example of comment code smells-
# Comment Code Smells Example
def factorial(n):
# Check if n is a non-negative integer
if not isinstance(n, int) or n < 0:
raise ValueError("Input must be a non-negative integer")
# Calculate the factorial
result = 1
for i in range(1, n + 1):
result *= i
return result
You can use assertions to get rid of the comments, as then it becomes self-explanatory.
# Avoid Code Smells with Assertion (Comment)
def factorial(n):
assert isinstance(n, int) and n >= 0, "Input must be a non-negative integer"
result = 1
for i in range(1, n + 1):
result *= i
return result
Duplicate Code Smells
Earlier, in the code smells example for alternative classes with different interfaces, you may have observed how the same code was repeated in two different classes. That phenomenon is common even for subclasses, methods, and conditional expressions.
You can detect those duplicate code sections and refactor using class & method extraction, implement inheritance, or get rid of deeply nested conditional statements to avoid duplicate code smells.
Such code smells often find a route into your code when multiple developers or teams are working on the same application.
Dead Code Smells
After enough code logic modifications and feature changes, certain pieces of code become redundant and are useless. Developers often comment these out, but the code continues to remain in your files.
This could lead to errors if someone new to the repository accidentally uncomments the code, and makes it alive.
For example, the below piece of code in Python function is commented out and was an alternative option if recursive behavior didn’t yield the expected results. So, this could be removed, or else it might mess up the code in the future.
def parse(self, response):
# If this recursive yield behavior doesn't work use listing_page_urls
# "listing_page_urls": urls,
# listing_page_urls = response.meta.get("listing_page_urls")
# listing_page_urls.append(response.url)
# Logic for recursive behavior
...
An Extended List of Code Smells
We explained the 14 most commonly occurring code smells that rots your codebase with time. But there are multiple others, that couldn’t be ignored and must be factored out of your code-
There is a GitHub page with a comprehensive list of smells, including code smells, anti-patterns, architecture smells, design smells, and implementation smells. Feel free to explore that in case you want to go deeper (recommended).
What Causes Code Smells?
- Violating OOP design principles
- Broken code review processes
- Poor requirement analysis
- Poorly defined standards, and the definition of done
How to Avoid Code Smells?
We already explained several ways to avoid code smells while explicitly discussing code smell examples. In general, code refactoring helps eradicate code smells from your application code. But rigorous code testing, robust PR processes, adherence to agile sprint planning, and embracing DevSecOps could be pivotal to your success against bad code smells.
Refactoring Tools like IntelliJ IDEA, SonarQube, Sourcery, and others help you optimize your code on the go to maintain good hygiene, and keep code smells at bay. Except for dispensable code smells, most of the code smells are rooted in poorly composed methods, poorly organized data, inefficient coupling & delegation between methods, and complex implementation of functionality using classes and methods.
Refactoring helps you navigate these challenges.
Using techniques like extract methods, extract variables, extract classes, inline classes, and inline functions, you can easily improve the quality of your code. It also helps you abide by the OOP principles.
- Simplify Conditional Expressions.
Consolidate or decompose complex long expressions into simplified structures, and make use of assertions in code. It does make calling & maintaining method calls a cakewalk if you remove unwanted parameters, and introduce parameter objects for better outcomes.
Discover more on how refactoring techniques can help you overcome the challenges of using primitives, assist with data handling, move methods between classes, and deal with generalization to wipe out code smells from your repository.
Rigorous Testing is Equally Pivotal
It is imperative to the success of code refactoring. Before you go for code refactoring, you must ensure that the application code is not breaking during execution, and is free from any major errors. First, fix the bugs, and then go for refactoring. Otherwise, you might be investing your time eradicating smells from the code blocks that won’t even be part of your in-production repository after initial refactorings. Testing helps maintain a consistent quality across the codebase.
Github, Gitlab, Bitbucket, and multiple other CI/CD platforms come coupled with features to effectively carry out code reviews. Thorough code reviews by senior developers and peer programmers help discover anti-patterns, and code smells early in the application lifecycle stage and help mitigate them in a timely fashion to keep your code sharp & succinct. On top of that, pull request (PR) analytics tools, such as Hatica, help niche down on root cause analysis (RCA) for any derailment observed in the PR review process. Code reviews are critical to the health of your repositories and overall engineering excellence. Read this blog to know how you may squeeze the most out of code reviews.
To Sum it up!
Code smells can be seen as a symptom of potential issues in the code, as propagated in this insight. But you can also see them as an opportunity to improve your application code quality, readability, performance, maintainability, and scalability. There are a multitude of code smells, and all of them can be avoided by proactively auditing your code, and sniffing for smells. Whilst, refactoring is by far the best approach to mitigating code smells in your repository, don’t shy away from investing in code review tools, PR analytics tools, and SDLC automation tools to streamline the software delivery process. This will buy your developer a high maker time to innovate and lead.
Subscribe to the Hatica Blog for more engineering Insights.