My kids code PEP 8 compliant, therefore they do Clean Code!
Someone really dumb
This is an actual quote from a teacher. Even hobbyists should know how to write proper code, but it’s rarely taught. When I asked a professor about “how to write good programs”, his answer was “Just look at other people’s code.”, but it’s not enough! We will focus on writing proper Python, but it can be applied to other programming languages too.
What is Clean Code?
I will only give a short approximate answer: a clean codebase is easy to read, understand, use, maintain, and extend. This focuses on the technical part: naming the variables (and maybe some formatting is also part of this), organising the functionalities in an intuitive way, removing technical debt, refactoring confusing and/or long parts. Long story short: if you have to explain what your code does, it’s probably not clean enough. We will have examples later!
What is PEP 8?
PEP 8 is the Style Guide for Python Code. PEPs are Python Enhancement Proposals – official documents about various aspects of the language. A bunch of guys sat together to decide on “how Python code should look like”, for example “use 4 space to indent”. A Style Guide describes the look and feel, they are about purely cosmetic stuff – or are they? Surprisingly they also convey a message with the looks, but let’s see those examples!
Two “clean” codes
These two snippets do the same – but I’m not telling what they do! (yet) Try figure out first, please. Spoiler alert! There is a teeny-tiny difference between the two sides, one of them is a little bit buggy, that’s why they give different results, but now focus on understanding the code.
import hashlib
bl = b"This has to be hashed"
no = 0
trgt = 22
while True:
h = hashlib.md5()
h.update(bl)
h.update(no.to_bytes(length=100, signed=False))
r = h.digest()
ri = int.from_bytes(r)
zc = 0
while ri > 0:
if (ri & 1) == 0:
break
zc += 1
ri >>= 1
if trgt <= zc:
print("The found value:", no)
break
no += 1
import hashlib
def countTrailingZeros ( number: int ):
zeroCount=0
while number>0 :
if number&1 :break
zeroCount+=1
number>>=1
return zeroCount
def miner (block,target: int):
current_nonce=0
while True :
hasher=hashlib . md5 ( )
hasher . update ( block )
hasher . update ( current_nonce.to_bytes ( length=100,signed=False ) )
result=hasher . digest ( )
resultInt=int . from_bytes (result)
zeroes=countTrailingZeros (resultInt)
if zeroes>=target : return current_nonce
current_nonce+=1
if __name__=="__main__" :
block=b"This has to be hashed"
target=22
resultNonce=miner ( block,target )
print ( f"The nonce for this block is { resultNonce }" )
What they actually do
Have you tried really hard to decipher the code? If you heard about the “Proof of Work” concept behind Bitcoin, you may find it familiar, because it is the same! The hashing algorithms – like the used MD5 or SHA-256) make one big integer from anything, but the result number is “pretty random”: change only one bit in the input, and the whole result changes!
Now, the fun begins with zeros: it’s hard to determine how many consecutive zeros will be at the end, and by end, I mean the Least Significant Bits (LSB). There may be none, maybe just 2, or even a dozen! You have to try it for each input. You get a goal: “have at least X zeros at the end”, 22 in our case. You also get an input, and you put a “random” value at the end of our original input, a nonce. If the resulting hash has enough trailing zeros, you won! Otherwise try with the next nonce, hash, check, until you find the result. This is the (in)famous brute force: try until succeed.
Okay, this was the “official documentation” and I don’t expect random people to know how crypto actually works. It was a must-read in this case, because it’s not trivial. However, with this information: how would you rate the snippets?
Rating the code
The left one is pythonic as possible in theory. If you run the flake8 linter, it will find no errors! It 100% matches the PEP 8 Style Guide. But let’s face it: really hard to understand.
The right one gets E for effort: parts were organised into functions, they are properly named (not the pythonic way, still the function tells you, what it does), but has a whopping 50+ style errors while being only 27 lines long!
Let’s face it: while the style and organisation can be separated, they mean nothing alone. As an exercise, try to lick the right code snippet into shape first! Also, this is the right time to find the bug and start cursing the writer. Believe me: nothing will teach you a better lesson than debugging someone else’s crappy code. In exchange I hereby vow not to write such atrocities. (you should too)
The importance of linting
Static code analyzers – like the aforementioned flake8 – are helpful tools. They can even find bugs! While they are not trying to replace software developers, unlike some AI tools, they can substitute a reviewer. As a result, real people can focus on nitpicking on your precious merge request. Some companies have extensive rules and it’s often hard to follow all of them! Even senior devs need a reminder after switching workplaces, because so many companies, so many customs. These tools help enforcing good habits, so use them!
Naming conventions
Now this part may cause some confusion, because this is both design and functional at the same time: how to name variables and functions. Of course try to make variable and function names as expressive as possible! It’s the #1 rule – no “a” or “b”, the name should tell you the purpose of that stuff! There were times when the length of a variable or function name was an issue, and we learned our lessons: no more wacky abbreviations. What do you guess, what does “itoa” do? Well, the opposite of “atoi”, that’s sure, but “intToStr” would have been MUCH better! Still, we stuck with names like these. Don’t repeat our ancestor’s mistakes, please.
Naming conventions have 2 major parts. One is the naming itself, like prefixes and suffixes. This is the “tell everyone, what is this code doing” part. Let’s haven an example, the builtin unittest framework: the test class should start with “Test”, and the test cases should start with “test_”.
Strictly following code conventions
Does your code break, if you don’t follow these conventions? Not always. Some components depend on these, and will fail if you don’t follow the rules. Meanwhile some naming conventions can be loose rules to follow and they are there to aide you – it all depends on the application/library!
The other major part is the cosmetic part in the conventions. These purely exist to help you by hinting a lot of things! In Python, if you see “TestStringMethods”, you can tell it’s the name of a class. Dunders like __name__ are special builtin things and it’s the best not to mess with them, meanwhile anything_written_like_this are regular variables. A _do_something() function tells me that it’s intended scope is the inside of a class, and every language has its rules like this!
For example, C macros ARE_WRITTEN_IN_CAPS_WITH_UNDERSCORES, variables are camelCase, and as Python classes, C++ classes are PascalCase. By the way, this is the snake_case, because of obvious reasons.
How to write clean code?
Well, this topic filled books. Sadly, don’t expect a quick summary on this topic, because it would need its own blog. Try reading Uncle Bob’s Clean Code first, I recommend it. Of course, there are some basic rules:
- it probably needs to be modified, if it needs a comment
- or if it’s not clear at first sight
- several screen long stuff definitely needs refactoring
- if it doesn’t feel right, modify until it feels right
Meanwhile, you can get some personal help if you take part in my Python training – writing proper code is an integral part of the course! Learn all the important parts to get a head start!