Managing Technical Debt in Django Projects

This is fine

I’ve been thinking about this subject a lot, and I’ve been meaning to write something. Rather than procrastinate until I have a lot of material, I’m going to just continuously edit this post as I discover things. Many of these principles aren’t specific to Django, but most of this experience comes from dealing with Django.

Some of these tips don’t cost any time, but some involve investing extra time to do things differently. It’s in the name of saving time in the long run. If you’re writing a few lines of JavaScript that’s going to be thrown away in a day, then you shouldn’t waste time building up a castle of tests around it. Managing technical debt is a design tradeoff. You’re sacrificing some agility and features for developer happiness.

Don’t reuse app names and model names

You can have Django apps with the same name. And you can have models with the same name, but your life will be easier if you can reference a model without having to think about which app it came from. This makes it easier to understand the code; makes it easier to to use tools (grep) to analyze and search makes it easier to use shell_plus, which automatically loads every model for you in the shell.

Leave XXX comments

You should leave yourself task comments in your code, and you should have three levels (like critical, error, warning or high, medium, low) for the priority. I commonly use  FIXME for problems,  TODO for todos,  DELETEME for things that really should be deleted,  DEPRECATED for things I really ought to look at later, and  XXX for documenting code smells and anti-patterns. Some examples:

  • FIXME this will break if user is anonymous
  • TODO  make this work in IE9
  • DELETEME This code block is impossible to reach anyways
  • DEPRECATED use new_thing.method instead of this
  • WISHLIST make this work in IE8
  • XXX magic constant
  • FIXME DELETEME this needs csrf

The comment should be on the same line, so when you grep TODO, you’ll be able to quickly scan what kind of todos you have. This is what other tools like Jenkin’s Task Scanner expect too. Many people say you shouldn’t add TODO comments to your code. You should just do them. In practice, that is not practical, and leads to huge diffs that are hard to review.

There is such a thing has too many comments. For example, instead of writing a comment to explain a poorly named variable, you should just rename the variable.

Naming Things

One of Phil Karlton’s famous two hard things, finding the write names can make a huge difference. With Django code, I find that I’m happiest when I’m following the same naming conventions as the Django core. You should be familiar with the concepts of Hungarian notation and coding without comments (remember the previous paragraph?).

Single letter variables are almost always a bad idea, except with simple code. They’re un-greppable and except for the following, have no meaning:

  • i (prefer idx) — a counter
  • x — When iterating through a loop
  • k, v — Key/Value when looping through dict items
  • na count/total/summation (think traditional for-loop)
  • a, b — When looking at two elements, like in a reduce function (very rare)

An exception is math, where x and y, and m and n, etc. are commonplace.

If the name of your variable implies a type, it should be that type. You would not believe how common the name of the variable lies.

When writing code, it should read close to English. Getter functions should begin with “get_”. Booleans and functions that return booleans commonly begin with “is_” (though anything that is readable and obvious will work).

Write tests

My current philosophy is “if you liked it then you should have put a test on it”. The worst part of technical debt is accidentally breaking things. To its credit, Django is the best framework I’ve used for testing. Unit tests are good for TDD, but functional tests are probably better for managing technical debt because they verify the output of your system for various inputs. Doing TDD, getting 100% coverage, and taking into account edge cases… never happens in practice. That does not mean you shouldn’t try. Adding tests to preventing regressions is the second best thing you can do. And the best thing you can do is to write those tests to begin with.

Get coverage

Running coverage is commonly done at the same time as tests. I skip the html reports and use  coverage report from the command line to get faster feedback. When you have good coverage, you can have higher confidence in your tests.

Be cognizant of when you’re creating technical debt

Of course, every line of code is technical debt, but I’ve started adding a “Technical Debt Note” too all pull requests. This is inspired by how legislation will have a fiscal note to assess how much it would cost. Bills can get shot down because they cost to much for what they promise. Features should be the same. Hopefully, you’re already catching these before you even write code, and you’re writing small pull requests. If we find that a pull requests increases technical debt to an unreasonable degree, we revise the requirements and the code.

Clean as you cook

Most people dread cooking because of the mountain of mess that has to get cleaned after the meal. But if you can master cleaning as you cook, there’s a much more reasonable and manageable mess. As you experiment with code, don’t leave behind unused code clean up inconsistencies as you go. Don’t worry about deleting something that might be useful or breaking something obscure. That’s what source control and tests are there for. Plan ahead for the full life cycle. That means if you’re experimenting with a concept, don’t stop when it works: update everything else to be consistent across the project. If it didn’t work, tear it down and kill it. Don’t get into a situation where you have to support two (or three or four or more) different paradigms.

The Boy Scout rule

“Always leave the campground leaner then you found it”. Feel free to break paradigm that a pull request should only do one thing. If you happen to clean something while working on a feature, there’s no shame in saying you took a little diversion.

Make it easy for others to jump in

Projects with a complicated bootstrapping processes are probably also difficult to maintain. Wipe your virtualenv once in a while. Wipe your database. If it’s painful, and you have to do it often enough, you’ll make it better. Code that doesn’t get touched often has the most technical debt.

tetris game over

Educate your organization that they can’t just ask for a parade of features

This problem fixes itself, one way or another. Either you keep building features and technical debt until you’re buried and everything comes to a standstill and you yell at each other, or you find a way to balance adding features.

Prevent Dependency Spaghetti

Just like how you should try to avoid spaghetti code, having a lot of third party apps that try to pull in dependencies will come back to bite you later on.

  1. Specify requirements with ==, not >=. Not every package uses semantic versioning. And using semver does not guarantee that a minor or bugfix release won’t break something.
  2. Don’t specify requirements of requirements. This is to avoid an explosion of requirements to keep track of.
  3. There’s no easy way to know when it’s safe to delete a requirement. Even if you have good test coverage, your test environment is not the same as production. For example, you can safely delete psycopg2 and still run your tests, but have fun trying to connect to your PostgreSQL database in production.

Don’t support multiple paths

If you’re writing a library consumed by many people, supporting get_queryset and get_query_set is a good idea. For yourself, only support one thing. If you have an internal library that’s used by multiple parts of the code base and you want add functionality while preserving backwards compatibility, you can write a compatibility layer, but then you should update it all within the same pull request. Or at least create an issue to clean it up. Supporting multiple code paths is technical debt.

Avoid Customizing the Admin

The moment you start writing customizations for the admin, you’ve now pinned yourself to whatever the admin happened to be doing that version. Unless you’re running automated browser tests to verify their functionality, you’re setting yourself up for things to break in the future. The Django admin always changes in major ways every version, and admin customizations always have weak test coverage.

Do Code Review and Peer Programming

Code review makes sure that more than one person’s input goes into a feature, and peer programming takes that even further. It helps make sure that crazy functionality and hard to read code doesn’t get into the main code base that others will then have to maintain. If you’re a team of one, do pull requests anyways. You’ll be amazed at all the mistakes and inconsistencies you’ll find when you view your feature all at once. Even better: sleep on your own pull requests so you can see them in a new light.

Don’t Write Unreadable Code

Code review and peer programming are supposed to keep you from writing unmaintainable code. We’ve embraced linters so that we can write code in the same style, coverage so we know when we need to write tests, but what if we could automatically know when we were writing complicated code? We can, using radon or PyMetric‘s McCabe’s Cyclomatic Complexity metric.

Additional Resources

  • http://youtu.be/SaHbtEeu37M Docker and DevOps by Gene Kim, Dockercon ’14
  • Inheriting a Sloppy Codebase by Casey Kinsey, DjangoCon US ’14

Special thanks to Noah S.

Leave a Reply

Your email address will not be published. Required fields are marked *