header image
[ # ] Document to understand
May 16th, 2007 under Technology, Devel, Thoughts, rengolin

Not always you have the opportunity to write a fresh new code. Sometimes you have to face a huge codebase, be it either one big monolithic code or thousands of small scripts and programs, it doesn’t matter, it’ll definitely be a nightmare. So, how to avoid stress and go through it with the fewest scratches possible?

It all depends on what you can do with the code…

If the last programmers were nice to you they have prepared test cases, documentation, doxygen comments, in-line comments, a wiki with all steps to compile, test and use the software and a plethora of resolved (and explained) problem-cases that can happen. I’m still to see that happening…

As Kernighan exposes in ;login: (apr 2006) writing tests should be the primary task of the programmer and not code. With existent codes, documentation should be the primary task instead of direct changes to the program.

If the program is documented already, your first task should read both the docs and the code, side by side but the case might probably be that you won’t understand properly. Nothing wrong with you, just that programmers tend to document what they don’t understand and fail to do so for what they do understand. But if you start changing old docs you might end up screwing everything so here’s a quick tip:

  1. Copy the code to a temp area so you can play at will
  2. Compile the code with your mind, read the code, understand from where all variables are coming, go backwards and check and add a comment before each line saying what each important line does
  3. Group lines in domains and create a bigger comment. This is specially important for balls of mud where the whole code is in a single function
  4. Check against the old docs and see if what it’s written is actually what’s happening. Documentation tend to be out of date very quickly

Steps

One side effect of documenting is that eventually you’ll understand the code better than the original coders. I explain. The original coders had one objective in mind: fix the bug. Most of the time and specially with balls of mud the fixes tend to be dirty, written as temporary and run forever. The original coder, most of the time, didn’t know how that bugfix fitted on the whole, he/she just knew it worked.

You, on the other hand, know the whole in a way no other coder had know before because you have documented it from start to finish and have the knowledge of what’s at stake on each change.

But that power have a problem: on the day you stop documenting you’ll be the “old coder” and stop understanding the code as a whole and start writing bad bugfixes yourself! So be diligent, be patient and most of all loyal to your grater purpose: to code better.

Back to the bug fixes… Once you find them you’ll notice that most of them are useless or redundant, or that lots of them can be replaced by a very small and simple shell script. Once you learnt all steps of a program you visualize the flow and it becomes very clear all shortcuts and optimizations available to you, and you have now the power to change it.

But after all those changes, a good chunk of the documentation is useless, specially those you made so, why bother? I tell you why, if you hadn’t documented in the first place you wouldn’t be able to optimize it that deep and throwing away most of your past documentation will (should) encourage you to document it again, in a higher level. Doing it you’ll see things that you couldn’t before (even when you had the whole system in your mind) because the system was not organized! That’s the next step!

Every time you document everything and see the whole system in your mind you visualize all optimizations available, this will force you to throw away old code and docs and redo again, which in turn, will come back to the same point but one step further.

The pure act of documenting allow you to optimize. Throwing away old docs allows you to go one step further.

Documents in tree

Instead of throwing away old docs might not be a good idea because people will not understand why you optimized that way and will probably go back to the original code once you’re off. Keep all documents, organize them in a tree and show only the high level docs at first, if one wants to know more it can go deeper in the tree to find out how it was and why you made it that way.

It’ll also help you to understand your own optimizations and change in the future when it makes no sense any more.

Wasting time

Some will argue you’re wasting time documenting when you could understand the code just by looking at it (use the source, luke) but that just isn’t true. People, as well as computer, have a short term and a long term memory and most codes won’t fit entirely on your short term memory (if they do, get a better job).

Navigating through your long term memory is expensive and will switch context on your short term memory and you’ll not be able to make connections properly. But if you have everything written in an organized structure it’ll be much easier for your brain (or any machine) to infer about the code.

It’s the same as testing, you may spend half of your project’s time doing the test cases but that could (and probably will) save you hundreds of hours (and probably your job) if something goes wrong, and it’s very likely to happen.

At last, writing test cases, documentation and proper code is not only part of your job, it’s part of your grater purpose in life: to code better. No job in the world is worth writing bad code, it’s like lying to keep the job… just don’t do it.

Popularity: 4% [?]


Write a comment





Close
E-mail It