Debugging Code : Problem Solving Revisited

 

A couple of incidents/discussions happened recently that made me think about this topic again. Here are some random thoughts on a subject that should definitely not be approached in a random manner. 🙂

Systematic Approaches Pay Off

I know it is really boring, but a systematic and meticulous approach will always yield better results than randomly jumping at stuff. I’ve discussed this before here.

It’s easy to become focused on what *you know* is the problem, just because of a gut feeling, without any supporting evidence. When you eventually find the real issue, you feel a bit stupid for looking at the wrong thing for so long.

Sometimes you end up focusing on the symptom, not the root cause. If I do this it all works again. Great! Then the same problem happens the next day. Before you know it you have a bunch of voodoo operational tasks to keep the system running, with nobody knowing how and why it works.

It really does pay to take a scientific approach to fixing things.

A Leap of Faith

Over time you get to spot patterns, which will sometimes allow you to jump straight to the root cause of a problem without doing the necessary legwork. There is no problem doing this, provided you are willing to accept it won’t always pay off, and you don’t become controlled by your hunches. You have to know when to accept your hunch could be wrong, and take a step back to a more meticulous approach.

This is not a contradiction of the first point. It’s something that you will learn to do because of prolonged use of a systematic approach. Be careful when working with more experienced people, as it is easy to believe their seemingly random approach to problem solving is just that. Random.

I mentioned this here.

Instrument Everything

I can’t emphasise enough how important instrumentation is.

You should be able to determine what went wrong just by looking at the instrumentation, without having to know or look at the code. In my opinion if you are doing it correctly, non-developers should be able to figure it out from your instrumentation.

We have a perfect example of this in Oracle. I have never seen any source code for the database, but I can diagnose and fix issues by using the instrumentation built into that code. Things like SQL Trace,  Real-Time SQL Monitoring, ASH, AWR, ADDM are all possible because of instrumentation in the code.

The problem with Googling solutions is you often see cut-down code examples, which can promote bad programming practices. I have almost no instrumentation in the examples on my website. That’s because I’m trying to keep them small and lightweight. I don’t want you to have to install a bunch of tracing, logging and unit testing packages before you paste in a 10 line bit of example code. That doesn’t mean those things are not important in your real solutions. It’s all about context.

A Fresh Pair of Eyes

Your brain is a weird thing. You work on something and get nowhere. You walk away and do something completely different and you get a flash of inspiration. All that time your brain has been churning it over and come up with the solution. Sometimes walking away is enough to solve the problem.

You can also call someone in to help you. Talking through the problem can help for a couple of reasons.

  1. They don’t have the mental baggage you have, so they might spot something obvious you are refusing to see. 🙂
  2. In explaining the issue to them, you are ordering your thoughts and effectively explaining it to yourself. The net result is you sometimes answer the question for yourself. This is one of the reasons why you should learn to ask questions properly, especially on forums. In formulating the proper question, you may answer the question for yourself.

I wrote about the second point here.

Cheers

Tim…

Author: Tim...

DBA, Developer, Author, Trainer.

2 thoughts on “Debugging Code : Problem Solving Revisited”

  1. Couldn’t agree more with you. I am always wondering why these skills of basic problem solving seem to be so rare. People seem to be content with ‘I retried it and it works now’ instead of understanding what went wrong and if there is a bigger problem lurking or even just an intermittent bug. So many times just setting up for a simplified reproducible test case reveals where the problem is (or at a minimum what a reasonable work around is).

  2. Hi,
    I really agree with the point of having to order your thoughts to explain the issue to someone else. I don’t know how many times I have solved a problem while explaining the symptoms to a colleague.

Comments are closed.