Wednesday, September 8, 2010

Else vs Elseif - A note about validations

Its 10'o clock when i'm writing this one (as opposed to the 7’o clock of the previous post). Today had been a productive day too with a lot of work done in a good pace. Today i had an arguement/discussion(FYI, arguEment is when you argue on a topic and argument is the one you pass to the functions in your program) with my colleague on a very very simple issue. It started when i was explaining about a piece of code i had written.

Going forward in this post, i am describing this problem and the discussion as a general article and not as a discussion.

To tell about the problem briefly, whenever we use a set of if else statements, should the last in the set be an else or an elseif ? Sounds simple right. Assume that we do these ifs in a system level program which on breaking with an irrelevant error is critical. The main points to be considered when approaching this problem are code readability and whether the code breaks (the code should not break the flow of the program by doing something unwanted). Again there definitely has to be a trade off between these two otherwise this would not be a topic to rampant an article about.

Consider a simple example of an if condition. If a boolean is true you perform an action and if the boolean is false you perform some other action.

Case 1 - if..else..:
In this case, there is an else at the end. So the code will look like this:

if a == True then return 1 else return 0


First of all, this code will always perform some action (but not necessarily the correct action). The problem with this code is that the second part is very loose. Even if the variable a has something other than False and True it will still be returning 0 which may not always be the expected behavior. Though it may seem that this option is never safe, at times it is. This type of modules will generally be some internal system code, at which point you may most likely not worry about things like input validation. So if your variable can have five possible states and you have two different action based on those states, in this approach you will be writing a piece of code like this:

if state == ‘state1’ or state == ‘state2’ then return 1 else return 0


Remember there are totally 5 possible values that state variable can take. Our objective here is to return 1 of the variable is in state1 or state2 and return 0 otherwise. This is exactly what we have wrote in this if..else.. sequence here. Is this the clean way of doing this ? Yes and No. When we are sure that we have done input validations on a high level layer and that the state variable may contain no erroneous value other than the expected 5 values, then this code will definitely suffice. If not, then we may have to rethink this since it will return 0 for not only the expected 3 values of that state variable, but for every other erroneous case too.

Case 2 - if..elseif..:
In this case, there is an elseif at the end. So the code will look like this:

if a == True then return 1 elseif a == False then return 0


Here, though we are in an internal system level module, we just put a little extra effort and add an else if constraint which makes the code stronger. Also, in the case of the 5 state example, in this case we will be listing out all the 5 possible values of the state variable verbosely and thereby in future you need not look up anywhere else for the possible values the variable can take (thus enhancing code readability). I always felt it was a good practice to consider all the validations in all layers of code as that would really help if we decide to bypass a layer for some reason.

It is very similar to doing the validations on both on the client side (using javascript) and on the server side (using a server side script like php) in a website. We do this because there is a possibility of bypassing the client side layer and invalid inputs may come to the server side and we don’t want it to break because of validations not being present on the server side.

Here, even though the variable is just a booelan, it is better to always have the most constrained form of the conditions so that with little effort you might be saving a big amount of code change when you decide to bypass a layer in the future.

The actual problem's statement now changes to this, when you have your entire system architectured into different layers (like user interface, application logic, backend, etc.) should you do validations upon entering each layer or is validation is enough in the top most layer alone ? The answer is again a yes and no. The only big point to consider before making such a decision is if you want to bypass a layer in the future then you have to change the code so that the validations are done appropriately.

Even otherwise, my opinion is that it is always a good practice to perform the validations on all the layers as there is a possibility of an input being corrupted (when a layer passes it on to the next layer or some such). This is how big a simple else vs else-if problem can lead to. When you write code that is not just going to be owned by you, it is always good to think of such issues before you write each and every single line.

This may seem like the dumbest article you have ever read. But to me, it has taught a good lession about where and where not to validate data.

Comments are welcome as always. :-)

-Vignesh