Monday, September 6, 2010

Infamous common problem !! :-)

I have had a productive day so far today. Its 7'o clock in the evening when i am writing this and i really feel like i have learnt a lot of new things today. I had a really long discussion regarding work with my team lead and he bought up an interesting point that i thought is worth sharing. I was wondering why noone had not put this formally yet (may be somewhere in the DBMS course).

It is one of the most recurring scenario in a software engineer's life that he may have to perform an action and print a log message corresponding to that action. The question is, how to keep these two operations atomic. Ok, now for those who are not familiar with the term, atomic operation is one that follows the "all" or "nothing" principle. A good example would be, if i transfer 1 crore to you, it involves two separate processes: 1) Debit 1 crore from my account and 2) Credit 1 crore to your account. This operation definitely have to be atomic as doing just one of them and not the other will make the bank go bankrupt. So thats an atomic operation (I wonder why it is called so as atoms themselves are divisible into 3 sub particles!!).

So whats with this action and log message you were talking about? Yes, coming to that, usually we perform an action (say increasing 1000 rupees in one's account balance) and to indicate that we have done this operation we write a message (probably somewhere to a log file) saying that we have increased the account balance 1000 rupees (and possibly include the timestamp). In this case, the presence of the message in the log file indicates that one's account balance has been increased by 1000 rupees. If that message isn't there in the file, then it means that we haven't increased 1000 rupees in the account.

Lets say that this operation (of increasing 1000 rupees in one's account) is to be done strictly once. No matter how many requests come, the operation has to be performed only once. A layman or a newbie programmer might think, ah, this is simple, i just have to look at the log file and if the message does not exist then i have to increase the balance by 1000 or else i should print an error message. This is what most of the people do without analyzing the possibily that the first part of the operation (increasing the balance) might have completed successfully and the second part of the operation (writing that in the log file) might have failed (due to some reason) in which case the operation might be performed more than once on receiving successive requests which can really result in a nightmare if the singleton-ness of the execution is really critical.

Since nothing is ideally (100%) atomic, it is impossible to achieve the exact singleton-ness of execution. It can be ensured that an operation is carried out no more than one time and it can also be ensured that an operation is carried out atleast once but not both (It is sort of equivalent to saying, <=1 and >=1 is possible but =1 is never possible).

Here I make a comparison of two different approaches to tackle this problem.

Approach 1 (Action First - Log Next):
Action(Completely) -> Log

This is the default approach that many people take without realising the seriousness of this problem. As i mentioned earlier, here there is a possibility that the action might be complete but the logging might fail, resulting in possibly multiple executions of the operation. Since this approach performs the action first, no matter how heavy the action is, the logging is done only after the entire action is complete. So in case of a heavy action (where only certain part of the code is singleton critical whereas the rest of the code can be run multiple times without any problem. Whereas it is not possible to separate the action into singleton critical and non-singleton critical parts in this approach, as the entire action is carried out first followed by the logging. According to me, this is one major drawback of this approach. This method gives the guarantee that the action is executed atleast once.

Approach 2 (Log First - Action Next):
Preparatory Action(Need not be singleton) -> Log -> Action (Singleton)

This is the approach that many people don't even know the existence of. Why not log first and action next? As i already said, now we can divide the action into two parts - singleton critical part and the non-singleton critical part. Now the execution goes like this, 1) non-singleton critical part of the action (lets call it "preparatory action" as it can afford to run multiple times). 2) Log the message and 3) singleton critical part of the action. So what is better about this approach? First of all, as i said earlier, this does not solve the problem completely. This approach gives the guarentee that the action is executed atmost once, which is slightly an expected behavior than the previous approach. Moreover, we are making the action lighter by splitting it into two parts. Even if the split isn't possible (there is no non-singleton critical section in the action), this approach still guarantees that the action is executed no more than once.

This is a simple yet powerful fact. It is never possible to make (as of now and as far as i understand) an action ideally atomic (meaning, executed exactly once). Post your thoughts on comments :-)

P.S.: The basic idea of these two approaches is my colleague's and i have portrayed that in my own words with my own examples.

-Vignesh