Tuesday, August 9, 2011

How I hacked an android game with Python and OCR!

Math Workout is a famous android game. In fact, it features in the top 5 of google listings for many math game + android related queries. The objective of the game is very very simple. It will fire simple math questions one after the other and you'll have to tap in the correct answer. Its a race against time among other users of the app in the world.

Here's how the app looks like and a few screenshots of questions:

As you can see, the game is fairly straigtforward. So its the time that you have to beat. A naive approach to that would be having a calculator or a computer near by and feeding in the questions to determine the answer and feeding it back to the phone. Totally manual!

Thats when the programming neurons of my brains started itching me that this could be automated and cheated by some mean. Come on think, think! So i sat on to solve this problem during my weekend and started thinking about ways i could attack this problem.

These are the steps that came into my mind in the first thought:

  1. Grab a screenshot of every question
  2. Crop the screenshot so that only the question is visible
  3. Run the cropped image through an OCR engine
  4. Parse the result and evaluate it
  5. Identify the co-ordinates of the resulting number and appropriately simulate touch events in the phone

Bummer! Every step looked a bit complex in itself at first sight. Then came along a bit of googling, and voila, i found the perfect tool that i needed to perform steps 1, 2 and 5. It is the monkeyrunner tool that comes along with the Android SDK. It opens up a Python API through which i can grab and crop screenshots, simulate touch events given an (x,y) co-ordinate. Exactly what i wanted.

Now, I have the cropped image that has the question in hand. Next step is to run it through an OCR engine. Again googling told me that ocrad is an useful OCR command line tool that was available as a part of the GNU project. I installed it and found that it cannot process png images. So i had to run the image through a converter before passing it to ocrad. This small piece of shell script helped me accomplish that:

To keep things simple, the shell script is invoked from python using os.popen(). Now, I have the actual expression as a python string. As you can see from the sample screenshots, few questions can be solved by a direct "eval" whereas others require some processing. Basic operations like addition, subtraction, multiplication and division can be solved using "eval". Whereas questions like "10% of 20", "square root of 9" needs some processing. Thats what this following if else block does:

Now that the expression is evaluated and we have the result in hand, all that's left is to go through the result character by character and simulate touch events in corresponding positions in the screen. I managed to identify the co-ordinates of each number in the screen by trial and error and hard coded those values within two functions named getx() and gety() which will take a character and return its corresponding x and y co-ordinates respectively, and the simulation happens. Here is the code snippet:

To orchestrate this whole process and play the game fully automatically other cosmetic additions like coping up with the frame rate of the phone and taking care of screenshot/ocr lags are to be considered. These are handled by minor if conditions and sleeps for very small amounts of time.

The end result is as you see in the below screenshot :-P

Here is a video of how the game looks like when it is being played by my script:

Though these steps seem like computationally a bit expensive, in practice i found them to be really fast. The script was able to answer approximately 2 questions per second (with an explicit sleep of 0.2 seconds between two questions - which leads to 2 questions every 0.8 seconds). A C/C++ program might run faster than this, but i stopped here as i have accomplished what i wanted. Overall it was a fun filled Sunday! :-)

Here is a link to the full source code of the automated script: auto_math_workout.py (you can find ocr.sh from the gist above in this page - rest of the source code is in the link)

Any comments/feedbacks are welcome! :-)