Sunday, October 3, 2010

DBMS - The art that is taught as a science !!

Note: This post is not intended for a general crowd of audience. This post rambles on about something that the current college syllabi lack and suggests some changes to improve it. If you are interested in reading the previous sentence, then go on. Otherwise just close this tab and move on with your work instead of wasting time here :-)

We all know that database is a software that is used to store, organize, blah blah blah our data. Any person who is taking up a programme related to computer science would have learnt this definition for a short answer question in his/her database management systems course. After having worked with a database that grows by approximately 40000 rows a day, I now realise that the college level courses on DBMS could do much better than what they actually are doing now.


How the indian college DBMS courses work? Lets have a quick bulleted list of items that a student does during his DBMS course spanning one full semester.

  • Reads through the differences between File Management System (FMS) and DBMS

  • Reads about ACID properties and byhearts them for answering in the examination

  • Reads about various normal forms (and byhearts them for writing short answers in exam). Almost all the DBMS courses say that there are only 4 normal forms (1st, 2nd, 3rd and Boyce-Codd) and they always omit even the fact that there are actually 3 more normal forms (4th, 5th and the 6th)

  • Whiles away time with Entity Relationship (ER) diagrams which is the student's life saver in exams (as there is no such thing as THE right answer, anything that makes sense will fetch you marks)

  • Learns basic SQL and does a huge list of CRUD (Create, Read, Update, Delete) commands. The most complex query one writes here involves a query with another inner query

  • Argues over whether SQL should be pronounced letter by letter (like an abbreviation) or as SeQueL (like an acronym)

  • Reads about and byhearts various types of joins and remembers all of them for writing in the exam with a single example

  • Optionally, develops an information management system (famous ones include Library Management System, Airline Reservation System, etc.) at the end of the course, where the actual system will consist of nothing but data entry. This project will, of course, go into his/her resume during the interviews

  • Towards the fag end of the course, reads something about how the DBMS does some basic operations internally. Most of the students omit this as it is covered in the last few weeks of the semester when students are busy with some other work

The problem

The main problem lying around here is the difference between theory and practice. The beauty about DBMS is that everything that works out perfectly on theory performs worse in practice. For example, consider normalisation. In theory, and according to college syllabi, Boyce-Codd Normal Form is the best that a database could be designed. But in practice, it might kill the database if it mostly consists of reads (because of the increased number of costly joins even for simple reads).

Almost all the colleges teach DBMS in theory and they never consider the practical aspect. And yeah, developing an information management system over the period of the course does NOT count as practice. The information management systems which students develop as a part of their DBMS course hardly involves more than 10 rows per table. And most of the students develop it in the mindset of deceiving the faculty for marks rather than to actually learn DBMS which is the actual objective. DBMS is all about scalability and hence teaching DBMS in small scale is like teaching something that is totally illogical.

What can be done?

Again, let me make a bulleted list of points that i think can be done to bridge the theory to practice gap that is found in the DBMS courses. I am not an expert in DBMS and these are just my humble views on what i feel could be changed.

  • First major change to be done is to teach DBMS in the actual scale in which it is to be used. In the lab classes, don't let the students create their own tables and add 5 or 10 dummy records. Give them a preloaded table with ten lakh rows and make them query against that table. This will make them write better queries naturally as every query is going to take some time to spit out the output they want.

  • The above step also makes the student to learn about indexes. Indexes are one thing that is really useful in DBMS (i am in the process of writing an article about database indexes - will be posting it once its complete). None of the courses on DBMS really teach about the practical benefits of indexes. When a large table is given, students will have to define indexes and exploit them in their queries.

  • Instead of making students worry about JDBC, ODBC, etc. leave the frontend language to their choice and just lay constraints on the backend DBMS system. For example, don't force them to use Visual Basic, which makes a student focus more on learning ODBC and designing forms rather than designing databases.

  • Make the joins practical. Don't just teach the various types of joins for the sake of short answer questions. Make it such a way that the student is really comfortable with how exactly the different types of joins work.

  • And for god's sake stop teaching stuff about File Management Systems. We are in an era where there is a group that says even SQL and DBMS is not needed (For more about this google for NoSQL).

Closing thoughts

Database Management Systems are the arteries and veins of any corporate business. Hence, learning the complexity and scale involved in it is really inevitable and important. Making proper use of the DBMS course in colleges would be a real boost to the student's career. If the changes i suggested here are incorporated in a course, then that will definitely also save the time that a student spends on preparing DBMS for interviews.

To conclude, "In theory there is no difference between theory and practice. Whereas in practice, there is". DBMS is more of an art which is being taught in colleges as a science. If this change happens, quality of students being spit out of college every year will become better for sure.