Module information

CS3223 is a system-oriented module provides an in-depth study of the concepts and implementation issues related to database management systems. It first covers the physical implementation of the relational data model, which includes storage management, access methods, query processing, and optimization. Then it covers issues and techniques dealing with multi-user application environments, namely, transactions, concurrency control, and recovery. The third part covers advanced topics such as on-line analytical processing, in-memory databases, and column stores.

Schedule

  • Venue: i3 Auditorium
  • Lecture time: Friday, 1000-1200
  • Tutorial time: Wednesday, 1600-1700

Module breakdown

  1. Tutorial Participation 5%
  2. Assignments 30
  3. Closed-book Midterms 25%
    1. Until Evaluating Query Evaluation
  4. Closed-book Finals 40%
    1. Everything else

Prerequisites

  • CS2102
  • CS2040

Module details

The module involves:

  1. Data Storage
    1. Data Storage & Disk Drives
    2. Buffer Manager & Files
  2. Tree-Structured Indexing
    1. Database indexes and B+ Trees
  3. Hash-based Indexing
  4. External Sorting
  5. Query Evaluation
  6. Evaluating Relational Operators
  7. Query Optimizer
  8. Transaction Management
  9. Concurrency Control
  10. Crash Recovery

Lectures

Lectures are held every Friday from 10am to 12pm.

Prof Chan is very knowledgeable and is able to explain the concepts very well. He also runs through the algorithms in class using a step by step fashion for our better understanding.

However, due to the amount of content that is covered in the lecture, it is difficult to understand the concepts fully during the lecture itself.

Please do find time to review the notes before / after the class itself.

There is no webcast for this class during my semester.

Tutorials

Tutorials start from week 3.

  • Every week, a subset of tutorial questions will be discussed in class.
  • Each student is pre-assigned to prepare the answer for a question
  • For each discussion question one of the assigned students will be randomly picked to lead the discussion
  • Other students are expected to participate as well

Before the tutorials, we are each assigned 1 question to complete and we will be asked to present our answers in class.

Assignments

The assignments are slightly different this semester. Instead of using SimpleDB, we use PostgreSQL instead.

The assignments are completed in groups of 2 and are required to modify PostgreSQL’s code (Written in C)

Assignment 1

There are 2 parts to assignment 1.

Individual component

For the individual component, we are required to follow a set of simple instructions given by the lecturer to install and run a benchmark on postgresql. The benchmark is a set of queries that are run on the database. The queries are run multiple times and the time taken to run the queries are recorded. The time taken to run the queries are then compared to the time taken to run the queries on a database without the changes made in the assignment.

We will then have to submit the results onto Canvas.

Team component

We have to modify Postgresql’s Clock replacement algorithm and implement a LRU algorithm instead.

A large amount of source code from Postgresql have to be read and understood before any development can begin.

Ratings

Workload 8/10 (High Workload)

Between juggling the tutorial, assignments and the lecture, the workload is quite high.

Some time is required to understand the content of the lecture before the tutorial can be completed. Completing it for the first half of the semester, I spend most of my time on my FYP and this module.

The assignment involves a lot of code reading and understanding of the postgres code base. A few hours is required to understand fully what the source code is doing before any implementation can begin.

The source code is well annotated but it is the sheer amount of code that makes it difficult to understand.

Organization 7/10

The module is relatively well structured. We can see how each lecture builds up to form the data base itself.

Learning 8/10

There are a lot to be learnt from this module regarding database implementation.

Enjoyment 3/10

Personally I did not enjoy the course very much as it is very math heavy and I am not really a math person.

The concepts were confusing to me and a large amount of time is required to dig through the slides and understand the examples which were given.

Usefulness 4/10

I feel that the mod is only useful towards the algorithm side of things. The data structures and algorithms (B-Trees, Index, Sorting) taught in this module might be useful when faced with similar problems in the future.

Overall 5/10

Overall I feel that this mod is ok. Due to the workload for the semester I dropped this mod. It might get better during the 2nd half of the semester.

Actual grade: W (As I dropped it after the first half of the Semester)

Resources

  1. Postgresql LRU
  2. Ramakrishnan Database Management Systems 3rd Edition PDF
  3. Database System Concepts Sixth Edition PDF
  4. Database Systems The Complete Book