Profile Picture
Written ByMd Tanveer

Prior knowledge for database planning

Making a database efficient is extremely important since it determines how fast you could access your data or modify them. Can't forget about space efficiency by eliminating redundancy. If you build an application which stores data in a database then you need to make sure you can tune it in future with minimum effort. This article is not about planning a database rather things you have to consider before starting planning a database.

Coding
176
Aug 4, 2019 @ 7:10 PM

Determine the data for now and future

First thing first. We need to identify what data will be stored in the database. The best way to think of that is thinking about what type of data will be derived from the database. Means when you will create your application in what form you will need your data. Often this can be determined by what kind of reports, forms, statistics and other information must be derived from the data. For instance, if you want to save information of your customers then you need to think beforehand what information will you need to have. You have to be proactive to think about future data requirement as well. Of course you need to keep your database as simple as possible but also have to make sure you will not have to change your database in future as your application progresses . Once you have many tables linked to one another (Relational database) it will become intimidating to change one table (like adding a field/column).

Entity type

Next establish what entity types are there (Customers, Employees, Orders etc). Each entity type will be represented by a separate table. Then determine how the tables are linked. This is crucial and often a very complex task called Normalization.


Here you can see customers and Rental table for a video rental business. They are linked by their primary key.

Breaking down the data

The data should be broken down into individual fields (columns) within each table. Break the data down into smallest significant (Atomic) units.

Example: Instead of putting full name in one column put first_name and last_name in two different columns. Same for address. Split road_number, city, province and postal_code into different columns.

To decide what constitutes a significant unit, think of what criteria you will use to search, sort and filter the data, and what values you need to isolate for reports, form letters etc.

Moving/Changing data overtime

There will be some data which will change over time. It’s called “Moving Target” Like age. It's not a good idea to store someone's age. Instead you should store date of birth so you can calculate the age of that specific person at any given time.

Category of data

When you store data make sure you categorize them properly. For example, "doctor" is a profession, while "Sergen" is a speciality. "Dog" is a type of animal but "Poodle" is a breed. You should not mix and match them as it will put you in trouble to maintain as your database grows.

Data Redundancy

Data redundancy means repeating the same data over and over again. It affects your database efficiency and consumes a lot of unnecessary space. Imagine you have a customer Moe who rented 2 Movies from your video rental store. If you do not use relational data base to break down the entity into tables and link them using primary key then it will look like something this.

On the other had if you have a table for "orders" and there you have quantity and price then you do not need to have a separate column for total quantity or price as it can be calculated by a simple query on the fly.


You are repeating customer Moe's information twice just because he rented 2 different movies. Here comes the relational database into rescue to create different tables for customers and Rental and link them when necessary. See the above picture under Entity.

Properties of Columns

Once the structure has been built now the next part is decide the properties (Column name, data type, size) for your database table.

Column Name: Make sure your column name is not conflicting with the SQL keywords. You need to use column name in your SQL query, so if you use any of SQL keywords it will prone to error.

Data Type: Use the correct data type to accommodate your data within minimum space. For example: Use tinyint to store a boolean value instead of int.).

Size: Specify the size of your data type. If you want to store postal code you should not need more than 7 characters of space in your data base for one postal code. In Canada a postal code looks like something this: "XXX XXX".

Naming convention

Another important thing is being consistent about naming your column. Convention is something known as best practice. First use meaningful name and avoid using abbreviation (For example: first_name for customer instead of fn) and use same method through out your database to separate words within your name. Like: first_name or firstName(camelCase). Your application will work fine if you use different method but it will be hard for you to maintain your application as it will grow bigger.

Every database language has specific rules for naming tables, columns (Spaces or periods may not be allowed in names in some languages). To be on the safe side here are some rules other programmer tend to use and also for you to be consistent:

Start names with a letter, followed by letters and numbers. (email_1 or email1).
Avoid spaces and punctuation marks (Not first name or first,name instead first_name or firstName).
Begin table names with a capital letter followed by small letters (Customer, FirstName).

Validation/Data integrity of data

Some column should not left blank. Like First Name of a customer or Postal code in an address. When you fill out a form on any site you get error when you leave a mandatory field blank. This is because the application needs to have that specific data to store in the database.

Lookup tables

Lookup tables is meant to hold static value like name of provinces, departments etc. Not only does a lookup strategy avoid duplication and save space in your main data tables by using the short codes, it also guarantees consistency and allows you to restrict data entry to only those values on the list.


In the above image the table on the left side is a lookup table. You can see in the customer's table we are just using the short code from lookup table to represent customer's province. If you need the full name of the province you can always join the table by join query and have them displayed.

Primary key

For each table, it should be determined if a primary key is required (almost always). The primary key is important not only for identification purpose, but also as the basis for indexing to make searches faster and to link separate tables together on common keys.

It is mandatory that each row contain a unique value for the primary key field, and it can never be left null (blank).

The primary key should be simple and does not change over time (Social insurance number, Student Number) Not a phone number or email address though.The primary key can be combination of fields as long as the result is unique within the table (like branch code plus an account number for a banking database). This is known as concatenated or compound key.

You can also check