The Ultimate Guide to Normalization in DBMS: From 1NF to BCNF

Imagine stacking up books in a library at random instead of their genre. It may be convenient initially, but you will struggle to find a specific title later. You will then go through the shelves to find that you have stacked the same title twice or forgot an important title. This is what happens to databases without normalization. Now, most Database Management Systems (DBMS) use Structured Query Language (SQL) to store and manage data simultaneously. The role of normalization in DBMS is crucial to prevent data anomalies and improve query performance.
Every aspiring developer must know how to leverage normalization in DBMS if they aim to succeed in the field. So, let’s dive deep into the concept, explore its various forms, and why it is critical for creating scalable and reliable databases.
What is Normalization?

It is important to understand the definition of normalization in DBMS in detail before we learn its various forms. SQL is, without reservation, the backbone of many database systems, making it one of the most popular programming languages, with 51% of developers (1) using it worldwide.
However, these systems need a proper design to prevent databases from becoming inefficient and riddled with errors. Cue normalization, a database design technique that organizes data into tables and columns to minimize redundancy and ensure data integrity.
The concept involves breaking down large, complex tables into smaller, more manageable ones while maintaining relationships between them. Here are a few objectives of normalization in DBMS:
1. Eliminate Redundancy: Normalization reduces storage space and the risk of inconsistencies because it avoids duplicating data.
2. Improve Data Integrity: It helps prevent anomalies such as insertion, deletion, and update anomalies, ensuring that data is accurate and consistent.
3. Enhance Query Performance: Normalized databases are generally efficient for querying and updating data.
4. Facilitate Database Maintenance: A normalized database is easier to maintain and modify over time than one with no normalization.
Normalization chiefly involves a series of steps, each known as a “normal form”. The most common normal forms are as follows:
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Boyce-Codd Normal Form (BCNF)
ALSO READ: What is Programming and How to Code Your Future Career With It
1NF: First Normal Form
First Normal Form (1NF) is certainly the most basic level of normalization in DBMS. What’s more, it stipulates that every attribute in a table must be atomic. In other words, each column in a table should contain a single value, not a group of values. A table is in 1NF if it meets the following criteria:
1. Key Characteristics
A. Atomic Values
Every field must hold only one piece of data. For example, you cannot have multiple phone numbers in one field. Each phone number must be in its row.
B. Unique Rows
Another key aspect is that each row should be distinct; no two rows should have the same data. The table must also have a primary key to ensure this uniqueness.
C. No Repeating Groups
There cannot be repeated datasets within a table. A customer’s multiple orders should appear in a separate row instead of a single field.
2. Violations of 1NF
Sr. No. | Student Name | Subjects |
1 | Anjali | Maths, Science |
2 | Rehman | English, History, Maths |
Consider the table above. The subject column holds multiple values (Maths, Science for Anjali and English, History, Maths for Rahul). This violates the rule of atomicity because each field should contain a single value. As a result, you have to break down the multi-valued field into separate rows.
Sr. No. | Student Name | Subject |
1 | Anjali | Maths |
1 | Anjali | Science |
2 | Rehman | English |
2 | Rehman | History |
The aforementioned table is in the 1NF, with each cell containing a single value besides no repeating groups.
ALSO WATCH: Importance of Analytics in Digital Business by Pawan Kumar, Head Data Science & Analytics
2NF: Second Normal Form

The second normal form states that every non-key attribute in a relation should significantly depend on the primary key. In short, no part of the primary key should be able to determine a non-key attribute without the other parts. Moreover, a table in 2NF must fulfill all the requirements of 1NF.
1. Characteristics of 2NF
A. Full Dependency
All non-key attributes must depend on the overall primary key, not just a portion of it.
B. No Partial Dependency
It is equally important to have no partial dependencies between non-key attributes and parts of the primary key.
2. Violations of 2NF
Consider a table storing order details to illustrate 2NF:
Order ID | Product ID | Customer ID | Product Name | Customer Name |
1001 | 201 | 100 | Laptop | Ahana |
1002 | 202 | 100 | Smartphone | Ahana |
In the table above, the product name and customer name unquestionably depend on the customer ID portion of the primary key (Order ID, Customer ID). Hence, this is a partial dependency and results in a violation of 2NF. We will, thus, need to create separate tables to normalize the above table. You will create a table for orders (order ID, customer ID), products (product ID, name), and customers (customer ID, customer name). You can then bring data integrity with normalization in DBMS.
ALSO READ: What are the Unique Benefits of Keys in DBMS?
3NF: Third Normal Form
Every table in the Third Normal Form (3NF) must first be in the 2NF. Furthermore, it must eliminate transitive dependencies. This means that non-key attributes should specifically depend on the primary key and not other non-key attributes. There should also be no indirect relationships where one non-key attribute depends on another non-key attribute.
Violations of 3NF
Let’s take an example of a table that is in 2NF but violates 3NF:
Student ID | Student Name | Course ID | Course Name | Instructor ID | Instructor Name |
1 | Alizeh | 101 | Maths | 501 | Dr More |
2 | Benny | 102 | English | 502 | Dr Das |
The primary key could be (Student ID and Course ID) since a student can take multiple courses, and each combination is unique. There is an explicitly transitive dependency between the non-key columns— Instructor Name and Instructor ID.
This is because the instructor’s name depends on the instructor ID (a non-key column), not on the primary key. In essence, the instructor’s name indirectly depends on the primary key through the instructor ID.
You need to split the table into two separate ones to remove the transitive dependency. As a result, you will have a course table with information related to the course and instructor. The instructor’s table will then store information about instructors.
ALSO READ: SQL vs Python: Which is the Better Programming Language for You
BCNF: Boyce-Codd Normal Form
Boyce-Codd Normal Form is a stricter version of the Third Normal Form (3NF). It states that every non-trivial dependency in a relation must undoubtedly be a superkey. A superkey is any set of attributes that can definitively identify a tuple (row) in a relation. Lastly, there should be no non-trivial dependencies that do not involve a superkey.
Violations of BCNF
Here’s a table that particularly deals with course offerings:
Course ID | Department ID | Department Name |
Department Name is undeniably dependent transitively on Course ID through the Department ID. This, therefore, violates BCNF as the Department Name is not fully dependent on the Course ID, which is not a superkey. So, you have to create separate tables to normalize the above example into BCNF.
- Course Table: CourseID, Department ID…
- Department Table: Department ID, Department Name…
Level up Your Skills With Emeritus
In conclusion, the importance of normalization in DBMS cannot be stressed enough. It is evidently indispensable to the process of designing reliable databases. Consequently, you must have a firm grasp of database normalization to become proficient in SQL and database management.
There is a range of online IT courses at Emeritus if you want to deepen your knowledge of SQL, database design, and other IT skills. These courses will not only enhance your expertise but also offer practical examples relevant to the industry. The curriculum covers topics such as SQL, database optimization, and full-stack development. Join one of these courses at Emeritus to unlock your potential and emphatically take your career to soaring heights.
Write to us at content@emeritus.org
Source: