Unit 3: Data Base Design & Normalization
Database Design & Normalization
Database Design is the process of organizing data to reduce redundancy, avoid anomalies, and ensure data integrity.
Main Objectives
- Minimize data redundancy
- Eliminate update, insertion, and deletion anomalies
- Ensure data consistency
- Improve query performance
Phases of Database Design
- Conceptual Design (ER Model)
- Logical Design (Relational Schema)
- Physical Design (Storage, Indexing)
Functional Dependencies (FD)
A Functional Dependency (FD) represents a relationship between attributes in a relation.
An FD X → Y means:
If two tuples have the same value of X, they must have the same value of Y.
Example
STUDENT(RollNo, Name, Course, Marks)
- RollNo → Name
- RollNo → Course
Types of Functional Dependencies
| Type | Description |
|---|---|
| Trivial FD | Y ⊆ X |
| Non-Trivial FD | Y ⊄ X |
| Fully Functional Dependency | Depends on entire key |
| Partial Dependency | Depends on part of composite key |
| Transitive Dependency | X → Y, Y → Z |
Normalization
Normalization is the process of organizing data into normal forms to reduce redundancy.
Advantages
- Removes data duplication
- Avoids anomalies
- Improves data integrity
- Simplifies maintenance
First Normal Form (1NF)
Rule
- All attributes must have atomic (single) values
- No repeating groups or multivalued attributes
Example (Not in 1NF)
STUDENT(RollNo, Name, Subjects)
1NF Form
STUDENT(RollNo, Name, Subject)
Second Normal Form (2NF)
Rule
- Must be in 1NF
- No partial dependency
Partial Dependency
Occurs when a non-key attribute depends on part of a composite key.
Example
ENROLLMENT(StudentID, CourseID, StudentName, CourseName)
- StudentID → StudentName
- CourseID → CourseName
Solution
Split into:
- STUDENT(StudentID, StudentName)
- COURSE(CourseID, CourseName)
- ENROLLMENT(StudentID, CourseID)
Third Normal Form (3NF)
Rule
- Must be in 2NF
- No transitive dependency
Transitive Dependency
X → Y and Y → Z (X indirectly determines Z)
Example
EMP(EmpID, EmpName, DeptID, DeptName)
- EmpID → DeptID
- DeptID → DeptName
Solution
- EMP(EmpID, EmpName, DeptID)
- DEPT(DeptID, DeptName)
Boyce-Codd Normal Form (BCNF)
Rule
For every FD X → Y, X must be a super key.
Difference Between 3NF and BCNF
| Basis | 3NF | BCNF |
|---|---|---|
| Strictness | Less strict | More strict |
| Redundancy | Possible | Eliminated |
Example
TEACH(Student, Course, Teacher)
- (Student, Course) → Teacher
- Teacher → Course
Violates BCNF because Teacher is not a super key.
Inclusion Dependency
An Inclusion Dependency states that values in one relation must exist in another relation.
Example
STUDENT(DeptID) ⊆ DEPARTMENT(DeptID)
This is similar to a foreign key constraint.
Lossless Join Decomposition
A decomposition is lossless if the original relation can be reconstructed by joining the decomposed relations.
Condition for Lossless Join
For R → R1, R2:
-
(R1 ∩ R2) → R1 or
-
(R1 ∩ R2) → R2
Importance
- No data loss
- Preserves original information
Normalization Using FD, MVD, and JD
Functional Dependency (FD)
Used to achieve 1NF, 2NF, 3NF, BCNF.
Multivalued Dependency (MVD)
X →→ Y means Y depends on X independently of other attributes.
Example
STUDENT(StudentID, Hobby, Language)
StudentID →→ Hobby
StudentID →→ Language
Fourth Normal Form (4NF)
Rule
- Must be in BCNF
- No non-trivial MVD
Solution
Split into:
- STUDENT_HOBBY(StudentID, Hobby)
- STUDENT_LANGUAGE(StudentID, Language)
Join Dependency (JD)
A Join Dependency exists when a table can be reconstructed by joining multiple relations.
Fifth Normal Form (5NF)
Rule
- No non-trivial join dependency
- Decomposition cannot be further split without data loss
Alternative Approaches to Database Design
1. Top-Down Approach
- Start with high-level design (ER model)
- Gradually refine
2. Bottom-Up Approach
- Start with attributes
- Combine into relations
3. Mixed Approach
- Combination of top-down and bottom-up
- Most practical approach
Comparison of Approaches
| Approach | Description |
|---|---|
| Top-Down | Conceptual → Logical |
| Bottom-Up | Attributes → Relations |
| Mixed | Balanced and efficient |
Exam-Focused Summary (MCA)
- Functional Dependency is the foundation of normalization
- 1NF removes multivalued attributes
- 2NF removes partial dependency
- 3NF removes transitive dependency
- BCNF is stricter than 3NF
- MVD leads to 4NF
- JD leads to 5NF
- Lossless decomposition ensures data safety
- Mixed approach is best for database design