Unit 3: Data Base Design & Normalization



Database Design & Normalization

Database Design is the process of organizing data to reduce redundancy, avoid anomalies, and ensure data integrity.

Main Objectives

  • Minimize data redundancy
  • Eliminate update, insertion, and deletion anomalies
  • Ensure data consistency
  • Improve query performance

Phases of Database Design

  1. Conceptual Design (ER Model)
  2. Logical Design (Relational Schema)
  3. Physical Design (Storage, Indexing)

Functional Dependencies (FD)

A Functional Dependency (FD) represents a relationship between attributes in a relation.

An FD X → Y means:

If two tuples have the same value of X, they must have the same value of Y.

Example

STUDENT(RollNo, Name, Course, Marks)

  • RollNo → Name
  • RollNo → Course

Types of Functional Dependencies

TypeDescription
Trivial FDY ⊆ X
Non-Trivial FDY ⊄ X
Fully Functional DependencyDepends on entire key
Partial DependencyDepends on part of composite key
Transitive DependencyX → Y, Y → Z

Normalization

Normalization is the process of organizing data into normal forms to reduce redundancy.

Advantages

  • Removes data duplication
  • Avoids anomalies
  • Improves data integrity
  • Simplifies maintenance

First Normal Form (1NF)

Rule

  • All attributes must have atomic (single) values
  • No repeating groups or multivalued attributes

Example (Not in 1NF)

STUDENT(RollNo, Name, Subjects)

1NF Form

STUDENT(RollNo, Name, Subject)

Second Normal Form (2NF)

Rule

  • Must be in 1NF
  • No partial dependency

Partial Dependency

Occurs when a non-key attribute depends on part of a composite key.

Example

ENROLLMENT(StudentID, CourseID, StudentName, CourseName)

  • StudentID → StudentName
  • CourseID → CourseName

Solution

Split into:

  • STUDENT(StudentID, StudentName)
  • COURSE(CourseID, CourseName)
  • ENROLLMENT(StudentID, CourseID)

Third Normal Form (3NF)

Rule

  • Must be in 2NF
  • No transitive dependency

Transitive Dependency

X → Y and Y → Z (X indirectly determines Z)

Example

EMP(EmpID, EmpName, DeptID, DeptName)

  • EmpID → DeptID
  • DeptID → DeptName

Solution

  • EMP(EmpID, EmpName, DeptID)
  • DEPT(DeptID, DeptName)

Boyce-Codd Normal Form (BCNF)

Rule

For every FD X → Y, X must be a super key.

Difference Between 3NF and BCNF

Basis3NFBCNF
StrictnessLess strictMore strict
RedundancyPossibleEliminated

Example

TEACH(Student, Course, Teacher)

  • (Student, Course) → Teacher
  • Teacher → Course

Violates BCNF because Teacher is not a super key.

Inclusion Dependency

An Inclusion Dependency states that values in one relation must exist in another relation.

Example

STUDENT(DeptID) ⊆ DEPARTMENT(DeptID)

This is similar to a foreign key constraint.

Lossless Join Decomposition

A decomposition is lossless if the original relation can be reconstructed by joining the decomposed relations.

Condition for Lossless Join

For R → R1, R2:

  • (R1 ∩ R2) → R1 or

  • (R1 ∩ R2) → R2

Importance

  • No data loss
  • Preserves original information

Normalization Using FD, MVD, and JD

Functional Dependency (FD)

Used to achieve 1NF, 2NF, 3NF, BCNF.

Multivalued Dependency (MVD)

X →→ Y means Y depends on X independently of other attributes.

Example

STUDENT(StudentID, Hobby, Language)

StudentID →→ Hobby
StudentID →→ Language

Fourth Normal Form (4NF)

Rule

  • Must be in BCNF
  • No non-trivial MVD

Solution

Split into:

  • STUDENT_HOBBY(StudentID, Hobby)
  • STUDENT_LANGUAGE(StudentID, Language)

Join Dependency (JD)

A Join Dependency exists when a table can be reconstructed by joining multiple relations.

Fifth Normal Form (5NF)

Rule

  • No non-trivial join dependency
  • Decomposition cannot be further split without data loss

Alternative Approaches to Database Design

1. Top-Down Approach

  • Start with high-level design (ER model)
  • Gradually refine

2. Bottom-Up Approach

  • Start with attributes
  • Combine into relations

3. Mixed Approach

  • Combination of top-down and bottom-up
  • Most practical approach

Comparison of Approaches

ApproachDescription
Top-DownConceptual → Logical
Bottom-UpAttributes → Relations
MixedBalanced and efficient

Exam-Focused Summary (MCA)

  • Functional Dependency is the foundation of normalization
  • 1NF removes multivalued attributes
  • 2NF removes partial dependency
  • 3NF removes transitive dependency
  • BCNF is stricter than 3NF
  • MVD leads to 4NF
  • JD leads to 5NF
  • Lossless decomposition ensures data safety
  • Mixed approach is best for database design