Unit 2: Data Modelling & Database Design

What is Data Modelling?

Data modelling is the process of visually designing how data will be stored, connected, and accessed in a database.

In simple words

Data modelling is like creating a blueprint before building a database.

Just like architects draw a plan before constructing a building, IT teams draw a data model before building a database.

What is the ER (Entity-Relationship) Model?

ER Model (Entity-Relationship Model) represents data and how different data items are related.

ER Model uses ER diagrams to show:

Entities (things)
Attributes (details about things)
Relationships (how things are connected)

Key Concepts

1. Entity

An entity is any real-world object or concept about which data is stored.

Entity	Real-life meaning
Customer	Person who buys products
Product	Item being sold
Employee	Person working in company
Order	Customer’s purchase record

Entity = Noun (Thing)

2. Attributes

Attributes are properties or details of an entity.

Entity	Attributes (Details)
Customer	Customer ID, Name, Phone, Email
Product	Product ID, Name, Price, Category
Employee	Employee ID, Name, Salary, Department

Attribute = Property/Characteristic

3. Types of Attributes

Type	Meaning	Example
Key Attribute	Uniquely identifies an entity	Student Roll Number, Employee ID
Composite Attribute	Can be broken into smaller parts	Name → First Name + Last Name
Multivalued Attribute	Has more than one value	Phone numbers, Email IDs
Derived Attribute	Can be calculated	Age (from Date of Birth), Total Price (qty × rate)

4. Relationships

A relationship shows how entities are connected.

Relationship Type	Meaning	Example
One-to-One (1:1)	One entity linked to one	CEO → One company
One-to-Many (1:M)	One entity linked to many	One customer → many orders
Many-to-Many (M:M)	Many entities linked to many	Students ↔ Courses

Relationship = Verb (links between entities)

5. Constraints

Constraints are rules applied to ensure data correctness.

Type	Example
Primary key	Every Customer ID must be unique
Foreign key	An order must belong to an existing customer
Not null	Phone number cannot be blank
Check	Age must be greater than 18

ER Diagram Example


Customer (CustomerID, Name, Phone)
Orders (OrderID, Date, Amount)
Relationship: Customer places Order (1-to-Many)

ER Diagram Representation:


   Customer 1 -----< places >----- M Order

(One customer can place many orders)

Extended ER Features (Advanced Concepts)

Generalization

Combining similar entities into a higher-level entity
“Bottom-up approach”

Example
Car + Bike + Truck → Vehicle (general entity)

Specialization

Breaking a general entity into sub-entities
“Top-down approach”

Example
Employee → Manager, Worker, Intern

Aggregation

Combining a relationship and entity into a higher-level entity
Used when relationship itself has attributes

Example
Project + Employee = Works-On → Department supervises this

Aggregation is helpful in complex systems like HRMS, banking, ERP.

Why is ER Modelling Important for Business?

Benefit	Business Impact
Clear data structure	Saves time, avoids confusion
Reduces errors	Ensures accuracy in reports
Improves communication	Business & IT understand each other
Better database performance	Smooth operations & transactions
Supports decision-making	Better analytics & dashboards

Summary (Quick Revision Sheet)

Concept	Meaning
Entity	Object about which data is stored
Attribute	Characteristics of entity
Relationship	How entities are linked
Key Attribute	Unique identifier
Composite Attribute	Can be divided
Multivalued Attribute	Multiple values
Derived Attribute	Calculated
Generalization	Merge similar entities
Specialization	Divide into sub-entities
Aggregation	Group entity + relationship

Final Tip

ER Model = Visual plan for a database
It helps companies avoid data errors, duplication, and confusion.

Relational Database Model

Think of a relational database as a set of neat spreadsheets (tables) that are linked together. Each table holds information about one “thing” (entity) — customers, orders, products — and the links between tables keep data accurate and avoid repetition.

1. Structure — what a relational model looks like

Table (Relation): like a spreadsheet with rows and columns.

Row (Tuple): one record / one real-world instance (e.g., one customer).
Column (Attribute): one piece of information about that record (e.g., name, phone).

Relational Schema: the design/blueprint showing all tables and how they relate.

Example: CUSTOMER(CustomerID, Name, Phone, Email)
Example: ORDER(OrderID, OrderDate, CustomerID, TotalAmount)

2. Keys — how tables are linked and how we ensure uniqueness

Primary Key (PK): a column (or set of columns) that uniquely identifies each row in a table.

Example: CustomerID uniquely identifies a customer.
Rules: not null, unique.

Foreign Key (FK): a column in one table that refers to the primary key of another table — this creates the relationship. Example: Order.CustomerID is an FK that points to Customer.CustomerID.
Candidate Key: a column (or combo) that could be a primary key (there may be several), but one is chosen as PK.
Composite Key: PK made of more than one column (e.g., StudentID + CourseID for an enrollment table).
Surrogate Key: an artificial key (like an auto-increment ID) used instead of a natural key.

3. Integrity Constraints — rules to keep data correct

Entity Integrity: primary key must not be null and must be unique.
Referential Integrity: foreign key must either be null (if allowed) or match an existing primary key in the referenced table. This prevents “orphan” records.
Domain Constraint: each column contains values from an allowed set or type (e.g., Age must be numeric between 0–120).
Unique Constraint: column(s) must have unique values (e.g., Email if required unique).
Check Constraints: custom rules (e.g., Salary >= 0).

Business impact: these constraints prevent data errors (like orders with no customer, duplicate IDs, wrong data types).

4. Normalization — cleaning tables to reduce redundancy and anomalies

Normalization breaks a messy table into well-structured tables to avoid:

Insertion anomaly (can’t add certain data without others),
Update anomaly (need to update same data in many places),
Deletion anomaly (deleting one thing removes needed info).

We’ll show a simple running example using a single messy table and transform it.

Initial (unnormalized) table — `SALES_RAW`

SaleID	CustomerName	CustomerPhone	ProductID	ProductName	Qty	UnitPrice	OrderDate
1	Rahul	98765	P100	Pen	10	5	2025-10-01
2	Rahul	98765	P200	Notebook	2	50	2025-10-01
3	Priya	91234	P100	Pen	20	5	2025-10-02

Problems: customer details repeated, product details repeated, price repeated — lots of redundancy.

1NF — First Normal Form (remove repeating groups, atomic values)

Rule: Each column must contain atomic (single) values; table must have a primary key.

Transform by ensuring one product per row (our raw table above is already row-per-product). Ensure columns are atomic (no lists). Choose PK: SaleID + ProductID (or create SaleLineID).

Result: still redundant (customer and product info repeat), but now in 1NF.

2NF — Second Normal Form (remove partial dependency)

Rule: Table must be in 1NF and non-key attributes must depend on the whole primary key (not part of it). Applies when PK is composite.

If our PK is SaleID + ProductID, attributes like CustomerName depend only on SaleID (part of key) → move them out.

Split into:

SALE(SaleID, OrderDate, CustomerID)
SALE_LINE(SaleLineID or SaleID+ProductID, SaleID, ProductID, Qty, UnitPrice)

And create:

CUSTOMER(CustomerID, Name, Phone)
PRODUCT(ProductID, ProductName)

Now attributes are stored where they fully depend on the table’s PK.

3NF — Third Normal Form (remove transitive dependency)

Rule: In 2NF, and no non-key attribute depends on another non-key attribute.

Example problem: if PRODUCT table stored CategoryName AND CategoryDescription in the same table, and CategoryDescription depends on CategoryName (non-key depends on non-key), we should move categories to a separate table.

Resulting tables:

CUSTOMER(CustomerID, Name, Phone)
PRODUCT(ProductID, ProductName, CategoryID)
CATEGORY(CategoryID, CategoryName, CategoryDesc)
SALE(SaleID, OrderDate, CustomerID)
SALE_LINE(SaleLineID, SaleID, ProductID, Qty, UnitPrice)

Now each non-key attribute depends only on the key.

BCNF — Boyce-Codd Normal Form (stronger than 3NF)

Rule: For every functional dependency X → Y, X must be a superkey. It fixes edge cases where 3NF still allows anomalies.

Example: Suppose CourseInstructor table had Course → Instructor and also Instructor → Course (in special contexts). If a non-key determines another column, BCNF enforces further splitting.

In practice: BCNF is applied when you find unusual dependencies; many well-designed schemas already satisfy BCNF.

Relational Schema & Table Design

Start with ER model: identify entities, relationships, attributes.
Choose clear primary keys: stable, unique, preferably surrogate keys (CustomerID auto-number).
Use foreign keys for links: name them clearly (CustomerID in SALE points to CUSTOMER).
Avoid duplicate data: normalize to at least 3NF unless performance reasons require denormalization.
Use meaningful column names: order_date not od.
Define data types & sizes: VARCHAR(100) for name, DECIMAL(10,2) for money.
Add constraints: PK, FK, NOT NULL, UNIQUE, CHECK — these protect data.
Index important columns: for faster searches (e.g., index OrderDate), but don’t over-index.
Document the schema: a simple data dictionary describing each table/column.
Plan for growth and backups: partition large tables, think about archiving old data.

Trade-off: normalization reduces redundancy but may require more joins for queries — balance for performance.

Role & Responsibilities of the Database Administrator (DBA)

The DBA is the person (or team) who keeps the database healthy, secure, and available. Think of the DBA as the “facilities manager” for company data.

Key responsibilities:

Installation & Configuration: install DBMS software, configure for the organization.
Schema Design Support: work with developers/analysts to design efficient schemas and ensure best practices.
Security & Access Control: create users/roles, grant privileges, enforce least privilege, audit logins.
Backup & Recovery: schedule backups, test restores, create disaster recovery plans.
Performance Tuning: monitor queries, add indexes, tune parameters, manage resources.
Capacity Planning: plan storage, CPU, memory needs; forecast growth.
High Availability & Replication: set up replication, clustering, failover solutions.
Maintenance Tasks: patching, upgrades, reorganizing tables/indexes, statistics updates.
Data Integrity & Quality: enforce constraints, help with ETL and data cleansing processes.
Monitoring & Alerts: set up alerts for failures, slow queries, resource saturation.
Support & Troubleshooting: help developers, resolve production incidents.
Documentation & Policies: maintain runbooks, access policies, backup procedures.
Compliance & Audits: support legal/finance audits; ensure regulatory compliance (e.g., data retention).

Business impact: good DBAs reduce downtime, prevent data loss, improve performance, and protect sensitive data — directly affecting company reliability and trust.

Quick Practical Examples (mini checklists)

When designing a `Customer` table

Columns: CustomerID (PK), Name, Phone, Email (UNIQUE?), AddressID (FK)
Constraints: CustomerID PK, Email UNIQUE (if required), Phone CHECK or format rule.

When creating an `Order` table:

Columns: OrderID (PK), OrderDate, CustomerID (FK), TotalAmount
Design: Don’t store customer name here — link to CUSTOMER. Use SALE_LINE for products per order.

Summary

A relational DB = tables + keys + relationships.
Keys (PK/FK) enforce identity and links.
Integrity constraints keep data valid (no orphans, correct types).
Normalization (1NF → 2NF → 3NF → BCNF) removes redundancy and anomalies — aim for 3NF for business apps unless there’s a performance justification.
Good table design balances normalization, query speed, and simplicity.
DBA ensures the database runs, is secure, and recovers from failures.

Unit 2: Data Modelling & Database Design

What is Data Modelling?

In simple words

What is the ER (Entity-Relationship) Model?

Key Concepts

1. Entity

2. Attributes

3. Types of Attributes

4. Relationships

5. Constraints

ER Diagram Example

Extended ER Features (Advanced Concepts)

Generalization

Specialization

Aggregation

Why is ER Modelling Important for Business?

Summary (Quick Revision Sheet)

Final Tip

Relational Database Model

1. Structure — what a relational model looks like

2. Keys — how tables are linked and how we ensure uniqueness

3. Integrity Constraints — rules to keep data correct

4. Normalization — cleaning tables to reduce redundancy and anomalies

Initial (unnormalized) table — `SALES_RAW`

1NF — First Normal Form (remove repeating groups, atomic values)

2NF — Second Normal Form (remove partial dependency)

3NF — Third Normal Form (remove transitive dependency)

BCNF — Boyce-Codd Normal Form (stronger than 3NF)

Relational Schema & Table Design

Role & Responsibilities of the Database Administrator (DBA)

Key responsibilities:

Quick Practical Examples (mini checklists)

When designing a `Customer` table

When creating an `Order` table:

Summary

Unit 1 Fundamental of Management & Planning

MBA AKTU 1 SEM NOTES LINKS (PDF & VIDEO)

(KMBN302) Unit 1: Innovation

Unit 1: Basic Concepts & Principles of Managerial Economics

Categories

Unit 2: Data Modelling & Database Design

What is Data Modelling?

In simple words

What is the ER (Entity-Relationship) Model?

Key Concepts

1. Entity

2. Attributes

3. Types of Attributes

4. Relationships

5. Constraints

ER Diagram Example

Extended ER Features (Advanced Concepts)

Generalization

Specialization

Aggregation

Why is ER Modelling Important for Business?

Summary (Quick Revision Sheet)

Final Tip

Relational Database Model

1. Structure — what a relational model looks like

2. Keys — how tables are linked and how we ensure uniqueness

3. Integrity Constraints — rules to keep data correct

4. Normalization — cleaning tables to reduce redundancy and anomalies

Initial (unnormalized) table — SALES_RAW

1NF — First Normal Form (remove repeating groups, atomic values)

2NF — Second Normal Form (remove partial dependency)

3NF — Third Normal Form (remove transitive dependency)

BCNF — Boyce-Codd Normal Form (stronger than 3NF)

Relational Schema & Table Design

Role & Responsibilities of the Database Administrator (DBA)

Key responsibilities:

Quick Practical Examples (mini checklists)

When designing a Customer table

When creating an Order table:

Summary

You might like

Initial (unnormalized) table — `SALES_RAW`

When designing a `Customer` table

When creating an `Order` table: