Unit 4: Database Implementation & Management

Database Implementation & Management

When we create a database, it’s not just about tables and queries —
there’s a lot going on behind the scenes to store data efficiently, process queries faster, and reduce cost/time.

Let’s understand each topic one by one

Database Storage and Physical Structures

A Database Storage Structure refers to how data is actually stored inside a database system (on hard disk or memory). Databases organize data in a way that makes it easy to access and update quickly.

Levels of Data Storage

Level	Description	Example
Logical Level	How users see data (tables, columns)	`SELECT * FROM Customers;`
Physical Level	How data is stored on disk	Data files, indexes, pages, blocks
Buffer / Cache	Temporary memory area for fast access	RAM used for query results

Indexing

Imagine a book index — instead of reading every page, you check the index to find a topic quickly.
The same concept applies in databases.

Indexing is a technique to speed up data retrieval from large tables.

Example:

Without Index:
To find “EmpID = 105”, the system checks every record one by one.
With Index:
It directly jumps to the location where EmpID = 105 is stored.

Types of Indexes:

Type	Description
Primary Index	Automatically created on primary key column
Secondary Index	Created on non-key columns for faster search
Clustered Index	Data rows are physically arranged according to the index
Non-clustered Index	Index created separately from the actual data table

SQL Example:


CREATE INDEX idx_name ON Employees(Name);

B-Trees (Balanced Trees)

A B-tree (Balanced Tree) is a data structure used to organize and access data efficiently in indexing.

Think of a B-tree as a hierarchical structure like a family tree:

Each node can store multiple keys.
Root node → main entry point.
Leaf nodes → actual data.

Why B-trees are useful:

Keep data sorted
Allow fast searching, insertion, and deletion
Reduce the number of disk reads

Example:

When you search for EmpID = 50:

The B-tree checks top (root) nodes first
Then moves down the branches until it finds the right leaf node.

Used in: Indexes in databases (MySQL, Oracle, etc.)

Hashing

Hashing is another way to locate data quickly. It uses a mathematical function (hash function) to convert a key value into an address.

Example:

If EmployeeID = 105,
A hash function might map it to address 10, so the data for that employee is stored at that location.

Types of Hashing:

Type	Description
Static Hashing	Hash table size is fixed
Dynamic Hashing	Hash table grows/shrinks as data changes

Hashing is best for exact searches, like finding a record by ID.

Query Processing and Optimization

When you run a SQL command like:


SELECT * FROM Employees WHERE Salary > 50000;

The DBMS does several steps internally to process and optimize the query for faster results.

Steps in Query Processing

Step	Description
1. Parsing	SQL query is checked for syntax errors
2. Translation	SQL is converted into a low-level query (relational algebra)
3. Optimization	DBMS chooses the best way (fastest path) to execute the query
4. Execution	Query runs and retrieves data from disk or memory

Query Optimization

Query Optimization means improving the performance of SQL queries —
so that results come faster and with minimum resource use (CPU, memory, I/O).

Techniques Used:

Technique	Description
Using Indexes	Access fewer rows by using indexed columns
Join Optimization	Choose the most efficient join method (nested loop, merge join, hash join)
Predicate Pushdown	Filter data early (apply WHERE condition first)
Query Rewriting	Rewrite query in a faster equivalent form
Caching	Store frequently used results in memory

Query Cost Analysis

When the database runs a query, it consumes resources —
like CPU time, memory, and disk I/O (input/output operations).

Query Cost Analysis measures these to find the most efficient execution plan.

Factors Affecting Query Cost:

Factor	Description
Disk Access Time	Time to read/write data from storage
CPU Usage	Time for processing and sorting data
Network Cost	For distributed databases
Memory Usage	Amount of RAM used for temporary data

Goal: Choose the query plan with the lowest cost (fastest execution).

Summary Table

Topic	Meaning	Example / Use
Database Storage	How data is stored physically	Pages, Blocks, Files
Indexing	Data lookup shortcut	Index on Employee Name
B-Trees	Hierarchical data structure for indexes	Used in MySQL indexing
Hashing	Uses hash function to locate data	Hash(EmployeeID) = memory address
Query Processing	Steps DBMS takes to run a query	Parsing → Optimization → Execution
Query Optimization	Improve speed & reduce cost	Use indexes, rewrite query
Query Cost Analysis	Measure of efficiency	Disk + CPU + Memory cost

In Simple Words:

Database Implementation = How data is stored.
Query Optimization = How queries run faster.
Goal = Fast access + Less cost + Accurate results.

Database Security, Recovery & Transaction Management

Backup, Recovery, and Disaster Management

Backup

A backup is a copy of database data that can be used to restore information if something goes wrong.

Think of backup like saving a copy of your project file on Google Drive — if your laptop crashes, you can still recover your work.

Types of Backups

Type	Description	Example
Full Backup	Complete copy of the entire database	Weekly full database copy
Incremental Backup	Backs up only the changes made after the last backup	Daily small backups
Differential Backup	Backs up changes made after the last full backup	Every 2–3 days backup of all recent changes

SQL Example:


BACKUP DATABASE myDB TO DISK = 'D:\Backup\myDB.bak';

Recovery

Recovery is the process of restoring the database after a failure using backup files and logs.

There are two main types of failures:

System Crash – Power failure, hardware crash
Transaction Failure – Error during transaction (e.g., division by zero, data conflict)

Recovery Techniques

Technique	Description
Roll Forward	Restore database using backup + log files to bring it to latest state
Roll Back	Undo incomplete transactions using logs (like CTRL + Z for database)

Disaster Management

When something serious happens (like a fire, server crash, or cyber attack), disaster management ensures the database can be quickly restored.

Steps in Disaster Management:

Regular Backups – Store copies in multiple locations (onsite + cloud)
Redundancy – Use duplicate servers (mirroring/replication)
Disaster Recovery Plan (DRP) – A formal plan for what to do after failure
Testing – Regularly test recovery methods to ensure they work

Example: Banks keep real-time mirror servers in different cities to avoid data loss.

Concurrency Control

In a multi-user system, many users access the database at the same time.

Concurrency Control ensures that:

Multiple users can work together without conflicting
The accuracy and consistency of data are maintained

Example:

Two employees withdraw money from the same bank account at the same time.
Without control → Wrong balance
With control → Transactions handled one after another safely.

Locking

Locking prevents multiple users from changing the same data at once.

Type	Description	Example
Shared Lock	Data can be read by many users but not modified	Viewing balance
Exclusive Lock	Data can be modified by only one user	Updating salary

When a user locks a row, others must wait until it’s released.

Timestamping

Each transaction is given a unique timestamp (time of start).
The database uses these timestamps to decide the order of transactions.

Case	Result
Older transaction → executes first	Allowed
Newer one → waits or rolls back	Ensures consistency

Used in systems where time order matters (like online ticket booking).

Deadlocks

A deadlock happens when two or more transactions wait for each other forever.

Example

Transaction 1 locks “Account A” and needs “Account B”.
Transaction 2 locks “Account B” and needs “Account A”.→ Both wait endlessly.

Deadlock Solutions:

Deadlock Prevention: Stop conditions before they happen (give timeout or order of locking).
Deadlock Detection: DBMS checks for cycles of waiting and stops one transaction.
Deadlock Recovery: Abort one transaction and restart it.

Transaction Management

A transaction is a small logical unit of work —
like transferring ₹1000 from one account to another.

Example:


1. Deduct ₹1000 from Account A  
2. Add ₹1000 to Account B

Both steps must complete together — if one fails, both should be cancelled.

ACID Properties

To ensure reliability, every transaction must follow ACID properties 👇

Property	Meaning	Example
A – Atomicity	All steps of a transaction must succeed or none at all	Money deducted & added both or neither
C – Consistency	Database must remain valid before & after transaction	Total money before = total after
I – Isolation	Transactions must not interfere with each other	Two users withdrawing same time → no conflict
D – Durability	Once a transaction is committed, it remains saved even after crash	System crash → committed data still safe

Transaction States

State	Meaning
Active	Transaction is running
Partially Committed	All operations done, waiting to save
Committed	Successfully completed
Failed	Error occurred
Aborted	Rolled back or undone

Transaction Control Commands (TCL)

Command	Purpose
COMMIT	Saves all changes permanently
ROLLBACK	Undoes recent changes
SAVEPOINT	Sets a checkpoint in a transaction to roll back to

Summary Table

Topic	Meaning	Key Point
Backup	Copy of database	Used for recovery
Recovery	Restoring data after failure	Rollback / Rollforward
Disaster Management	Plan to restore system after major failure	Replication, DRP
Locking	Prevents simultaneous data modification	Shared / Exclusive
Timestamping	Orders transactions by time	Prevents conflict
Deadlock	Two transactions waiting for each other	Detection & recovery
Transaction	Unit of work	Must follow ACID
ACID	Reliability principles	Atomicity, Consistency, Isolation, Durability

In Short:

Backup & Recovery → Protect data
Concurrency Control → Prevent conflicts
Transaction Management → Ensure reliability

ACID → The golden rule of database safety

Unit 4: Database Implementation & Management

Database Implementation & Management

Database Storage and Physical Structures

Levels of Data Storage

Indexing

Example:

Types of Indexes:

B-Trees (Balanced Trees)

Why B-trees are useful:

Example:

Hashing

Example:

Types of Hashing:

Query Processing and Optimization

Steps in Query Processing

Query Optimization

Techniques Used:

Query Cost Analysis

Factors Affecting Query Cost:

Summary Table

In Simple Words:

Database Security, Recovery & Transaction Management

Backup, Recovery, and Disaster Management

Backup

Types of Backups

Recovery

Recovery Techniques

Disaster Management

Steps in Disaster Management:

Concurrency Control

Example:

Locking

Timestamping

Deadlocks

Example

Deadlock Solutions:

Transaction Management

ACID Properties

Transaction States

Transaction Control Commands (TCL)

Summary Table

In Short:

You might like