Unit 4: Database Implementation & Management
Database Implementation & Management
When we create a database, it’s not just about tables and queries —
there’s a lot going on behind the scenes to store data efficiently, process queries faster, and reduce cost/time.
Let’s understand each topic one by one
Database Storage and Physical Structures
A Database Storage Structure refers to how data is actually stored inside a database system (on hard disk or memory). Databases organize data in a way that makes it easy to access and update quickly.
Levels of Data Storage
| Level | Description | Example |
|---|---|---|
| Logical Level | How users see data (tables, columns) | SELECT * FROM Customers; |
| Physical Level | How data is stored on disk | Data files, indexes, pages, blocks |
| Buffer / Cache | Temporary memory area for fast access | RAM used for query results |
Indexing
Imagine a book index — instead of reading every page, you check the index to find a topic quickly.
The same concept applies in databases.
Indexing is a technique to speed up data retrieval from large tables.
Example:
Without Index:
To find “EmpID = 105”, the system checks every record one by one.
With Index:
It directly jumps to the location where EmpID = 105 is stored.
Types of Indexes:
| Type | Description |
|---|---|
| Primary Index | Automatically created on primary key column |
| Secondary Index | Created on non-key columns for faster search |
| Clustered Index | Data rows are physically arranged according to the index |
| Non-clustered Index | Index created separately from the actual data table |
SQL Example:
B-Trees (Balanced Trees)
A B-tree (Balanced Tree) is a data structure used to organize and access data efficiently in indexing.
Think of a B-tree as a hierarchical structure like a family tree:
- Each node can store multiple keys.
- Root node → main entry point.
- Leaf nodes → actual data.
Why B-trees are useful:
- Keep data sorted
- Allow fast searching, insertion, and deletion
- Reduce the number of disk reads
Example:
When you search for EmpID = 50:
- The B-tree checks top (root) nodes first
- Then moves down the branches until it finds the right leaf node.
Used in: Indexes in databases (MySQL, Oracle, etc.)
Hashing
Hashing is another way to locate data quickly. It uses a mathematical function (hash function) to convert a key value into an address.
Example:
If EmployeeID = 105,
A hash function might map it to address 10, so the data for that employee is stored at that location.
Types of Hashing:
| Type | Description |
|---|---|
| Static Hashing | Hash table size is fixed |
| Dynamic Hashing | Hash table grows/shrinks as data changes |
Hashing is best for exact searches, like finding a record by ID.
Query Processing and Optimization
When you run a SQL command like:
The DBMS does several steps internally to process and optimize the query for faster results.
Steps in Query Processing
| Step | Description |
|---|---|
| 1. Parsing | SQL query is checked for syntax errors |
| 2. Translation | SQL is converted into a low-level query (relational algebra) |
| 3. Optimization | DBMS chooses the best way (fastest path) to execute the query |
| 4. Execution | Query runs and retrieves data from disk or memory |
Query Optimization
Query Optimization means improving the performance of SQL queries —
so that results come faster and with minimum resource use (CPU, memory, I/O).
Techniques Used:
| Technique | Description |
|---|---|
| Using Indexes | Access fewer rows by using indexed columns |
| Join Optimization | Choose the most efficient join method (nested loop, merge join, hash join) |
| Predicate Pushdown | Filter data early (apply WHERE condition first) |
| Query Rewriting | Rewrite query in a faster equivalent form |
| Caching | Store frequently used results in memory |
Query Cost Analysis
When the database runs a query, it consumes resources —
like CPU time, memory, and disk I/O (input/output operations).
Query Cost Analysis measures these to find the most efficient execution plan.
Factors Affecting Query Cost:
| Factor | Description |
|---|---|
| Disk Access Time | Time to read/write data from storage |
| CPU Usage | Time for processing and sorting data |
| Network Cost | For distributed databases |
| Memory Usage | Amount of RAM used for temporary data |
Goal: Choose the query plan with the lowest cost (fastest execution).
Summary Table
| Topic | Meaning | Example / Use |
|---|---|---|
| Database Storage | How data is stored physically | Pages, Blocks, Files |
| Indexing | Data lookup shortcut | Index on Employee Name |
| B-Trees | Hierarchical data structure for indexes | Used in MySQL indexing |
| Hashing | Uses hash function to locate data | Hash(EmployeeID) = memory address |
| Query Processing | Steps DBMS takes to run a query | Parsing → Optimization → Execution |
| Query Optimization | Improve speed & reduce cost | Use indexes, rewrite query |
| Query Cost Analysis | Measure of efficiency | Disk + CPU + Memory cost |
In Simple Words:
Database Implementation = How data is stored.
Query Optimization = How queries run faster.
Goal = Fast access + Less cost + Accurate results.
Database Security, Recovery & Transaction Management
Backup, Recovery, and Disaster Management
Backup
A backup is a copy of database data that can be used to restore information if something goes wrong.
Think of backup like saving a copy of your project file on Google Drive — if your laptop crashes, you can still recover your work.
Types of Backups
| Type | Description | Example |
|---|---|---|
| Full Backup | Complete copy of the entire database | Weekly full database copy |
| Incremental Backup | Backs up only the changes made after the last backup | Daily small backups |
| Differential Backup | Backs up changes made after the last full backup | Every 2–3 days backup of all recent changes |
SQL Example:
Recovery
Recovery is the process of restoring the database after a failure using backup files and logs.
There are two main types of failures:
- System Crash – Power failure, hardware crash
- Transaction Failure – Error during transaction (e.g., division by zero, data conflict)
Recovery Techniques
| Technique | Description |
|---|---|
| Roll Forward | Restore database using backup + log files to bring it to latest state |
| Roll Back | Undo incomplete transactions using logs (like CTRL + Z for database) |
Disaster Management
When something serious happens (like a fire, server crash, or cyber attack), disaster management ensures the database can be quickly restored.
Steps in Disaster Management:
- Regular Backups – Store copies in multiple locations (onsite + cloud)
- Redundancy – Use duplicate servers (mirroring/replication)
- Disaster Recovery Plan (DRP) – A formal plan for what to do after failure
- Testing – Regularly test recovery methods to ensure they work
Example: Banks keep real-time mirror servers in different cities to avoid data loss.
Concurrency Control
In a multi-user system, many users access the database at the same time.
Concurrency Control ensures that:
- Multiple users can work together without conflicting
- The accuracy and consistency of data are maintained
Example:
Two employees withdraw money from the same bank account at the same time.
Without control → Wrong balance
With control → Transactions handled one after another safely.
Locking
Locking prevents multiple users from changing the same data at once.
| Type | Description | Example |
|---|---|---|
| Shared Lock | Data can be read by many users but not modified | Viewing balance |
| Exclusive Lock | Data can be modified by only one user | Updating salary |
When a user locks a row, others must wait until it’s released.
Timestamping
Each transaction is given a unique timestamp (time of start).
The database uses these timestamps to decide the order of transactions.
| Case | Result |
|---|---|
| Older transaction → executes first | Allowed |
| Newer one → waits or rolls back | Ensures consistency |
Used in systems where time order matters (like online ticket booking).
Deadlocks
A deadlock happens when two or more transactions wait for each other forever.
Example
- Transaction 1 locks “Account A” and needs “Account B”.
- Transaction 2 locks “Account B” and needs “Account A”.→ Both wait endlessly.
Deadlock Solutions:
- Deadlock Prevention: Stop conditions before they happen (give timeout or order of locking).
- Deadlock Detection: DBMS checks for cycles of waiting and stops one transaction.
- Deadlock Recovery: Abort one transaction and restart it.
Transaction Management
A transaction is a small logical unit of work —
like transferring ₹1000 from one account to another.
Example:
Both steps must complete together — if one fails, both should be cancelled.
ACID Properties
To ensure reliability, every transaction must follow ACID properties 👇
| Property | Meaning | Example |
|---|---|---|
| A – Atomicity | All steps of a transaction must succeed or none at all | Money deducted & added both or neither |
| C – Consistency | Database must remain valid before & after transaction | Total money before = total after |
| I – Isolation | Transactions must not interfere with each other | Two users withdrawing same time → no conflict |
| D – Durability | Once a transaction is committed, it remains saved even after crash | System crash → committed data still safe |
Transaction States
| State | Meaning |
|---|---|
| Active | Transaction is running |
| Partially Committed | All operations done, waiting to save |
| Committed | Successfully completed |
| Failed | Error occurred |
| Aborted | Rolled back or undone |
Transaction Control Commands (TCL)
| Command | Purpose |
|---|---|
| COMMIT | Saves all changes permanently |
| ROLLBACK | Undoes recent changes |
| SAVEPOINT | Sets a checkpoint in a transaction to roll back to |
Summary Table
| Topic | Meaning | Key Point |
|---|---|---|
| Backup | Copy of database | Used for recovery |
| Recovery | Restoring data after failure | Rollback / Rollforward |
| Disaster Management | Plan to restore system after major failure | Replication, DRP |
| Locking | Prevents simultaneous data modification | Shared / Exclusive |
| Timestamping | Orders transactions by time | Prevents conflict |
| Deadlock | Two transactions waiting for each other | Detection & recovery |
| Transaction | Unit of work | Must follow ACID |
| ACID | Reliability principles | Atomicity, Consistency, Isolation, Durability |
In Short:
Backup & Recovery → Protect data
Concurrency Control → Prevent conflicts
Transaction Management → Ensure reliability
ACID → The golden rule of database safety