DECISION TREE LEARNING & INSTANCE-BASED LEARNING

Introduction to Decision Tree Learning

Decision Tree Learning is a supervised machine learning method used for classification and prediction. It represents decisions in a tree-like structure where:

Nodes represent tests or conditions
Branches represent possible outcomes
Leaf nodes represent final decisions or classes

Example: Suppose we want to predict whether a person will buy a product.

Decision tree example:


             Age > 30?
            /        \
         Yes          No
        /              \
   Income > 50k?      No Buy
     /       \
   Yes       No
 Buy Product  No Buy

Components of Decision Tree

Component	Meaning
Root Node	Starting point of tree
Internal Node	Decision condition
Branch	Outcome of condition
Leaf Node	Final result or class

Applications

Medical diagnosis
Customer behavior prediction
Fraud detection
Credit risk analysis

Decision Tree Learning Algorithm

The decision tree learning algorithm builds a tree by recursively dividing the dataset based on attributes.

Steps of the Algorithm

Start with the entire dataset as the root node.
Select the best attribute that splits the data.
Divide the dataset into subsets based on attribute values.
Create child nodes.
Repeat the process until:
- All data belongs to same class
- No attributes remain

This process is called recursive partitioning.

Inductive Bias

Inductive bias refers to the assumptions made by a learning algorithm to generalize from training data.

In simple words: When a machine learning model learns from limited examples, it assumes certain patterns to predict unseen data.

Example: Suppose a decision tree learns that:

Birds with wings and feathers can fly.
Even if it has not seen a new bird, it assumes it can fly based on learned patterns.

Importance

Inductive bias helps the model:

Make predictions on unseen data
Avoid overfitting
Learn general patterns

Inductive Inference with Decision Trees

Inductive inference means drawing general conclusions from specific examples.

Decision trees perform inductive inference by:

Studying training examples
Finding patterns
Creating decision rules

Example

Training Data:

Weather	Play Cricket
Sunny	No
Rainy	Yes
Overcast	Yes

The decision tree may infer rule:

If Weather = Sunny → No Play
If Weather = Rainy → Play

This rule is generalized for future predictions.

Entropy (Information Theory)

Entropy measures the uncertainty or impurity in data.

If the data is mixed, entropy is high.
If the data is pure, entropy is low.

Entropy Formula

$Entropy(S) = - \sum_{i=1}^{c} p_i \log_2 p_i$

Where:

Symbol	Meaning
S	Dataset
pᵢ	Probability of class i
c	Number of classes

Example

Dataset:

Class	Count
Yes	9
No	5

If both classes are equally distributed → High entropy.

If only one class exists → Entropy = 0.

Information Gain

Information Gain measures how much information an attribute gives about the class. It helps determine which attribute should be used to split the dataset.

Formula

$Gain(S,A) = Entropy(S) - \sum_{v \in Values(A)} \frac{|S_v|}{|S|} Entropy(S_v)$

Where:

Symbol	Meaning
S	Dataset
A	Attribute
Sᵥ	Subset of data

Interpretation

Higher information gain → better attribute for splitting.

Example attributes:

Age
Income
Education

The attribute with highest information gain becomes the root node.

ID3 Algorithm

ID3 (Iterative Dichotomiser 3) is a decision tree learning algorithm developed by Ross Quinlan.

It builds the decision tree using information gain.

Steps of ID3 Algorithm

Calculate entropy of dataset.
Compute information gain for each attribute.
Choose attribute with highest information gain.
Create node and split dataset.
Repeat process for each subset.
Stop when:
- All examples belong to same class
- No attributes remain

Advantages of ID3

Simple to implement
Works well with categorical data
Easy to interpret

Limitation

Cannot handle continuous data directly
Prone to overfitting

Issues in Decision Tree Learning

Decision tree algorithms face several challenges.

Issue	Explanation
Overfitting	Tree becomes too complex and fits training data perfectly but fails on new data
Handling Continuous Attributes	Decision trees work better with categorical data
Missing Values	Difficult to process incomplete data
Bias toward Attributes with Many Values	Attributes with more values may appear more informative
Tree Complexity	Very large trees are difficult to interpret

Solutions

Pruning (reducing tree size)
Handling missing values
Using improved algorithms like:
- C4.5
- CART

Summary Table

Topic	Key Idea
Decision Tree	Tree structure for classification
Inductive Bias	Assumptions for generalization
Inductive Inference	Learning general rules from examples
Entropy	Measure of data impurity
Information Gain	Measures usefulness of attribute
ID3 Algorithm	Builds tree using information gain
Issues	Overfitting, missing data, complexity

Important MCA Exam Questions

Explain Decision Tree Learning with diagram.
Define Entropy and Information Gain.
Explain ID3 Algorithm with steps.
What is Inductive Bias in Machine Learning?
Explain Issues in Decision Tree Learning.

INSTANCE-BASED LEARNING

Instance-Based Learning (IBL) is a type of machine learning method where the model learns by storing training examples and comparing new data with those examples.

Instead of creating a general model, the system remembers past instances (examples) and uses them to make predictions.

In simple words: Learning happens by comparing new cases with previously stored cases.

Example: Suppose a system stores medical records of patients.

When a new patient arrives, the system compares the new patient with similar previous patients to predict disease.

Key Idea

Prediction is based on similarity between instances.

Characteristics

Feature	Explanation
Lazy Learning	Model waits until prediction time
Memory-Based	Stores training examples
Similarity Measure	Uses distance measures like Euclidean distance

k-Nearest Neighbour (k-NN) Learning

k-Nearest Neighbour (k-NN) is the most popular instance-based learning algorithm.

It classifies a new data point by looking at the k nearest data points in the training dataset.

How k-NN Works

Steps:

Choose the value of k (number of neighbors).
Calculate the distance between new data and all training data.
Select the k closest neighbors.
Assign the class that appears most frequently among those neighbors.

Distance Formula (Euclidean Distance)

$d(x,y) = \sqrt{\sum_{i=1}^{n}(x_i - y_i)^2}$

Where:

Symbol	Meaning
x	New data point
y	Training data point
n	Number of features

Example

Suppose we classify fruit type using features:

Weight
Color

If most neighbors are Apples, the new fruit is classified as Apple.

Advantages

Simple to implement
No training phase
Works well with small datasets

Disadvantages

Slow for large datasets
Sensitive to irrelevant features
Requires large memory

Locally Weighted Regression (LWR)

Locally Weighted Regression is a type of regression algorithm where predictions are made using nearby data points.

Instead of fitting a single global model, LWR creates local models for each prediction.

Idea

Points closer to the query point get higher importance (weight). Example - Suppose we want to predict house prices.

A house price prediction depends more on nearby houses rather than houses in another city.

So the algorithm gives more weight to closer houses.

Key Characteristics

Feature	Explanation
Local Model	Model built around nearby data
Weighted Data	Nearby points have higher weights
Lazy Learning	Model built during prediction

Applications

Economic forecasting
Demand prediction
Time-series analysis

Radial Basis Function (RBF) Networks

Radial Basis Function Networks are a type of Artificial Neural Network used for function approximation and classification.

They use radial basis functions as activation functions.

Structure of RBF Network

An RBF network has three layers:

Input Layer
Hidden Layer (Radial basis functions)
Output Layer

Radial Basis Function

$\phi(x) = e^{-\gamma ||x - c||^2}$

Where:

Symbol	Meaning
x	Input vector
c	Center
γ	Parameter controlling spread

The function gives higher values for points closer to the center.

Applications

Pattern recognition
Speech recognition
Image processing
Function approximation

Advantages

Fast learning
Good for nonlinear problems

Case-Based Learning

Case-Based Learning (CBL) is a method where the system solves new problems using solutions from similar past cases.

It is widely used in expert systems.

Process of Case-Based Learning

The system follows four steps:

Step	Explanation
Retrieve	Find similar past cases
Reuse	Apply solution from past case
Revise	Modify solution if necessary
Retain	Store new case for future use

Example: Legal system: A lawyer solves a case by studying similar past court cases.

Similarly, case-based learning uses previous cases to solve new problems.

Applications

Medical diagnosis systems
Customer support systems
Legal reasoning systems
Help desk systems

Comparison of Instance-Based Learning Methods

Method	Purpose
k-NN	Classification based on nearest neighbors
Locally Weighted Regression	Local regression prediction
RBF Networks	Neural network using radial functions
Case-Based Learning	Solve problems using past cases

Advantages of Instance-Based Learning

Simple concept
No training phase required
Adapts quickly to new data
Effective for pattern recognition

Limitations of Instance-Based Learning

Problem	Explanation
High Memory Requirement	Stores entire dataset
Slow Prediction	Needs to compare many instances
Sensitive to Noise	Outliers affect results
Curse of Dimensionality	Performance decreases with many features

Summary

Instance-Based Learning is a memory-based machine learning approach where predictions are made by comparing new instances with stored examples. Algorithms like k-NN, Locally Weighted Regression, RBF networks, and Case-Based Learning rely on similarity measures to make predictions. These methods are simple and flexible but can require large storage and computational resources.

Important MCA Exam Questions

Explain Instance-Based Learning with characteristics.
Explain k-Nearest Neighbour algorithm with example.
What is Locally Weighted Regression?
Explain Radial Basis Function Networks.
Explain Case-Based Learning with steps

DECISION TREE LEARNING & INSTANCE-BASED LEARNING

Introduction to Decision Tree Learning

Components of Decision Tree

Applications

Decision Tree Learning Algorithm

Steps of the Algorithm

Inductive Bias

Importance

Inductive Inference with Decision Trees

Example

Entropy (Information Theory)

Entropy Formula

Example

Information Gain

Formula

Interpretation

ID3 Algorithm

Steps of ID3 Algorithm

Advantages of ID3

Limitation

Issues in Decision Tree Learning

Solutions

Summary Table

Important MCA Exam Questions

INSTANCE-BASED LEARNING

Key Idea

Characteristics

k-Nearest Neighbour (k-NN) Learning

How k-NN Works

Distance Formula (Euclidean Distance)

Example

Advantages

Disadvantages

Locally Weighted Regression (LWR)

Idea

Key Characteristics

Applications

Radial Basis Function (RBF) Networks

Structure of RBF Network

Radial Basis Function

Applications

Advantages

Case-Based Learning

Process of Case-Based Learning

Applications

Comparison of Instance-Based Learning Methods

Advantages of Instance-Based Learning

Limitations of Instance-Based Learning

Summary

You might like