Unit 4: Sampling
Sampling: Basic Concepts
Sampling is the process of selecting a small group (sample) from a large group (population) to study and draw conclusions.
Defining the Universe
The universe (also called the target population) refers to the entire group of people, items, or data that a researcher is interested in studying.
Types of Universe
- Finite Universe: Limited number of items. Example: Number of students in a school.
- Infinite Universe: Unlimited number. Example: Number of stars in the sky.
Example: If a company wants to study customer satisfaction, the universe could be all customers who purchased their product in the last year.
Concepts of Statistical Population
A statistical population is the complete set of observations or data that have something in common and are the focus of a statistical study.
Types of Population
- Real Population: Actually exists (e.g., employees in a company).
- Hypothetical Population: Does not exist physically but is assumed for study (e.g., result of tossing a coin infinite times).
Example: In a survey about the reading habits of college students, the statistical population is all college students in the city or region.
Sample
A sample is a subset of the population selected for analysis. It should represent the characteristics of the whole population.
Why use a sample?
- Saves time and cost.
- Practical and manageable.
Example: Out of 5,000 college students, selecting 500 students to survey is creating a sample.
Characteristics of a Good Sample
A good sample must have the following characteristics:
Summary
Sampling Frame: Definition and Practical Approach
A sampling frame is a list or database that includes all the elements (individuals, items, or units) in the population from which a sample is actually drawn. It serves as a bridge between the target population and the sample.
In simple terms: A sampling frame is a list of people or things you can choose your sample from.
📚 Example of Sampling Frame
Practical Approach to Determine the Sampling Frame
Step 1: Define the Target Population Clearly
Step 2: Identify Available Sources of Information
Step 3: Ensure the Frame is Updated and Complete
- Recent and up-to-date
- Complete (includes all relevant units)
- Free of duplicates
Step 4: Remove Ineligible Units (if any)
Step 5: Choose the Sampling Technique
- Simple Random Sampling
- Stratified Sampling
- Systematic Sampling, etc.
🔍 Importance of a Good Sampling Frame
- High accuracy in results
- Representation of the whole population
- Reduction of sampling bias
- Reliable and valid conclusions
❌ Common Problems in Sampling Frames
- Target Population: All MBA students enrolled in 2024–2025.
- Sampling Frame: Official list of enrolled students from the registrar's office.
- Sampling Method: Randomly select 200 students from the list.
Sampling Errors
Causes:
- Small sample size
- Improper sampling technique
- Non-random selection
Non-Sampling Errors
Types of Non-Sampling Errors:
Sample Size Constraints
Common Constraints
- Budget: Limited money for data collection
- Time: Less time to gather responses
- Resources: Limited access to people or databases
Non-Response
Types:
- Unit Non-Response: Whole participant doesn’t respond.
- Item Non-Response: Respondent skips some questions.
How to Reduce Non-Response:
- Send follow-up reminders
- Offer incentives
- Keep survey short and easy
- Ensure confidentiality
✅ Quick Summary Table
Probability Sampling
Simple Random Sampling
How it works:
- Use random number tables or computer software
- No pattern is followed
Advantages:
- Simple to understand
- No bias in selection
Disadvantages:
- Not suitable for large populations without a complete list
Systematic Sampling
Formula: Sampling Interval (k) = Population size / Sample size
Advantages
- Easy to implement
- Quick and cost-effective
Disadvantages
- If there's a hidden pattern, results may be biased
Stratified Random Sampling
Advantages
- Ensures representation of all groups
- More accurate than simple random sampling
Disadvantages
- Requires detailed population information
- More complex to administer
Area Sampling
Area sampling is used when the population is spread across a large geographical area. It divides the area into sections, then samples are taken from selected sections.
Used in: Field surveys, national census, rural marketing research
Advantages
- Practical for large-scale studies
- Cost-effective for dispersed populations
Disadvantages
- May miss variation within each area if not sampled well
Cluster Sampling
Difference from Stratified Sampling:
- In stratified, elements are similar within each group but different across groups.
- In cluster, each cluster is a mini-version of the population.
Advantages
- Saves time and cost
- Useful when population list is not available
Disadvantages
- Less accurate than stratified sampling
- High chance of sampling error if clusters are not well chosen
✅ Comparison Table
Non-Probability Sampling
Judgment Sampling (Expert Sampling)
Advantages
-
Useful when only experts or informed individuals are
needed
- Saves time
Disadvantages
- Highly subjective
- Risk of bias
Convenience Sampling
Advantages
- Very easy and quick
- Low cost
Disadvantages
- Not representative of the population
- High chance of bias
Purposive Sampling
Advantages
- Focused data collection
- Useful for studying a specific subgroup
Disadvantages
- Limited generalizability
- Risk of excluding relevant views
Quota Sampling
Advantages
- Ensures subgroup representation
- Easier to manage than stratified sampling
Disadvantages
- Not random within quotas
- Still prone to bias
Snowball Sampling
Advantages
- Effective for hidden or hard-to-reach groups
- Builds trust through referrals
Disadvantages
- Limited control over sample composition
- Potential for over-representation of connected groups
✅ Comparison Table
Determining Sample Size
🛠️ Practical Considerations in Sampling and Sample Size
Sample Size Determination (Formula-Based)
n = (Z² × p × q) / E²
- n = Required sample size
- Z = Z value (based on confidence level, e.g., 1.96 for 95%)
- p = Estimated proportion of the population having the attribute
- q = 1 − p
- E = Margin of error (allowed error rate, like 5% = 0.05)
📊 Example: Suppose you want to find out how many people like a new product.
- q = 1 – 0.5 = 0.5
- Margin of error (E): 5% = 0.05
- n = (1.96² × 0.5 × 0.5) / 0.05²
- n = (3.8416 × 0.25) / 0.0025
- n = 0.9604 / 0.0025 = 384.16 ≈ 385
Tips for Determining Sample Size
Balance Between Accuracy and Cost
- Accuracy needed (bigger sample)
- Time & money available (smaller sample)