Symbol Tables



Symbol Tables

A symbol table is a central data structure used by a compiler to store information about identifiers (variables, functions, arrays, labels, constants).

Why Symbol Tables Are Needed

  • Track declarations and uses
  • Support type checking
  • Enable storage allocation
  • Facilitate scope management

Typical Information Stored

FieldDescription
NameIdentifier lexeme
Typeint, float, array, function
ScopeGlobal / local / block
SizeMemory requirement
AddressStorage location
ParametersFor procedures/functions

Data Structures for Symbol Tables

Choosing the right data structure affects lookup speed and compiler performance.

Common Data Structures

Data StructureAdvantagesLimitations
Linear ListSimpleSlow lookup (O(n))
Hash TableFast average lookup (O(1))Collision handling
Binary Search TreeOrdered traversalCan degrade to O(n)
TrieEfficient prefix searchHigher memory usage

Most compilers use hash tables due to efficiency.

Representing Scope Information

Scope defines the region of the program where an identifier is valid.

Types of Scope

  • Global scope: visible throughout the program
  • Local scope: visible within a function/block
  • Nested (block) scope: inner blocks shadow outer names

Scope Handling Techniques

TechniqueDescription
Separate TablesOne table per scope
Scope StackPush on entry, pop on exit
Chained Hash TablesParent pointer to enclosing scope

Run-Time Administration

Run-time administration manages memory allocation, activation records, and control flow during program execution.

Key Responsibilities

  • Allocate/deallocate storage
  • Manage procedure calls/returns
  • Support recursion

Implementation of Simple Stack Allocation Scheme

A stack allocation scheme uses a run-time stack to manage memory for procedure calls.

Activation Record (Stack Frame)

FieldPurpose
ParametersPassed values
Return AddressControl transfer
Local VariablesLocal storage
TemporariesIntermediate values
Control LinkAccess caller frame

Characteristics

  • LIFO discipline
  • Efficient for block-structured and recursive programs
  • Used by most modern languages (C, Java)

Storage Allocation in Block-Structured Languages

Block-structured languages (C, Pascal) allow nested blocks with local declarations.

Allocation Strategies

StrategyDescription
Static AllocationFixed at compile time
Stack AllocationDynamic, supports recursion
Heap AllocationDynamic objects with unknown lifetime

Scope-Based Allocation

  • Enter block → allocate locals
  • Exit block → deallocate locals automatically

Error Detection & Recovery

Compilers must detect, report, and recover from errors to continue processing.

Lexical Phase Errors

Detected by the lexical analyzer while scanning characters.

Examples

  • Invalid character: @
  • Unterminated string
  • Illegal numeric format

Recovery Techniques

TechniqueDescription
Panic ModeSkip invalid characters
Error TokenReplace with dummy token
Ignore CharacterContinue scanning

Syntactic Phase Errors

Detected by the parser when token sequences violate grammar.

Examples

  • Missing semicolon
  • Unmatched parentheses
  • Incorrect keyword order

Recovery Methods

MethodDescription
Panic ModeSkip tokens until synchronizing symbol
Phrase-LevelInsert/delete tokens
Error ProductionsGrammar rules for errors

Semantic Errors

Detected during semantic analysis.

Examples

  • Undeclared variable
  • Type mismatch (int + float)
  • Wrong number of arguments
  • Array index out of bounds (partial)

Detection Mechanism

  • Symbol table lookup
  • Type checking rules
  • Attribute evaluation

Error Handling Comparison

PhaseError TypeExamples
LexicalInvalid tokenIllegal character
SyntacticGrammar violationMissing ;
SemanticMeaning errorType mismatch

Exam-Oriented Summary Table

TopicKey Points
Symbol TableStores identifier info
Data StructuresHash table preferred
ScopeGlobal, local, nested
Stack AllocationActivation records
Block StructureScope-based storage
Lexical ErrorsInvalid tokens
Syntax ErrorsGrammar violations
Semantic ErrorsType/scope issues