Grammars and Parsing



Grammars and Sentence Structure

In Natural Language Processing, a grammar is a set of rules that defines how words combine to form valid sentences in a language.

Grammars and Parsing

Grammar helps a computer determine whether a sentence is syntactically correct.

Grammars and Parsing

Grammar rules are derived from Linguistics.

Components of Grammar

ComponentDescriptionExample
Terminal SymbolsActual words in the sentenceboy, apple
Non-Terminal SymbolsGrammatical categoriesNP, VP
Production RulesRules to generate sentencesS → NP + VP
Start SymbolStarting point of grammarS

Example Grammar Rules

RuleMeaning
S → NP VPSentence consists of Noun Phrase + Verb Phrase
NP → Det NNoun phrase
VP → V NPVerb phrase
Det → the, aDeterminer
N → boy, appleNoun
V → eatsVerb

Example Sentence Structure

Sentence: “The boy eats an apple.”

Parse representation:

S
├── NP
│ ├── Det → The
│ └── N → boy
└── VP
├── V → eats
└── NP
├── Det → an
└── N → apple

This hierarchical structure helps computers understand sentence syntax.

Parsing in NLP

Parsing is the process of analyzing a sentence according to grammar rules to determine its syntactic structure.

Parsing identifies:

  • sentence components
  • grammatical relationships
  • hierarchical structure

Example sentence: “Ram reads a book.”

Parser identifies:

ComponentRole
RamSubject
readsVerb
bookObject

Top-Down Parser

Grammars and Parsing
Grammars and Parsing

Grammars and Parsing

A Top-Down Parser starts parsing from the start symbol (S) and tries to generate the input sentence using grammar rules. It expands grammar rules until it matches the input sentence.

Steps of Top-Down Parsing

  1. Start with S
  2. Expand grammar rules
  3. Match terminal symbols with input words
  4. Continue until full sentence is generated

Example sentence: “The boy eats apple.”

Parsing process:

S
→ NP VP
→ Det N VP
→ The boy VP
→ The boy V NP
→ The boy eats apple

Advantages

AdvantageExplanation
Simple designEasy to implement
Clear derivationShows sentence generation

Disadvantages

DisadvantageExplanation
Backtracking requiredWrong rule selection
InefficientMany unnecessary expansions

Bottom-Up Parser

A Bottom-Up Parser starts from the input words and gradually builds the parse tree upwards toward the start symbol (S). It reduces groups of words into grammar symbols.

Grammars and Parsing

Parsing Process

Example: Sentence: “The boy eats apple.”

Step process:

StepReduction
TheDet
boyN
Det + NNP
eatsV
appleN
V + NVP
NP + VPS

Advantages

AdvantageExplanation
Efficient for many grammarsLess backtracking
Works well with ambiguous grammar

Disadvantages

DisadvantageExplanation
Complex implementationHarder than top-down
Large search spaceMany reductions possible

Grammars and Parsing

Difference Between Top-Down and Bottom-Up Parsing

FeatureTop-Down ParsingBottom-Up Parsing
Starting PointStart symbol (S)Input words
DirectionRoot → LeavesLeaves → Root
StrategyPredict structureBuild structure
EfficiencyLess efficientMore efficient

Transition Network Grammars

A Transition Network Grammar represents grammar rules using a network of states and transitions. It is based on finite state machines used in Artificial Intelligence.

Grammars and Parsing

Grammars and Parsing

Grammars and Parsing

Grammars and Parsing

Components

ComponentMeaning
NodeState
ArcTransition between states
LabelWord or category

Example - Sentence structure:

Start → NP → VP → End

Graph representation:

(Start) → NP → VP → (End)

Transitions represent grammatical categories.

Augmented Transition Network (ATN)

An advanced version of transition network grammar.

Features:

  • recursion
  • memory registers
  • conditions

Used for complex sentence parsing.

Top-Down Chart Parsing

Top-Down Chart Parsing is an improved version of top-down parsing that uses a chart data structure to store intermediate results and avoid repeated computations. It is widely used in syntactic parsing systems.

Grammars and Parsing
Grammars and Parsing

Grammars and Parsing

Grammars and Parsing

Components of Chart Parser

ComponentDescription
ChartData structure storing partial parses
EdgePartial grammar rule
AgendaList of operations to perform

Example Chart Entry

Sentence: “The boy eats apple.”

Chart entries may include:

PositionGrammar Rule
0–2NP
2–3V
3–4N
2–4VP

Finally:

NP + VP → S

Advantages

AdvantageExplanation
Avoids repeated workStores partial results
Efficient parsingHandles ambiguity
Works with large grammarsSuitable for NLP systems

Summary Table

TopicKey Idea
GrammarRules to form valid sentences
Sentence StructureHierarchical representation
Top-Down ParserStart from root symbol
Bottom-Up ParserStart from input words
Transition Network GrammarGrammar as state network
Chart ParsingEfficient parsing using charts

Short 

Grammars and Parsing: Grammar is a set of rules that define the structure of sentences in a language. Parsing is the process of analyzing a sentence according to these grammatical rules to determine its syntactic structure. Major parsing techniques include Top-Down parsing, which starts from the start symbol and expands grammar rules, and Bottom-Up parsing, which begins with input words and builds the structure upward. Transition Network Grammars represent grammar using states and transitions, while Top-Down Chart Parsing improves efficiency by storing intermediate parsing results in a chart.

Feature Systems and Augmented Grammars

In Natural Language Processing, feature systems are used to represent grammatical properties of words such as number, gender, tense, and person.

Feature systems help parsers enforce grammatical agreement rules in sentences.

Feature Systems and Augmented Grammars


Feature Systems and Augmented Grammars
Feature Systems and Augmented Grammars

Feature Systems and Augmented Grammars

Feature Systems and Augmented Grammars

Example sentence: “She eats apples.”

The system checks:

WordFeature
ShePerson = 3rd
eatsVerb agreement = 3rd person
applesNumber = plural

These grammatical properties are represented using feature structures.

Basic Feature System for English

A feature system represents grammatical information using attribute–value pairs.

Example feature structure:

Noun
[
Number = Singular
Gender = Masculine
Person = Third
]

Common Features in English Grammar

FeatureMeaningExample
NumberSingular or pluralboy / boys
PersonFirst, second, thirdI, you, he
TenseTime of actionpast, present
GenderMasculine/feminineactor / actress
CaseGrammatical rolesubject/object

Example: Agreement Checking

Sentence: “The boy runs.”

Feature matching:

WordFeature
boyNumber = singular
runsVerb form = singular

Correct agreement → sentence valid.

Incorrect example“The boy run.”

Feature mismatch → grammar error.

Morphological Analysis and the Lexicon

Morphological Analysis

Morphology studies the internal structure of words.

Morphological analysis identifies:

  • root word
  • prefixes
  • suffixes
  • grammatical properties

Example word: “unhappiness”

Structure:

ComponentMeaning
unprefix
happyroot
nesssuffix

Types of Morphology

TypeDescriptionExample
Inflectional MorphologyChanges grammatical formplay → played
Derivational MorphologyCreates new wordshappy → happiness

Grammars and parsing
Grammars and Parsing
Grammars and Parsing
Grammars and Parsing

Grammars and Parsing

Lexicon in NLP

The lexicon is a dictionary of words used by NLP systems.

Each word entry contains:

FieldDescription
Wordactual word
Part of Speechnoun, verb, adjective
Featuresnumber, tense
Meaningsemantic information

Example Lexicon Entry

WordPOSFeature
boynounsingular
eatverbbase form
eatsverbpresent, 3rd person

Lexicons help the system identify word properties during parsing.

Parsing with Features

Feature-based parsing ensures grammatical agreement using feature matching. The parser checks whether features of words match grammar rules.

Grammars and Parsing

Grammars and Parsing

Grammars and Parsing

Example Grammar Rule with Features

Traditional rule:

S → NP VP

Feature-based rule:

S → NP[NUM=?n] VP[NUM=?n]

Meaning:

  • The number feature of NP and VP must match.

Example Parsing

Sentence: “The boys play football.”

Feature analysis:

WordFeature
boysplural
playplural verb

Feature match → sentence accepted.

Incorrect example: “The boys plays football.”

Plural subject + singular verb → parser rejects.

Augmented Transition Networks (ATN)

An Augmented Transition Network (ATN) is an advanced form of Transition Network Grammar used in Artificial Intelligence for parsing natural language.ATN extends finite state networks with additional capabilities.

Augmented Transition Networks (ATN)

Augmented Transition Networks (ATN)

Components of ATN

ComponentDescription
Nodesstates of parsing
Arcstransitions between states
Registersmemory variables
Testsconditions to check grammar

ATN Parsing Example

Sentence structure:

Start → NP → VP → End

Network representation:

(Start)
|
NP
|
VP
|
(End)

During parsing, the system:

  1. Moves through states
  2. Checks grammatical conditions
  3. Stores information in registers

Advantages of ATN

AdvantageExplanation
Handles complex grammarSupports recursion
Efficient parsingReduces ambiguity
Supports feature checkingWorks with feature systems

Relationship Between Feature Systems and Augmented Grammars

ConceptRole
Feature SystemStores grammatical properties
LexiconStores word information
Morphological AnalysisBreaks words into components
Parsing with FeaturesChecks agreement
ATNUses features during parsing

Together they allow NLP systems to process language more accurately.

Key Points

TopicImportant Idea
Feature SystemsAttribute–value grammatical representation
Morphological AnalysisStudy of word structure
LexiconNLP dictionary of words
Parsing with FeaturesAgreement checking during parsing
ATNAdvanced parsing network with memory

Short 

Feature systems represent grammatical properties of words using attribute–value pairs such as number, tense, and gender. Morphological analysis studies the internal structure of words and uses a lexicon that stores word information. Parsing with features ensures grammatical agreement between words during sentence analysis. Augmented Transition Networks are advanced parsing models that extend transition networks by adding memory registers, conditions, and recursive capabilities to process complex natural language sentences.