Home »
DBMS
What is Chomsky Normal Form in DBMS?
In this tutorial, we are going to learn about the chomsky normal form, why it is used, explanation with examples.
By IncludeHelp Last updated : May 29, 2023
Overview
A grammar where each output is either of the form A → BC or A → c Example: S → AS | an A → SA | b
Where, the non-terminals are A, B, C and the terminal is a, we infer, from the above description, that in order to be in CNF, either two non-terminals or a single terminal must be derived from all production.
CNF limits the number of symbols to two on the right side of an output.
- Non-terminals or a single terminal must be the two symbols.
What is Chomsky Normal Form in DBMS?
The standard form of Chomksy is a useful form for dealing with context-free grammar. This is a basic method of writing a CFG that is useful for understanding and explaining things about CFGs. It also allows a binary tree of the parse tree for derivations that use this form of the CFG.
What is the standard Chomsky form, then? When each rule is of the form A → BC and A → a, where an is a terminal, and A, B and C are variables, a CFG is natural in Chomsky form. Moreover the starting variable is not B or C. In addition, for technical reasons, we allow the S → ε rule where S is the starting variable. Notice that this means that, as one of several possible laws, we allow S → ε.
Why Chomsky's Normal Form?
The key advantage is that every derivation of a string of n letters has exactly 2n-1 steps in the Chomsky Normal Form. Thus, by an exhaustive search of all derivations, one can determine if a string is in the language.
For instance—
S x AB AB
A → a
B → b
Free grammar in this context is in normal Chomsky form.
Algorithm
To standardize the grammar using CNF, the following steps are followed—
Step 1:
Absolutely minimize your grammar by-
Elimination of productivity
Eliminating unit manufacturing
Elimination of obsolete inventions
Step 2:
Replace each A → B1B2B3 production....Bn where n → 2 with A → B1C where C → B2B3....Bn.
Repeat this step for all productions that have more than two RHS variables.
Step 3:
Substitute A x XB and X x a for each output of type A x aB.
For all productions having the form A x aB, repeat this step.
Practice Solution
Example 1
Convert the grammar given to CNF-
S → aADD
A → aB / bAB
B → b
D →d
ANSWER
Step 1:
The grammar given is already completely diminished.
Step 2:
The productions that are already in normal Chomsky form are—
B → b .......... (1)
D → d .......... (2)
Such productions will stay as they are.
Productions that are not in the normal Chomsky form are—
S → aAD .......... (3)
A → aB / bAB .......... (4)
We're going to convert these productions to the normal Chomsky form.
Step 3:
Replace the terminal symbols a and b with the new Ca and Cb variables.
This is achieved by incorporating the grammar of the following two recent productions—
Ca → a .......... (5)
Cb b .......... (6)
The productions (3) and (4) are now adapted to—
S → CaAD .......... (7)
A → CaB / CbAB .......... (8)
Step 4:
Replace AD and AB with new CAD and CAB variables.
This is achieved by incorporating the grammar of the following two recent productions—
CAD → AD .......... (9)
CAB →AB .......... (10)
The productions (7) and (8) are now altered to—
S → CaCAD .......... (11)
A → CaB / CbCAB .......... (12)
Step 5:
The resulting grammar is—from (1), (2), (5), (6), (9), (10), (11) and (12),
S CaCAD CaCAD
A → Ta→i / CbCAB
B → b b
D to d
Ca an a
Cb → b b
CAD → ADD
CAB to AB
This grammar is in the normal Chomsky form.
Example 2
Consider the following CFG: S → aXbXX
X → aY | bY | ε
Y → X | c
ANSWER
Step 1:
The vector X is nullable, and thus Y is nullable. After ε has been deleted, we get:
S → aXbX | abX | aXb | ab XbX | ab
X → aY | bY | a | b
Y → X | c
Step 2:
After the development unit Y → X is deleted, we get:
S → aXbX | abX | aXb | ab XbX | ab
X → aY | bY | a | b
Y → aY | bY | a | b | c | c
Step 3:
Now, sever S's RHSs; and replace a with A, b with B and c with C, wherever the units are not:
S → EF | AF | EB | AB EF | AF | AB
X for AY | BY | a | b
Y → AY | BY | a | b | c
E → AX
F → BX
A → a
B → b
C → c