This section will be about context-free grammars and languages. It will also feature the notion of ambiguous context-free grammars. I will present the definition of context-free grammars and languages. I will also discuss what it means for them to be ambiguous. Lots of exercises about creating CFGs to generate languages can be found at the end of this section. You can find some notes related to this section here. Please note that I don’t follow these notes as closely as previous sections, but you might still want to refer to them as an extra resource.

What is a context-free grammar?

So far we have seen that FAs accept or recognize languages and that regular expressions denote or describe them. What about generation? Is there a mechanism that generates strings, rather than accepting them? Yes, absolutely! These objects are known as grammars and I introduce a special type of grammar, context-free grammars, in the next video.

Derivation order and ambiguity

You might have noticed that there are grammars \(G\) that can generate the same string \(w\) in multiple different ways. This notion of having multiple ways of generating strings is formalized by introducing the concepts of derivation order and ambiguity. Another closely related concept is that of parse trees.

Inherently ambiguous languages

Ambiguity is a property of grammars because, in many cases, it’s possible to disambiguate a grammar by eliminating its alternate paths to generating the same string. Is this always possible? No! In some cases, it’s the languages which are inherently ambiguous, i.e. regardless of what context-free grammar \(G\) we create to generate \(L\), \(G\) will be ambiguous. More in the following video!

Exercises

Exercise. Give a context-free grammar \(G\) that generates the language \(L(a^*b^*)\).

Exercise. What language is be generated by the following grammar \(G = (\{S, A\}, \{a, b\}, S, P)\), where \(P\) is \begin{align*} S & \rightarrow aS | baA | abaA \\ A & \rightarrow aAbb | aAb | \varepsilon \end{align*}

Exercise. Give a grammar \(G\) that generates the language \(L = \{ ww^R : w \in \{a,b\}^*\}\). Is this language context-free?

Exercise. Design a context-free grammar \(G\) that generates the language \(L = \{ w \in \{a,b\}^* : n_a(w) = n_b(w)\}\) and prove the correctness of your grammar.

Exercise. Give a context-free grammar \(G\) that generates the language \(L = \{vwvw^R : v,w \in \{a,b\}^*, |v| = 2 \}\).

Exercise. Give a context-free grammar \(G\) that generates the language \(L = \{uawb : u,w \in \{a,b\}^*, |u| = |w| \}\).

Exercise. Design a context-free grammar \(G\) that generates the language \(L = \{ w \in \{a,b,c\}^* : n_a(w) + n_c(w) = n_b(w)\}\).

Exercise. We did not discuss other types of grammars beyond context-free grammars. Consult these slides about the different types of grammars. Answer the questions on slides 14, 15 and 17 about regular grammars. Can regular grammars be ambiguous?