Recitation 3 latex source
This file is provided to help students create crib sheets
in latex if they wish. Some formatting was removed for clarity.
It is missing some macros, but you can figure out what they are
by looking at the pdf image. It is not a working latex file.
\chapter{Recitation 03: CFLs, TMs, T-recognizability, Decidability}
In today's recitation, we will review some things about CFLs and gain more
practice working with Turing Machines. First we will review closure properties for CFLs and do an example problem with the pumping lemma for CFLs, then we'll review the TM variants that
we saw in class this week and finally we'll work through a few examples for showing
that a language is Turing-decidable. We will also learn about closure properties
for Turing recognizable and decidable languages.
\section{Context Free Languages (CFLs)}
We saw in class that a CFL is a language generated by a CFG (context free grammar) or a language recognized by a PDA (push down automata).
\begin{theorem}
(CFL Closure Properties) Let $A$ and $B$ be CFLs. Then $A \cup B$, $A \circ B$ and $A^*$ are CFLs
(closure under $\cup$, $\circ$, and $*$).
\end{theorem}
\begin{proof}
Let $A$ and $B$ be recognized by PDAs $P_A$ and $P_B$ or generated by grammars with start state $G_A$ and $G_B$, respectively.
\begin{itemize}
\item For $\cup$, we can use the non-determinism of a PDA to guess which language the input is from and run $P_A$ or $P_B$ on the input. Or using CFGs, we can construct a grammar with an initial rule $S \rightarrow G_A | G_B$.
\item For $\circ$, we can non-deterministically guess where to split the string and run $P_A$ on the first portion and $P_B$ on the second portion and accept if both accept. Or using CFGs, construct a grammar with $S \rightarrow G_A G_B$.
\item For $*$, we can non-deterministically guess where the string is split up and run $P_A$ on each segment. Or using CFGs, make $S \rightarrow SG_A | \varepsilon$.
\end{itemize}
\end{proof}
Note that CFLs are not closed under $\cap$ or complement, as you will prove in Pset 2 problem 1b. One helpful fact, however, is that a CFL $\cap$ a regular language is still a CFL.
Next, lets look at an example problem of using the pumping lemma for CFLs. First, recall the pumping lemma for CFLs.
\begin{theorem}
(Pumping Lemma for CFLs) If $A$ is a CFL, then there exists a pumping length $p$ such that $\forall s \in A$ where $|s| \geq p$, then $\exists u,v,x,y,z$ such that $s = uvxyz$ and
\begin{enumerate}
\item $uv^ixy^iz \in A \;\;\forall i \geq 0$
\item $|vy|$ > 0
\item $|vxy| \leq p$
\end{enumerate}
\end{theorem}
Now lets apply it with a proof by contradiction.
\begin{example}
Show that $A = \{ a^i b^j c^k | i > j > k \}$ is not a CFL.
\end{example}
\begin{proof}
Suppose that $A$ is a CFL. Then the pumping lemma applies and there exists a pumping length $p$. Take the string $s = a^{p+2}b^{p+1}c^p$. Clearly, $s \in A$ and $|s| = 3p + 3 \geq p$.
Since $|vxy| \leq p$, $vxy$ can only contain $a$'s and $b$'s or $b$'s and $c$'s. If $vxy$ is in the first portion of the string and contains $a$'s and $b$'s, then observe that $uv^0xy^0z = uxz \notin A$ since we removed at least 1 $a$ or at least 1 $b$ and there will now no longer be more $a$'s than $b$'s or $b$'s than $c$'s. If instead $vxy$ contains $b$'s and $c$'s, observe that $uv^2xy^2z \notin A$ since we added at least 1 $b$ or at least 1 $c$ and there will now no longer be more $a$'s than $b$'s or $b$'s than $c$'s.
These cases cover all possible assignments of $uvxyz$, so by way of contradiction, $A$ cannot be a CFL.
\end{proof}
\section{Turing Machine variants}
We saw in class that there are several equivalent variants of Turing machines. We can use any of
these equivalent formulations when proving that a language is Turing-recognizable.
\begin{theorem}
The following machines are all equivalent in computational power to single-tape,
deterministic TMs.
\marginnote{%
To review the proofs that all of these are equivalent,
see \textbf{Section 3.2} of the textbook.}
\begin{enumerate}
\item Nondeterministic TMs (NTM)
\marginnote{%
Remember that an NTM accepts if \emph{at least one} branch of its computation accepts.}
\item Multi-tape TMs
\item Enumerators (i.e.\ a two-tape TM where the second tape is a printer)
\end{enumerate}
\end{theorem}
Another way to formulate the above theorem is that the following four statements are equivalent
(so there is an ``if and only if'' relationship between each pair of these statements):
\begin{enumerate}
\item Language \(B\) is Turing-recognizable.
\item \(B = L(N)\) for some NTM \(N\).
\item \(B = L(M)\) for some multi-tape TM \(M\).
\item \(B = L(E)\) for some enumerator \(E\).
\end{enumerate}
There is a similar theorem for variants of Turing deciders.
\begin{theorem}
The following machines are all equivalent in computational power to single-tape,
deterministic Turing deciders.
\begin{enumerate}
\item Nondeterministic Turing deciders.
\marginnote{%
A nondeterministic Turing decider must halt on \emph{every branch} of its computation.
It accepts if \emph{at least one} branch of its computation accepts.}
\item Multi-tape Turing deciders.
\item Enumerators that enumerate strings in lexicographic order.
\marginnote{%
\#3 is proved as problem 4 in pset 2(a language is decidable if and only if there is an enumerator that enumerates its language in lexicographic
order).}
\end{enumerate}
\end{theorem}
\begin{example}
Prove that there's an enumerator $E$ that generates a language if and only if there's a Turing machine $M$ that recognizes. (A subset of theorem 1)
\end{example}
\begin{solution}
The standard way to solve this kind of problem is to use each of the machines to construct the other. To construct $M$ using $E$, we have $M$ run $E$. If at any point $E$ outputs the input, $M$ accepts. ($M$ rejects all strings not in the language by looping)
The main subtlety is in using $M$ to construct an enumerator, this requires two standard tricks:
\begin{enumerate}
\item Simulating multiple instances of $M$ running on different input in a single machine, by repeatedly running a step in each. (to avoid getting stuck in one that never halts). This is similar to the proof that none-deterministic machines can not recognize that deterministic Turing machine can not recognize. But as a reminder it goes as follows:
\begin{enumerate}
\item Separate your tape into finite regions separated by $\#$s so that each machine gets a region. If a machine tries to read outside is region while being simulated, we exit simulation and shift everything to the right to grow it's the region.
\item Store the current state of the simulated machine at the beginning of each region.
\item Denoted where the simulated head is by putting a dot on it's position. Each region should have one and only one dot.
\end{enumerate}
\item Doing the above, but adding a new thread simulating $M$ on a new word after each iterations on all the simulated words. So that every machine gets an infinite number of steps. One any of the simulated machines accepts, it's string is printed.
\end{enumerate}
\end{solution}
Next, let's work through an example showing that TMs are also computationally equivalent to a ``broken''
TM where trying to move left would move the tape head all the way to the left.
\begin{example}
Let a \textbf{left-reset TM} be a TM, but when the transition function returns an \(L\), the TM
actually goes all the way to the left instead of moving left by one cell. Then left-reset TMs are
equivalent to standard TMs.
\end{example}
\begin{solution}
We can prove that left-reset TMs are equivalent to standard TMs by showing that left-reset TMs can
simulate standard TMs. Since the only difference in left-reset TMs is the moving left operation,
it is enough to show that left-reset TMs can simulate moving left by one cell.
There are two ways of showing this:
\begin{enumerate}
\item
To move left by one cell, the left-reset TM shifts every symbol right by one.
More specifically, the left-reset TM does the following to move left one cell:
\begin{enumerate}
\item Mark the current cell.
\item Move all the way to the left.
\item Shift every symbol on the tape right by one cell,
taking care to not actually shift the mark.
\item Move all the way to the left.
\item Go right until we reach the mark.
\end{enumerate}
Another detail to be careful about is how the left-reset TM knows where the tape symbols end.
This can be resolved by keeping a mark at the rightmost cell ever visited.
\item
To move left by one cell, the left-reset TM makes a mark on the current cell, and advances
using another mark slowly to get to the cell before the first marked cell.
More specifically, the left-reset TM does the following to move left one cell:
\begin{enumerate}
\item Mark the current cell with a \(\bullet\).
\item Go all the way to the left and mark the leftmost cell with a \(\star\).
\item Repeat the following loop:
\begin{enumerate}
\item Go to the left and scan right until finding the \(\star\).
\item Move right one cell.
\item If the cell has a \(\bullet\), remove it and go back to the \(\star\) and change it to a \(\bullet\) exit the loop. Otherwise, the
\(\star\) is not left of the original cell. Advance the \(\star\) by one cell.
\end{enumerate}
\item Go to the \(\star\) and remove it. The left-reset TM should be on the cell one left of
the original cell.
\end{enumerate}
\end{enumerate}
\end{solution}
\section{Hint for PSet 2 \#5}
This PSet problem is difficult and hard to think about, so we will go over a special case to help.
\begin{proposition}
Let
\[
C =
\set{\angles{p} \mid \text{\(p\) is a multivariable polynomial where
\(p(x_1, x_2, \dots, x_n) = 0\) has an integer solution}}.
\]
Then there is some decidable \(D\) such that
\(C = \set{p \mid \exists y,\, \angles{p, y} \in D}\).
\end{proposition}
\begin{solution}
First, note that \(C\) is Turing-recognizable because we can test all possible integer tuples to
see if \(p\) has that tuple as a root.
In order to construct some decidable \(D\), the intuition is that we want \(y\)
to be an extra piece of information that helps us decide if \(\angles{p} \in C\).
The extra piece of information that will help us here is the integer root of \(p\).
In particular, we have
\[
D =
\set{
\angles{p, (x_1, \dots, x_n)} \mid
\text{%
\(p\) is a polynomial,
\(x_1, \dots, x_n\) are integers, and
\(p(x_1, \dots, x_n) = 0\)%
}
}.
\]
We can see \(C = \set{p \mid \exists y,\, \angles{p, y} \in D}\),
and \(D\) is decidable.
\end{solution}
\section{T-recognizable and decidable languages}
\begin{figure}\label{turing-decidable-venn-diagram}
\caption{Examples of Turing decidable and Turing recognizable languages from class.}
\begin{tikzpicture}
\draw[] (-1, -2.2) rectangle(10, 7.5);
\node[at={(0.5, 7)}] {\textbf{All languages}};
\node[at={(3, 6.5)}] {\(EQ_{\textsf{CFG}}\)};
\node[at={(5, 6.5)}] {\(\overline{A_{\textsf{TM}}}\)};
\draw[] (4.5,2) ellipse[x radius=4.5, y radius=4];
\node[at={(4.5,5.4)}] {\textbf{Turing recognizable}};
\node[at={(4,4.9)}] {\(A_{\textsf{TM}}\)};
\draw[] (5,1.6) ellipse [x radius=3.5, y radius=3];
\node[at = {(5,4)}] {\textbf{Turing decidable}};
\node[at={(3.5, 3.5)}] {\(A_{\textsf{DFA}}\)};
\node[at={(3.5, 3)}] {\(A_{\textsf{NFA}}\)};
\node[at={(5, 3.5)}] {\(E_{\textsf{DFA}}\)};
\node[at={(5, 3)}] {\(EQ_{\textsf{DFA}}\)};
\node[at={(6.5, 3.5)}] {\(A_{\textsf{CFG}}\)};
\node[at={(6.5, 3)}] {\(E_{\textsf{CFG}}\)};
\node[at={(3.5, 2.25)}] {\(\{0^k 1^k 2^k \mid k \geq 0\}\)};
\node[at={(6.5, 2.25)}] {\(\{ww \mid w \in \{0, 1\}^*\}\)};
\draw[] (5,0.4) ellipse [x radius=2, y radius=1.2];
\node[at = {(5,1.3)}] {\textbf{CFLs}};
\draw[] (5,0.2) ellipse [x radius=1.6, y radius=0.7];
\node[at = {(5,0.2)}] {\textbf{Regular languages}};
\end{tikzpicture}
\end{figure}
Turing machines are stronger than the other models of computation we have seen in class
(such as finite automata or PDAs).
In fact, the \textbf{Church-Turing thesis} says that any real-world
algorithm can be computed by a Turing machine.
\marginnote{%
For a quick review of Turing machines and the definitions of Turing recognizable and Turing
decidable, read \textbf{Section 3.1} of the
textbook.}
To show that a language \(B\) is \textbf{Turing-recognizable}, we will want to construct a TM \(M\)
that recognizes \(B\). This means that \(M\) accepts every string in \(B\), and either enters the
reject state or loops forever on strings not in \(B\).
To show that a language \(B\) is \textbf{Turing-decidable}, we will want to construct a TM \(M\)
that \emph{decides} \(B\). This means that \(M\) accepts every string in \(B\), and enters the
reject state and halt on strings not in \(B\). We will need to make sure that \(M\) always halts,
and does not loop forever on any input.
Note that every Turing-decidable language is Turing-recognizable,
since every Turing decider is a TM.
We have seen several examples of Turing decidable languages in class,
listed in \cref{turing-decidable-venn-diagram}.
We can use the deciders for these languages as subroutines to
prove that other languages are decidable. We will see this in the examples. We have also seen one example of a language that is Turing recognizable but not decidable:
\begin{align*}
A_{TM} = \{\angles{M, w} \mid \text{\(M\) is a TM and \(M\) accepts \(w\)}\}.
\end{align*}
There are also some languages such as \(EQ_{CFG}\) and \(\overline{A_{TM}}\) that are not
T-recognizable. We will learn more about undecidable and unrecognizable languages next week.
Here are some example problems for proving a language is Turing-decidable.
\begin{example}
Show that
\[
\set{\angles{M, w} \mid \text{\(M\) is a TM such that \(M\) on \(w\) moves left at some point}}
\]
is decidable.
\end{example}
\begin{solution}
The idea is that we can build a decider that simulates \(M\) on \(w\) for some finite number of
steps, checking to see if \(M\) ever moves left. The key observation is that if \(M\) only moves
right for enough steps, the decider can be sure that \(M\) never moves left since once it has read the input, \(M\) can only read blank tape symbols and must end up looping.
What is enough steps? Even if it read the entirety of $w$, it could decide to move back to the left some finite number of steps after. However, after a number of steps exceeding the size of it's control when it's reading empty characters, it has surely hit a loop and it will keep doing the same thing.
We build the following decider:
\begin{mdframed}
On input \(\angles{M, w}\):
\begin{enumerate}
\item Simulate \(M\) on \(w\) for \(\mathsf{length}(w)\) steps.
If \(M\) ever moved left, then \textsc{accept}.
Otherwise, we know that \(M\) is now at the blank portion of the tape.
\item Simulate \(M\) for \(\size{Q_M}\) more steps, where \(\size{Q_M}\) is the number of
states in \(M\). If \(M\) ever moved left, then \textsc{accept}.
Otherwise, we know that \(M\) must have repeated a state,
and only moves right while looking at blanks in between this state.
Since the rest of the tape is blank,
\(M\) will keep going to the right going through the same sequence of states.
In particular, \(M\) will never move left, so we \textsc{reject}.
\end{enumerate}
\end{mdframed}
\end{solution}
\section{Closure properties for T-recognizable and T-decidable languages}
Turing-decidable languages are closed under union, concatenation, star, intersection, and
complement.
Turing-recognizable languages are closed under union, concatenation, star, and intersection. Here is
a counterexample showing that T-recognizable languages are \emph{not} closed under complement:
\(A_{TM}\) is T-recognizable, but we will see later in class that \(\overline{A_{TM}}\) is
T-unrecognizable.
\begin{theorem}
Turing-decidable languages are closed under union, concatenation, star, intersection, and
complement.
\end{theorem}
\begin{proof*}
Suppose \(A\) and \(B\) are Turing-decidable languages. Then there exists a TM \(M_A\) that
decides \(A\) and a TM \(M_B\) that decides \(B\). Since \(M_A\) and \(M_B\) are deciders, they
are guaranteed to halt on any input (either accepting or rejecting by halting).
\textbf{Union.} We construct a Turing decider \(M_\union\) for \(A \union B\).
\begin{mdframed}
On input string \(w\):
\begin{enumerate}
\item Run \(M_A\) on \(w\).
\item Run \(M_B\) on \(w\).
\item \textbf{Accept} if either \(M_A\) or \(M_B\) accepts. \textbf{Reject} if both reject.
\end{enumerate}
\end{mdframed}
\textbf{Concatenation.} We construct a Turing decider \(M_\circ\) for \({A \circ B}\).
\begin{mdframed}
On input \(w\):
\begin{enumerate}
\item Nondeterministically guess where we split \(w\) into two strings: \(w = w_1 w_2\).
\item Run \(M_A\) on \(w_1\).
\item Run \(M_B\) on \(w_2\).
\item \textbf{Accept} if on some branch of the computation, both \(M_A\) and \(M_B\) accept.
\textbf{Reject} otherwise.
\end{enumerate}
\end{mdframed}
\textbf{Star.} We construct a Turing decider \(M_*\) for \(A^*\).
\begin{mdframed}
On input \(w\):
\begin{enumerate}
\item Nondeterministically guess the number \(k\) of partitions.
Then guess where we split \(w\) into \(k\) strings: \(w = w_1 w_2 \dots w_k\).
\item Sequentially run \(M_A\) on \(w_1\), \(w_2\), \dots, \(w_k\).
\item \textbf{Accept} if on some branch of the computation, \(M_A\) accepts on all the
strings. \textbf{Reject} otherwise.
\end{enumerate}
\end{mdframed}
\textbf{Intersection.} We construct a Turing decider \(M_\intersect\) for \({A \intersect B}\).
\begin{mdframed}
On input string \(w\):
\begin{enumerate}
\item Run \(M_A\) on \(w\).
\item Run \(M_B\) on \(w\).
\item \textbf{Accept} if both \(M_A\) or \(M_B\) accept. \textbf{Reject} if either reject.
\end{enumerate}
\end{mdframed}
\textbf{Complement.} We construct a Turing decider \(M'\) for \(\overline{A}\).
\begin{mdframed}
On input string \(w\):
\begin{enumerate}
\item Run \(M_A\) on \(w\).
\item \textbf{Accept} if \(M_A\) rejects. \textbf{Reject} if \(M_A\) accepts.
\end{enumerate}
\end{mdframed}
The above algorithm for \(M'\) would not work if \(A\) were Turing-recognizable but not decidable,
because \(M_A\) might reject by looping forever on \(w\). Then \(M'\) would not halt and accept
\(w\), even though \(w\) is in \(\overline{A}\).
\end{proof*}
\begin{theorem}
Turing-recognizable languages are closed under union, concatenation, star, and intersection.
\end{theorem}
\begin{proof*}
Suppose \(A\) and \(B\) are Turing-recognizable languages. Then there exists a TM \(M_A\) that
recognizes \(A\) and a TM \(M_B\) that recognizes \(B\). Since \(M_A\) and \(M_B\) are TMs, they
can accept, reject by halting, or reject by looping.
\textbf{Union.}
We construct a TM \(M_\union\) that recognizes \(A \union B\). We need to slightly modify
our proof for Turing-decidable languages. Since \(M_A\) or \(M_B\) might reject by looping
forever, we have to run the two machines in parallel on the input (rather than sequentially).
\begin{mdframed}
On input string \(w\):
\begin{enumerate}
\item Run \(M_A\) and \(M_B\) on \(w\) in parallel.
\item \textbf{Accept} if either \(M_A\) or \(M_B\) accepts. \textbf{Reject} otherwise.
\end{enumerate}
\end{mdframed}
\textbf{Concatenation.}
Same algorithm as the proof for Turing-decidable languages. We can run the machines sequentially
as before. Note that if \(M_A\) or \(M_B\) (or both) reject by looping, then the new machine also
rejects by looping (in that branch). This is okay because in any given branch, we want to reject
if either machine rejects on its section of \(w\).
\textbf{Star.}
Same algorithm as the proof for Turing-decidable languages. We can run \(M_A\) sequentially on the
strings \(w_1, w_2, \dots w_k\) as before. Note that if \(M_A\) rejects by looping on one of the
strings \(w_i\), then the new machine rejects the whole string \(w=w_1 w_2 \dots w_k\) by looping
(in that branch). This is okay because in any given branch, we want to reject if \(M_A\) rejects
any of \(w_1\), \(w_2\), \(\dots\), \(w_k\).
\textbf{Intersection.}
Same algorithm as the proof for Turing-decidable languages. Note that if \(M_A\) or \(M_B\) (or
both) reject by looping, then our machine \(M_\cap\) would also reject by looping. This is okay
because we want to reject if \(M_A\) or \(M_B\) rejects anyway.
\end{proof*}
The following table summarizes important closure properties for the classes of languages we have
learned about.
\begin{center}
\begin{tabular}{r c c c c c c}
\toprule
class & \(A^*\) & \(A \circ B\) & \(A \union B\) & \(A \intersect B\) & \(\overline A\) & \(A \setminus B\) \\
\midrule
regular & yes & yes & yes & yes & yes & yes \\
context-free & yes & yes & yes & no\(^\dagger\) & no & no \\
decidable & yes & yes & yes & yes & yes & yes \\
recognizable & yes & yes & yes & yes & no & no \\
\bottomrule
\end{tabular}
\smallskip
{\footnotesize \(\dagger\): yes if intersected with a regular language}
\end{center}