*contradiction* | 1 | $x \land y$ | `0001` | *conjunction* (**AND**) | 2 | $x \land \lnot y$ | `0010` | $x \not\to y$

$x \not\supset y$

$\lnot(x \to y)$

$\lnot(x \supset y)$

*nonimplication*

*abjunction* | 3 | $x$ | `0011` | *identity* of $x$ | 4 | $\lnot x \land y$ | `0100` | $x \not\leftarrow y$

$x \not\subset y$

$\lnot(x \leftarrow y)$

$\lnot(x \subset y)$

*converse-nonimplication* | 5 | $y$ | `0101` | *identity* of $y$ | 6 | $\lnot(\lnot x \land \lnot y) \land \lnot(x \land y)$ | `0110` | $x \oplus y$

$x \not\leftrightarrow y$

$x \not\equiv y$

*exclusive disjunction* (**XOR**) | 7 | $\lnot(\lnot x \land \lnot y)$ | `0111` | $x \lor y$

*disjunction* (**OR**) | 8 | $\lnot x \land \lnot y$ | `1000` | $x \downarrow y$

*negation of disjunction* (**NOR**) | 9 | $\lnot(\lnot(\lnot x \land \lnot y) \land \lnot(x \land y))$ | `1001` | $x \leftrightarrow y$

$x \equiv y$

*biconditional*

*equality* (**XNOR**) | 10 | $\lnot y$ | `1010` | *negation* of $y$ | 11 | $\lnot(\lnot x \land y)$ | `1011` | $x \leftarrow y$

$x \subset y$

*converse-implication* | 12 | $\lnot x$ | `1100` | *negation* of $x$ | 13 | $\lnot(x \land \lnot y)$ | `1101` | $x \to y$

$x \supset y$

*implication* | 14 | $\lnot(x \land y)$ | `1110` | $x \uparrow y$

*negation of conjunction* (**NAND**) | 15 | $\lnot(x \land \lnot x)$ (alternatively, $\lnot(y \land \lnot y)$) | `1111` | $\top$

*tautology* With 2 input bits ($x$ and $y$) and 2 elementary logic operators ($\land$ and $\lnot$), we can construct infinitely many 2-bit Boolean expressions, but the possibility of their truth tables is finitely bounded (to $2^{2^2} = 16$). If two expressions yield the same truth table, we consider their corresponding logic gates as equivalent (and we can replace one by another, usually its simplest form). [QED] By Claim 5 and Claim 6, we are now convinced that Boolean circuits, as our very first model of computation, are capable of the computation $\mathcal{C}$ for a finite language as we defined. Claim 6 further says that AND gates and NOT gates are sufficient to construct Boolean circuits for any computation, so even though we have $2^{2^1}=4$ types of 1-bit gates and $2^{2^2}=16$ types of 2-bit gates, if we choose the correct **functionally complete** (or **expressive adequate**) set, i.e., $\{\land, \lnot\}$, two types of gates are sufficient. Using the same technique we applied when justifying Claim 6, we may prove that $\{\lor, \lnot\}$ is also a functionally complete set. Even better, we have that $\{\uparrow\}$ ("*Sheffer stroke*"), even though it's a singleton of NAND gate, is also a functionally complete set that is sufficient to implement any Boolean circuits; so is $\{\downarrow\}$ ("*Peirce arrow*"), the singleton of NOR gate. There is a deeper reason for this [@pelletier1990post], but I won't go into details since neither logic nor Boolean algebra is my main topic here. Notice that, Claim 6 justifies that there *exists* a Boolean circuit for every computation, but it does not provide much clue for constructing such a circuit for a specific computation. However, for most of us this is no rocket science: Given an arbitrary truth table, we write down its disjunctive (or conjunctive) normal form, (optionally) simplify it and rewrite it using prescribed functionally complete set of logic gates ($\{\land, \lnot\}$, in our case). See a concrete example: | $x$ | $y$ | $z$ | Output $b$ | | - | - | - | -: | | 0 | 0 | 0 | 1 | | 0 | 0 | 1 | 0 | | 0 | 1 | 0 | 0 | | 0 | 1 | 1 | 0 | | 1 | 0 | 0 | 1 | | 1 | 0 | 1 | 0 | | 1 | 1 | 0 | 0 | | 1 | 1 | 1 | 1 | \begin{align*} b &= (\lnot x \land \lnot y \land \lnot z) \lor (x \land \lnot y \land \lnot z) \lor (x \land y \land z) \\ &= \lnot\big(\lnot(\lnot x \land \lnot y \land \lnot z) \land \lnot(x \land \lnot y \land \lnot z)\big) \lor (x \land y \land z) \\ &= \lnot\Big(\big(\lnot(\lnot x \land \lnot y \land \lnot z) \land \lnot(x \land \lnot y \land \lnot z)\big) \land \lnot(x \land y \land z)\Big) \end{align*} (In most engineering treatments, notations like $\bar{x}\bar{y}\bar{z} + x\bar{y}\bar{z} + xyz$ are commonly used, but in commutative algebra the operator "$+$" has a different meaning that is analogous to XOR rather than OR, so we stick with the conventional logic symbols here.) The best thing about Boolean circuits is that people can implement arbitrary computation with finite input by building up sophisticated combination from only a minimal set of elementary logic gates so that the production of such circuits can be industrialized -- that's why this model is sometimes called **combinational logic**, although digital circuits may be used for performing many general-purpose computing tasks rather than just representing "logic". However, there is a price to pay beneath their designs: By allowing only logic gates from a minimal set, the complexity of circuits grows progressively. Reducing the number of gates required and optimizing the size of the circuit are engineering problems, but from a theoretic perspective, they are inherently hard -- there is a bunch of **circuit complexity** classes and related theories that study them. It was proved that a general computation involving $n$ input bits requires Boolean circuit of size $\Theta\left(\frac{2^n}{n}\right)$ [@shannon1949synthesis], but it is not known whether a specific computation $\mathcal{C}$ has a superpolynomial lower bound for the sizes of corresponding Boolean circuits. Why are some circuits so complex? Because *computation can be hard*. We can think of this unavoidable intractability in an alternative way: Computation for a finite language is trivial by using a lookup table (which can be seen as an equivalent model of computation); deciding a finite language with length $n$ requires at most $2^n$ entries in the lookup table for a total mapping, assumed that this mapping (of computation) is completely random. Sometimes we can find a "pattern" in the computation, so we don't need the whole exponentially large table to look at -- we can do a lookup in a polynomial or even constant-size sub-table and decide the language efficiently. Sometimes we can't: There are still many computationally hard problems, if we could ever write down a lookup table for its problem-solution mapping, the solutions will narrate complete pseudorandomness (Just think of a decision problem like "Generate a random bit from any input"). If we try to convert the lookup table to its equivalent Boolean circuit, or even simulate the computation using a much more advanced model (e.g., Turing machine, x86 instructions, programming in Java, ...), easy problems are still easy^[The complexity class of problems decidable by uniform Boolean circuits of polynomial size and polylogarithmic depth is $\mathsf{NC}$. Such problems may also be decided efficiently on modern computers. However, it is an open question whether $\mathsf{NC}=\mathsf{P}$.], while hard problems will always be hard, because our choice of model of computation will not essentially change the algorithmic hardness of a problem; it only changes our view of dealing with that problem. As shown by the above thought experiment, there is often a trade-off between certain complexities: Either we have a big lookup table that exhausts time and space, or we try to build up a circuit with a great deal of wires and gates. But the most important conclusions we reached in this section, are that (1) Finite language corresponds to the simplest form of computation; (2) If we can state a problem in a finite language, then it is decidable (logically); (3) Decision problems for finite languages can be solved by Boolean circuits, our first model of computation; (4) Boolean circuits can be constructed using a minimal set of logic gates, e.g., AND gates and NOT gates. # Information Towards Simulation Before finishing the introduction to the first model of computation, we recap by some fundamental (but inspiring) questions: 1. **Is our model of computation *sound*? In other words, does it correctly simulate the computation $\mathcal{C}$?** Yes. Boolean circuits are sound by construction, as long as Boolean algebra is deterministic. 2. **Is our model of computation *complete*? In other words, can it simulate every computation $\mathcal{C}$ (by deciding a finite language $\mathcal{L}_\mathsf{problem}$)?** Yes. As argued by Claim 5. So far as the mapping of computation $\mathcal{C}$ is total, we can always construct a Boolean circuit that decides the finite language $\mathcal{L}_\mathsf{problem}$. 3. **What is the limitation of Boolean circuits? In other words, what kinds of computation is it incapable to simulate?** It wouldn't be too hard to see the fatal issue about Boolean circuits: They only take *finite* inputs. So if we have a language containing an unbounded number of strings, no Boolean circuit will ever be able to decide it, let alone doing any useful computation. However, even if we consider only finite languages, a fixed Boolean circuit still has a bounded ability for computation. We show that by proposing some more questions. 4. **Does there exist a Boolean circuit that simulates the computation of every possible Boolean circuit?** No. This is an easy question, since "the set of all Boolean circuits" is infinite (I would also argue that it's also *countably infinite*), and clearly the set of all finite languages is also infinite. As a fun way of informal thinking, if there is such an omnipotent Boolean circuit that can simulate every other circuit, someone would already have patented it and earned a deserved fame. Certainly this would not happen! 5. **Does there exist a Boolean circuit with $n+1$ input bits that simulates the computation of a Boolean circuit with $n$ input bits?** Yes, there is. Our new Boolean circuit should work like this: Take the first $n$ bits, replicate the original Boolean circuit with $n$ input bits that we already have, and use that one's output as its output. Our constructed circuit also takes an extra bit, but that doesn't matter and can be safely dropped; if we must use it, we could use it to construct a $\top$ and put the output bit and the original circuit together with an AND gate. 6. **Does there exist a Boolean circuit with $n$ input bits that simulates the computation of a Boolean circuit with $n+1$ input bits?** Intuitively, one would say no, because there is no way for a circuit with only $n$ input bits to accommodate for $n+1$ input bits. But that reasoning can be hard to justify -- What if we just construct a circuit, sending the extra input bit to a null device as a "dumb" placeholder? How would you argue that it can never do the computation as the original circuit does? In fact, there is an obvious counterexample: If a Boolean circuit with $n+1$ input bits implements a tautology (i.e., always outputs 1), then it surely can be simulated by a Boolean circuit with $n$ input bits, because we don't expect an extra bit to offer any more useful "information" for this computation. Whenever we fail to justify something convincingly, a new theory emerges. *Information theory* is the one we are looking for. The *information entropy* (Shannon entropy) of a random variable $X$ is defined as [@shannon2001mathematical] $$\operatorname{H}(X) = -\Sigma_{x \in \mathcal{X}} p(x) \log p(x)$$ Given a Boolean circuit of input size $n$, consider its equivalent lookup table which contains $2^n$ entries of solutions to corresponding decision problems. We need a $2^n$-bit string to encode this table. Let $X$ be this $2^n$-bit string. What does information theory tell us? If we *already know* with certainty that every problem in our finite language $\mathcal{L}_\mathsf{problem}$ has answer $1$ (tautology), then our encoded lookup table would be the string $x = \underbrace{11\cdots1}_{2^n\ \textrm{1s}}$. There is no point of computing here (since the answer is assumed a priori), and since $p(x) = 1$, we get the entropy of this lookup table $\operatorname{H}(X) = 0$, that is, this "computation" does not provide any extra information of our interest. The usefulness of computation dramatically emerges when we are *not sure* about the answers. If we know that every problem in our finite language $\mathcal{L}_\mathsf{problem}$ has exactly the same answer, but we don't know what it is (that we must find out), then our encoded lookup table would be either $x_0 = \underbrace{00\cdots0}_{2^n\ \textrm{0s}}$ or $x_1 = \underbrace{11\cdots1}_{2^n\ \textrm{1s}}$. Since we have no a priori knowledge about the universal answer, we would not assert which version of lookup table is more likely be yielded by computation. Thus, it is natural to assume that $p(x_0) = p(x_1) = \frac{1}{2}$. The entropy of this lookup table is $\operatorname{H}(X) = -(p(x_0)\log_2 p(x_0) + p(x_1)\log_2 p(x_1))$ = 1, that is, this computation can provide as much information as exactly 1 *bit* (or "1 shannon", to emphasize the fact that this is a measure of information entropy), even though its lookup table contains strings of $2^n$ bits. In most interesting computations, we have no idea how the problem space $\mathcal{L}_\mathsf{problem}$ is related to the solutions at all, and the encoded lookup table, *in the scope of our a priori knowledge*, is a *uniformly random* string. Practically we are unable to enumerate all its possibilities, but we can still assume that given all $2^n$ entries, each of which has a fair probability of being 0 or 1, then a probable lookup table appears with probability $p(x) = 2^{-2^n}$. The entropy of this lookup table is $$\operatorname{H}(X) = -2^{2^n} \cdot 2^{-2^n} \log_2 2^{-2^n} = 2^n$$ This is an interesting result. It shows that as our input size (in the finite language $\mathcal{L}_\mathsf{problem}$) increases, the information provided by the naïve "lookup table approach" of computation booms exponentially, assuming that we have no a priori knowledge on how to do the computation efficiently. Notice that this is merely a theoretical *upper bound*; in reality, if we can ever build a Boolean circuit to decide a problem so that we do not have to use a lookup table, then we already have found some "pattern" in the computation thus our model of computation would not have included so much useful information (as an exponentially large lookup table). With the notion of information entropy, we can argue about the statement we made before: The information contained in a Boolean circuit with $n$ input bits has an upper bound of $2^n$ bits (which is equivalent to a lookup table model), but a Boolean circuit with $n+1$ input bits can carry as much information as $2^{n+1}$ bits. Thus, it is generally impossible to simulate any Boolean circuit with $n+1$ input bits using a circuit with equal to or less than $n$ bits. Although in some special cases we can, e.g., imagine a circuit with $n+1$ input bits but only carries an information entropy of $2^n$, then the simulation should be tractable for an $n$-bit circuit; moreover, if a big circuit always outputs either "0" or "1" for any problem in the language, then it can be simulated easily by a nontrivial circuit, since its information entropy is merely $1$ bit. Clearly, the ability of a model of computation to simulate one another is restricted by the information it maintains. Thus, we can sketch a hierarchy for all Boolean circuits: any 1-bit circuit can be simulated by a 2-bit circuit, any 2-bit circuit can by simulated by a 3-bit circuit, ..., any $n$-bit circuit can be simulated by a $(n+1)$-bit circuit, but generally not contrariwise. Thinking about this hierarchy intuitively, we would conclude that every Boolean circuit with more input bits $n$ (corresponding to a larger finite language $\mathcal{L}_\mathsf{problem} \subseteq \{\Sigma^k\ |\ 0 \leq k \leq n\}$) has a stronger ability of computation than its predecessors. As this chain goes forever, a Boolean circuit with $\infty$ input bits would have the ultimate ability to decide any finite language (thus it can simulate every other Boolean circuit), which cannot be achieved because we can never build such a circuit of infinitely many input bits in reality. Therefore, there is no such model that we can use to decide problems in an "infinite" language. Is that true? In a way, yes. Remember that our choice of model of computation will not essentially change the algorithmic tractability of a problem, so if a problem asks for an impossible decision on arbitrary, recursively large things, we just cannot solve it, no matter from which model of computation we seek for help. However, if a problem asks for some decisions on very specific things, e.g., Does the string "`a`" consist of only letter `a`? Does the string "`aaa`" consist of only letter `a`? Does the string "`aaaba`" consist of only letter `a`? We can immediately tell that there is a general mechanizable way of deciding it, regardless of the length of the questioned string. Such a language $\mathcal{L}_\mathsf{problem} = \{\Sigma^k\ |\ k \in \mathbb{N}\}$ is infinitely large but every problem defined by it is decidable; as it's not even a finite language we can't build a Boolean circuit for it, that doesn't mean we can't solve it! The implication is that we would miss out so much between *finite languages* and *arbitrarily infinite languages*^[I invented that word; it should be formally called *recursively enumerable languages*.], if we just restrict our mindset to finite languages and pure Boolean circuits that recognize them. What we have achieved so far, is to argue that *Boolean circuits* are a sound and complete model of computation for deciding *finite languages*, with all the reasoning done in two powerful *arbitrarily infinite languages*: English as an ambiguous, informal natural language; and mathematics as a less ambiguous, semi-formal symbolic language. Naturally, we would think that it is feasible to describe and reason about a language via a more expressive one, as well as to simulate a model of computation using a more powerful one. We will demonstrate this in the next section: First we formalize our model of computation (Boolean circuits), with its syntax defined as a *context-free grammar*. Then we simulate the computation of Boolean circuits using *total functional programming*, by implementing it as an embedded domain-specific language. Lastly, we show the semantic equivalence of two programs (combinational circuits), using notions from *deductive logic*. All these approaches correspond to something in between finite languages and arbitrarily infinite languages; they have sufficient power for formalizing Boolean circuits, our first simple model of computation. # Formalizing Combinational Logic In Section 1 we introduced finite languages, as the simplest representing form of computation. In Section 2 we demonstrated that Boolean circuits, as a model of computation, may be used to decide finite languages, although it can carry an exponential level of complexity in worst cases. Building a circuit physically may be challenging, but simulation (either by a bigger circuit or by a more powerful model of computation) of them is possible, as we concluded in Section 3. To enable ourselves some formal thinking, and for practical reasons, to simulate this model on a computer, we must eliminate any ambiguity and present our precise ideas in a symbolic way. The finite language we defined in Section 1 is already a formal language, but the model of computation which decides it, that is, Boolean circuits, is not formalized by us yet. Thinking of it as a minimal programming language, we may describe our model formally with regards to two supporting components: its syntax and its semantics. Both are critical components for any concrete implementation of a model of computation. ## Syntax Visually, Boolean circuits are planar structures. In order to give a formal description of a circuit, instead of drawing the circuit diagram, we use the language of combinational logic, which is a more mathematical treatment and more easily typed into a computer. The whole piece of its syntax is simple, in the form of a context-free grammar: \begin{align*} t &::= \mathbf{0}\ |\ \mathbf{1} \\ b &::= t\ |\ \lnot b_0\ |\ (b_0 \land b_1) \\ \end{align*} := (The first rule must be applied $n$ times) The restriction on the number of terminals is necessary since the Boolean circuit generated by a given grammar has exactly $n$ input bits. Note that even if with that restriction, the resulting language is still not a finite one -- the combination of $\{\land, \lnot\}$ can generate infinitely many Boolean expressions, correspondingly, we can build infinitely many circuits receiving a finite number of input bits. However, as shown before, a Boolean circuit with $n$ input bits can have at most $2^{2^n}$ diverse truth tables, by the pigeonhole principle, if we write down more than $2^{2^n}$ Boolean expressions, some of them must have identical truth tables, and we consider such expressions as equivalent. In this way, this seemingly context-free syntax of Boolean circuits can be interpreted as a description of finite languages that determines some unique computations, as we desired: $$\mathcal{L}(b) = \{ w\ |\ w \textrm{ is the bit-string decided by the Boolean circuit of } b \} $$ The above syntax is easily translated into Standard ML definitions: ```sml datatype truth = T | F datatype boolean = Value of truth | Not of boolean | And of (boolean * boolean) ``` ## Semantics Now we formally define the semantics of Boolean circuits with respect to their syntactic forms. $$\textrm{EB-Cst} : \frac{}{\langle t, \sigma \rangle \downarrow t} $$ $$\textrm{EB-NegT} : \frac{\langle b_0, \sigma \rangle \downarrow \mathbf{1}}{\langle \lnot b_0, \sigma \rangle \downarrow \mathbf{0}} $$ $$\textrm{EB-NegF} : \frac{\langle b_0, \sigma \rangle \downarrow \mathbf{0}}{\langle \lnot b_0, \sigma \rangle \downarrow \mathbf{1}} $$ $$\textrm{EB-AndT} : \frac{\langle b_0, \sigma \rangle \downarrow \mathbf{1} \qquad \langle b_1, \sigma \rangle \downarrow t}{\langle b_0 \land b_1, \sigma \rangle \downarrow t} $$ $$\textrm{EB-AndF} : \frac{\langle b_0, \sigma \rangle \downarrow \mathbf{0}}{\langle b_0 \land b_1, \sigma \rangle \downarrow \mathbf{0}} $$ Notice that since Boolean circuit is a stateless model, the environment $\sigma$ for evaluation is vacuous. We preserve it in the derivation rules anyway, since almost all our future models of computation will be stateful. The semantics reveals some important properties of our model of computation. (1) The functionally complete set $\{\land, \lnot\}$ is used for this model. (2) The rule of $\textrm{EB-AndF}$ implements a short-circuit evaluation. (3) The evaluation is deterministic. Given a Boolean expression with all variables bound, one can construct a unique derivation tree that leads to a single terminal of either $\mathbf{0}$ or $\mathbf{1}$. (4) The evaluation is guaranteed to terminate. In a practical programming language, we could simulate it with a recursive function (or an equivalent loop), then we clearly see that it produces only reduced structural forms for each run of recursion. The formal semantics is translated into a Standard ML function that simulates the evaluation: ```sml fun eval (Value T) = T | eval (Value F) = F | eval (Not b0) = (case eval b0 of T => F | F => T) | eval (And (b0, b1)) = (case eval b0 of T => eval b1 | F => F) ``` ## Proof of Circuit Equivalence The formalization allows us to prove the equivalence of two circuits, or to verify that a circuit meets some specification using purely logic derivations, without having to write down and compare their truth tables. We roughly demonstrate this by a quick example. > **Example 7.** Extend Boolean circuits with an OR gate ($\lor$), given the following rules of semantics: > > $$\textrm{EB-OrT} : \frac{\langle b_0, \sigma \rangle \downarrow \mathbf{1}}{\langle b_0 \lor b_1, \sigma \rangle \downarrow \mathbf{1}}$$ > $$\textrm{EB-OrF} : \frac{\langle b_0, \sigma \rangle \downarrow \mathbf{0} \qquad \langle b_1, \sigma \rangle \downarrow t}{\langle b_0 \lor b_1, \sigma \rangle \downarrow t}$$ > > Show the equivalence of the following circuits: > > $$b_0 \lor b_1 \sim \lnot(\lnot b_0 \land \lnot b_1)$$ To show that $\langle b_0 \lor b_1, \sigma \rangle \downarrow t \iff \langle \lnot(\lnot b_0 \land \lnot b_1), \sigma \rangle \downarrow t$, we do structural inductions from both directions. *Left-to-right* ("$\Rightarrow$"): Suppose we have $\langle b_0 \lor b_1, \sigma \rangle \downarrow t$. Then there must be a derivation of $\langle b_0 \lor b_1, \sigma \rangle \downarrow t$. By $\textrm{EB-OrT}$ and $\textrm{EB-OrF}$, the derivation must have either of the following forms: * *Case* $$\mathcal{E} = \frac{\langle b_0, \sigma \rangle \downarrow \mathbf{1}}{\langle b_0 \lor b_1, \sigma \rangle \downarrow \mathbf{1}}$$ Here we have $t = \mathbf{1}$. By $\textrm{EB-NegT}$ on the derivation of $\langle b_0, \sigma \rangle \downarrow \mathbf{1}$, we get $\langle \lnot b_0, \sigma \rangle \downarrow \mathbf{0}$. By $\textrm{EB-AndF}$ on the previous derivation, we get $\langle \lnot b_0 \land \lnot b_1, \sigma \rangle \downarrow \mathbf{0}$. By $\textrm{EB-NegF}$ on the previous derivation, we get $\langle \lnot(\lnot b_0 \land \lnot b_1), \sigma \rangle \downarrow \mathbf{1}$, as desired. * *Case* $$\mathcal{E} = \frac{\langle b_0, \sigma \rangle \downarrow \mathbf{0} \qquad \langle b_1, \sigma \rangle \downarrow t}{\langle b_0 \lor b_1, \sigma \rangle \downarrow t}$$ By $\textrm{EB-NegF}$ on the derivation of $\langle b_0, \sigma \rangle \downarrow \mathbf{0}$, we get $\langle \lnot{b_0}, \sigma \rangle \downarrow \mathbf{1}$. * *Case* $\langle b_1, \sigma \rangle \downarrow \mathbf{0}$. Here we have $t = \mathbf{0}$. By $\textrm{EB-NegF}$, we get $\langle \lnot{b_1}, \sigma \rangle \downarrow \mathbf{1}$. By $\textrm{EB-AndT}$, we get $\langle \lnot{b_0} \land\lnot{b_1}, \sigma \rangle \downarrow \mathbf{1}$. By $\textrm{EB-NegT}$, we get $\langle \lnot(\lnot{b_0} \land\lnot{b_1}), \sigma \rangle \downarrow \mathbf{0}$, as desired. * *Case* $\langle b_1, \sigma \rangle \downarrow \mathbf{1}$. Here we have $t = \mathbf{1}$. By $\textrm{EB-NegT}$, we get $\langle \lnot{b_1}, \sigma \rangle \downarrow \mathbf{0}$. By $\textrm{EB-AndF}$, we get $\langle \lnot{b_0} \land\lnot{b_1}, \sigma \rangle \downarrow \mathbf{0}$. By $\textrm{EB-NegF}$, we get $\langle \lnot(\lnot{b_0} \land\lnot{b_1}), \sigma \rangle \downarrow \mathbf{1}$, as desired. *Right-to-left* ("$\Leftarrow$"): Suppose we have a derivation of $\langle \lnot(\lnot b_0 \land \lnot b_1), \sigma \rangle \downarrow t$. * *Case* $t = \mathbf{0}$. By $\textrm{EB-NegT}$, we must have $\langle \lnot{b_0} \land \lnot{b_1}, \sigma \rangle \downarrow \mathbf{1}$. By $\textrm{EB-AndT}$ and $\textrm{EB-NegF}$, we must have $\langle b_0, \sigma \rangle \downarrow \mathbf{0}$ and $\langle b_1, \sigma \rangle \downarrow \mathbf{0}$. Then we can apply $\textrm{EB-OrF}$ to get the desired derivation of $\langle b_0 \lor b_1, \sigma \rangle \downarrow \mathbf{0}$. * *Case* $t = \mathbf{1}$. By $\textrm{EB-NegF}$, we must have $\langle \lnot{b_0} \land \lnot{b_1}, \sigma \rangle \downarrow \mathbf{0}$. By $\textrm{EB-AndT}$ and $\textrm{EB-AndF}$, two cases emerge: * *Case* $\langle \lnot{b_0}, \sigma \rangle \downarrow \mathbf{0}$. By $\textrm{EB-NegT}$, we must have $\langle b_0, \sigma \rangle \downarrow \mathbf{1}$. Then we can apply $\textrm{EB-OrT}$ to get the desired derivation of $\langle b_0 \lor b_1, \sigma \rangle \downarrow \mathbf{1}$. * *Case* $\langle \lnot{b_1}, \sigma \rangle \downarrow \mathbf{0}$. By $\textrm{EB-NegT}$, we must have $\langle b_1, \sigma \rangle \downarrow \mathbf{1}$. Then by either $\textrm{EB-OrT}$ or $\textrm{EB-OrF}$, we get the desired derivation of $\langle b_0 \lor b_1, \sigma \rangle \downarrow \mathbf{1}$. [QED] Now that we have verified the equivalence of $b_0 \lor b_1$ and $\lnot(\lnot b_0 \land \lnot b_1)$, we may just implement an OR gate as a combination of AND gates NOT gates -- In Section 2 we reached the same conclusion by looking at truth tables of Boolean expressions, but that approach won't scale as well as this one: This time we proved the correctness of a "program", without enumerating on its (potentially) exponentially large truth tables. ```sml (* OR-gate implementation *) fun or (b0, b1) = Not (And (Not b0, Not b1)) ``` As shown by the above example, even the most obvious equivalence can require an involved semi-formal proof (and it indeed takes a significant amount of work to check for validity). Unlike those proofs of theorems done by mathematicians, formal verification of circuits (as well as programs) is almost always mechanized in practice. Example 7 provides a side proof for *De Morgan's law* in Boolean algebra; in mathematics, laws and theorems tend to have relatively simple, comprehensive forms (that's why they can be useful in constituting proofs for other bigger, harder propositions); however, in electronic or software engineering, circuits or programs can have very complicated constructions. By relating logic deductions to real-world verifications, one may find that such proof techniques can provide an intuitive way of understanding the basis for correctness of systems, contrariwise, the mechanization of verifying engineered systems may shed some light on how we attack difficult mathematical problems. > **Exercise 8.** Extend Boolean circuits with the semantics of $\bot$: > > $$\textrm{EB-Contra} : \frac{}{\langle \bot, \sigma \rangle \downarrow \mathbf{0}}$$ > > Then > > $$\bot \sim b_0 \land \lnot b_0$$ The above equivalence shows that the law of excluded middle holds in combinational logic. From an engineer's perspective, this gives us the possibility of constructing a gate that always outputs 0, using purely AND gates and NOT gates: (although there's little realistic necessity) ```sml (* a gate that always outputs 0 *) fun contra (b0, b1) = And (b0, Not b0) ``` Instead of *semantics*, engineers have *specifications*. However, if we can formalize our goal, it's not terribly hard to give it some semantics and verify that our circuit (program) is equivalent to the desired one. Let's see one last example: > **Exercise 9.** (**Multiplexer**) Consider the semantics of a 2-to-1 multiplexer: > > $$\textrm{EB-Mux0} : \frac{\langle s, \sigma \rangle \downarrow \mathbf{0} \qquad \langle a, \sigma \rangle \downarrow t}{\langle \mathsf{Mux}(a, b, s), \sigma \rangle \downarrow t}$$ > $$\textrm{EB-Mux1} : \frac{\langle s, \sigma \rangle \downarrow \mathbf{1} \qquad \langle b, \sigma \rangle \downarrow t}{\langle \mathsf{Mux}(a, b, s), \sigma \rangle \downarrow t}$$ > Then > > $$\mathsf{Mux}(a, b, s) \sim (a \land \lnot{s}) \lor (b \land s)$$ It may be formally verified that a 2-to-1 multiplexer (according to the specified semantics) is correctly implemented by an equivalent circuit $(a \land \lnot{s}) \lor (b \land s)$. ```sml (* 2-to-1 multiplexer *) fun multiplexer (a, b, s) = eval (or (And (a, Not s), And (b, s))) ``` # Where to Go from Here? In Section 4, we briefly studied Boolean circuits in its mathematical form, combinational logic, from three formal aspects: 1. *Syntax*, which is defined as a limited form of *context-free grammar*. 2. *Semantics*, which can be implemented as a *recursive* (but decidable) *function* in a programming language like Standard ML. Furthermore, we know that it is total so it will always terminate with a deterministic output. 3. *Circuit equivalence*, which can be justified by *logic deduction* (instead of the naïve approach of checking truth tables). Without all these formalization tools, we would not be able to tell more about a Boolean circuit, not to mention its software simulation and the verification of its correctness. The conclusion we reached in Section 3 is a vital one here; it claims that we can talk about a model of computation using a more expressive language and simulate it with a more powerful model. Technically speaking, the limitation of Boolean circuits as a model of computation is likely due to its finite number of input bits. However, modern computers have a finite number of input bits too; they can, undoubtedly, do much more useful computation than any plain Boolean circuits. To fill in the gap between their disparate abilities to compute, we just need one more thing: **memory**. Memory implies *states*, which might not look "purely mathematical", but they are so natural to physicians and engineers, and they are easily implemented and used in the construction of real machines. Consider a flip-flop circuit allowing for feedback, where the output does not only depend on the current input bits, but also on previous input bits; in other words, the computation maintains a "state" for its future inputs: ![**Figure 3.** An animated SR latch. (Source: [Wikimedia](https://en.wikipedia.org/wiki/File:SRLatch-lowres.gif). Created by Marble machine.)](https://i0.wp.com/dl.dropboxusercontent.com/s/k99861w2hy0se7d/sr-latch-lowres.gif){ width=25% } Such improved Boolean circuits are time-dependent, in the sense that a computation can be done on an unlimited *chronological sequence* of input bits, rather than on a finite set of input bits without any reference to its history. This gives us the intuition to disassemble a big, probably infinite string into a timely sequence and feed it into the machine, ask if it is accepted (or rejected). In this way, our ability to compute is no longer limited to finite languages. Circuits with states implement a *sequential logic*, and the resulting model of computation is often called a *finite state machine* (alternatively, a *finite automaton*). The language that can be decided by such a model of computation is the *regular language*, which is much more expressive than a finite language (i.e., a finite set of strings) since it can accommodate an infinite number of strings with a given pattern. Regular languages are of both theoretical and practical importance. In my next post of this series, I will review the regular language and its corresponding models of DFAs/NFAs, as well as how to deal with regular expressions (which is much of a practical programming issue) from scratch. Then I'll go down the Chomsky hierarchy and expand my collection of languages and their machines: * **Regular languages**. * **Context-free languages**. * **Context-sensitive languages**. * **Recursive decidable languages**. * Presburger arithmetic and decidability. * **Recursively enumerable languages**. * Peano arithmetic and undecidability (Gödel's incompleteness theorem). * Turing machines and the halting problem. Church-Turing thesis. ## More Circuits... Despite their limited power of computation, Boolean circuits are yet a useful theoretical model when studying **complexity theory**, since it can be used to represent a specific computational mapping for given finite input bits (although not for the whole language). In a normal setting of computation, we construct a circuit that implements our desired mapping from input to output; however, the $\textsf{CIRCUIT-SAT}$ (Boolean satisfiability) problem asks for a known circuit, whether there exists a set of input bits that evaluates to a certain output. Although we can easily show that this problem is *decidable*, no currently known algorithm can decide it efficiently (as it is proved to be NP-complete). Naturally, we would also ask for a known circuit, whether we can correctly reverse any (or even all) input bits from its output. Generally it is not possible for decision problems, but if the number of output bits is the same as the number of input bits, or we can show that the information entropy contained in input is equal to the entropy contained in output, then things can be different. While we can construct a reversible circuit trivially, it is sometimes desirable to design a scheme of computation such that one can reverse input bits only with a secret key; while without the right key, reversing input from output must be as difficult as solving an instance of $\textsf{CIRCUIT-SAT}$. These are very interesting topics of **cryptography**, where computational hardness assumptions form the theoretical basis for practical information security. On the other hand, most Boolean logic gates we used are irreversible (except for a few like $\lnot$); this result may be shown using the concept of information entropy. However, in the research field of **reversible computing**, it is required that every step (gate) of the computation is perfectly *reversible*. An example of a nontrivial reversible gate is $\textsf{XOR}_\textsf{reversible}(x, y) = (x, x \oplus y)$. By looking at its truth table, all possible outputs of 2-bit combinations $\{(0, 0), (0, 1), (1, 0), (1, 1)\}$ appear with the same probability, so we can immediately tell that the output maintains the same entropy (2 bits) as the input does. Given any output $(x, x \oplus y)$, it is sufficient to reverse the input value $(x, y)$: | $x$ | $y$ | Output $x$| Output $x \oplus y$ | | --- | --- | --------: | ------------------: | | 0 | 0 | 0 | 0 | | 0 | 1 | 0 | 1 | | 1 | 0 | 1 | 1 | | 1 | 1 | 1 | 0 | The quantum counterpart of $\textsf{XOR}_\textsf{reversible}$ is a [controlled NOT gate](https://en.wikipedia.org/wiki/Controlled_NOT_gate) (also CNOT), which is essential to the construction of **quantum computers**. Furthermore, as quantum circuits work in a *probabilistic* way, in contract to deterministic Boolean circuits, we are inspired to rethink of computation with an alternative mindset: From quantum circuits to finite automata, then [quantum Turing machines](https://en.wikipedia.org/wiki/Quantum_Turing_machine) and λ-calculus. Just because digital circuits are everywhere today and most of our software systems as well as programming languages work deterministically without any probable outcome, we should not easily take them for granted: What if quantum circuits were invented first, and how will computation be any different? After all, our abilities to compute are still bounded from two aspects: (1) Logically, Turing machines (and its equivalent models) cannot decide the halting problem, neither can its quantum counterpart do; (2) Physically, we can never implement a Turing machine of infinite memory, not even with a quantum computer, as restrained by the [Bekenstein bound](https://en.wikipedia.org/wiki/Bekenstein_bound). Nevertheless, quantum computers are particularly of interest in complexity theory^[Due to the fact that quantum computers are probabilistic, we can only talk about $\mathsf{BQP}$, which is the complexity class of problems that can be solved on a quantum computer in polynomial time *with a bounded error probability of $\frac{1}{3}$ for all instances*. Its classical counterpart is $\mathsf{BPP}$. But again, our knowledge about the relations among $\mathsf{BQP}$, $\mathsf{BPP}$, $\mathsf{P}$ and $\mathsf{NP}$ is woefully inadequate. Some believe that $\mathsf{BQP} = \mathsf{NP}$, that is, every problem that can be decided by a nondeterministic Turing machine may also be decided by a quantum computer efficiently with only a bounded error probability; however this has not been proven yet.], since designated algorithms that run on such circuits would allow us to attack many computationally hard problems in a more efficient, physically implementable way. := (To be continued.) # References