Folding in ZF

I have finally finished the tool for converting Isabelle verified IsarMathLib source to a blog entry. Writing a parser with Haskell’s Parsec library was a pleasure as usual. The post below was generated from the Fold_ZF.thy theory file that will be included in the next (1.6.2) release of IsarMathLib.

Having a parser opens the possibility of exporting the IsarMathLib source to different formats. I have been thinking what my next target should be and decided it should be TiddlyWiki.

Why TiddlyWiki?

There are two most important features that a good presentation of formalized mathematics should have.

1. Mathematical notation including mathematical symbols.

You can not effectively communicate mathematics without mathematical symbols. Most theorem provers designers seem not to understand this. It is a mystery for me why the designers of programming languages are stuck at the set of symbols that were present on those typewriters that people used to talk to IBM mainframes in the sixties. It obviously wastes the bandwith of human visual cognition channel that can recognize thousands of symbols easily and fast. Fortunately Isabelle is an exception to this – it allows to define any number of custom symbols and specify how they should be typeset in the proof document. As for TiddlyWiki the jsMath plugin allows beautiful display of LaTeX math notation, much better that what you can see in this blog.

2. Ways to deal with excessive verbosity of formalized math

Formalized proofs contain much more details than informal ones. It is good in some way that a machine will not accept proofs written in style “using methods similar to those in [1] modified in a way analogous to [2] we can show that …” . On the other hand having too much detail can make it difficult to see the main idea and the big picture of a proof. Isabelle’s Isar language allows to write structured proofs where the proofs of the main points can be can be separated in the whole proof text so they can be skipped by the reader if the fact is obvious to him/her. TiddlyWiki’s NestedSlidersPlugin would allow to expand such sub-proofs only on demand, so that they don’t obstruct the flow of the main proof.

Another aspect of verbosity of formal proofs is that they typically reference many more other theorems than a typical informal proof. It is sometimes difficult to see which of the referenced theorems are essential to understanding of the proof and which ones are only the glue logic. Isabelle allows to add some theorems to the simplifier where they effectively become the background knowledge that is not necessary to reference directly in proofs. The NestedSlidersPlugin does something even better – it can make the theorem reference clickable, so that the reader can easily examine what the referenced theorem is saying in a window floating over the main text.

And now some formalized mathematics about folding in ZF set theory.

theory Fold_ZF imports InductiveSeq_ZF

begin

Suppose we have a binary operation P: X\times X \rightarrow X written multiplicatively as P\langle x, y \rangle= x\cdot y. In informal mathematics we can take a sequence \{ x_k \}_{k\in 0.. n} of elements of X and consider the product x_0\cdot x_1 \cdot .. \cdot x_n. To do the same thing in formalized mathematics we have to define precisely what is meant by that “\cdot .. \cdot“. The definitition we want to use is based on the notion of sequence defined by induction discussed in InductiveSeq_ZF. We don’t really want to derive the terminology for this from the word “product” as that would tie it conceptually to the multiplicative notation. This would be awkward when we want to reuse the same notions to talk about sums like x_0 + x_1 + .. + x_n.
In functional programming there is something called “fold”. Namely for a function f, initial point a and list \left[ b, c, d\right] the expression fold(f, a, [b,c,d]) is defined to be f(f(f(a,b),c),d) (in Haskell something like this is called foldl). If we write f in multiplicative notation we get a\cdot b \cdot c\cdot d, so this is exactly what we need. The notion of folds in functional programming is actually much more general that what we need here (not that I know anything about that). In this theory file we just make a slight generalization and talk about folding a list with a binary operation f:X\times Y \rightarrow X with X not necessarily the same as Y.

Folding in ZF

Suppose we have a binary operation f : X\times Y \rightarrow X. Then every y\in Y defines a transformation of X defined by T_y(x) = f\langle x,y\rangle. In IsarMathLib such transformation is called as Fix2ndVar(f,y). Using this notion, given a function f: X\times Y\rightarrow X and a sequence y = \{y_k\}_{k\in N} of elements of X we can get a sequence of transformations of X. This is defined in Seq2TransSeq below. Then we use that sequence of tranformations to define the sequence of partial folds (called FoldSeq) by means of InductiveSeqVarFN (defined in InductiveSeq_ZF theory) which implements the inductive sequence determined by a starting point and a sequence of transformations. Finally, we define the fold of a sequence as the last element of the sequence of the partial folds.

Definition that specifies how to convert a sequence a of elements of Y into a sequence of transformations of X, given a binary operation f :X\times Y \rightarrow X.

definition

Seq2TrSeq(f,a) \equiv  \{ \langle k,Fix2ndVar(f,a(k))\rangle .\  k \in  domain(a)\}

Definition of a sequence of partial folds.

definition

FoldSeq(f,x,a) \equiv
InductiveSeqVarFN(x,fstdom(f),Seq2TrSeq(f,a),domain(a))

Definition of a fold.

definition

Fold(f,x,a) \equiv  Last(FoldSeq(f,x,a))

If X is a set with a binary operation f:X\times Y \rightarrow X then Seq2TransSeqN(f,a) converts a sequence a of elements of Y into the sequence of corresponding transformations of X.


lemma seq2trans_seq_props:
   assumes A1: n \in  nat  and
   A2: f : X\times Y \rightarrow  X  and
   A3: a:n\rightarrow Y  and A4: T = Seq2TrSeq(f,a)
   shows T : n \rightarrow  (X\rightarrow X)  and
   \forall k\in n.\  \forall x\in X.\  (T(k))(x) = f\langle x,a(k)\rangle 
proof -
   from a:n\rightarrow Y have D: domain(a) = n
      using func1_1_L1
   with A2 A3 A4 show T : n \rightarrow  (X\rightarrow X)
      using apply_funtype fix_2nd_var_fun ZF_fun_from_total Seq2TrSeq_def
   with A4 D have I: \forall k \in  n.\  T(k) = Fix2ndVar(f,a(k))
      using Seq2TrSeq_def ZF_fun_from_tot_val0
   {
      fix k
      fix x
      assume A5: k\in n,   x\in X
      with A1 A3 have a(k) \in  Y
         using apply_funtype
      with A2 A5 I have (T(k))(x) = f\langle x,a(k)\rangle 
         using fix_var_val
   }
   thus \forall k\in n.\  \forall x\in X.\  (T(k))(x) = f\langle x,a(k)\rangle 
qed

Basic properties of the sequence of partial folds of a sequence a = \{y_k\}_{k\in \{0,..,n\} }.


theorem fold_seq_props:
   assumes A1: n \in  nat  and
   A2: f : X\times Y \rightarrow  X  and
   A3: y:n\rightarrow Y  and A4: x\in X  and
   A5: Y\neq 0  and A6: F = FoldSeq(f,x,y)
   shows F: succ(n) \rightarrow  X,   F(0) = x  and
   \forall k\in n.\  F(succ(k)) = f\langle F(k), y(k)\rangle 
proof -
   let T = Seq2TrSeq(f,y)
   from A1 A3 have D: domain(y) = n
      using func1_1_L1
   from f : X\times Y \rightarrow  X Y\neq 0 have I: fstdom(f) = X
      using fstdomdef
   with A1 A2 A3 A4 A6 D show II: F: succ(n) \rightarrow  X  and F(0) = x
      using seq2trans_seq_props FoldSeq_def fin_indseq_var_f_props
   from A1 A2 A3 A4 A6 I D have \forall k\in n.\  F(succ(k)) = T(k)(F(k))
      using seq2trans_seq_props FoldSeq_def fin_indseq_var_f_props
   moreover    {
      fix k
      assume A5: k\in n
      hence k \in  succ(n)
      with A1 A2 A3 II A5 have (T(k))(F(k)) = f\langle F(k),y(k)\rangle 
         using apply_funtype seq2trans_seq_props
   }
   ultimately show \forall k\in n.\  F(succ(k)) = f\langle F(k), y(k)\rangle 
qed

A consistency condition: if we make the list shorter, then we get a shorter sequence of partial folds with the same values as in the original sequence. This can be proven as a special case of fin_indseq_var_f_restrict but a proof using fold_seq_props and induction turns out to be shorter.


lemma foldseq_restrict:
   assumes n \in  nat,   k \in  succ(n)  and
   i \in  nat,   f : X\times Y \rightarrow  X,
   a : n \rightarrow  Y,   b : i \rightarrow  Y  and
   n \subseteq  i,
   \forall j \in  n.\  b(j) = a(j),   x \in  X,   Y \neq  0
   shows FoldSeq(f,x,b)(k) = FoldSeq(f,x,a)(k)
proof -
   let P = FoldSeq(f,x,a)
   let Q = FoldSeq(f,x,b)
   from assms have n \in  nat,   k \in  succ(n),   Q(0) = P(0)  and
   \forall j \in  n.\  Q(j) = P(j) \longrightarrow  Q(succ(j)) = P(succ(j))
      using fold_seq_props
   then show Q(k) = P(k)
      by (rule fin_nat_ind)
qed

A special case of foldseq_restrict when the longer sequence is created from the shorter one by appending one element.


corollary fold_seq_append:
   assumes n \in  nat,
   f : X\times Y \rightarrow  X,   a:n \rightarrow  Y  and
   x\in X,   k \in  succ(n),   y\in Y
   shows FoldSeq(f,x,Append(a,y))(k) = FoldSeq(f,x,a)(k)
proof -
   let b = Append(a,y)
   from assms have b : succ(n) \rightarrow  Y,   \forall j \in  n.\  b(j) = a(j)
      using append_props
   with assms show thesis
      using foldseq_restrict
qed

What we really will be using is the notion of the fold of a sequence, which we define as the last element of (inductively defined) sequence of partial folds. The next theorem lists some properties of the product of the fold operation.


theorem fold_props:
   assumes A1: n \in  nat  and
   A2: f : X\times Y \rightarrow  X,
   a:n \rightarrow  Y,   x\in X,   Y\neq 0
   shows Fold(f,x,a) =  FoldSeq(f,x,a)(n)  and Fold(f,x,a) \in  X
proof -
   from assms have FoldSeq(f,x,a) : succ(n) \rightarrow  X
      using fold_seq_props
   with A1 show Fold(f,x,a) =  FoldSeq(f,x,a)(n)  and Fold(f,x,a) \in  X
      using last_seq_elem apply_funtype Fold_def
qed

The next theorem tells us what happens to the fold of a sequence when we add one more element to it.


theorem fold_append:
   assumes A1: n \in  nat  and
   A2: f : X\times Y \rightarrow  X  and
   A3: a:n \rightarrow  Y  and
   A4: x\in X  and A5: y\in Y
   shows FoldSeq(f,x,Append(a,y))(n) = Fold(f,x,a)  and
   Fold(f,x,Append(a,y)) = f\langle Fold(f,x,a), y\rangle 
proof -
   let b = Append(a,y)
   let P = FoldSeq(f,x,b)
   from A5 have I: Y \neq  0
   with assms show thesis1: P(n) = Fold(f,x,a)
      using fold_seq_append fold_props
   from assms I have II: succ(n) \in  nat,
   f : X\times Y \rightarrow  X,   b : succ(n) \rightarrow  Y,
   x\in X,   Y \neq  0,   P = FoldSeq(f,x,b)
      using append_props
   then have \forall k \in  succ(n).\  P(succ(k)) =  f\langle P(k), b(k)\rangle 
      by (rule fold_seq_props)
   with A3 A5 thesis1 have P(succ(n)) =  f\langle  Fold(f,x,a), y\rangle 
      using append_props
   moreover
   from II have P : succ(succ(n)) \rightarrow  X
      by (rule fold_seq_props)
   then have Fold(f,x,b) = P(succ(n))
      using last_seq_elem Fold_def
   ultimately show Fold(f,x,Append(a,y)) = f\langle Fold(f,x,a), y\rangle 
qed

end

The formal part of this blog post has been generated from IsarMathLib’s Fold_ZF.thy theory file, see the relevant pages of the IsarMathLib proof document.

Update Sept. 19 2008: See also the TiddlyWiki rendering of this post.

Tags:

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.