Summing up in Kaitai Struct

January 8, 2022

I recently wanted to understand the Parquet binary format so I attempted to write a specification for it in Kaitai Struct. Kaitai Struct is a declarative binary format specification language, embedded in YAML. After writing such specification for a format one can generate parsers for it in several supported languages. Another thing that one can use such specification for is to visualize various elements of the format in the Kaitai Web IDE, which shows output somewhat similar to what Wireshark shows for network protocols. That’s what I want to use (if I ever finish the spec).

One stumbling block that I encountered was related to the way a SchemaElement is encoded in Parquet. A SchemaElement is a struct with a couple of fields selected from a fixed collection. The fields have numerical identifiers which are not provided directly but through deltas between their values. To obtain actual field id one has to compute the sum of the deltas provided (as 4-bit integers) in the file. For a while it was unclear if there is a way to encode such sum in Kaitai Struct and there was an issue about it on Kaitai GitHub repository opened more that four years ago. However, some three weeks ago cher-nov came up with a solution to this problem and posted it to the issue linked above. In this post I want to present an application of this solution to a toy format as another example that one can look at.

Read the rest of this entry »

IsarMathLib 1.22.1: Update to Isabelle2021-1

December 25, 2021

This release updates IsarMathLib to Isabelle2021-1. There was actually not much to update – only a couple of places in the informal text that referenced Isabelle2021 now reference Isabelle2021-1. The main difference is in the HTML presentation that Isabelle generates from IsarMathLib sources. This presentation improved somewhat and now has links to referenced theorems in proofs. In one respect it is even better than the presentation at the site – the locale names in the statements of theorems are linked to the definitions, which is a feature missing on I remember some statements by Makarius Wentzel on the Isabelle mailing list that the goal is the “publication quality” of officially generated HTML presentation. I hope that this includes what I think is needed the most – folding of structured proofs, support for mathematical notation with custom symbols and options for reduction of syntactic noise so that characters like the question mark or the quotes separating the object logic statements are omitted. When this happens I would be happy to retire the tool that generates HTML files. I wrote that tool 14 years ago with only a scant knowledge of HTML, CSS or Java Script and the web pages it generates are probably hopelessly old-style.

In other news, Pedro Sánchez Terraf announced that his team has completed a formalization of the independence of CH in Isabelle/ZF. This is a good news for IsarMathLib as well. Isabelle/ZF is used by very few people and I worry that at some point the Isabelle team stops supporting it, which would be the end of IsarMathLib. Every application like this makes it less likely.

IsarMathLib 1.22.0: Uniform covers part II

October 5, 2021

The material added in this release is again about uniform spaces, more specifically about the equivalence of defining uniform space through (diagonal) uniformities and through uniform covers. The previous release concluded with the theorem stating that if \Theta is a family of uniform covers of X then the following collection \Phi of subsets of X\times X is a (diagonal) uniformity:

\Phi = \{A\in Pow(X\times X): \exists_{P\in \Theta} \bigcup\{U\times U: U\in P\} \subseteq A\}.

This shows that if we define a uniform space through a family of uniform covers then we can always get the usual definition by transforming that family into a uniformity.

The current release adds the dual fact (unicov_from_uniformity): if \Phi is a uniformity on X then the following collection of covers of X is a family of uniform covers:

\{P\in \text{Covers}(X). \exists_{A\in \Phi} \forall_{x\in X} \exists_{U\in P} A\{x \} \subseteq U\}.

The next two theorems show that these two transformations are inverses of each other.

I should probably quote sources for my formalization so that not to create an impression that I claim I am inventing some new stuff here. However the truth is that I found the Willard’s “General Topology” book that I bought for this nearly useless. So I ended up mostly getting definitions and facts from the “Uniform cover definition” section on the Wikipedia page on uniform spaces, and figuring out the proofs by myself. The only exception was a nice (even if not completely correct) proof that barycentric refinement of a barycentric refinement is a star refinement by Henno Brandsma that I found on Stack Exchange and converted to the bary_bary_star lemma.

Other news

Read the rest of this entry »

IsarMathLib 1.21.0: Uniform covers

August 22, 2021

There are three approaches to defining uniform spaces. The first one, through diagonal uniformities is presented in IsarMathLib in the UniformSpace_ZF theory. This release adds a theory about defining uniform spaces through uniform covers.

Consider a set X with its powerset (the set of subsets) \text{Pow}(X). A collection P of subsets of X whose union is equal to X is called a cover of X. So, we can define the set \text{Covers}(X) = \{ P\in  \text{Pow}( \text{Pow} (X)): \bigcup P =X \} .

For a set U\subseteq X and a collection of \mathcal{U} of subsets of X we can define the star of U in \mathcal{U} as \text{st}(U,\mathcal{U}) = \bigcup\{ W\in  \mathcal{U} : W\cap U\ne \emptyset \}.

For two covers \mathcal{R},\mathcal{C} \in \text{Covers}(X) we say that the cover \mathcal{R} is a star-refinement of the cover \mathcal{C} and write \mathcal{R} <^*  \mathcal{C} if for every R\in  \mathcal{R} we can find a C\in  \mathcal{C} such that \text{st}(R, \mathcal{R}) is contained in C.

Finally we are able to define uniform covers. This notion is a bit tricky as there is really no separate notion of a uniform cover. If we look at cover of X we cannot say if this is a uniform cover or not. What actually is defined is a family of uniform covers. A member of such family can only be called a uniform cover if it is clear from the context what family of uniform covers it belongs to.

A nonempty collection of covers \Theta of X is called a family of uniform covers iff the following two conditions hold:

a) If \mathcal{R} \in  \Theta, \mathcal{C} \in  \text{Covers}(X) and \mathcal{R} <^*  \mathcal{C}, then \mathcal{C} \in  \Theta

b) For every two covers \mathcal{R},  \mathcal{C}\in  \Theta we can find a cover \mathcal{D}\in  \Theta that is a star-refinement of both \mathcal{R} and \mathcal{C}

This definition is slightly different that the one on the Wikipedia page on uniform spaces, which specifically requires that the trivial cover \{ X\} is a member of \Theta. However, since every cover is a star-refinement of the trivial cover \{ X\} and \Theta is not empty, condition a) implies that \{ X\}\in \Theta.

The key fact here is that if \Theta is a family of uniform covers of X then the following collection \Phi of subsets of X\times X is a (diagonal) uniformity:

\Phi = \{A\in Pow(X\times X): \exists_{P\in \Theta} \bigcup\{U\times U: U\in P\} \subseteq A\}

In words, \Phi is the collection of supersets of sets of the form \bigcup\{U\times U: U\in P\} as P ranges over \Theta.

In Willard’s “General Topology” book the proof that \Phi is a uniformity is left as a “straightforward exercise” for the reader. Typically that means that the proof is long and tedious and the author does not feel like typing all that down. In this case the proof is indeed of mostly straightforward “follow your nose” kind right up to the moment when given a set A\in \Phi one needs to find a set B\in \Phi such that B\circ B \subseteq A (here B\circ B is the composition of relation B with itself). For that, I found the following identity useful:

(  \bigcup\{U\times U: U\in P\} ) \circ (  \bigcup\{U\times U: U\in P\} ) = \bigcup\{U\times \text{st}(U,P): U\in P\}

It took me some effort to figure out this identity and I was ready to type the proof of that in Isabelle, split into two inclusions etc. To my amazement though Isabelle accepted this fact without an explicit proof, only with the reference to the definition of star. This was probably the most unclear to me fact that was obvious to Isabelle that I have seen writing proofs for IsarMathLib so far. I left the proof of this identity as an exercise for the reader. This time the straightforwardness of this fact got certified by Isabelle.

Which brings us to the subject of

Automation in proof assistants

Read the rest of this entry »

IsarMathLib 1.20.0: Ordered loops and loop valued metrics

April 28, 2021

Just to be clear, the loops in the title are algebraic structures and are not related to continuous images of circles or flow control constructs in programming languages.

Some time ago I got curious about a minimal setup in which a (pseudo) metric defined on a set would result in making this set a topological space. It turned out that if the metric is defined as a function with values in an partially ordered group we can use the standard definition of an open disk and the collection of unions of such open disks is a topology. However, I noticed then that this should be possible to be pushed a step farther and one should be able to define a metric as a function valued in an ordered loop and still get a topological space. In this release I added some material on ordered loops and rewrote the metric spaces section to be based on metrics valued in ordered loops.

Loops (the algebraic structures) are similar to groups, except that their binary operation does not have to be associative (I wrote more about loops in the previous post). An ordered loop (see for example Lattice-ordered loops and quasigroups by T. Evans) is a loop with a partial order such that x\leq y if and only if x+z\leq y+z and x\leq y if and only if z+x\leq z+y. The “if and only if” is worth noting here as this is different from the ordered group definition where one-sided implications are sufficient (I lost a couple of hours working with a wrong definition).

IsarMathLib 1.19.0: Loops

April 6, 2021

This release adds a theory file about loops. Loops (the kind we talk about here, not to be confused with for example loop algebras) are algebraic structures that are almost like groups, except that we don’t require the operation to be associative. Wikipedia has a nice picture illustrating dependencies between various algebraic structures between magmas and groups which I copy below:

As the picture shows we get a loop when we take a quasigroup and require that an identity (a neutral element of the operation) exists. It can be shown that that identity is unique in the loop. If we denote the loop identity as e, then we can define the left and right inverses of a loop element a as the only solutions to the equations x\cdot a = e and a\cdot x = e. In a loop these inverses do not have to be the same in general. If the loop operation is associative then we get a group and the left and right inverses unify to become the group inverse.

IsarMathLib 1.18.0: Quasigroups

March 24, 2021

In this release I added a small theory about quasigroups.

A quasigroup is a set with a binary operation (usually written with the multiplicative notation) that supports a certain kind of “divisibility”: every equation of the form a\cdot x = b or x\cdot a = b has a unique solution. So, a quasigroup is like a group except we do not require associativity or existence of a neutral element. If we denote those solution as a\backslash b and b/a respectively we get the identities a\cdot (a\backslash b) = b and (b/a)\cdot a = b by definition and, with the uniqueness, we get a\backslash (a\cdot b) =  b and (b\cdot a)/a =b, which may serve as an alternative definition of quasigroup.

According to this paper “One of the mathematical subfields where automated theorem provers are heavily used is the field of quasigroup and loop theory”.

IsarMathLib 1.17.0: Real numbers as a metric and topological space

March 7, 2021

In this release I added a new theory about real numbers. It’s mostly about setting up notation and validating other contexts (related to ordered fields, rings, groups, metric spaces etc.) so that theorems formulated in those contexts can be used when considering real numbers. At the end there are two theorems about the distance function (x,y) \mapsto |x-y| being a metric and one stating that the collection of unions of open disks is a topology on reals.

IsarMathLib had been upgraded to Isabelle2021 in the previous release 1.16.1 (not announced on this blog) but at that time I didn’t notice that the proof documents were not generated properly. The pdf files generated by Isabelle didn’t get the formal text right, some symbols (like \{ and \}, very important in mathematics based on set theory) were replaced by some placeholder symbol. Errors related to LaTeX are difficult to debug in Isabelle (at least for me), but this time I was lucky. The Isabelle2021 release notes contained information that the “The standard LaTeX engine is now lualatex, according to settings
variable ISABELLE_PDFLATEX.” . Indeed changing the ISABELLE_PDFLATEX="lualatex --file-line-error" line in the settings file back to ISABELLE_PDFLATEX="pdflatex -file-line-error" like it was in Isabelle2020 fixed the problem.

Read the rest of this entry »

IsarMathLib version 1.16.0: metric spaces

February 1, 2021

In this release I added a new theory on (pseudo-)metric spaces. Daniel de la Concepción Sáez contributed a theorem stating that in topological groups the group inverse is uniformly continuous with respect to the Roelcke uniformity.

A pseudo-metric is typically defined as a function d: X\times X \rightarrow [0,\infty) that satisfies the conditions d(x,x) = 0, d(x,y) = 0 \Rightarrow x=y and d(x,z) \leq d(x,y)+d(y,z) for all x,y,z \in X. This is sufficient to prove that the collection of subsets of X that are unions of disks (defined as \text{disk}(c,r) = \{x\in X: d(c,x)<r\}) is a topology, making this metric space a topological space. If we replace the interval [0,\infty) in the definition of pseudo-metric by the nonnegative set of a linearly ordered, abelian, Archimedean group then we are not changing anything as such group is isomorphic to a subgroup of the additive group of real numbers. I was curious: what is really needed to get a topological space from a pseudo-metric space this way? Do we really need the group to be Archimedean? Does the group operation have to be commutative? Do we need the order to be total? Amusingly, the answer to these questions is “no”. The group does not have to be abelian and the Archimedean property is not needed. The linear order condition can be relaxed by assuming that the positive set is a meet semi-lattice (every two-element subset has a greatest lower bound) under the group order relation. I don’t know if this is useful to know, but surely my personal curiosity got satisfied.

IsarMathLib version 1.15.0: comparison with Mizar

December 13, 2020

Some time ago Victor Makarov asked on the Mizar forum about two theorems in group theory. The first theorem states that the union of two subgroups is a subgroup iff one of the subgroups is a subset of the other subgroup. The other one gives a criterion for when for two subgroups H,K the set H\cdot K = \{x\cdot y | x\in H,y\in K \} is also a subgroup. These theorems turned out not to be in the Mizar Mathematical Library, but Roland Coghetto posted formalization of those two theorems in the Mizar proof language. In this release I added the formalization of those two theorems to IsarMathLib so that one can easily compare the syntax and style of Isabelle’s Isar proof language (as it is used in IsarMathLib) and the Mizar proof language.

The result can be viewed as theorems named union_subgroups and prod_subgr_subgr. For the proof of the first one I followed the informal source on Stack Exchange provided by Roland, so it can be directly compared to its Mizar version. The proof of the second one is probably different than the Mizar version, but I think they are still useful as examples of the same theorem in different proof assistants.