Compression of enumerations and gain

George Barmpalias

Xiaoyan Zhang

Bohua Zhan

From omega to Omega – Summer 2023 – IMS, Singapore

Abstract. We study the compressibility of enumerations, and its role in the relative Kolmogorov complexity of computably enumerable sets, with respect to density. With respect to a strong and a weak form of compression, we examine the gain: the amount of auxiliary information embedded in the compressed enumeration. Strong compression and weak gainless compression is constructed for any computably enumerable set, and a positional game is studied toward understanding strong gainless compression.

Full paper at:

Supported by NSFC grant No. 11971501.

Extended abstract. Consider the even number game: given k>1, in each round:

Any player that is unable to make a move loses the game. Who wins?

This problem is open for k>3 (player 1 has a winning strategy for k\leq 3).

Such positional games are key in understanding compressibility of enumerations.

Problem: given an effective enumeration of A\subseteq\mathbb{N}, obtain a compression of it, in the form of an effective enumeration of another set D which:

essentially contains the information in A, but in a compact form.

This turns out to be the key to an open problem in Kolmogorov complexity (see below) but it is probably more interesting in its own right. By ‘essentially’ we mean indifference to finite errors: if D\upharpoonright_{\ell_n}, \ell_n\leq n is supposed to encode A\upharpoonright_{n}, we merely require that some computable f maps D\upharpoonright_{\ell_n} to an n-bit string f(D\upharpoonright_{\ell_n}) whose Hamming-distance from A\upharpoonright_n is bounded. Since A, D are computably enumerable (c.e.), this can be concisely expressed as C(A\upharpoonright_n\mid D\upharpoonright_{\ell_n})=\textrm{\textup O}\hspace{0.02cm}(1) where C(\sigma\mid\tau) denotes the conditional Kolmogorov complexity: the length of the shortest program that can generate \sigma from input \tau. By D being more compact than A we mean that \ell_n is considerably smaller than n or, at least, |\hspace{0.03cm}{D\upharpoonright_n}\hspace{0.03cm}| is considerably smaller than |\hspace{0.03cm}{A\upharpoonright_n}\hspace{0.03cm}|.

Given c.e. A,D, we say that D is a

Replacing n/2 by \epsilon n for \epsilon\in (0,1), in the same way we can define \epsilon-compression. If every c.e. set has a compression, by iteration it also has an \epsilon-compression; the same is true of strong compression. The simpler form often suffices for our purpose, as it can be iterated.

Theorem 1. Given any c.e. A we can effectively enumerate a strong compression of it.

This process is relatively simple but introduces additional information, and which we call the gain, which is not recoverable from the source enumeration.

We say that a compression is gainless if it has finite gain, where the

Obtaining gainless compressions is considerably harder and, useful for resolving problems about the Kolmogorov complexity of c.e. sets.

Theorem 2. Given any c.e. A we can enumerate a gainless compression D\subseteq A of it.

The halting problem has a gainless strong \epsilon-compression, for arbitrary \epsilon\in (0,1).

Open: we do not know if every c.e. set has a gainless strong compression.

Three preorders comparing the descriptive complexity of reals are \leq_{rK}, \leq_K, \leq_C: A\leq_{rK} B\iff C(A\upharpoonright_n\ \mid B\upharpoonright_n)=O(1) A\leq_{K} B\iff \forall n,\ K(A\upharpoonright_n)= K(B\upharpoonright_n)+O(1) where K denotes the Kolmogorov complexity with respect to prefix-free machines and \leq_{C} is defined similarly to \leq_K.

The question of density of c.e. sets with respect to \leq_{rK}, \leq_K, \leq_C is open.

We show that density holds amongst the c.e. sets that have a gainless strong compression.