Functional Information
I will review here the basic concepts of functional information and of its different forms.
Information is a rather vague concept. Here, and in all the discussions on this site, I will use a rather specific and well defined meaning of the word “information”, which is well summarized in this paragraph from the related Wikipedia page:
Information resolves uncertainty. The uncertainty of an event is measured by its probability of occurrence and is inversely proportional to that. The more uncertain an event, the more information is required to resolve uncertainty of that event. The bit is a typical unit of information, but other units such as the nat may be used. Example: information in one “fair” coin flip: log2(2/1) = 1 bit, and in two fair coin flips is log2(4/1) = 2 bits.
Let’s try to understand better. Let’s say that I have a safe, which can be opened by an eightdigit key. If we have no other specific information about the key, any ten digit sequence could be it. In other words, if we want to find the key by a random search, there are 10^8 different configurations. The probability of getting the right sequence in one random attempt is:
1:10^8 = 0.00000001 = 1.00E-08 = 2.00E-26.57
This information is usually expressed in bits, as -log2(probability). In this case, we have therefore 26.57 bits of information. The meaning is simple:
a) If we only know that the key is a ten digit number, our uncertainty about its true value is 2E26.57, and the probability of finding it in a single random attempt is 2E-26.57. If we know the specific sequence, our uncertainty becomes 0, therefore it is reduced of 26.57 bits. Therefore, we can say that the information conveyed by knowing the specific solution, as compared to knowing only the generic fact that the solution is an 8 digit sequence.
This example allows us to define some basic concepts:
a)The search space (SS) is the whole set of different configurations which can be found in the specified system. In our example, the search space is 10^8 (2^26.57), because we know that the solution is a ten digit sequence, but we have no further information about it.
b) The target space (TS) is the whole set of different configurations in the SS which are a solution to the requirement which defines the search. In our case, the target space is made of one sequence. However, let’s say that there are 5 different sequences which work as a key to the safe. In that case, the target space is the set of those 5 sequences. The TS is, by definition, a subset of the SS.
c) The specification is the requirement which defines the TS in the SS. In our case, it is the ability of a sequence to be a key for our safe.
d) The system is the physical setting that where the search takes place. In our case, the system requires at least: the safe, and some environment where ten digit sequences can be randomly generated.
So, we can give our first important definition:
- 1. Given the above definitions, Specified Information (SI) in bits is -log2 ofthe ratio of the target space to the Search Space (TS/SS, the probability of finding a solution in one random attempt). In our case, SI for the search of a key to the safe is 26.57 bits if the TS is of 1 sequence, 24.25 bits if the TS is of 5 sequences.
And the important corollary:
- 1a) Any measurement of SI is relative to a specific system, and to a specific specification.
Now, let’s define another important aspect: complexity. In continuous form, the complexity of some SI is simply its value in bits: the higher the value, the higher the complexity. However, it is often useful to categorize complexity as a binary variable. To do that, we must choose some threshold, which must have some particular meaning for us in the context we are analyzing. Given the threshold, all values of SI which are, say, higher than the threshold will be categorized as “complex”, while all other values will be categorized as “non complex”. So, if our threshold is, say, 20 bits, then our value of 26.57 bits will be considered complex. That brings us to our second important definition:
- 2. Complex Specified Information (CSI) is any measurement of SI, in a defined system and for a defined specification, which is higher than a pre-defined threshold.