Professional Documents
Culture Documents
Ebook New Foundations For Information Theory Logical Entropy and Shannon Entropy 1St Edition David Ellerman Online PDF All Chapter
Ebook New Foundations For Information Theory Logical Entropy and Shannon Entropy 1St Edition David Ellerman Online PDF All Chapter
Ebook New Foundations For Information Theory Logical Entropy and Shannon Entropy 1St Edition David Ellerman Online PDF All Chapter
https://ebookmeta.com/product/new-foundations-for-information-
theory-logical-entropy-and-shannon-entropy-1st-edition-david-
ellerman-2/
https://ebookmeta.com/product/forecasting-with-maximum-entropy-
the-interface-between-physics-biology-economics-and-information-
theory-1st-edition-hugo-fort/
https://ebookmeta.com/product/infamy-entropy-endless-
obsession-1st-edition-kelsie-calloway-2/
https://ebookmeta.com/product/infamy-entropy-endless-
obsession-1st-edition-kelsie-calloway/
An Account Of Thermodynamic Entropy 1st Edition Alberto
Gianinetti
https://ebookmeta.com/product/an-account-of-thermodynamic-
entropy-1st-edition-alberto-gianinetti/
https://ebookmeta.com/product/entropy-divergence-and-
majorization-in-classical-and-quantum-thermodynamics-takahiro-
sagawa/
https://ebookmeta.com/product/high-entropy-materials-from-basics-
to-applications-1st-edition-huimin-xiang/
https://ebookmeta.com/product/arctic-archives-ice-memory-and-
entropy-1st-edition-susi-k-frank-kjetil-a-jakobsen/
https://ebookmeta.com/product/loss-data-analysis-the-maximum-
entropy-approach-2-extended-edition-henryk-gzyl/
SpringerBriefs in Philosophy
SpringerBriefs in Philosophy
This work is subject to copyright. All rights are solely and exclusively
licensed by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in
any other physical way, and transmission or information storage and
retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.
The publisher, the authors and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
1. Logical Entropy
David Ellerman1
(1) Faculty of Social Sciences, University of Ljubljana, Ljubljana,
Slovenia
Abstract
This book presents a new foundation for information theory where the
notion of information is defined in terms of distinctions, differences,
distinguishability, and diversity. The direct measure is logical entropy
which is the quantitative measure of the distinctions made by a
partition. Shannon entropy is a transform or re-quantification of logical
entropy for Claude Shannon’s “mathematical theory of
communications.” The interpretation of the logical entropy of a
partition is the two-draw probability of getting a distinction of the
partition (a pair of elements distinguished by the partition) so it
realizes a dictum of Gian-Carlo Rota: . Andrei
1.1 Introduction
This book develops the logical theory of information-as-distinctions.
The atom of information is: “This is different from that.” It can be seen
as the application of the logic of partitions [7] to information theory.
Partitions are dual (in a category-theoretic sense) to subsets. George
Boole developed the notion of logical probability [5] as the normalized
counting measure on subsets in his logic of subsets. This chapter
develops the normalized counting measure on partitions as the
analogous quantitative treatment in the logic of partitions—so that
measure will be called “logical entropy. ” That measure is a new logical
derivation of an old formula measuring diversity and distinctions, e.g.,
Corrado Gini’s index of mutability or diversity [10], that goes back to
the early twentieth century.1
This raises the question of the relationship of logical entropy to
Claude Shannon’s entropy [24] . The entropies are closely related since
they are both ultimately based on the concept of information-as-
distinctions —but they represent two different way to quantify
distinctions. Logical entropy directly counts the distinctions (as defined
in partition logic) whereas Shannon entropy, in effect, counts the
minimum number of binary partitions (or yes/no questions) it takes,
on average, to make the same distinctions. Since that gives (in standard
examples) a binary code for the distinct entities, the Shannon theory is
perfectly adapted for applications to the theory of coding and
communications.
The logical theory and the Shannon theory are also related in their
compound notions of joint entropy, conditional entropy, and mutual
information. Logical entropy is defined as a (probability) measure on a
set in the usual mathematical sense (e.g., is non-negative) so, as with
any measure, the compound formulas satisfy the usual Venn diagram
relationships. The compound notions of Shannon entropy were defined,
without any mention of a set, so that they also satisfy similar Venn
diagram relationships. When extended to three or more partitions or
random variables, the Shannon mutual information can have negative
values. There is a more general measure notion, a signed measure, that
can take on negative values and signed measures also satisfy all the
usual Venn diagram relationships (where some areas in the diagrams
may be negative). Thus logical entropy is defined as a certain
(probability) measure on a set, while Shannon entropy is defined
independently of any given set. But given the Venn diagram
relationships between the compound Shannon entropies, a set can be
constructed ex post so that those Shannon entropies are the values of a
signed measure on that set.
There is also a monotonic but non-linear transformation of formulas
that transforms each of the logical entropy compound formulas into the
corresponding Shannon entropy compound formula, and the transform
preserves the Venn diagram relationships. This “dit-bit transform” is
heuristically motivated by showing how average counts of distinctions
(“dits”) can be converted in average counts of binary partitions (“bits”).
This view even has an interesting history. In James Gleick’s book, The
Information: A History, A Theory, A Flood, he noted the focus on
differences in the seventeenth century polymath, John Wilkins , who
was a founder of the Royal Society. In 1641, the year before Isaac
Newton was born, Wilkins published one of the earliest books on
cryptography, Mercury or the Secret and Swift Messenger, which not only
pointed out the fundamental role of differences but noted that any
(finite) set of different things could be encoded by words in a binary
code.
For in the general we must note, That whatever is capable of a
competent Difference, perceptible to any Sense, may be a
sufficient Means whereby to express the Cogitations. It is more
convenient, indeed, that these Differences should be of as great
Variety as the Letters of the Alphabet; but it is sufficient if they
be but twofold, because Two alone may, with somewhat more
Labour and Time, be well enough contrived to express all the
rest. [27, Chap. XVII, p. 69]
As Gleick noted:
classical finite probability theory [18] also dealt with the case where
the outcomes were assigned real point probabilities
so rather than summing the equal probabilities , the point
Our claim is quite simple; the analogue to the size of a subset is the size
of the ditset, the set of distinctions, of a partition.3 The normalized size
of a subset is the logical probability of the event, and the normalized
size of the ditset of a partition is, in the sense of measure theory, “the
measure of the amount of information” in a partition. Thus we define
the logical entropy of a partition , denoted , as the
Since we have an easy set-of-blocks definition for the join, we also have:
Subsets S ⊆ U Ditsets
where:
Atoms x ≠ x ′ y ≠ y ′ X≢Y X ⊃ Y
S X∧Y T T F T
S X∧¬Y T F T F
S ¬X∧Y F T T T
S ¬X∧¬Y F F F T
For n = 2 variables X and Y , there are ways to fill in the
T’s and F’s to define all the possible Boolean combinations of the two
propositions so there are 16 subsets in the information algebra
. The 15 nonempty subsets in are defined in
disjunctive normal form by the union of the atoms that have a T in their
row. For instance, the set S X≢Y corresponding to the symmetric
difference or inequivalence is
.
The information algebra is a finite combinatorial
structure defined solely in terms of X × Y using only the distinctions and
indistinctions between the elements of X and Y . Any equivalence
between Boolean expressions that is a tautology, e.g.,
, gives a set identity in
the information Boolean algebra, e.g., .
Since that union is disjoint, any probability distribution on X × Y gives
the logically necessary identity (see below).
In addition to the logically necessary relationships between the
logical entropies, other relationships may hold depending on the
particular probability distribution on X × Y . Even though all the 15
subsets in the information algebra aside from the empty set ∅ are
always nonempty, some of the logical entropies can still be 0. Indeed,
iff the marginal distribution on X has for some x ∈
X. These more specific relationships will depend not just on the infosets
but also on their positive supports (which depend on the probability
distribution):
ρ.” [13, p. 561] As noted by Bhargava and Uppuluri [4], the formula
was used by Gini in 1912 [10] as a measure of “mutability” or
in his cryptoanalysis work and called it the repeat rate since it is the
probability of a repeat in a pair of independent draws from a
population with those probabilities (i.e., the identification probability
). Polish cryptoanalysts had independently used the repeat
rate in their work on the Enigma [21].
After the war, Edward H. Simpson, a British statistician, proposed
as a measure of species concentration (the opposite of
References
1. Aczel, J., and Z. Daroczy. 1975. On Measures of Information and Their
Characterization. New York: Academic Press.
2.
Adriaans, Pieter, and Johan van Benthem, ed. 2008. Philosophy of Information. Vol.
8. Handbook of the Philosophy of Science. Amsterdam: North-Holland.
3.
Bennett, Charles H. 2003. Quantum Information: Qubits and Quantum Error
Correction. International Journal of Theoretical Physics 42: 153–176. https://doi.
org/10.1023/A :1024439131297.
[Crossref]
4.
Bhargava, T. N., and V. R. R. Uppuluri. 1975. On an axiomatic derivation of Gini
diversity, with applications. Metron 33: 41–53.
5.
Boole, George. 1854. An Investigation of the Laws of Thought on which are founded
the Mathematical Theories of Logic and Probabilities. Cambridge: Macmillan and
Co.
6.
Ellerman, David. 2009. Counting Distinctions: On the Conceptual Foundations of
Shannon’s Information Theory. Synthese 168: 119–149. https://doi.org/10.1007/
s11229-008-9333-7.
[Crossref]
7.
Ellerman, David. 2014. An introduction to partition logic. Logic Journal of the
IGPL 22: 94–125. https://doi.org/10.1093/j igpal/j zt036.
[Crossref]
8.
Ellerman, David. 2017. Logical Information Theory: New Foundations for
Information Theory. Logic Journal of the IGPL 25 (5 Oct.): 806–35.
9.
Friedman, William F. 1922. The Index of Coincidence and Its Applications in
Cryptography. Geneva IL: Riverbank Laboratories.
10.
Gini, Corrado 1912. Variabilità e mutabilità. Bologna: Tipografia di Paolo Cuppini.
11.
Gleick, James 2011. The Information: A History, A Theory, A Flood. New York:
Pantheon.
12.
Good, I. J. 1979. A.M. Turing’s statistical work in World War II. Biometrika 66:
393–6.
[Crossref]
13.
Good, I. J. 1982. Comment (on Patil and Taillie: Diversity as a Concept and its
Measurement). Journal of the American Statistical Association 77: 561–3.
[Crossref]
14.
Havrda, Jan, and Frantisek Charvat. 1967. Quantification Methods of
Classification Processes: Concept of Structural α-Entropy. Kybernetika (Prague)
3: 30–35.
15.
Kolmogorov, Andrei N. 1983. Combinatorial Foundations of Information Theory
and the Calculus of Probabilities. Russian Math. Surveys 38: 29–40.
[Crossref]
16.
Kullback, Solomon 1976. Statistical Methods in Cryptoanalysis. Walnut Creek CA:
Aegean Park Press.
17.
Kung, Joseph P. S., Gian-Carlo Rota, and Catherine H. Yan. 2009. Combinatorics:
The Rota Way. New York: Cambridge University Press.
[Crossref]
18.
Laplace, Pierre-Simon. 1995. Philosophical Essay on Probabilities. Translated by
A.I. Dale. New York: Springer Verlag.
19.
Rao, C. R. 1982a. Gini-Simpson Index of Diversity: A Characterization,
Generalization and Applications. Utilitas Mathematica B 21: 273–282.
20.
Rao, C. R. 1982b. Diversity and Dissimilarity Coefficients: A Unified Approach.
Theoretical Population Biology. 21: 24-43.
[Crossref]
21.
Rejewski, M. 1981. How Polish Mathematicians Deciphered the Enigma. Annals of
the History of Computing 3: 213–34.
[Crossref]
22.
Ricotta, Carlo, and Laszlo Szeidl. 2006. Towards a unifying approach to diversity
measures: Bridging the gap between the Shannon entropy and Rao’s quadratic
index. Theoretical Population Biology 70: 237–43. https://doi.org/10.1016/j .tpb.
2006.06.003.
[Crossref]
23.
Rota, Gian-Carlo. 2001. Twelve problems in probability no one likes to bring up.
In Algebraic Combinatorics and Computer Science: A Tribute to Gian-Carlo Rota, ed.
Henry Crapo and Domenico Senato, 57–93. Milano: Springer.
[Crossref]
24.
Shannon, Claude E. 1948. A Mathematical Theory of Communication. Bell System
Technical Journal 27: 379–423; 623–56.
[Crossref]
25.
Simpson, Edward Hugh. 1949. Measurement of Diversity. Nature 163: 688.
[Crossref]
26.
Tsallis, Constantino. 1988. Possible Generalization for Boltzmann-Gibbs
statistics. J. Stat. Physics 52: 479–87.
[Crossref]
27.
Wilkins, John 1707 (1641). Mercury or the Secret and Swift Messenger. London.
Footnotes
1 Many of the results about logical entropy were developed in [6] and [8].
2 Logical information theory is about what Adriaans and van Benthem call
“Information B: Probabilistic, information-theoretic, measured quantitatively”, not
about “Information A: knowledge, logic, what is conveyed in informative answers”
where the connection to philosophy and logic is built-in from the beginning.
Likewise, this book is not about Kolmogorov-style “Information C: Algorithmic, code
compression, measured quantitatively.” [2, p. 11]
Abstract
This chapter is focused on developing the basic notion of Shannon
entropy, its interpretation in terms of distinctions, i.e., the minimum
average number of yes-or-no questions that must be answered to
distinguish all the “messages.” Thus Shannon entropy is also a
quantitative indicator of information-as-distinctions, and, accordingly, a
“dit-bit transform” is defined that turns any simple, joint, conditional,
or mutual logical entropy into the corresponding notion of Shannon
entropy. One of the delicate points is that while logical entropy is
always a non-negative measure in the sense of measure theory (indeed,
a probability measure), we will later see that for three or more random
variables, the Shannon mutual information can be negative. This means
that Shannon entropy can in general be characterized only as a signed
measure, i.e., a measure that can take on negative values.
common claim is that Shannon’s entropy has the same functional form
as entropy in statistical mechanics, and that is simply false. Taking a
few more terms would give a humongous formula that is a “more
accurate approximation” [8, p. 2].
There is another intuitive approach to Shannon entropy that might
be mentioned. When an outcome with probability p i occurs, it is said
that the “surprise factor” is . These surprise factors then need to be
so . In the biodiversity
, , i.e., .
References
1. Campbell, L. Lorne. 1965. Entropy as a Measure. IEEE Trans. on Information
Theory IT-11: 112–114.
[Crossref]
2.
Csiszar, Imre, and Janos Kö rner. 1981. Information Theory: Coding Theorems for
Discrete Memoryless Systems. New York: Academic Press.
3.
Doob, J. L. 1994. Measure Theory. New York: Springer Science+Business Media.
[Crossref]
4.
Halmos, Paul R. 1974. Measure Theory. New York: Springer-Verlag.
5.
Hartley, R.V.L. 1928. Transmission of information. Bell System Technical Journal 7:
535–63.
[Crossref]
6.
Hu, Kuo Ting. 1962. On the Amount of Information. Probability Theory & Its
Applications 7: 439–447. https://doi.org/10.1137/1107041.
[Crossref]
7.
MacArthur, Robert H. 1965. Patterns of Species Diversity. Biol. Rev. 40: 510–33.
[Crossref]
8.
MacKay, D. J. C. 2003. Information Theory, Inference, and Learning Algorithms.
Cambridge UK: Cambridge University Press.
9.
Polya, George, and Gabor Szego. 1998. Problems and Theorems in Analysis Vol. II.
Berlin: Springer-Verlag.
[Crossref]
10.
Rao, K. P. S. Bhaskara, and M. Bhaskara Rao. 1983. Theory of Charges: A Study of
Finitely Additive Measures. London: Academic Press.
11.
Rozeboom, William W. 1968. The Theory of Abstract Partials: An Introduction.
Psychometrika 33: 133–167.
[Crossref]
12.
Ryser, Herbert John. 1963. Combinatorial Mathematics. Washington DC:
Mathematical Association of America.
[Crossref]
13.
Takacs, Lajos. 1967. On the Method of Inclusion and Exclusion. Journal of the
American Statistical Association 62: 102–113. https://doi.org/10.1080/01621459.
1967.10482891.
14.
Yeung, Raymond W. 1991. A New Outlook on Shannon’s Information Measures.
IEEE Trans. on Information Theory 37: 466–474. https://doi.org/10.1109/18.
79902.
[Crossref]
Another random document with
no related content on Scribd:
contemptuously of Madame de Montespan, and his remarks
had been carried to the lady's ears by one of those tale-
bearers who flourish at court. Of course madame became
his enemy. She had great influence with the king, though
not so much as Madame de Maintenon came to have
afterward. My uncle's disgrace grew more and more
apparent every day, and at last he received peremptory
orders to retire to his château in Provence, where he held
some sort of office under government. He was allowed,
however, to remain in Paris for two or three weeks, to settle
up his affairs there, which were, I imagine, in no little
confusion.
My uncle stamped his foot and bit his lip with vexation.
It seems he thought his brother had left his treasures
concealed, and hoped by my means to lay hands upon
them. No doubt they would have made a very welcome
supply at that time.
"I am going into the country for a few days," said he.
"When I return I shall hope to find that you have recovered
your spirits and are prepared to submit to any arrangement
your friends may think it best to make for your welfare."