Wh i “I f i Th ” ? What is “Information Theory” ? “Information Theory ...
Wh i “I f i Th ” ? What is “Information Theory” ? “Information Theory ...
Wh i “I f i Th ” ? What is “Information Theory” ? “Information Theory ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Wh</strong> <strong>Wh</strong>at <strong>is</strong> i <strong>“I</strong>nformation <strong>“I</strong> f i <strong>Th</strong>eory<strong>”</strong> <strong>Th</strong> <strong>”</strong> ?<br />
<strong>“I</strong>nformation Information <strong>Th</strong>eory <strong>Th</strong>eory<strong>”</strong> answers two fundamental questions<br />
in communication theory:<br />
• <strong>Wh</strong>at <strong>is</strong> the ultimate data compression (the entropy H)<br />
• <strong>Wh</strong>at <strong>is</strong> the ultimate transm<strong>is</strong>sion rate of communication<br />
(the channel capacity)<br />
It founds the most basic theoretical foundations of<br />
communication theory. y<br />
1
M Moreover, <strong>“I</strong> <strong>“I</strong>nformation f i <strong>Th</strong>eory<strong>”</strong> <strong>Th</strong> <strong>”</strong>i intersects<br />
• Physics (Stat<strong>is</strong>tical Mechanics)<br />
• Mathematics (Probability <strong>Th</strong>eory)<br />
• Electrical Engineering (Communication <strong>Th</strong>eory)<br />
• Computer Science (Algorithm Complexity)<br />
• Economics (Portfolio / Game <strong>Th</strong>eory)<br />
<strong>Th</strong><strong>is</strong> <strong>is</strong> why you should learn <strong>“I</strong>nformation <strong>Th</strong>eory<strong>”</strong>.<br />
2
El Electrical i l Engineering E i i (Communication (C i i <strong>Th</strong>eory) <strong>Th</strong> )<br />
In the early 1940s, 1940s Shannon proved that<br />
the error probability p y of transm<strong>is</strong>sion error could be made<br />
nearly zero for all communication rates below “Channel<br />
Capacity<strong>”</strong>.<br />
Source:H Destination<br />
channel:C<br />
<strong>Th</strong>e Capacity, C, can be computed simply from the no<strong>is</strong>e<br />
character<strong>is</strong>tics (described ( by yconditional p probabilities) ) of the<br />
channel.<br />
3
Sh Shannon ffurther h argued d that h random d processes ( (signals) i l ) such h<br />
as music and speech have an irreducible complexity below<br />
which the signal cannot be compressed. compressed<br />
<strong>Th</strong><strong>is</strong> he named the “Entropy<strong>”</strong>. py<br />
Shannon argued that if the entropy of the source <strong>is</strong> less than<br />
th the Capacity C it of f the th channel, h l asymptotically t ti ll (in (i probabil<strong>is</strong>tic b bili ti<br />
sense) error-free communication can be achieved.<br />
4
CComputer SScience i (K (Kolmogorov l CComplexity) l i )<br />
Kolmogorov Kolmogorov, Chaitin Chaitin, and Solomonoff put the idea that the<br />
complexity of a string of data can be defined by the length<br />
of the shortest binary computer program for computing the<br />
string.<br />
<strong>Th</strong>e “Complexity<strong>”</strong> <strong>is</strong> the “Minimum description length<strong>”</strong> !<br />
<strong>Th</strong><strong>is</strong> definition of complexity <strong>is</strong> universal, universal that <strong>is</strong>, <strong>is</strong> computer<br />
independent, and <strong>is</strong> fundamental importance.<br />
“Kolmogorov Complexity<strong>”</strong> lays the foundation for the<br />
theory of “descriptive complexity<strong>”</strong>.<br />
5
GGratifyingly, if i l the h Kolmogorov K l complexity l i K i<strong>is</strong> approximately i l<br />
equal to the Shannon entropy H if the sequence <strong>is</strong> drawn at<br />
random from a d<strong>is</strong>tribution that has entropy HH.<br />
Kolmogorov complexity <strong>is</strong> considered to be more<br />
fundamental than Shannon entropy. It <strong>is</strong> the ultimate data<br />
compression and leads to a logically cons<strong>is</strong>tent procedure for<br />
inference.<br />
6
OOne can think hi k about b computational i l complexity l i (time ( i<br />
complexity) and Kolmogorov complexity (program length or<br />
descriptive complexity) as two axes corresponding to<br />
program running time and program length. Kolmogorov<br />
complexity focuses on minimizing along the second ax<strong>is</strong>, ax<strong>is</strong> and<br />
computational complexity focuses on minimizing along the<br />
first ax<strong>is</strong>.<br />
Little work has been done on the simultaneous minimization<br />
of the two.<br />
7
MMathematics h i (P (Probability b bili <strong>Th</strong> <strong>Th</strong>eory and d SStat<strong>is</strong>tics) i i )<br />
<strong>Th</strong>e fundamental quantities of Information <strong>Th</strong>eory –<br />
Entropy, Relative Entropy, and Mutual Information – are<br />
defined as functionals of probability d<strong>is</strong>tributions.<br />
d<strong>is</strong>tributions<br />
In turn, they characterize the behavior of long sequences of<br />
random d variables i bl and d allow ll us to estimate i the h probabilities b bili i<br />
of rare events and to find the best error exponent in<br />
hhypothes<strong>is</strong> pothes<strong>is</strong> tests tests.<br />
8
CComputation i vs. Communication.<br />
C i i<br />
As we build larger Computers out of smaller components, components we<br />
encounter both a computation limit and a communication<br />
limit limit. Computation <strong>is</strong> communication limited and<br />
communication <strong>is</strong> computation limited. <strong>Th</strong>ese become<br />
intertwined, and thus all of the developments in<br />
communication theory via information theory should have a<br />
direct impact p on the theory y of computation. p<br />
9
NNew TTrends d in i Information I f i <strong>Th</strong>eory. <strong>Th</strong><br />
• Compress each of many sources and then put the compressed<br />
descriptions together into a joint reconstruction of the sources<br />
— Slepian-Wolf theorem. theorem<br />
• If one has many senders sending information independently<br />
to a common receiver, what <strong>is</strong> the channel capacity of th<strong>is</strong><br />
“Multiple-Access p channel<strong>”</strong>—Liao and Ahlswede theorem.<br />
10
• If one has h one sender d and d many receives i and d w<strong>is</strong>hes i h to<br />
communicate (perhaps different) information<br />
simultaneously to each of the receiver, receiver what <strong>is</strong> the channel<br />
capacity of th<strong>is</strong> “Broadcasting channel<strong>”</strong>.<br />
• If one has arbitrary number of senders and receivers in an<br />
environment of interference and no<strong>is</strong>e, what <strong>is</strong> the capacity<br />
region of achievable rates from the various senders to the<br />
receivers.<br />
11