31.07.2015 Views

The Arimoto-Blahut Algorithm for Calculation of Channel Capacity

The Arimoto-Blahut Algorithm for Calculation of Channel Capacity

The Arimoto-Blahut Algorithm for Calculation of Channel Capacity

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

J(Q, Φ) ≤ log( ∑ ir(i) = exp[ ∑ jr(i))P (j/i) log Φ(i/j)]with equality iff Q(i) =r(i) ∑k r(k).<strong>The</strong> double maximum in (4) can be taken in any order. <strong>The</strong>re<strong>for</strong>e,C = maxΦmaxQJ(Q, Φ) = max log(∑Φir(i))= max log(∑Φiexp ∑ jP (j/i) log Φ(i/j)) (5)<strong>The</strong> point <strong>of</strong> introducing J(Q, Φ) is that C can be calculated through a double maximizationprocedure (40¡ each step <strong>of</strong> which can be solved in closed <strong>for</strong>m, as given in Lemmas1 and 2. <strong>The</strong> maximization <strong>of</strong> J(Q) with respect to Q has no closed-<strong>for</strong>m solution. We candoubly maximize J(Q, Φ) in any order. For example, let us fix Q to some initial value Q o .By Lemma 1, Φ o (i/j) =P (j/i)Qo (i)∑k P (j/k)Qo (k) maximizes J(Qo , Φ) when Φ = Φ o . Now considerJ(Q, Φ o ) <strong>for</strong> Φ o fixed and vary Q to produce a maximum. In Lemma 2, that maximum isJ(Q, Φ o ) = log( ∑ ir(i)) = log ∑ iexp ∑ jP (j/i) log Φ(i/j)<strong>for</strong> Q(i) = Q 1 (i) =r(i) ∑k r(k). <strong>The</strong>n <strong>for</strong> that Q1 we can find a maximum <strong>of</strong> J(Q 1 , Φ) <strong>for</strong>Φ = Φ 1 as given in Lemma 1 <strong>for</strong> Q 1 and P (j/i).Fix Φ 1 and maximize J(Q, Φ 1 ) <strong>for</strong>Q = Q 2 as given in Lemma 2. Now maximize J(Q 2 , Φ) and so on. At each step, J(Q, Φ)is increasing until it eventually reaches capacity C.So, we can now <strong>for</strong>mulate the <strong>Arimoto</strong>-<strong>Blahut</strong> algorithm to accomplish the doublemaximization in (4).3


<strong>Algorithm</strong>l is the iteration index1. Set l = 0 and choose initial set <strong>of</strong> input probabilities Q o (i) > 0, all i.2. ComputeΦ l (i/j) =r l (i) = exp ∑ jQl (i)P (j/i)∑k Ql (k)P (j/k)all i, jP (j/i) ln Φ l (i/j)J(Q l+1 , Φ l ) = ln( ∑ ir l (i))3. Set l = l + 1 and go to 2.Q l+1 (i) =rl (i)∑k rl (i)This algorithm accomplishes what was already explained, but not how to stop therecursion. <strong>The</strong> question now is when do we know that we are close to capacity. We nowhave to introduce one more lemma.For any l, J(Q l+1 , Φ l ) = ln( ∑ i rl (i)) ≤ C. Let us define<strong>The</strong>nc l (i) ≡ rl (i)Q l (i)J(Q l+1 , Φ l ) = ln( ∑ iQ l (i)c l (i))So J(Q l+1 , Φ l ) is the logarithm <strong>of</strong> the average <strong>of</strong> the c l (i)’s.Lemma 3C ≤ max ln c l (i)i4


<strong>The</strong>re<strong>for</strong>e, channel capacity C is bounded from below and above as follows:ln( ∑ iQ l (i)c l (i)) ≤ C ≤ max ln c l (i)iIn order to find C within accuracy ɛ, stop the iteration whenmax ln c l (i) − ln( ∑iiQ l (i)c l (i)) < ɛorJ(Q l+1 , Φ l ) > max ln c l (i) − ɛiInsert in algorithm:2 ′ . Calculate −J(Q l+1 , Φ l ) + max i ln rl (i)Q l (i) = T .If T > ɛ, go to 3.If T < ɛ, go to 4.4. C = J(Q l+1 , Φ l )It is interesting to note that ln c l (i) = I(i; Y/Q l ); the average in<strong>for</strong>mation that theoutput ensemble Y gives about the input event i, when the input probabilities are Q l .Recall that ifI(i; Y/Q) = γ <strong>for</strong> all i s.t. Q(i) > 0I(i; Y/Q) ≤ γ <strong>for</strong> all i s.t. Q(i) = 0(Kuhn-Tucker conditions).J(Q) = I(i; Y/Q) = C and C = γ. So, when we are nearing capacity C, the in<strong>for</strong>mationgiven by the output ensemble Y about each input event <strong>of</strong> nonzero probability isnearing equality, since5


∑Q l (i) ln c l (i) ≤ ln ∑iiQ l (i)c l (i) ≤ C ≤ max ln c l (i)iby the convex ∩ property <strong>of</strong> ln. So we haveI(i; Y/Q l ) ≤ C ≤ max I(i; Y/Q l )iWe now prove the lemmas.Pro<strong>of</strong> <strong>of</strong> Lemma 1:J(Q, Φ) − J(Q) = ∑ i≤ ∑ i= ∑ i∑P (j/i)Q(i) log Φ(i/j)P (i/j)j∑[ ]Φ(i/j)P (j/i)Q(i)P (i/j) − 1 j∑[P (j)Φ(i/j) − P (j/i)Q(i)] = 1 − 1 = 0jwith equality iff Φ(i/j) = P (i/j) <strong>for</strong> all i and j.Pro<strong>of</strong> <strong>of</strong> Lemma 2:6


J(Q, Φ) = ∑ i= ∑ i= ∑ iJ(Q, Φ) = ∑ i= ∑ i∑jP (j/i)Q(i) log Φ(i/j)Q(i)∑1P (j/i)Q(i) logQ(i) + ∑ ∑P (j/i)Q(i) log Φ(i/j)ji j1Q(i) logQ(i) + ∑ Q(i) ∑ P (j/i) log Φ(i/j)i j( ∑)expjP (j/i) log Φ(i/j)Q(i) logQ(i)( ) r(i)Q(i) log , r(i) = exp ∑ P (j/i) log Φ(i/j)Q(i)j= ∑ iQ(i) log( ∑ kr(k)) + ∑ iQ(i) log r(i)/ ∑ k r(k)Q(i)≤ log( ∑ kr(k))<strong>The</strong> last step results from ln x ≤ x − 1 in the second sum with equality iff Q(i) =r(i) ∑k r(k).Pro<strong>of</strong> <strong>of</strong> Lemma 3.Suppose Q ∗ achieves capacity. <strong>The</strong>nC = ∑ iC = ∑ i∑j∑jP (j/i)Q ∗ (i) log P (j/i)P ∗ (j) , P ∗ (j) = ∑ kP (j/i)Q ∗ (i) log P (j/i)P l (j) × P l (j)P ∗ (j)P (j/k)Q ∗ (k)whereP l (j) = ∑ kP (j/k)Q l (k) are the output probabilities <strong>for</strong> Q l (k).Continuing,7


C = ∑ i= ∑ j∑jP (j/i)Q ∗ (i) log P l (j)P ∗ (j) + ∑ iP ∗ (j) log P l (j)P ∗ (j) + ∑ iQ ∗ (i) ∑ j∑jP (j/i)Q ∗ (i) log P (j/i)P l (j)P (j/i) log P (j/i)P l (j)Using ln x ≤ x − 1, the first term is ≤ 0. In the second term, we can overbound the∑jP (j/i) logP (j/i)P l (j)by its maximum over i. <strong>The</strong>re<strong>for</strong>e,C ≤ maxi∑jP (j/i) log P (j/i)p l (j)= max I(i; Y/Q l )iRecallr l (i) = exp ∑ jP (j/i) ln Φ l (i/j) = exp ∑ jP (j/i) ln Ql (i)P (j/i)P l (j)ln c l (i) = ln r l (i) − ln Q l (i) = ∑ jP (j/i) ln Ql (i)P (j/i)P l (j)− ∑ jP (j/i) ln Q l (i)= ∑ jP (j/i) ln P (j/i)P l (j) = I(i; Y/Ql ).<strong>The</strong>re<strong>for</strong>e, the conclusion <strong>of</strong> the lemma can be expressed asC ≤ max ln c l (i),ias given.Note the condition <strong>for</strong> equality. We must have P l (j) = P ∗ (j) or Q l (i) = Q ∗ (i) <strong>for</strong>nonzero Q ∗ (i) and I(i; Y/Q l ) be the same maximum value <strong>for</strong> all i such that Q l (i) > 0.<strong>The</strong>se conditions are both necessary and sufficient <strong>for</strong> equality. So we have corroboratedthe Kuhn-Tucker conditions.8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!