16.03.2021 Views

Advanced Deep Learning with Keras

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

( D)

x

data

( ) log ( ) ( ) log 1 ( )

x

g

[ 129 ]

( )

Chapter 5

L =− p x D x dx− p x − D x dx (Equation 5.1.6)

( D)

x

( data ( ) log ( ) g ( ) log( 1 ( )))

L =− p x D x + p x − D x dx (Equation 5.1.7)

The term inside the integral is in the form of y → a log y + b log(1 - y) which has

a

2

a known maximum value at a + b for y ∈ [ 0,1]

, for any a,

b ∈ R not including {0,0}.

Since the integral does not change the location of the maximum value (or the

( D)

minimum value of L ) for this expression, the optimal discriminator is:

pdata

( x) =

p + p

D (Equation 5.1.8)

data g

Consequently, the loss function is given the optimal discriminator:

( D )

p

data

p ⎞

data

=−E

x~ pdata

log − Ex~

p

log 1− g

pdata + p

g

⎜⎝ pdata + pg

⎠⎟

L (Equation 5.1.9)

( D )

p

⎛ p ⎞

data

g

=−E

x~ pdata

log −Ex~

p

log

g

pdata + p

g

⎜⎝ pdata + pg

⎟⎠

L (Equation 5.1.10)

( D )

p p p p

2log 2 D ⎛ + data g data g

KL

p ⎞ data

D ⎛ +

= − KL

p

g

⎝ 2 ⎟⎠ ⎜⎝ 2 ⎟⎠

L (Equation 5.1.11)

( )

( D )

= 2log 2−2D p p

L (Equation 5.1.12)

JS data g

We can observe from Equation 5.1.12 that the loss function of the optimal

discriminator is a constant minus twice the Jensen-Shannon divergence between

the true distribution, pdata, and any generator distribution, p g

. Minimizing ( D ∗

)

L

D p p or the discriminator must correctly classify

implies maximizing

JS ( data g )

fake from real data.

Meanwhile, we can safely argue that the optimal generator is when the generator

distribution is equal to the true data distribution:

( x)

→ p = p

G * (Equation 5.1.13)

g data

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!