09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Recurrent Neural Networks

Gated recurrent unit (GRU)

The GRU is a variant of the LSTM and was introduced by Cho, et al [17]. It retains

the LSTM's resistance to the vanishing gradient problem, but its internal structure

is simpler, and is therefore faster to train, since less computations are needed to

make updates to its hidden state.

Instead of the input (i), forgot (f), and output (o) gates in the LSTM cell, the GRU cell

has two gates, an update gate z and a reset gate r. The update gate defines how much

previous memory to keep around, and the reset gate defines how to combine the

new input with the previous memory. There is no persistent cell state distinct from

the hidden state as it is in LSTM.

The GRU cell defines the computation of the hidden state h t

at time t from the hidden

state h t-1

at the previous time step using the following set of equations:

zz = σσ(WW zz h tt−1 + UU zz xx tt )

rr = σσ(WW rr h tt−1 + UU rr xx tt )

cc = tanh⁡(WW cc (h tt−1 ∗ rr) + UU cc xx tt )

h tt = (zz ∗ cc) + ((1 − zz) ∗ h tt−1 )

The outputs of the update gate z and the reset gate r are both computed using

a combination of the previous hidden state h t-1

and the current input x t

. The

sigmoid function modulates the output of these functions between 0 and 1. The

cell state c is computed as a function of the output of the reset gate r and input x t

.

Finally, the hidden state h t

at time t is computed as a function of the cell state c and

the previous hidden state h t-1

. The parameters W z

, U z

, W r

, U r

, and W c

, U c

, are learned

during training.

Similar to LSTM, TensorFlow 2.0 (tf.keras) provides an implementation for the

basic GRU layer as well, which is a drop-in replacement for the RNN cell.

Peephole LSTM

The peephole LSTM is an LSTM variant that was first proposed by Gers and

Schmidhuber [19]. It adds "peepholes" to the input, forget, and output gates, so they

can see the previous cell state c t-1

. The equations for computing the hidden state h t

,

at time t, from the hidden state h t-1

at the previous time step, in a peephole LSTM are

shown next.

[ 288 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!