Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

In code, we can use split() to get tensors for each of the components:Wxr, Wxz, Wxn = Wx.split(hidden_dim, dim=0)bxr, bxz, bxn = bx.split(hidden_dim, dim=0)Whr, Whz, Whn = Wh.split(hidden_dim, dim=0)bhr, bhz, bhn = bh.split(hidden_dim, dim=0)Wxr, bxrOutput(tensor([[-0.0930, 0.0497],[ 0.4670, -0.5319]]), tensor([-0.4316, 0.4019]))Next, let’s use the weights and biases to create the corresponding linear layers:def linear_layers(Wx, bx, Wh, bh):hidden_dim, n_features = Wx.size()lin_input = nn.Linear(n_features, hidden_dim)lin_input.load_state_dict({'weight': Wx, 'bias': bx})lin_hidden = nn.Linear(hidden_dim, hidden_dim)lin_hidden.load_state_dict({'weight': Wh, 'bias': bh})return lin_hidden, lin_input# reset gate - redr_hidden, r_input = linear_layers(Wxr, bxr, Whr, bhr)# update gate - bluez_hidden, z_input = linear_layers(Wxz, bxz, Whz, bhz)# candidate state - blackn_hidden, n_input = linear_layers(Wxn, bxn, Whn, bhn)Gated Recurrent Units (GRUs) | 631

Then, let’s use these layers to create functions that replicate both gates (r and z)and the candidate hidden state (n):def reset_gate(h, x):thr = r_hidden(h)txr = r_input(x)r = torch.sigmoid(thr + txr)return r # reddef update_gate(h, x):thz = z_hidden(h)txz = z_input(x)z = torch.sigmoid(thz + txz)return z # bluedef candidate_n(h, x, r):thn = n_hidden(h)txn = n_input(x)n = torch.tanh(r * thn + txn)return n # blackCool—all the transformations and activations are handled by the functions above.This means we can replicate the mechanics of a GRU cell at its component level (r, z,and n). We also need an initial hidden state and the first data point (corner) of asequence:initial_hidden = torch.zeros(1, hidden_dim)X = torch.as_tensor(points[0]).float()first_corner = X[0:1]We use both values to get the output from the reset gate (r):r = reset_gate(initial_hidden, first_corner)rOutputtensor([[0.2387, 0.6928]], grad_fn=<SigmoidBackward>)632 | Chapter 8: Sequences

Then, let’s use these layers to create functions that replicate both gates (r and z)

and the candidate hidden state (n):

def reset_gate(h, x):

thr = r_hidden(h)

txr = r_input(x)

r = torch.sigmoid(thr + txr)

return r # red

def update_gate(h, x):

thz = z_hidden(h)

txz = z_input(x)

z = torch.sigmoid(thz + txz)

return z # blue

def candidate_n(h, x, r):

thn = n_hidden(h)

txn = n_input(x)

n = torch.tanh(r * thn + txn)

return n # black

Cool—all the transformations and activations are handled by the functions above.

This means we can replicate the mechanics of a GRU cell at its component level (r, z,

and n). We also need an initial hidden state and the first data point (corner) of a

sequence:

initial_hidden = torch.zeros(1, hidden_dim)

X = torch.as_tensor(points[0]).float()

first_corner = X[0:1]

We use both values to get the output from the reset gate (r):

r = reset_gate(initial_hidden, first_corner)

r

Output

tensor([[0.2387, 0.6928]], grad_fn=<SigmoidBackward>)

632 | Chapter 8: Sequences

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!