22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

in Chapter 9, and I reproduce it below for your convenience:

"…greedy decoding because each prediction is deemed final. "No backsies":

Once it’s done, it’s really done, you just move along to the next prediction and

never look back. In the context of our sequence-to-sequence problem, a

regression, it wouldn’t make much sense to do otherwise anyway.

But that may not be the case for other types of sequence-to-sequence

problems. In machine translation, for example, the decoder outputs

probabilities for the next word in the sentence at each step. The greedy

approach would simply take the word with the highest probability and move

on to the next.

However, since each prediction is an input to the next step, taking the top

word at every step is not necessarily the winning approach (translating from

one language to another is not exactly "linear"). It is probably wiser to keep a

handful of candidates at every step and try their combinations to choose the

best one: That’s called beam search…"

By the way, if you try using greedy decoding instead (setting do_sample=False), the

generated text simply and annoyingly repeats the same text over and over again:

'What is the use of a daisy-chain?'

'I don't know,' said Alice, 'but I think it is a good idea.'

'What is the use of a daisy-chain?'

'I don't know,' said Alice, 'but I think it is a good idea.'

For more details on the different arguments that can be used for

text generation, including a more detailed explanation of both

greedy decoding and beam search, please check HuggingFace’s

blog post "How to generate text: Using different decoding

methods for language generation with Transformers" [222] by

Patrick von Platen.

"Wait a minute! Aren’t we fine-tuning GPT-2 so it can write text in a

given style?"

I thought you would never ask… Yes, we are. It’s the final example, and we’re

GPT-2 | 1007

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!