Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

peiying410632
from peiying410632 More from this publisher
22.02.2024 Views

Maybe you filled this blank in with "too," or maybe you chose a different word like"here" or "now," depending on what you assumed to be preceding the first word.Figure 11.7 - Many options for filling in the [BLANK]That’s easy, right? How did you do it, though? How do you know that "you" shouldfollow "nice to meet"? You’ve probably read and said "nice to meet you" thousands oftimes. But have you ever read or said: "Nice to meet aardvark"? Me neither!What about the second sentence? It’s not that obvious anymore, but I bet you canstill rule out "to meet you aardvark" (or at least admit that’s very unlikely to be thecase).It turns out, we have a language model in our heads too, and it’s straightforward toguess which words are good choices to fill in the blanks using sequences that arefamiliar to us.Before Word Embeddings | 917

N-gramsThe structure, in the examples above, is composed of three words and a blank: afour-gram. If we were using two words and blank, that would be a trigram, and, for agiven number of words (n-1) followed by a blank, an n-gram.Figure 11.8 - N-gramsN-gram models are based on pure statistics: They fill in the blanks using the mostcommon sequence that matches the words preceding the blank (that’s called thecontext). On the one hand, larger values of n (longer sequences of words) may yieldbetter predictions; on the other hand, they may yield no predictions since aparticular sequence of words may have never been observed. In the latter case, onecan always fall back to a shorter n-gram and try again (that’s called a stupid back-off,by the way).For a more detailed explanation of n-gram models, please checkthe "N-gram Language Models" [178] section of Lena Voita’samazing "NLP Course | For You." [179]These models are simple, but they are somewhat limited because they can onlylook back."Can we look ahead too?"Sure, we can!918 | Chapter 11: Down the Yellow Brick Rabbit Hole

N-grams

The structure, in the examples above, is composed of three words and a blank: a

four-gram. If we were using two words and blank, that would be a trigram, and, for a

given number of words (n-1) followed by a blank, an n-gram.

Figure 11.8 - N-grams

N-gram models are based on pure statistics: They fill in the blanks using the most

common sequence that matches the words preceding the blank (that’s called the

context). On the one hand, larger values of n (longer sequences of words) may yield

better predictions; on the other hand, they may yield no predictions since a

particular sequence of words may have never been observed. In the latter case, one

can always fall back to a shorter n-gram and try again (that’s called a stupid back-off,

by the way).

For a more detailed explanation of n-gram models, please check

the "N-gram Language Models" [178] section of Lena Voita’s

amazing "NLP Course | For You." [179]

These models are simple, but they are somewhat limited because they can only

look back.

"Can we look ahead too?"

Sure, we can!

918 | Chapter 11: Down the Yellow Brick Rabbit Hole

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!