22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

become more and more sparse (that is, have more zeros in them) as the vocabulary

grows, but also every word is orthogonal to all the other words.

"What do you mean by one word being orthogonal to the others?"

Remember the cosine similarity from Chapter 9? Two vectors are said to be

orthogonal to each other if there is a right angle between them, corresponding to a

similarity of zero. So, if we use one-hot-encoded vectors to represent words, we’re

basically saying that no two words are similar to each other. This is obviously

wrong (take synonyms, for example).

"How can we get better vectors to represent words then?"

Well, we can try to explore the structure and the relationship between words in a

given sentence. That’s the role of…

Language Models

A language model (LM) is a model that estimates the probability of a token or

sequence of tokens. We’ve been using token and word somewhat interchangeably,

but a token can be a single character or a sub-word too. In other words, a language

model will predict the tokens more likely to fill in a blank.

Now, pretend you’re a language model and fill in the blank in the following

sentence.

Figure 11.4 - What’s the next word?

You probably filled the blank in with the word "you."

Figure 11.5 - Filling in the [BLANK]

What about this sentence?

Figure 11.6 - What’s the next word?

916 | Chapter 11: Down the Yellow Brick Rabbit Hole

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!