09.05.2023 Views

pdfcoffee

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 5

MuseNet can produce up to 4-minute musical compositions with 10 different

instruments, and can combine styles from country, to Mozart, to the Beatles. For

instance, I generated a remake of Beethoven's "fur Elise" in the style of Lady Gaga

with Piano, Drums, Guitar, and Bass. You can try this for yourself at https://

openai.com/blog/musenet/, under the section Try MuseNet:

A summary of convolution operations

In this section we present a summary of different convolution operations.

A convolutional layer has I input channels and produces O output channels.

I × O × K parameters are used, where K is the number of values in the kernel.

Basic convolutional neural networks

(CNN or ConvNet)

Let's remind ourselves briefly what a CNN is. CNNs take in an input image (two

dimensions) or a text (two dimensions) or a video (three dimensions) and apply

multiple filters to the input. Each filter is a like a flashlight sliding across the areas

of the input and the areas that it is shining over is called the receptive field. Each

filter is a tensor of the same depth of the input (for instance if the image has a depth

of 3, then the filter must also have a depth of 3).

When the filter is sliding, or convolving, around the input image, the values

in the filter are multiplied by the values of the input. The multiplications are

then summarized into one single value. This process is repeated for each location

producing an activation map (also known as a feature map).

[ 183 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!