Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych
Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych
Elektronika 2009-11.pdf - Instytut Systemów Elektronicznych
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
In addition, one has to remember that a musical work<br />
always begins with a tonic (T), and each stress build up by<br />
a dominant is resolved to a tonic (T). In short, creating music<br />
is a matter of choosing such progressions.<br />
Input data preparation<br />
An idea was born to try, in accordance with Computer Generated<br />
Music trends, to implement (simplified) tonal harmony<br />
rules in generation of harmonic sequences as an initial phase<br />
of further research in the area of Computer Generated Music.<br />
To achieve this, neural networks have been used. After performing<br />
extensive evaluation and testing, a three layer perceptron<br />
model with 32 neurons in hidden layer has been<br />
chosen. Each of 16 input neurons and 16 output neurons has<br />
been assigned a corresponding value from the learning sets.<br />
An example of such mapping is presented in Table 2.<br />
Tabl. 2. Exemplary values of corresponding input and output neurons<br />
used in the network architecture<br />
Tab. 2. Przykładowe wartości odpowiednich neuronów wejściowych<br />
i wyjściowych w użytej architekturze sieci<br />
Neurons In<br />
In1 In2 In3 In4 In5 … In14 In15 In16<br />
1 0 0 0 1 0 1 0 0<br />
Neurons Out<br />
Ou1 Ou2 ... Ou8 Ou9 … Ou14 Ou15 Ou16<br />
0 0 0 1 0 0 1 0 0<br />
To represent the corresponding tones in chords of given<br />
harmonic meaning, a MIDI notation was used (see tabl. 2),<br />
more precisely - the part of information stored in the voice<br />
messages. Occurence of a certain value in a voice message<br />
has led to a ‘1’ as an input for a given neuron. No tone was<br />
equivalent to ‘0’ as an input.<br />
Tabl. 3. Values corresponding to sounds in MIDI notation<br />
Tab. 3. Wartości MIDI odpowiadające wartościom dźwiękowym<br />
Note MIDI value (hex) Decimal value (dec)<br />
C 3c 60<br />
D 3e 62<br />
E 40 64<br />
F 41 65<br />
G 43 67<br />
A 45 69<br />
H 47 71<br />
Data was prepared in accordance with tonal harmony<br />
rules using a synthesizer, which was utilized to obtain the desired<br />
groups of dependencies as sets for teaching the neural<br />
network. The sets contained from 50 sequences, in a simplest<br />
case, up to 300 in a case that considered dependencies between<br />
triads in I and II position of the secondary functions.<br />
Input data was tested according to the rule that three tones<br />
making a chord have specified harmonic meaning. In the course<br />
of using a taught network to generate harmonic sequences,<br />
the input consisted of the two tones plus the harmonic<br />
meaning of the previous tonal sequence, in a hope that the<br />
network would point the correct tone for filling in the specified<br />
harmonic value. The tonal range was limited to a single octave<br />
to simplify the occurring dependencies.<br />
The research<br />
The learning process, depending on the method used,<br />
consisted of up to 40 000 iterations. In the process [6,7] the<br />
following methods were taken into consideration:<br />
• gradient descent back-propagation, (with and without momentum),<br />
• resilient back-propagation,<br />
In addition, the following transfer functions were used there [5]:<br />
• log-sigmoid transfer function - Matlab - logsig(n);<br />
• hyperbolic tangent sigmoid transfer function - Matlab<br />
tansig(n).<br />
As a result of teaching the neural network, a good match,<br />
measured with the mean squared error (MSE) and the number<br />
of erratic classifications, was obtained (see tabl. 4). The<br />
magnitude of the classification error Er means the number of<br />
direct error matches with respect to the expected precise<br />
answers and can be defined in the following manner:<br />
where: er - number of erratic classifications, L - test set size<br />
Tabl. 4. The results of learning processes and their Er error values<br />
Tab. 4. Wyniki uczenia wraz z wartościami błędu Er<br />
Network<br />
architecture<br />
Learning<br />
time<br />
(iterations)<br />
16/32/16 40 000<br />
16/32/16 40 000<br />
16/32/16 500<br />
Transfer<br />
Function<br />
tansig/<br />
tansig/tansig<br />
tansig/<br />
tansig/tansig<br />
tansig/<br />
tansig/tansig<br />
Teaching<br />
Function<br />
MSE<br />
The most interesting results for an architecture with one<br />
hidden layer and 32 neurons in the layer, as well as obtained<br />
adjustment and mean squared error (MSE), is shown in<br />
tabl. 4. The architecture of presented solution was chosen<br />
empirically. All of the sets were prepared with default network<br />
parameters taken into account. Various network configurations<br />
with different numbers of neurons per hidden layer were<br />
tried out. Many more setups were investigated , but not all of<br />
them are included because of the unsatisfactory results they<br />
produced.<br />
Concluding from the tabl. 4, the reached level of mean<br />
squared error of learning process for this issue is, in fact, the<br />
same in every case. The same can be observed for the Er<br />
classification error, which shows similar values in different network<br />
configurations. Due to this, more attention has to be paid<br />
to the configurations that allow to obtain similar level of adjustment<br />
with less calculation overhead.<br />
(1)<br />
(2)<br />
(3)<br />
Er<br />
traingd 0,0214148 8<br />
taingdm 0,021857 8<br />
trainrp 0,0182302 8<br />
ELEKTRONIKA 11/<strong>2009</strong> 23