DEliverable 2.3 - the School of Engineering and Design - Brunel ...

DEliverable 2.3 - the School of Engineering and Design - Brunel ... DEliverable 2.3 - the School of Engineering and Design - Brunel ...

dea.brunel.ac.uk
from dea.brunel.ac.uk More from this publisher
27.12.2014 Views

ICT Project 3D VIVANT– Deliverable 2.3 Contract no.: 248420 User Acceptance Validation Plan 2.3 QUALITY ASSESSEMNT ON 3D AUDIO For studying the perceptive quality of audio of content reproducing, processing algorithms and content generation, listening tests are the current state-of-the-art. For conducting listening tests, there are several standardised and established methods. Most of them are used for validating the quality level of coding systems. There are no up-to-date standardised methods for 3D audio assessment, but the known and established principles can easily be transferred for assessing certain attributes of 3D audio. All methods offer the test subjects one or more test-signals, so called stimuli, which they have to value on the basis of predefined attributes. Multiple stimuli, which have to be valued simultaneously, form a so-called trial. The different trials of a listening test should be validated back-to-back and the order of the trials has to be accidental and unknown to the subjects (ITU-R BS.1284-1). Participating test persons are grouped in expert listeners and normal listeners. The assigned category depends on the experience of the subject with listening tests and used methods. If the differences between the stimuli under test are small and difficult to detect, the listening test should be conducted with expert listeners. Test with normal listeners may be consulted for the evaluation of a new audio reproducing system’s overall perceptive quality. For the evaluation it is necessary to separate the valuations of expert and normal listeners to draw conclusions from the results to the listening experience of the test persons. In principle the number of test persons should be as big as possible, but in practice a number of 12 to 30 test subjects is considered to be sufficient. Listening tests can be conducted in three different environments: • Anechoic chamber: Advantageous for tests without any influence of the room. • Headphone listening: Avoids most room influences and provides the detection of subtle distinction. • Room with predefined conditions referred to ITU-R BS.1116-1: Volume, reverberation time, etc. are defined to prevent possible unwanted room influences. These parameters also guarantee the reproducibility of the listening tests. In the preparation of the listening test it is necessary to define the attributes, which are to be rated. These can be for example the sound colour or the spatial impression. A list of possible attributes can be found at ITU-R BS.1284-1, but using own attributes is also allowed as long as they are explicit and clear defined. 2.3.1 Methods for the Assessment of Sound Colour Every method for proceeding listening tests has advantages and disadvantages and has to be selected on the basis of the criteria to be evaluated, the number of stimuli and the expected perceptibility. Typical methods are: • “ABX”: Two known stimuli “A” and “B” and one further “X”, which is either identically with “A” or “B”, are offered to the test person. The test subject has to allocate “X” to the corresponding stimulus. The different stimuli can be listened to as often as wanted. This method is qualified for subtle distinction. The number of correct and wrong allocations is a measure of the perceptibility of the differences, but doesn’t provide any quantitative information. • “Double-blind triple-stimulus with hidden reference” referred to ITU-R BS.1116-1: Three stimuli “A”, “B” and “C” are presented to the test subject. “A” is always the known 01.09.11 12

ICT Project 3D VIVANT– Deliverable 2.3 Contract no.: 248420 User Acceptance Validation Plan reference. “B” and “C” are equivalent to the tested stimulus and one more hidden reference in random order. The test subject has to value the stimuli “B” and “C” in comparison to the reference “A”. The hidden reference should be identified, of course. The stimuli can be listened to as often as wanted. The issued ratings and recognition rate of the hidden reference gives information about small differences respectively degradation of the tested signal. • “Multiple stimulus test with hidden reference and anchor” (MUSHRA) referred to ITU- R BS.1534-1: The test person is offered several stimuli and one known reference. One of the stimuli equates to a hidden reference and a “low anchor”. The test subject has to rate the individual stimuli in comparison to the reference regarding to a defined attribute using the “quality scale” (ITU-R BS.1284-1). The hidden reference has to be identified and accordingly positive valued and the “low anchor”, which is an artificially debased signal, should also be identified and negative valued. Sometimes MUSHRA is used without “low anchor” and/or explicit reference, as it may be hard to define such signals under certain circumstances. 2.3.2 Methods for the Assessment of Localisation Quality Especially for spatial audio the accuracy of localisation of an audio reproduction or generation system is very important. In practice there are two methods established for assessment (Farag 2003): • Pointing method: The test person is offered a stimulus and has to point with a laser on a scale or pencil on a map at the perceived position. This method has the advantage that it is easy to conduct, but the disadvantage of possible further inaccuracies through the test person. This applies especially for positions behind or over the test subject, thus positions outside of the test person’s visual field. • Acoustic pointer method: The test subject controls a sound source (e.g. a loudspeaker array with only one active speaker) and has to ”point” at the perceived position through positioning the sound source. The test person can switch between the offered stimulus and the positioned sound source as often as wanted. For both variations, the correlation between real and perceived position is evaluated. Instead of positions often only the perceived direction, that means the angle of incidence, is evaluated. 2.4 QUALITY ASSESSMENT ON INTERACTIVE SOFTWARE This section provides an overview of the methods that will be employed for testing the user acceptance for interactive features. There are numerous methods to evaluate the usability and usefulness of Information Technology. As 3D VIVANT’s development outcomes are primarily combinatory innovations, models which validate usability and usefulness in comparison with existing technologies are applicable only to a limited extent. While the Motivational Model (MM) focuses on predicting the users’ interest in using the features in question and Innovation Diffusion Theory (IDT, after Rogers 1985) or the Model of PC Utilization (MPCU, after Thompson et al. 1994) focus on evaluating usability in terms of improvements of previous experience and (especially) working situations, 3D VIVANT’s test will focus on the Technology Acceptance Model (TAM, after Davis 1989) as it focuses on two aspects of major interest in 3D VIVANT: 1) Perceived Usefulness (PU), and 2) Perceived Ease-of-Use (PEOU). Whereas Perceived Usefulness is the degree to which a person believes that using a particular system would enhance or improve his or her situation Perceived Ease of Use measures “the degree to which a person believes that using a particular system would be free of effort” (Davis 1989). 01.09.11 13

ICT Project 3D VIVANT– Deliverable <strong>2.3</strong><br />

Contract no.:<br />

248420<br />

User Acceptance Validation Plan<br />

<strong>2.3</strong> QUALITY ASSESSEMNT ON 3D AUDIO<br />

For studying <strong>the</strong> perceptive quality <strong>of</strong> audio <strong>of</strong> content reproducing, processing algorithms <strong>and</strong><br />

content generation, listening tests are <strong>the</strong> current state-<strong>of</strong>-<strong>the</strong>-art.<br />

For conducting listening tests, <strong>the</strong>re are several st<strong>and</strong>ardised <strong>and</strong> established methods. Most <strong>of</strong> <strong>the</strong>m<br />

are used for validating <strong>the</strong> quality level <strong>of</strong> coding systems. There are no up-to-date st<strong>and</strong>ardised<br />

methods for 3D audio assessment, but <strong>the</strong> known <strong>and</strong> established principles can easily be transferred<br />

for assessing certain attributes <strong>of</strong> 3D audio.<br />

All methods <strong>of</strong>fer <strong>the</strong> test subjects one or more test-signals, so called stimuli, which <strong>the</strong>y have to<br />

value on <strong>the</strong> basis <strong>of</strong> predefined attributes. Multiple stimuli, which have to be valued simultaneously,<br />

form a so-called trial. The different trials <strong>of</strong> a listening test should be validated back-to-back <strong>and</strong> <strong>the</strong><br />

order <strong>of</strong> <strong>the</strong> trials has to be accidental <strong>and</strong> unknown to <strong>the</strong> subjects (ITU-R BS.1284-1).<br />

Participating test persons are grouped in expert listeners <strong>and</strong> normal listeners. The assigned category<br />

depends on <strong>the</strong> experience <strong>of</strong> <strong>the</strong> subject with listening tests <strong>and</strong> used methods. If <strong>the</strong> differences<br />

between <strong>the</strong> stimuli under test are small <strong>and</strong> difficult to detect, <strong>the</strong> listening test should be conducted<br />

with expert listeners. Test with normal listeners may be consulted for <strong>the</strong> evaluation <strong>of</strong> a new audio<br />

reproducing system’s overall perceptive quality. For <strong>the</strong> evaluation it is necessary to separate <strong>the</strong><br />

valuations <strong>of</strong> expert <strong>and</strong> normal listeners to draw conclusions from <strong>the</strong> results to <strong>the</strong> listening<br />

experience <strong>of</strong> <strong>the</strong> test persons. In principle <strong>the</strong> number <strong>of</strong> test persons should be as big as possible, but<br />

in practice a number <strong>of</strong> 12 to 30 test subjects is considered to be sufficient.<br />

Listening tests can be conducted in three different environments:<br />

• Anechoic chamber: Advantageous for tests without any influence <strong>of</strong> <strong>the</strong> room.<br />

• Headphone listening: Avoids most room influences <strong>and</strong> provides <strong>the</strong> detection <strong>of</strong> subtle<br />

distinction.<br />

• Room with predefined conditions referred to ITU-R BS.1116-1: Volume, reverberation<br />

time, etc. are defined to prevent possible unwanted room influences. These parameters also<br />

guarantee <strong>the</strong> reproducibility <strong>of</strong> <strong>the</strong> listening tests.<br />

In <strong>the</strong> preparation <strong>of</strong> <strong>the</strong> listening test it is necessary to define <strong>the</strong> attributes, which are to be rated.<br />

These can be for example <strong>the</strong> sound colour or <strong>the</strong> spatial impression. A list <strong>of</strong> possible attributes can<br />

be found at ITU-R BS.1284-1, but using own attributes is also allowed as long as <strong>the</strong>y are explicit <strong>and</strong><br />

clear defined.<br />

<strong>2.3</strong>.1 Methods for <strong>the</strong> Assessment <strong>of</strong> Sound Colour<br />

Every method for proceeding listening tests has advantages <strong>and</strong> disadvantages <strong>and</strong> has to be selected<br />

on <strong>the</strong> basis <strong>of</strong> <strong>the</strong> criteria to be evaluated, <strong>the</strong> number <strong>of</strong> stimuli <strong>and</strong> <strong>the</strong> expected perceptibility.<br />

Typical methods are:<br />

• “ABX”: Two known stimuli “A” <strong>and</strong> “B” <strong>and</strong> one fur<strong>the</strong>r “X”, which is ei<strong>the</strong>r identically<br />

with “A” or “B”, are <strong>of</strong>fered to <strong>the</strong> test person. The test subject has to allocate “X” to <strong>the</strong><br />

corresponding stimulus. The different stimuli can be listened to as <strong>of</strong>ten as wanted. This<br />

method is qualified for subtle distinction. The number <strong>of</strong> correct <strong>and</strong> wrong allocations is a<br />

measure <strong>of</strong> <strong>the</strong> perceptibility <strong>of</strong> <strong>the</strong> differences, but doesn’t provide any quantitative<br />

information.<br />

• “Double-blind triple-stimulus with hidden reference” referred to ITU-R BS.1116-1:<br />

Three stimuli “A”, “B” <strong>and</strong> “C” are presented to <strong>the</strong> test subject. “A” is always <strong>the</strong> known<br />

01.09.11 12

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!