08.05.2014 Views

Panel monitoring using Senstools 3.3

Panel monitoring using Senstools 3.3

Panel monitoring using Senstools 3.3

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Panel</strong> <strong>monitoring</strong><br />

<strong>using</strong> <strong>Senstools</strong> v<strong>3.3</strong><br />

August 2005


Introduction<br />

‣ QDA panels<br />

‣ function, setup and maintenance<br />

‣ quality criteria<br />

‣ which requirements<br />

‣ how to monitor<br />

‣ <strong>monitoring</strong> panelist performance <strong>using</strong> <strong>Senstools</strong>


QDA or descriptive panels<br />

‣ trained assessors (6 to12)<br />

‣ product characterization by a fixed vocabulary<br />

‣ individual assessments, specified presentation<br />

design, controlled environment<br />

‣ often repeated measures (each product is rated two<br />

or three times by all assessors)


Structuur QDA data<br />

Structure of descriptive data<br />

(N ´ M) datamatrix X k<br />

data from 1 assessor<br />

(often replicates)<br />

N products<br />

K assessors<br />

M attributes<br />

3-mode data structure conventional<br />

profiling:<br />

N products are rated by K assessors<br />

on M attributes for p presentations<br />

(replicates)


The dataset<br />

‣ 10 assessors (‘sets’), 10 products (‘objects), 2<br />

presentations<br />

‣ 11 attributes: flower, rose, evergreen, wood, burnt,<br />

alcohol, pungent, medicinal, sulphur, grape, bitter<br />

‣ in total 2200 datapoints<br />

dataset with courtesy from<br />

Compusense, data from expert<br />

wine panel, original attribute<br />

names have been recoded,<br />

only part of the data is used


Data example<br />

Replica Assessor Product flower rose evergreen wood burnt alcohol pungent medicinal sulpher<br />

1 1 4 22 12 24 1 2 71 61 8 1<br />

2 1 4 25 7 7 1 1 50 16 8 1<br />

1 1 5 2 1 1 1 1 91 62 10 2<br />

2 1 5 1 1 2 10 1 47 20 10 1<br />

1 1 7 21 0 1 1 1 74 50 10 1<br />

2 1 7 1 6 8 0 2 62 38 7 0<br />

1 1 8 24 1 36 1 1 50 25 6 1<br />

2 1 8 1 1 2 9 1 61 43 7 14<br />

1 1 10 49 17 8 1 1 74 62 6 9<br />

2 1 10 21 2 11 2 1 74 62 8 1<br />

1 1 11 1 1 8 1 1 61 42 1 0<br />

2 1 11 18 1 7 1 1 51 20 6 1<br />

1 1 13 50 25 10 1 2 74 60 13 4<br />

2 1 13 43 15 7 2 0 99 86 2 1<br />

1 1 17 39 20 8 1 8 89 74 7 1<br />

2 1 17 47 19 6 1 1 72 31 7 1<br />

1 1 18 0 1 1 9 15 52 25 9 26<br />

2 1 18 21 1 7 9 13 71 33 8 31<br />

1 1 20 51 18 8 1 4 87 86 10 2<br />

2 1 20 15 1 6 1 0 57 11 7 1<br />

1 4 4 18 0 11 0 0 47 80 0 22<br />

2 4 4 12 25 4 0 0 47 82 0 8<br />

1 4 5 16 0 0 0 0 43 78 0 17<br />

2 4 5 18 0 0 0 5 77 81 0 0<br />

1 4 7 15 0 7 0 0 57 64 0 0<br />

2 4 7 9 0 0 0 4 65 78 34 0<br />

1 4 8 8 0 8 0 2 78 83 0 0<br />

2 4 8 10 0 14 0 0 75 80 14 0<br />

1 4 10 12 28 0 0 0 69 77 0 5<br />

2 4 10 28 25 6 0 0 78 89 0 0<br />

1 4 11 13 15 17 0 0 73 82 0 0<br />

2 4 11 17 24 7 0 0 57 56 0 0<br />

1 4 13 21 28 0 0 2 43 81 6 12<br />

2 4 13 27 31 4 0 0 57 86 0 20


<strong>Panel</strong> performance measures<br />

‣ Reliability and repeatability of assessors: do they give the same<br />

ratings to the same products – this requires repeated measures !<br />

‣ Validity: do the assessor rate the products in a similar way<br />

(correlations between each individual assessor and the others:<br />

do they agree?)<br />

‣ Discrimination: does the individual assessor and the panel as a<br />

whole rate different products as different?


Reliability<br />

‣ in order to establish whether an individual can discriminate<br />

between products (i.e. give the same ratings to identical<br />

products and different ratings to different products) we need a<br />

measure of variability<br />

‣ when products are rated more than once, we can compute the<br />

variance of ratings of the same product (within-product<br />

variance of MS within ) and for ratings of different products<br />

(between-product variance of MS between )<br />

‣ an assessor is considered to be discriminating when the<br />

within product variance is smaller than the between product<br />

variance<br />

‣ this is the ‘assessor statistic’ or the ratio<br />

MS between products /MS within<br />

‣ without replicates there is no MS within


Assessor statistics in <strong>Senstools</strong><br />

Assessor statistics sulphur:<br />

ratio between/within variance for each<br />

assessor (Mssq object/Mssq error)<br />

>1=more variance for different products<br />

than for same products<br />

1=same variance for different or same<br />

products<br />


Repeated measures<br />

Visualization assessor statistics in <strong>Senstools</strong><br />

view by attribute (for each assessor)


Repeated measures<br />

Visualization assessor statistics in <strong>Senstools</strong><br />

view by subject (for each attribute)<br />

11 attributes


Repeated measures<br />

Validity<br />

‣ for each subject: all subjects must show comparable rating<br />

behavior (they have been trained for that): the ratings of the<br />

subjects should at least correlate positively<br />

‣ furthermore, the ‘between-subject’ variance should be small<br />

(MS panelist / MS error < 1)<br />

‣ for the panel as a whole: panelist should not disagree too much, in<br />

other words there should be little or no interaction between subject<br />

and product


Repeated measures<br />

Correlations between individual and panel<br />

for each subject, the object<br />

ratings for each attribute<br />

are standardized and the<br />

correlation coefficients are<br />

computed between the<br />

subject and the rest of the<br />

panel<br />

in this example, there are 11<br />

attributes and 10 products<br />

resulting in 110 z-scores for<br />

each subject<br />

Agreement Between Assessors<br />

(Correlations)<br />

number of<br />

subjects<br />

4<br />

3<br />

2<br />

1<br />

-1.0-0.9-0.8-0.7-0.6-0.5-0.4-0.3-0.2-0.10.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1<br />

negatieve correlations<br />

positieve correlations<br />

distribution of correlations of subject X with<br />

the panel without subject X (total of 10<br />

coefficients when there are 10 subjects)


Repeated measures<br />

Between subject variance<br />

‣ the ‘between-subject’ variance should be small, they should all<br />

give similar ratings for a given attribute to a product:<br />

(MS panelist / MS error small)<br />

‣ for each object and attribute, the variability between assessors is<br />

computed (the object statistics in <strong>Senstools</strong>) and graphically<br />

represented: see example for attribute bitter<br />

100<br />

no disagreement<br />

for object 5<br />

bitter<br />

much disagreement<br />

for object 11<br />

10<br />

1<br />

object 4 object 5 object 7 object 8 object 10 object 11 object 13 object 17 object 18 object 20<br />

.1


Repeated measures<br />

variance between subjects<br />

‣ example of the individual ratings on attribute bitter for object 5<br />

(F-ratio 1.1) and object 11 (F-ratio 14.9)<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

Judge 100<br />

Judge 81<br />

Judge 64<br />

Judge 49<br />

Judge 36<br />

Judge 25<br />

Judge 16<br />

Judge 9<br />

Judge 4<br />

Judge 1


Visualization Repeated variance measures between subjects<br />

view by attribute (for each object)


Visualization Repeated variance measures between subjects<br />

view by object (for each attribute)


Repeated measures<br />

Interaction effects<br />

‣ for the panel as a whole: panelist should not disagree too much, in<br />

other words there should be little or no interaction between subject<br />

and product<br />

‣ <strong>Senstools</strong> shows the F ratio’s of the assessor and object variance<br />

for the MS-error and MS-interaction term


Repeated measures<br />

Interaction effects F-ratio objects<br />

RM ANOVA Object Interaction F Ratios by Attribute<br />

F-ratio<br />

100<br />

significant interactions between<br />

assessors and products<br />

F obj<br />

10<br />

significant difference<br />

between objects<br />

1<br />

no significant<br />

difference between<br />

objects<br />

F int obj<br />

0.1<br />

flowers evergreen burnt pungent sulphur bitter<br />

rose wood alcohol medicinal grape


Repeated measures<br />

Interaction effects F-ratio assessors<br />

RM ANOVA Set Interaction F Ratios by Attribute<br />

F-ratio<br />

100<br />

significant interactions between<br />

assessors and products<br />

F ass<br />

10<br />

significant difference<br />

between assessors<br />

1<br />

no significant difference<br />

between assessors<br />

F int ass<br />

0.1<br />

flowers evergreen burnt pungent sulphur bitter<br />

rose wood alcohol medicinal grape


Repeated measures<br />

Interaction effects burnt, sulphur & wood<br />

Attribute ratings<br />

for burnt, sulpher<br />

and wood by<br />

subject<br />

Subj 1 Subj 2 Subj 3 Subj 4 Subj 5 Subj 6 Subj 7 Subj 8 Subj 9 Subj 10<br />

100<br />

80<br />

burnt, no significant interaction<br />

60<br />

40<br />

20<br />

0<br />

object 4 object 5 object 7 object 8 object 10 object 11 object 13 object 17 object 18 object 20<br />

100<br />

80<br />

sulpher, significant interaction<br />

60<br />

40<br />

20<br />

0<br />

object 4 object 5 object 7 object 8 object 10 object 11 object 13 object 17 object 18 object 20<br />

100<br />

80<br />

wood, significant interaction<br />

60<br />

40<br />

20<br />

0<br />

object 4 object 5 object 7 object 8 object 10 object 11 object 13 object 17 object 18 object 20


Repeated measures<br />

Table of means<br />

obj 4<br />

obj 5<br />

obj 7<br />

obj 8<br />

obj 10<br />

obj 11<br />

obj 13<br />

obj 17<br />

obj 18<br />

obj 20<br />

flowers<br />

17<br />

15<br />

19<br />

18<br />

26<br />

16<br />

28<br />

22<br />

13<br />

24<br />

rose<br />

13<br />

13<br />

18<br />

13<br />

18<br />

15<br />

23<br />

16<br />

10<br />

17<br />

evergreen<br />

17<br />

11<br />

10<br />

16<br />

12<br />

13<br />

13<br />

15<br />

10<br />

15<br />

wood<br />

8<br />

3<br />

6<br />

4<br />

4<br />

3<br />

4<br />

3<br />

12<br />

2<br />

burnt<br />

9<br />

3<br />

4<br />

7<br />

3<br />

3<br />

5<br />

3<br />

9<br />

4<br />

alcohol<br />

51<br />

54<br />

53<br />

55<br />

62<br />

51<br />

61<br />

62<br />

54<br />

56<br />

pungent<br />

55<br />

55<br />

54<br />

59<br />

65<br />

53<br />

67<br />

62<br />

59<br />

60<br />

medicinal<br />

4<br />

3<br />

9<br />

5<br />

5<br />

2<br />

6<br />

4<br />

6<br />

4<br />

sulphur<br />

20<br />

3<br />

5<br />

4<br />

3<br />

2<br />

7<br />

2<br />

25<br />

6<br />

grape<br />

37<br />

39<br />

41<br />

39<br />

41<br />

41<br />

39<br />

39<br />

30<br />

41<br />

red: significant at 1%<br />

blue: significant at 5%


Repeated measures<br />

To summarize<br />

‣ without individual reliability no valid panel and no valid measuring<br />

instrument<br />

‣ full diagnosis requires repeated measures<br />

‣ performance of assessors can be diagnosed very accurately<br />

provided the right data and right analysis tools<br />

Questions about panel<strong>monitoring</strong> or <strong>Senstools</strong> v<strong>3.3</strong>?<br />

Contact Pieter Punter at pieter@opp.nl<br />

see also: Example-GPA Using <strong>Senstools</strong> v<strong>3.3</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!