Exploring the Use of Speech Input by Blind People on ... - Washington
Exploring the Use of Speech Input by Blind People on ... - Washington
Exploring the Use of Speech Input by Blind People on ... - Washington
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
[ ] Yes<br />
[ ] No<br />
If <str<strong>on</strong>g>the</str<strong>on</strong>g> participant answered “No” to <str<strong>on</strong>g>the</str<strong>on</strong>g> questi<strong>on</strong> above, <str<strong>on</strong>g>the</str<strong>on</strong>g> survey<br />
c<strong>on</strong>cluded with a final questi<strong>on</strong> that asked why not. If <str<strong>on</strong>g>the</str<strong>on</strong>g><br />
participant answered “yes,” she was asked to recall a specific<br />
instance in which she used dictati<strong>on</strong>. She was <str<strong>on</strong>g>the</str<strong>on</strong>g>n asked several<br />
questi<strong>on</strong>s about this instance, such as when <str<strong>on</strong>g>the</str<strong>on</strong>g> instance occurred,<br />
and what she dictated (a questi<strong>on</strong> to Siri, a text message, etc.).<br />
The penultimate questi<strong>on</strong> presented <str<strong>on</strong>g>the</str<strong>on</strong>g> user with three statements<br />
and asked her to describe how she feels about each statement <strong>on</strong> a<br />
Likert scale. The statements were:<br />
• Dictati<strong>on</strong> <strong>on</strong> a smartph<strong>on</strong>e is accurate<br />
• Using dictati<strong>on</strong> <strong>on</strong> a smartph<strong>on</strong>e (including <str<strong>on</strong>g>the</str<strong>on</strong>g> time it<br />
takes to correct errors) is fast relative to an <strong>on</strong>-screen<br />
keyboard.<br />
• I am satisfied with dictati<strong>on</strong> <strong>on</strong> my smartph<strong>on</strong>e.<br />
The survey c<strong>on</strong>cluded with a prompt for “o<str<strong>on</strong>g>the</str<strong>on</strong>g>r comments” and a<br />
free-form text box for <str<strong>on</strong>g>the</str<strong>on</strong>g>ir resp<strong>on</strong>se.<br />
Surveys were completed <strong>on</strong> <str<strong>on</strong>g>the</str<strong>on</strong>g> Internet and resp<strong>on</strong>ses were<br />
an<strong>on</strong>ymized.<br />
To analyze <str<strong>on</strong>g>the</str<strong>on</strong>g> results, we graphed <str<strong>on</strong>g>the</str<strong>on</strong>g> data and computed<br />
descriptive statistics for all questi<strong>on</strong>s. We used Wilcox<strong>on</strong> Rank<br />
Sums tests to compare means between Likert scale resp<strong>on</strong>ses. We<br />
modeled <str<strong>on</strong>g>the</str<strong>on</strong>g> data with <strong>on</strong>e factor, SightAbility, with two levels:<br />
BLV, and Sighted. The measures corresp<strong>on</strong>ded to <str<strong>on</strong>g>the</str<strong>on</strong>g> three Likert<br />
resp<strong>on</strong>se statements: Accurate, Fast, and Satisfied.<br />
3.2 Results<br />
The survey resp<strong>on</strong>ses showed that BLV people used dictati<strong>on</strong> far<br />
more frequently than sighted people. Interestingly, 58 BLV<br />
participants (90.6%) and 58 sighted participants (55.2%) used<br />
dictati<strong>on</strong> recently. Am<strong>on</strong>g BLV participants, <strong>on</strong>ly <strong>on</strong>e used<br />
speech input <strong>on</strong> an Android device and <str<strong>on</strong>g>the</str<strong>on</strong>g> rest used it <strong>on</strong> iOS<br />
devices. Am<strong>on</strong>g sighted people, 21 participants used speech input<br />
<strong>on</strong> an Android device and 34 <strong>on</strong> an iOS device. Most BLV people<br />
used speech input within <str<strong>on</strong>g>the</str<strong>on</strong>g> last day while most sighted people<br />
used it within <str<strong>on</strong>g>the</str<strong>on</strong>g> last week.<br />
Both BLV and sighted participants used speech most for<br />
composing text message. Table 1 shows <str<strong>on</strong>g>the</str<strong>on</strong>g> kinds <str<strong>on</strong>g>of</str<strong>on</strong>g> messages<br />
participants composed. Many more BLV than sighted people used<br />
speech to compose emails. Figure 2 supports this finding,<br />
showing that BLV people composed l<strong>on</strong>ger messages.<br />
Table 1. Number <str<strong>on</strong>g>of</str<strong>on</strong>g> resp<strong>on</strong>ses for a survey questi<strong>on</strong>.<br />
What did you use speech input for? BLV Sighted<br />
A command (e.g., "Call bob smith") 8 14<br />
a questi<strong>on</strong> (e.g., "Siri, what's <str<strong>on</strong>g>the</str<strong>on</strong>g> wea<str<strong>on</strong>g>the</str<strong>on</strong>g>r like<br />
today?") 13 14<br />
An email 12 4<br />
A text message 20 19<br />
O<str<strong>on</strong>g>the</str<strong>on</strong>g>r 5 7<br />
Figure 2 shows <str<strong>on</strong>g>the</str<strong>on</strong>g> means and standard deviati<strong>on</strong>s (SD’s) <str<strong>on</strong>g>of</str<strong>on</strong>g><br />
participant Likert scale resp<strong>on</strong>ses to <str<strong>on</strong>g>the</str<strong>on</strong>g> penultimate questi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g><br />
<str<strong>on</strong>g>the</str<strong>on</strong>g> survey. Histograms <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> data showed <str<strong>on</strong>g>the</str<strong>on</strong>g> distributi<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g><br />
resp<strong>on</strong>ses were roughly normal, so <str<strong>on</strong>g>the</str<strong>on</strong>g> means and SDs represent<br />
<str<strong>on</strong>g>the</str<strong>on</strong>g> resp<strong>on</strong>ses appropriately. As Figure 3 shows, BLV people<br />
were more satisfied with speech and thought it was faster than <str<strong>on</strong>g>the</str<strong>on</strong>g><br />
<strong>on</strong>-screen keyboard compared with sighted people. This resulted<br />
in a significant effect <str<strong>on</strong>g>of</str<strong>on</strong>g> SightAbility <strong>on</strong> Satisfacti<strong>on</strong> (W = 2296, p<br />
< 0.001) and Speed (W = 2240, p = 0.001). There was no<br />
significant effect <str<strong>on</strong>g>of</str<strong>on</strong>g> SightAbility <strong>on</strong> Accuracy, but <str<strong>on</strong>g>the</str<strong>on</strong>g>re was a<br />
str<strong>on</strong>g trend (W = 1977, p = 0.067). Perhaps BLV people were<br />
able to get fewer recogniti<strong>on</strong> errors because <str<strong>on</strong>g>the</str<strong>on</strong>g>y had more<br />
practice using speech for input.<br />
Resp<strong>on</strong>ses<br />
0 5 10 15 20 25 30<br />
BLV<br />
Sighted<br />
1 − 5 6 − 10 >10<br />
Figure 2. Survey resp<strong>on</strong>ses to <str<strong>on</strong>g>the</str<strong>on</strong>g> questi<strong>on</strong>, “About<br />
how l<strong>on</strong>g was your dictated text?”<br />
0 1 2 3 4 5<br />
Accurate Fast Satisfied<br />
BLV<br />
Sighted<br />
Figure 3. Survey resp<strong>on</strong>ses <strong>on</strong> a 5-point Likert-scale:<br />
1 is str<strong>on</strong>gly disagree, and 5 is str<strong>on</strong>g agree. Resp<strong>on</strong>ses<br />
were roughly normally distributed.<br />
When prompted for o<str<strong>on</strong>g>the</str<strong>on</strong>g>r comments, many participants noted <str<strong>on</strong>g>the</str<strong>on</strong>g><br />
challenge <str<strong>on</strong>g>of</str<strong>on</strong>g> editing <str<strong>on</strong>g>the</str<strong>on</strong>g> recognizer’s output, and speaking in<br />
noisy envir<strong>on</strong>ments. One blind participant explained,<br />
Accuracy in noisy envir<strong>on</strong>ments is <str<strong>on</strong>g>the</str<strong>on</strong>g> biggest<br />
challenge I feel. I prefer to dictate short commands<br />
and text, saving l<strong>on</strong>g e-mail resp<strong>on</strong>ses for a standard<br />
computer. Editing can be a challenge.<br />
Three sighted participants felt awkward using speaking to <str<strong>on</strong>g>the</str<strong>on</strong>g>ir<br />
device. Ano<str<strong>on</strong>g>the</str<strong>on</strong>g>r sighted participant echoed this c<strong>on</strong>cern, feeling<br />
frustrated with <str<strong>on</strong>g>the</str<strong>on</strong>g> lack <str<strong>on</strong>g>of</str<strong>on</strong>g> feedback, “I find it hard to talk to <str<strong>on</strong>g>the</str<strong>on</strong>g><br />
device. Do you yell at it and hope it understands better?”<br />
Participants who did not use speech for input recently were<br />
mostly c<strong>on</strong>cerned with accuracy and errors. Some were also<br />
c<strong>on</strong>cerned about privacy or social appropriateness, since o<str<strong>on</strong>g>the</str<strong>on</strong>g>r<br />
people can hear what <str<strong>on</strong>g>the</str<strong>on</strong>g>y say when <str<strong>on</strong>g>the</str<strong>on</strong>g>y speak to <str<strong>on</strong>g>the</str<strong>on</strong>g>ir devices.<br />
Some sighted participants said that <str<strong>on</strong>g>the</str<strong>on</strong>g>y simply “d<strong>on</strong>’t need” to<br />
use speech for input or haven’t figured out how to use it yet.<br />
3.3 Discussi<strong>on</strong><br />
The survey results suggest that speech is already a widely used<br />
eyes-free alternative to keyboard input. <str<strong>on</strong>g>Blind</str<strong>on</strong>g> people seem more<br />
satisfied with speech than sighted people. This is probably<br />
because keyboard input with VoiceOver is so much slower than