Response to VizWiz Comments

Image by Witer via Flickr

Bjoern Hartmann's Crowdsourcing Seminar at Berkeley read my VizWiz paper today, and his students provided written comments. He asked me to share a response, which I did and which I have included below.

Looking back on VizWiz, I think what was interesting was that it showed (i) you can do crowdsourcing in something close to real-time (regardless of how you do it) and (ii) you can use resulting "deployable Wizard-of-Oz" prototypes to learn more about your population. Through the VizWiz prototype, we found and verified (Soylent ref?) a number of visual questions that blind people might want answered, and isolated problems that effective tools will need to address to make them feasible.

I think nearly all of Bjoern's students found such problems in reading through the VizWiz paper, which is great! I hope they'll be inspired to go solve them to make a VizWiz-like tool even better and more useful.

==Enabling Blind People to Take Pictures==

Blind people don't have nearly the trouble taking pictures that one might imagine. Think about all of the great non-visual clues that are available!

Nevertheless, here are a number of interesting approaches one might take to either help blind people take better pictures, or lessen the impact of poor-quality pictures. We explored some simple approaches in the paper (darkness/blur detection), and have expanded the capabilities a lot since then while working with some local blind photographers (who already take some pretty great pictures). Our current version gives users the option to record a video instead of still photos, although at the cost of latency to send a larger file. The best improvement so far came by simply upgrading to the iPhone 4 and its better camera and flash! To me, what I think is interesting about the VizWiz study is that it showed how far you can get with low-fidelity input and crowdsourcing, especially when the capture of that input is mediated by human intelligence (the blind user). Generally, what happens when a question comes back with an answer like "the picture is bad" is that the user will take another picture and ask again.

==Answer Quality==

Our mechanism for dealing with answer quality was to present multiple answers to users. Most strategies you might consider to ensure answer quality end up delaying the answer -- for instance, waiting for other users to verify. We decided to rely on the user to make sense of the answers, especially given that answers were correct the majority of the time. We actually saw zero malicious answers.

Depending on how you look at it, a correct (and quick) "there's nothing in this image" is a great answer because it signals that the person should try again. But, users wanted more interactivity. VizWiz highlights challenges that research could address in facilitating such interactivity -- you can get very interactive responses by pairing a user with a particular worker, but how do you keep that worker around for the whole interaction? What if that worker ends up not being a "good" worker? How do you pair a user with a group of workers and have the interactions still make sense? All great questions.

==Other Latency-Reducing Strategies==

What I like about this paper is that (I think) it introduces the idea that crowdsourcing could happen in something like real-time. I agree that strategies like signaling to workers when work is available may be more cost effective than keeping them busy-waiting, but busy-waiting ends up being cost effective if you have enough users. The additional complexity of the signaling system may not be worth it in the end. Yet another great problem to explore more.

No TrackBacks

Leave a comment

Categories

Monthly Archives

Pages

Search

About this Entry