Your task is to evaluate the subjective quality of the speech from short (2-3 second) audio files. Each HIT can be completed in 90 seconds.
We have methods that analyze the consistency of your answers with respect to themselves, to those of your fellow workers and to references we know to be accurate. We will use these methods to rank the submitted assignments according to quality.
For this experiment we will pay a base reward of $0.10/HIT for every accepted HIT. We have made available a set of 12 different HITs. You will receive a bonus of:
Bonuses will be paid up to 7 days after submission, because we can only rank the submissions once we have a statistically significant number of answers. The base reward will always be paid within 24 hours of submission.
Each file should be given a score according to the following scale, known as the MOS (mean opinion score) scale:
Score | Quality of the Speech | Level of Distortion |
5 | Excellent | Imperceptible |
4 | Good | Just perceptible, but not annoying |
3 | Fair | Perceptible and slightly annoying |
2 | Poor | Annoying, but not objectionable |
1 | Bad | Very annoying and objectionable |
The following references illustrate the meaning of each score. Please note that you will encounter many other types of noise and distortion. Therefore, these examples do not exhaust the range of conditions you can expect to hear.
The following recording represents clean speech with imperceptible noise or distortion, which is given a reference score of 5.0.
The following represents the best possible quality which can be obtained with a conventional telephone, and has a reference score of 4.5.
This file contains speech corrupted by background noise, and has a reference score of 2.5.
Finally, this is an example of significantly distorted speech, with a reference score of 1.5.
To obtain accurate results, we strongly recommend that you wear headphones and work in a quiet environment, otherwise you might not be able to discriminate between files with clearly different features.
Your results will be collected and evaluated for consistency. We (the requesters) have an estimate of each file's subjective quality that conforms with the references above. Thus, we can detect if someone submits random scores or does not rate according to these instructions, which can lead to work being rejected. You can rest assured that your work will be approved if you rate according to the chart and examples above.
Answers will be either reviewed or automatically approved within 24 hours.