Smartsourcing: How to Obtain High Quality Lists through Mechanical Turk

Brent Frei's blog - Mar 12 2009 - 4:22pm

Many of our customers are interested in a category of work that would best be described as “I need a list of information”. For example:

1.  Get me the top 10 Publishers in China and their contact information

2.  Find 4 Disc Jockeys in Dallas, Texas along with their profile, rates, and contact information

3.  Get me a list of the Salesforce’s AppExchange Applications along with each app’s profile data

You get the idea…

To ensure that you don’t end up with duplicate information from multiple people, here is the slick workflow to accomplish that. This approach has proven to provide better than average quality results and reduces the overhead associated with quality control.

Let’s use as our example the work that I did for the blog post on Google and Salesforce Apps, and have someone compile the full list of Salesforce AppExchange applications in the Office Productivity section (shown below), including useful data associated with them.


Step One: Create a Smartsheet with the columns of data you wish the worker to fill in. You can even preload the sheet with data and instructions if that makes sense. It can be anything from an empty sheet to a sheet that includes some data examples of how you want it filled out.


Step Two: Publish the Smartsheet as an Editable Sheet, and copy the “Anyone can edit this sheet online at:” string from the dialog. You will use this link in the Smartsourcing work request Instructions.


Step Three: Create the Smartsourcing request from a row or set of rows. In this case, I’ve done it from the main parent row. I included link to the Salesforce catalog in this row, and could have included the Published link to be passed as a part of the Include Data, but since I am launching the request from within the sheet to be filled out, I simply pasted that link into the instructions.


It’s important to write the Instructions in a way that explains to the worker that they will be inserting the work into the Sheet that is launched from the link you are providing (legible instruction text included at bottom of post). Given the potential volume of the work, it is important to carefully consider the price you assign to this bundle of work. It must be sufficient to reward the worker for taking the risk of committing a large block of time to this one massive request. One good way of approaching it is to offer a good fee for some baseline of work (the first 100 rows), with a bonus clause describing the upside potential for the additional work (and I’ll bonus you an additional 2 cents per additional acceptable row).

The text I chose to use in this case was (this is the confirmation email you get from a job submitted):


By bundling all this work into one worker’s request and pricing it at an appropriately high number (in my example, $5), you can expect the work to be accepted within minutes. By specifying that the worker should only accept the work if they intend to complete it all and accurately or you will reject it, you increase the odds that all the work is done well. Few workers will dedicate a large block of time to a task if they are not confident of getting paid. And, those workers that routinely abuse the process by trying to hide shoddy work within the large volume jobs generally do not accept these types of work.

The worker on Amazon’s Mechanical Turk will see this:


Step Four: If you wish, you can view your Smartsheet while the worker is saving information to it in the published form. When the worker is finished, per your instructions, the worker will return to the HIT in MTurk and put in the required information and push Submit. Review the results and be fair about paying for the work performed.


Have fun, this should open an entire world of possibilities for you.

 - Brent


Text Associated with the Example: 

Title: Smartsourcing Work Request: Salesforce AppExchage Application Information

Instructions: Please click on this Data Entry Sheet link   

to open a separate window in which to input all the information requested for the Salesforce AppExchange applications listed in the AppExchange Weblink. There should be between 85 and 100 applications listed in the resulting category. Complete the list in the same format in which the samples have been entered into the sheet. When you have entered all the apps listed, BE SURE TO PUSH THE SAVE button in the upper left hand corner to commit the information to the sheet. This can safely be done several times along the way to ensure no loss of work. Then note the count of apps entered (including the sample provided and headers), and returning to the HIT, enter that number into the Comments field along with anything you might want to say, and Submit. Please only accept this HIT if you intend to do all the work or it will be rejected entirely.

AppExchange Weblink:

An answer includes: Comments

Options: Pay $5.00 per approved answer submitted within the next 12 hours. Limit to workers with a 95% approval rating and allow them to reserve a question for 1 hour. Answers auto-approve after 1 day. Up to $5.00 (plus $1.01 in commissions) will be charged to your account.


Post new comment

This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Enter the characters shown in the image.