Skip to main content

Taskflow - AI Dataset Annotation

How to Distribute a Large Dataset for Annotation

Taskflow lets you distribute large datasets across multiple annotation tasks within a single Prolific study. Participants are automatically allocated to different URLs, so your dataset can be annotated in parallel.


Before you start: plan your tasks

Divide your dataset into manageable subsets before creating your study. Each subset needs a hosted URL where participants will complete their annotation work.

If you're using the API to create or manage your Taskflow study, see our API documentation.


Setting up Taskflow

Via the web app

  1. Log in to your Prolific researcher account

  2. Go to Projects and Add new study

  3. In Data collection type, select Taskflow

  4. Upload a CSV file structured as follows:

    • Column A: one URL per row

    • Column B: the number of participants to allocate to that URL (optional, defaults to 1 if left blank)

  5. Taskflow will distribute participants according to your chosen allocation strategy. You can manually adjust per-URL allocations after upload

Via the API

Set the access_details attribute on your study creation request as an array of objects, each with:

  • external_url — the task URL

  • total_allocation — the number of participants to allocate to that URL

For full request body details, see Create study and Get Taskflow progress in the API documentation.


Allocation strategies

Taskflow offers three allocation strategies, configurable via the API using the data_collection_metadata.allocation_strategy field.

Strategy

How it works

Best for

Round robin (default)

Distributes participants evenly across URLs, prioritizing URLs that haven't been allocated yet

Most annotation studies

Random

Selects randomly from any URL that hasn't reached capacity

Studies where even distribution isn't required

Deallocated first round robin

Like round robin, but prioritizes refilling slots vacated by returns, rejections, or timeouts before moving to unallocated URLs

Studies where full coverage across all URLs is critical


Adding participant tracking to your URLs

You can embed Prolific's built-in placeholders directly into your CSV URLs. These are automatically substituted with real values when a participant is allocated:

  • {{%PROLIFIC_PID%}} — the participant's Prolific ID

  • {{%STUDY_ID%}} — your study ID

  • {{%SESSION_ID%}} — the session ID

Example:

This lets you track which participant completed which task without any additional configuration. These placeholders work whether you're uploading a CSV via the web app or setting URLs via the API.


How Taskflow handles incomplete submissions

Taskflow automatically releases a URL slot back into the available pool when a participant returns their submission, times out, or is rejected. Released slots are reallocated to other available participants, ensuring your completion targets are met. The order in which slots are refilled depends on your chosen allocation strategy.

A participant will never be allocated the same URL twice, even if they previously returned that submission. This is guaranteed platform behavior and requires no configuration.


Screen-outs

When a participant is screened out of a task, Taskflow automatically adds 1 to the capacity for that URL, preserving your intended completion targets.


Increasing places on an active study

The total number of available places in a Taskflow study is calculated automatically from the sum of all per-URL total_allocation values.

Via the web app

  1. Go to your study's submissions page

  2. Select Increase places from the Actions menu

  3. Specify the total number of places you want to add

  4. Distribute the new places across your existing URLs

  5. Select Confirm to apply the update

Important: you can't add new URLs to a published Taskflow study via the web app.

Via the API

Update the access_details for the relevant URLs by increasing their total_allocation value. Note that you can't set a total_allocation lower than the number of slots already allocated for that URL.

You can also increase total places by adding new URLs to a published Taskflow study. To do so, send a PATCH request to https://api.prolific.com/api/v1/studies/{study_id}/ with only the URLs you're adding or updating. For example, to add a new URL with 100 places:

{
"access_details": [{
"external_url": "https://yourannotationplatform.com?task=6",
"total_allocation": 100
}]
}

You only need to include the URLs being added or updated, not your full list of existing URLs.


Finding which URL a participant was allocated

The URL allocated to each submission is recorded in the demographic export under the URL column. Use this to match completed submissions back to specific tasks or dataset subsets.


How many URLs can I add?

Taskflow collections support up to 20,000 URLs. If your dataset requires more, contact Prolific via the Support icon at the bottom right of the screen, or reach out to your Prolific contact.


Which study types does Taskflow support?

Taskflow is compatible with representative sample and quota studies.

Via the web app, select your sample type in the Study distribution and screener selection section as you normally would.

Via the API, Taskflow automatically applies your study's sample type across all sub-studies when you publish.

If you need to screen participants using custom criteria, we recommend the two-study screening approach before launching your Taskflow study.

Did this answer your question?