Journals and ethics boards increasingly ask researchers to describe the quality controls behind their online data collection. This reflects growing standards around methodological transparency.

This document covers the questions that come up most often during peer review when a study has used online participant recruitment. Each entry provides context on why the question matters, followed by a clear response that can be adapted for revision letters, cover letters, or correspondence with editors.

Common Questions and Responses

"How do you know your participants aren't bots?"

This question comes up frequently as awareness of automated survey fraud grows. The short answer is that Prolific makes it very difficult for bots to enter the participant pool in the first place, and then adds further checks during and after studies.

Before anyone can access a study on Prolific, they have to pass identity verification that includes a live video selfie and a government-issued document check. This is not a one-time checkbox; participants are also subject to surprise re-verification at random intervals, where they have to complete a new video selfie. This means that even if someone managed to create a fake account, they would be caught when re-checked later.

On top of this, Prolific's quality system (known as Protocol) runs 50+ automated checks. These cover things like detecting whether someone is using automation tools, monitoring for unusual behavioural patterns, and flagging submissions that are completed suspiciously fast. A dedicated data quality team also conducts hundreds of manual reviews every day.

The combined effect of these protections is measurable: the identity verification system, powered by Entrust’s technology, maintains a fraud rate below 0.1% (false acceptance of fraudulent documents or identities). This figure reflects the performance of the verification step specifically, not a platform-wide fraud rate.

For researchers who want an additional layer of protection, Prolific offers Authenticity Checks, an optional tool that runs during data collection and analyses behavioural signals in real time to flag responses that show signs of automated responding.

"What safeguards are in place against AI-generated responses?"

This question is related to concerns about bots but addresses a different set of issues. There are two threats relevant here, each requiring different detection approaches.

The first is AI-assisted participation, where a real, verified human participant uses a tool such as ChatGPT to help generate their answers, most commonly for open-ended text questions. The participant is genuine, but part of their response is not. This is currently the more commonly observed form of AI-related concern in online research.

The second is AI-powered bots (sometimes referred to as agentic AI or autonomous agents), where an automated system completes the study entirely without human involvement. These are more sophisticated than traditional bots and may attempt to simulate human-like behaviour. Current evidence suggests this remains rare on platforms with robust identity verification, though it’s an area of active monitoring and development.

Prolific addresses both threats at multiple levels. At the platform level, screening for quality during onboarding includes checks for AI misuse, and participants who are identified as having used AI in previous studies are removed from the pool. This means the participant pool is continuously maintained rather than verified only at registration.

At the study level, researchers can enable Authenticity Checks, which have two components targeting these different threats:

LLM authenticity checks detect when a human participant is using AI tools to generate free-text responses. The system looks for patterns associated with AI-assisted responding; for example, switching between browser tabs (which could indicate copying questions into a chatbot) or pasting text into response fields (which could indicate pasting answers from one). The system analyses 15+ behavioural signals and achieves 98.7% precision, 78.9% recall, and 88.4% overall accuracy.
Bot authenticity checks detect when an AI agent or automated script is completing the study. These analyse behavioural patterns across all question types to identify non-human environments, and achieved over 99.9% accuracy in testing.

A January 2026 internal audit found that 0.8% of responses across the platform were flagged for AI-generated content. This is a low rate, and it reflects the fact that most participants flagged for AI use are removed from the pool before they reach subsequent studies.

Prolific does not publicly disclose every detection method it uses. This is intentional, since sharing the full details of how detection works would make it easier for bad actors to find ways around it. The measures described here represent what can be shared publicly.

"How is Prolific different from MTurk or other online platforms?"

Different online recruitment platforms work in fundamentally different ways, and these differences affect data quality.

Prolific is a closed platform: participants have to apply, pass identity verification (including a video selfie), and be accepted before they can access any studies. Only about 13% of applicants are invited from the waitlist, and of those, roughly 55% pass the full onboarding process. Once on the platform, participants are continuously monitored and periodically re-verified.

Amazon Mechanical Turk (MTurk), by contrast, is an open marketplace where workers self-register with less rigorous identity verification. MTurk also has no minimum pay requirement, which creates a different incentive structure; very low pay can attract participants who are motivated to complete as many tasks as possible with minimal effort.

Marketplace aggregators (platforms like Cint, PureSpectrum, or Qualtrics panel services) route participants from multiple third-party panels, rather than maintaining their own participant pool. This means the platform has less direct control over who participates and how they were verified.

These differences show up in data quality. Peer and collaborators (2022), and Douglas and collaborators (2023) both compared data quality across multiple platforms. Peer and collaborators (2022) found that Prolific was the only platform to deliver high data quality across all four measures (attention, comprehension, honesty, and reliability) without requiring quality filters. Douglas and collaborators (2023) found that Prolific offered the lowest cost per high-quality respondent at $1.90, compared to $4.36 on MTurk and $8.17 on Qualtrics, with recall accuracy of 83.47% versus MTurk's 52.20%. A smaller direct comparison also found that MTurk participants showed higher attentional disengagement than Prolific participants (Albert & Smilek, 2023).

Response consistency tells a similar story. Kay (2025) administered pairs of semantically contradictory items (e.g., “I talk a lot” and “I rarely talk”) to samples from Connect, Prolific, and MTurk. On Connect and Prolific, these item pairs correlated negatively, as expected. On MTurk, over 96% of the pairs were positively correlated, indicating widespread inconsistent responding. Notably, this pattern could not be remedied by applying standard attention checks or by filtering for high-reputation participants.

"Can you prove the data wasn't contaminated by fraud?"

This is a fair question, and the honest answer is that no platform, online or offline, can guarantee zero contamination. Lab-based studies, for instance, are subject to demand characteristics, experimenter bias, and participant misrepresentation. The relevant standard is not whether contamination is theoretically possible, but whether adequate, documented safeguards are in place and whether the measurable outcomes demonstrate those safeguards are working.

On that measure, Prolific's quality infrastructure produces measurable outcomes at each layer:

Entry barrier: The identity verification system, powered by Entrust’s technology, maintains a false acceptance rate below 0.1% for fraudulent documents or identities.
AI-assisted responding: A January 2026 internal audit found that fewer than 0.8% of responses across the platform were flagged for AI-generated content. This rate reflects the cumulative effect of onboarding screening, ongoing monitoring, and removal of flagged participants from the pool.
Overall rejection rate: Across all studies in 2025, 0.5% of submissions were rejected by researchers. A low rejection rate should not be interpreted as an absence of quality controls, but as an indication that upstream filtering is removing most problematic participants before they reach each study.

Researchers can further strengthen their evidence by enabling Authenticity Checks and reporting the results in their manuscript, and by including their own attention and comprehension checks. The data quality reporting template in Writing About Prolific in Your Research provides ready-to-use language for this.

"Why should we trust an online sample over a lab sample?"

This question is about online data collection in general rather than about Prolific specifically, but it comes up often enough that it's worth addressing.

A large body of research has examined whether online samples produce comparable results to lab-based samples, and the overall finding is that for many types of studies, online data is comparable in quality and replicability. This holds across a range of paradigms: Prolific participants were statistically indistinguishable from web-tested university students on a demanding working memory task (Uittenhove et al., 2023). Similar equivalence has been found for theory of mind tasks (Krendl et al., 2024), economic decision-making (Gupta, et al., 2021), auditory tasks (Zhao et al., 2022), and simulation-based problem-solving (Wang et al., 2024). More broadly, Kay and Vlasceanu (2026) found that Prolific participants produced response distributions consistent with established findings on moral reasoning, anchoring, and framing effects, and showed the expected range of individual variability, patterns that synthetic respondents generated by six LLMs failed to replicate.

Online and lab samples each have their strengths. Lab settings offer more control over the physical environment (lighting, noise, distractions), which matters for some study types. Online samples offer greater demographic diversity (Zhao et al., 2022), access to larger sample sizes, and faster data collection. Many research programmes use both; for instance, running an initial study online and replicating key findings in the lab, or the reverse.

The key question for any online study is whether the platform provides adequate quality assurance to ensure the data is trustworthy. That is what the rest of this resource pack addresses.

"Are Prolific participants just professional survey-takers?"

This question reflects a concern that if participants complete lots of studies, they become "too experienced"; they learn what researchers are looking for, they respond on autopilot, or they're no longer representative of a general population.

Prolific manages this in a few ways. For example, Prolific tracks how many studies each participant has completed, and researchers can use this as a filter when setting up their study; for instance, only recruiting participants who have completed fewer than a certain number of prior studies.

It's also worth putting this concern in context. The "professional participant" issue exists in any panel-based recruitment method. University participant pools, for example, often consist of psychology students who complete dozens of studies per semester and arguably have more exposure to common experimental paradigms than Prolific participants do.

The available research on this topic actually suggests that more experienced participants tend to produce higher quality data. In one study of a demanding working memory task, a participant's prior approval rating on Prolific, which correlates with platform experience, was the only variable that predicted data quality; demographics such as age, gender, nationality, and education did not (Uittenhove et al., 2023). The authors suggest Prolific's reputation system itself acts as a quality filter, gradually selecting for engaged participants. This is consistent with findings that Prolific's attention-check compensation policy, which allows researchers to reject careless responses, both motivates engagement and selects against habitually disengaged participants over time (Albert & Smilek, 2023). In a separate perceptions study using incentivised truthful reporting, over 90% of Prolific participants rated statements about honest responding and task diligence as accurate (Aksoy & Nevo, 2025). Perhaps most striking, in an interactive physics problem-solving study, Prolific participants who could not solve the problem spent four times longer attempting it than university students who also failed, pausing more deliberately rather than rushing through (Wang et al., 2024). This could suggest that platform experience fosters accountability, not apathy. That said, researchers who are concerned about this for a specific study can use Prolific's screening tools to limit participation history.

"How do you ensure participants are who they say they are?"

This question is about whether the demographic and identity information associated with each participant can be trusted.

All Prolific participants complete identity verification at registration, which includes a live video selfie and a government-issued document check. This verification step is designed to achieve a false acceptance rate below 0.1% for fraudulent documents or identities.

Demographic information (age, sex, location, education, etc.) is collected during onboarding through self-report questionnaires and is subject to periodic re-verification. Additionally, we always suggest researchers re-validate screening criteria at the beginning of their studies, as some of the participants’ answers may be subject to change.

Participants also go through surprise identity re-checks at random intervals; these require a new video selfie and cannot be predicted or prepared for. IP validation and deduplication checks run continuously to prevent individuals from maintaining multiple accounts. Participants whose identity or demographic information is found to be inconsistent are investigated and, where appropriate, removed.

A Note on What Can and Cannot Be Disclosed

Some reviewers or ethics boards may ask for more specific technical detail on how Prolific's detection systems work; for instance, exactly what behavioural signals are monitored or what thresholds trigger a flag.

Prolific does not publicly disclose the full technical details of its detection methods, for the same reason that banks don't publish the rules their fraud detection uses or exam boards don't share the exact criteria for flagging plagiarism. Revealing this information would make it easier to circumvent the protections.

The information in this document and in "How Prolific Protects Data Quality" represents what can be shared publicly. If a journal or ethics board requires additional technical detail beyond what is published here, researchers can contact Prolific's support team to discuss what can be provided on a confidential basis, using the Support icon at the bottom right of this page.

Responding to a Reviewer or Editor: Email Template

If you have received specific feedback from a reviewer or editor and need to write a response, the following template can be adapted. Include or remove the optional blocks depending on the specific concerns raised.

Dear [Editor / Reviewer],

Thank you for the opportunity to address the questions raised regarding our participant recruitment.

Participants for this study were recruited through Prolific (www.prolific.com), a dedicated research recruitment platform. Prolific maintains a pre-screened participant pool in which all members have completed quality and identity verification (including a live video selfie and document check) before accessing any study. Participants are subject to ongoing behavioural monitoring, periodic re-verification, and performance-based quality tracking. The platform applies 50+ automated checks through its quality assurance system, covering identity verification, fraud detection, and behavioural monitoring.

▸ Add if the concern is about bots or AI: Regarding automated and AI-generated responding: Prolific's identity verification system, powered by Entrust’s technology, maintains a false acceptance rate below 0.1% for fraudulent documents or identities. A January 2026 internal audit found that 0.8% of responses were flagged for AI-generated content. These low rates reflect the combined effect of identity verification at onboarding, continuous monitoring, and in-study detection tools.

▸ Add if you used Authenticity Checks: We also enabled Prolific's Authenticity Checks during data collection [Both LLM and bot authenticity checks were used / LLM authenticity checks were used / Bot authenticity checks were used.] LLM authenticity checks detect when participants use AI tools to generate free-text responses; bot authenticity checks detect when an AI agent or automated script is completing the study. Of [NUMBER] submissions, [NUMBER] ([PERCENTAGE]) were flagged as low authenticity by the LLM check [and/or] [NUMBER] ([PERCENTAGE]) were flagged as low authenticity by the bot check. In total, [NUMBER] submissions ([PERCENTAGE]) were [excluded from analysis / reviewed manually / describe approach].

▸ Add if you used additional quality measures: Additional quality measures included [attention checks / comprehension checks / completion time thresholds / other]. [NUMBER] responses ([PERCENTAGE]) were excluded based on [criteria], resulting in a final analysed sample of N = [NUMBER].

▸ Add if you want to reference comparative evidence: Several independent peer-reviewed studies have compared data quality across online recruitment platforms and found Prolific to perform at or near the top of these comparisons (Douglas et al., 2023; Esch et al., 2025). For a full description of Prolific's quality infrastructure and citable metrics, see Prolific (2026).

I hope this addresses the concerns raised. I am happy to provide further documentation if helpful.

Kind regards,

[NAME]

Further Resources

How Prolific Protects Data Quality — Full description of Prolific's quality assurance infrastructure, citable metrics, and comparative evidence.
Writing About Prolific in Your Research — Template language for methods sections, IRB applications, consent forms, and grant applications.

References

Aksoy, B., & Nevo, S. (2025). Unlocking insights into Prolific: Research implications, participant behavior and motivations (SSRN Working Paper No. 5203748). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5203748
Albert, D. A., & Smilek, D. (2023). Comparing attentional disengagement between Prolific and MTurk samples. Scientific Reports, 13(20574). https://doi.org/10.1038/s41598-023-46048-5
Douglas, B. D., Ewell, P. J., & Brauer, M. (2023). Data quality in online human-subjects research: Comparisons between MTurk, Prolific, CloudResearch, Qualtrics, and SONA. PLOS ONE, 18(3). https://doi.org/10.1371/journal.pone.0279720
Gupta, N., Rigotti, L., & Wilson, A. (2021). The experimenters' dilemma: Inferential preferences over populations. arXiv. https://doi.org/10.48550/arXiv.2107.05064
Kay, C. S. (2025). Why you shouldn't trust data collected on MTurk. Behavior Research Methods, 57, 340. https://doi.org/10.3758/s13428-025-02852-7
Kay, C. S., & Vlasceanu, M. (2026). A scale for detecting LLM-generated responses in online survey research [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/4p7ns
Krendl, A. C., Hugenberg, K., & Kennedy, D. P. (2024). Comparing data quality from an online and in-person lab sample on dynamic theory of mind tasks. Behavior Research Methods, 56(3), 2353–2375. https://doi.org/10.3758/s13428-023-02152-y
Peer, E., Rothschild, D., Gordon, A., Evernden, Z., & Damer, E. (2022). Data quality of platforms and panels for online behavioral research. Behavior Research Methods, 54, 1643–1662. https://doi.org/10.3758/s13428-021-01694-3
Uittenhove, K., Jeanneret, S., & Vergauwe, E. (2023). From lab-testing to web-testing in cognitive research: Who you test is more important than how you test. Journal of Cognition, 6(1), Article 13. https://doi.org/10.5334/joc.259
Wang, K. D., Chen, Z., & Wieman, C. (2024). Can crowdsourcing platforms be useful for educational research? In Proceedings of the 14th Learning Analytics and Knowledge Conference (LAK '24) (pp. 416–425). ACM. https://doi.org/10.1145/3636555.3636897
Zhao, S., Brown, C. A., Holt, L. L., & Dick, F. (2022). Robust and efficient online auditory psychophysics. Trends in Hearing, 26, 1–18. https://doi.org/10.1177/23312165221118792

This document is maintained by Prolific and updated as new evidence, platform features, or common questions emerge. Last updated: 26 March 2026.

For questions about specific wording for ethics applications or journal submissions, contact Support using the icon at the bottom right of this page.

Is online crowdsourcing a legitimate alternative to lab-based research?

Participant vetting and identifiers

IRB applications and ethics approval

Methodological Justification Pack: How Prolific Protects Data Quality

Methodological Justification Pack: Writing About Prolific in Your Research