Skip to main content

How do I prevent AI-generated responses in my study?

When participants use ChatGPT or similar tools to help answer studies, this is called AI-assisted participation; real humans using AI to generate content. This article covers practical study-design steps that reduce it, and debunks techniques that don't actually help.

What Prolific does at the platform level

Prolific runs a multi-layered quality assurance system called Protocol, covering identity verification, ongoing monitoring, in-study detection, and dynamic quality scoring across the participant pool. For the full picture, see How Prolific protects data quality.

Beyond what Prolific does, there are a few things you can do as a researcher to prevent and protect your data from AI-generated responses.

This is guidance for cleaner data, not for rejections. The techniques below improve your dataset and can inform your decisions on excluding participants from your data analysis. They are not, in themselves, grounds for rejecting participants on Prolific. For what is and isn't a valid rejection reason, see Who should I reject?. If you have any specific concerns about rejections and data quality, you may reach out to us via our Data Quality Form.


What you can do at the study level

1. Design questions that LLMs answer poorly

This is your most effective lever, and it works on every platform.

  • Anchor questions in lived experience. Generic prompts are exactly what LLMs are trained to answer well. Prompts that require specific, recent, personal details are much harder to fake.

For example a weak question easily answered by an LLM would be: ‘What is your view on social media?’ A stronger question would be: ‘Think of the last time you opened a social media app. What were you hoping to find, and did you find it?

The stronger version is harder to delegate to an LLM because the answer depends on something only the participant knows.

  • Bind questions to your study’s context. If your study includes a stimulus (i.e. a vignette, a scenario, an image, an earlier response), ask follow-ups that require participants to refer back to it specifically.

For example, ‘In the scenario you just read, why do you think the manager responded that way?’ requires the participant to know about your study material. Copy-pasting into a chatbot becomes inefficient in these cases.

  • Use non-text stimuli where the design allows. Tasks involving images, audio, or interactive content raise the cost of using an LLM tool considerably. (Veselovsky et al., 2025). For example, you could add your study instructions as a screenshot or image, rather than text-based instructions.

The aim is not to trick people. It is to design a study where giving an authentic answer is easier than opening ChatGPT.

2. Use authenticity checks (when possible)

If your study runs on Qualtrics or AI Task Builder, enable authenticity checks. They are free, automated, and don’t change the participant experience. They will soon be available for other platforms as well.

There are two kinds of authenticity checks, each designed for a different risk:

  • Bot checks identify when AI agents or fully automated bots are answering your study. These look for non-human or scripted behaviour across all question types and work with 100% accuracy in testing.

  • LLM checks flag behavioural signs that a human is using LLMs (like ChatGPT) to answer free-text questions. These look for suspicious behaviors like copy-pasting and tab-switching on free-text questions only. Note that these checks focus on behavioral analysis and do not analyse the written words themselves. In testing, LLM checks work with 98.7% precision.

These checks address different use cases, so results are shown separately and should be interpreted independently. There are a few things to know to use them well:

  • Ideally add only one question per page when using authenticity checks, so behaviours can be clearly linked to each question.

  • Don’t enable LLM checks for tasks where AI used is expected or if you require participants to research information externally (e.g. summarising a Wikipedia article). The behaviours that flag misuse (like copy-pasting, switching tabs) are legitimate parts of the task and you’ll get false flags.

  • You can use bot checks with or without LLM checks, since they address different risks. Do not use bot checks if your study requires participants to use accessibility tools or must be completed on a specific device (e.g. tablet).

To know how to set up authenticity checks on your Qualtrics study, you can read this article, or watch this video.

3. Be explicit with participants

Most Prolific participants want to do good work; they just need to know what good work looks like for your study. Tell participants up-front that AI use is not allowed and explain why. Research done using Prolific has found that explicit, reasoned instructions reduces AI use measurably (from 10.9% to 7.1%, Veselovsky et al., 2025):

Please answer in your own words. Do not use AI tools or external websites; we're interested in your genuine perspective.

Place this near the start of the study and repeat it before any high-stakes free-text question, not just once at the introduction.

4. Modify your study design

If your study tool allows, disabling copy-paste on your external software or showing source text as an image both have been found to significantly reduce LLM use in a study using Prolific. Combining both approaches nearly halved LLM use (Veselovsky et al., 2025).


What doesn't work and why

Several techniques are widely recommended online but do not address AI-assisted participation in any meaningful way:

  • Traditional attention checks. Standard attention checks were designed for inattentive humans; AI passes around 99.8% of them (Traylor, 2025; Westwood, 2025). They remain useful for their original purpose, but they are not an AI defence.

  • Hidden white-text instructions (e.g., embedding “include the word ‘banana’” in white-on-white text). This does not reliably distinguish humans from AI (Storozuk et al., 2020), and might create false positives for participants using ‘dark mode’ or any assistive technology.

  • CAPTCHAs. These catch some bots, but a real human can solve a CAPTCHA and then use AI for the content. They don’t prevent AI-assisted participation (Zhang et al., 2022).

  • Adding your own typing-speed or screen-focus checks. These signals are already part of what Prolific's bot and LLM checks analyse. On their own, they are noisy; fast typists and people working from notes get flagged incorrectly.


References

Storozuk, A., Ashley, M., Delage, V., & Maloney, E. A. (2020). Got bots? Practical recommendations to protect online survey data from bot attacks. The Quantitative Methods for Psychology, 16(5), 472–481. https://doi.org/10.20982/tqmp.16.5.p472

Traylor, F. (2025). The threat of AI chatbot responses to crowdsourced open-ended survey questions. Energy Research & Social Science, 119, Article 103857.

Veselovsky, V., Horta Ribeiro, M., Cozzolino, P. J., Gordon, A., Rothschild, D., & West, R. (2025). Prevalence and prevention of large language model use in crowd work. Communications of the ACM, 68(3), 42–47. https://doi.org/10.1145/3685527

Westwood, S. J. (2025). The potential existential threat of large language models to online survey research. Proceedings of the National Academy of Sciences, 122, Article e2518075122. https://doi.org/10.1073/pnas.2518075122

Zhang, Z., Zhu, S., Mink, J., Xiong, A., Song, L., & Wang, G. (2022). Beyond bot detection: Combating fraudulent online survey takers. In Proceedings of the ACM Web Conference 2022 (pp. 699–709). Association for Computing Machinery.

If you would like a sense check on your study design for AI resilience, our Research Support team can help; get in touch via the support button.

Did this answer your question?