Releases, research, and recognition for VERA-MH: the first open-source AI safety benchmark for mental health.
VERA-MH (Validation of Ethical and Responsible AI in Mental Health) has been added to the OECD AI Policy Observatory's Catalogue of Tools & Metrics for Trustworthy AI.
VERA-MH is the first open-source AI safety benchmark for mental health. Co-developed and open-source by Spring Health, it helps researchers, developers, clinicians, and policymakers evaluate how AI systems handle mental health conversations involving suicide risk.
Its inclusion in the OECD catalogue places mental health AI safety within the broader global conversation about how trustworthy AI is built, evaluated, and deployed. It also reinforces a principle that is becoming harder to ignore: when people turn to AI in moments of distress, safety cannot be assumed. It has to be measured.
View the OECD listing: https://oecd.ai/en/catalogue/tools
A new research paper detailing the VERA-MH methodology and evaluation results for four leading LLM providers is now available on arXiv.
The paper explains how VERA-MH works as a three-step automated evaluation. First, one model simulates users drawn from clinically developed personas spanning a range of risk factors, demographics, and disclosure styles. Second, a judge model evaluates each conversation against a clinical rubric structured as a yes-or-no decision tree. Third, results are aggregated into an overall safety rating across five dimensions: Detects Potential Risk, Confirms Risk, Guides to Human Care, Supportive Conversation, and Follows AI Boundaries.
Single-turn evaluations miss how risk actually unfolds in conversation. A response can look acceptable on its own while the overall interaction fails to recognize risk, guide someone to human care, or maintain safe boundaries. VERA-MH was built to evaluate the full conversation.
Read the paper: https://arxiv.org/abs/2605.13318
A recording of the recent webinar, “Evaluating AI Safety in Mental Health: Practical Frameworks, Gaps, and What Comes Next,” is now available.
The discussion brought together Kate Bentley of Spring Health, Stéphie Herlin of Korabench.ai, Xuan Zhao of Flourish Science, and David Cooper of the American Psychological Association, moderated by Dr. Laura Erickson-Schroth of The Jed Foundation.
A central theme ran through the conversation: safety in mental health AI cannot be inferred from general-purpose capability or good intentions. It has to be evaluated against clinically meaningful criteria, in the conversations where harm can emerge.
Four themes stood out:
Watch the recording: https://www.linkedin.com/posts/vera-mh_evaluating-ai-safety-for-mental-health-best-activity-7457883654002864129-T0nq
VERA-MH v1.1 strengthens how teams can simulate and evaluate chatbot conversations involving suicide risk against a clinically informed safety rubric. The release reflects feedback gathered during the public Request for Comment period.
What's changed:
Because the rubric and persona set have changed, v1.1 scores are not directly comparable to v1.0. That tradeoff is deliberate. Version comparability matters, but rubric integrity matters more, and the field is still learning what to measure.
VERA-MH is a living framework. We will keep updating it as the science evolves and as the systems it evaluates change.
Repository: https://github.com/SpringCare/VERA-MH
In “The Map is Not the Territory,” Steve Duke and Kevin Hou examine two defining questions in mental health AI: how to tell whether an AI system is safe, and how to compare one chatbot against another.
Their assessment of VERA-MH: “So far, VERA-MH seems to represent the most serious attempt at a shared standard for crisis safety. It's open-source, clinically validated, and I've heard very positive feedback on the evals themselves and their openness to feedback and development.”
Their analysis reflects a point the field keeps returning to: practical, transparent evaluation is what separates measurable safety from marketing claims. VERA-MH is part of that shift by giving the field an open-source, clinically validated way to evaluate mental health AI safety across full conversations.
Read the analysis: The Hemingway Report — The Map is Not the Territory