A Large Language Model Framework to Advance Evidence-Based Strategies for Suicide Prevention and Early Intervention

October 28, 2025

Photos of Dr. Yifan Peng, a therapist and patient, and binary codes — Photo of Dr. Yifan Peng

Suicide is a major global public health concern. The underlying causes of suicidal behaviors and thoughts stem from a complex interplay of social, economic, political, and physical factors. Understanding these interactions is crucial to developing evidence-based strategies for suicide prevention and early intervention. While there is interest in integrating these factors into structured electronic health records, much of the information is embedded within unstructured text narratives, rendering it inaccessible or incomplete.

Advancements in natural language processing have enabled the extraction of socioeconomic, political, and physical factors from clinical notes, but challenges remain in capturing rarer factors and temporal context. Importantly, many deep learning models also do not provide reasoning, which is particularly problematic in suicide studies where explainability is crucial.

In a study in Communications Medicine, Dr. Yifan Peng, associate professor of population health sciences, and colleagues at Columbia University and the University of Texas at Austin developed a large language model (LLM) framework to extract this information from unstructured narratives. To address the lack of temporal context in other models, this study focuses on capturing relevant information occurring within the two weeks preceding suicide incidents.

The framework breaks down the extraction task into context retrieval, relevance verification, and factor extraction. The LLM can explain its reasoning via self-explanation, capture nuanced factors, and generalize information effectively.

Researchers compared their approach to other state-of-the-art language models and reasoning models. They further evaluated how the model’s explanations help people annotate factors more quickly. The three-step framework outperformed baseline models and reduced annotation time without sacrificing accuracy.

The framework demonstrates performance boosts in the overarching task of extracting factors and retrieving relevant context, which could aid in early identification of at-risk individuals and informing more effective prevention strategies. Future research should continue to further evaluate the causal impact of artificial intelligence assistance on annotation quality and efficiency.

Highlights

A Large Language Model Framework to Advance Evidence-Based Strategies for Suicide Prevention and Early Intervention

Follow us on X