Trustworthy Assertion Classification Through Prompting

Assertion classification is the task of classifying the assertion status of clinical concepts expressed in natural languages, such as a diagnosis or condition being present, absent, or possible. This is important for healthcare professionals to quickly understand crucial clinical information in unstructured notes. However, recent rule-based and machine-learning approaches suffer from labor-intensive pattern engineering and severe class bias toward majority classes. In a new study published in the Journal of Biomedical Informatics, Dr. Yifan Peng, assistant professor of population health sciences, and colleagues propose a prompt-based learning approach, which treats the assertion classification task as a masked language auto-completion problem. Upon evaluating the model on six datasets, it showed excellence in detecting classes with few instances. Compared to other methods, this prompt-based model has a stronger capability of identifying comprehensive and sufficient linguistic features from free text. Further, the results imply a better rationale agreement between the model and human beings, which demonstrates the superior trustworthiness of the model.

Population Health Sciences 402 E. 67th St. New York, NY 10065 Phone: (646) 962-8001