Skip to main navigation Skip to search Skip to main content

Reproducible Reporting of the Collection and Evaluation of Annotations for Artificial Intelligence Models

  • Katherine Elfer
  • , Emma Gardecki
  • , Victor Garcia
  • , Amy Ly
  • , Evangelos Hytopoulos
  • , Si Wen
  • , Matthew G. Hanna
  • , Dieter J.E. Peeters
  • , Joel Saltz
  • , Anna Ehinger
  • , Sarah N. Dudgeon
  • , Xiaoxian Li
  • , Kim R.M. Blenman
  • , Weijie Chen
  • , Ursula Green
  • , Ryan Birmingham
  • , Tony Pan
  • , Jochen K. Lennerz
  • , Roberto Salgado
  • , Brandon D. Gallas
  • United States Food and Drug Administration
  • National Institutes of Health
  • Massachusetts General Hospital
  • iRhythm Technologies Inc.
  • Memorial Sloan-Kettering Cancer Center
  • University of Antwerp
  • Sint-Maarten Hospital
  • Lund University
  • Yale University
  • Emory University
  • Harvard University
  • Peter Maccallum Cancer Centre
  • GZA-ZNA Hospitals

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

This work puts forth and demonstrates the utility of a reporting framework for collecting and evaluating annotations of medical images used for training and testing artificial intelligence (AI) models in assisting detection and diagnosis. AI has unique reporting requirements, as shown by the AI extensions to the Consolidated Standards of Reporting Trials (CONSORT) and Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) checklists and the proposed AI extensions to the Standards for Reporting Diagnostic Accuracy (STARD) and Transparent Reporting of a Multivariable Prediction model for Individual Prognosis or Diagnosis (TRIPOD) checklists. AI for detection and/or diagnostic image analysis requires complete, reproducible, and transparent reporting of the annotations and metadata used in training and testing data sets. In an earlier work by other researchers, an annotation workflow and quality checklist for computational pathology annotations were proposed. In this manuscript, we operationalize this workflow into an evaluable quality checklist that applies to any reader-interpreted medical images, and we demonstrate its use for an annotation effort in digital pathology. We refer to this quality framework as the Collection and Evaluation of Annotations for Reproducible Reporting of Artificial Intelligence (CLEARR-AI).

Original languageEnglish
Article number100439
JournalModern Pathology
Volume37
Issue number4
DOIs
StatePublished - Apr 2024

Keywords

  • Annotation Study
  • Artificial Intelligence Validation
  • Data set
  • Reference Standard
  • Reproducible Research
  • digital pathology

Fingerprint

Dive into the research topics of 'Reproducible Reporting of the Collection and Evaluation of Annotations for Artificial Intelligence Models'. Together they form a unique fingerprint.

Cite this