News|Articles|April 3, 2026

AI matches dermatologists in detecting melanoma, but questions remain about real-world use

Listen
0:00 / 0:00

Key Takeaways

  • Pooled prospective accuracy for melanoma detection by dermoscopy was similar for AI and dermatologists, with AI showing marginally higher sensitivity and comparable specificity overall.
  • Within matched clinical comparisons, AI often improved specificity without sacrificing sensitivity, implying potential to reduce false positives and avoid unnecessary biopsies.
SHOW MORE

A new meta-analysis of prospective studies finds artificial intelligence performs similarly to dermatologists in melanoma detection, although more rigorous research is needed before it can be considered as a reliable decision-support tool in routine care.

Artificial intelligence (AI) tools are moving closer to the dermatology clinic, but whether they can reliably match clinicians in real-world settings has remained unclear. A systematic review and meta-analysis published online in March 2026 in JAMA Dermatology suggests that AI systems can perform at levels comparable to dermatologists when diagnosing melanoma using dermoscopy, with some evidence that AI-assisted care may further improve accuracy.

Melanoma is one of the most aggressive forms of skin cancer, and early detection is critical to improving survival. Dermoscopy, which allows clinicians to visualize subsurface skin structures, is considered the standard of care but relies heavily on clinician expertise. In recent years, AI, particularly convolutional neural networks, has shown promise in analyzing dermoscopic images, often matching or exceeding expert performance in retrospective studies. However, those studies may not reflect the complexity of real-world clinical practice.

To better understand how AI performs in routine settings, Sara Laiouar-Pedari, Ph.D., of the German Cancer Research Center in Heidelberg, Germany, and colleagues conducted a systematic review focusing exclusively on prospective studies. The researchers analyzed data from 11 studies involving more than 2,500 patients and 50 dermatologists, comparing diagnostic performance across dermatologists, AI systems and, in one study, dermatologists assisted by AI.

Overall, the findings showed similar performance between AI and clinicians. Dermatologists achieved a pooled sensitivity of 78.6% and specificity of 75.2%, while AI systems demonstrated a sensitivity of 80.9% and specificity of 75.6%. In direct comparisons within the same clinical settings, AI often showed higher specificity with comparable sensitivity, suggesting it may help reduce unnecessary biopsies.

Notably, the single study evaluating AI-assisted dermatologists reported even higher performance, with sensitivity reaching 91.9% and specificity 83.7%. The authors suggest this points to a potential role for AI as a decision-support tool rather than a replacement for clinicians.

Still, the authors urge caution in interpreting the findings. Many of the included studies carried a high risk of bias, particularly because patients were often selected based on lesions already suspected to be melanoma. This preselection can make diagnostic performance seem better than it might be in everyday practice, where clinicians encounter a much broader mix of skin findings. Additionally, differences in study design, such as varying classification approaches and reference standards, complicate direct comparisons across studies.

Another key limitation is the imbalance in available data. While most studies evaluated dermatologist performance, fewer assessed AI alone and only one examined AI-assisted care, making it difficult to understand how much value AI might add in routine care.

Despite these caveats, the findings represent an important step forward. By focusing on prospective studies, the analysis offers a more realistic view of how AI performs outside controlled, retrospective datasets, which have historically overestimated accuracy. The authors note that AI models trained on larger and more diverse datasets tended to perform better, underscoring the importance of robust training data for real-world implementation.

For now, AI appears to be a promising assistive tool rather than a replacement for clinicians. The authors emphasize that larger, multicenter studies in real-world patient populations will be needed to determine whether AI can be safely and reliably integrated into everyday dermatology practice.


Latest CME