Cutaneous squamous cell carcinoma (cSCC) is the second most common form of skin cancer. While most cases are treatable, a small number can become serious and spread, leading to worse outcomes.
A risk prediction tool for cutaneous squamous cell carcinoma (cSCC) built with the GPT-4 large language artificial intelligence model performed better than current systems at identifying patients more likely to have poor outcomes, according to a new study published in JAMA Dermatology.
CSCC is the second most common form of skin cancer. While most cases are treatable, a small number can become serious and spread, leading to worse outcomes.
Accurately identifying which tumors are more dangerous is important for deciding how to treat patients, the report shared.
Invasive squamous cell carcinoma.
Existing tools or models, such as the AJCC8 and BWH staging systems, group tumors by certain traits, but they tend to miss important risk factors and can group very different tumors together, making it harder to predict who might do poorly.
Many factors increase the risk of developing cSCC, including immunosuppression, chronic wounds, fair skin, male gender, older age, certain genetic conditions, ultraviolet (UV) radiation exposure and a history of prior squamous cell carcinoma, according to the National Institutes of Health.
In 2012, the estimated incidence was 140 cases per 100,000 American men and 50 per 100,000 women.
To address these limitations, researchers searched PubMed, Embase and the Cochrane Library for studies from 1999 through the end of 2023.
After applying strict criteria, 10 studies that linked risk factors to serious outcomes such as recurrence, spread or death were selected.
These studies were used to inform a large AI model, GPT-4, called AIRIS through a process called retrieval-augmented generation (RAG).
The AI created a new scoring system to predict which cSCC tumors are more dangerous.
AIRIS was tested using tumor data from NYU Langone Health and Mayo Clinic.
The dataset included 2,379 biopsy-proven cSCC cases with full clinical information.
The AI model’s predictions were compared to AJCC8 and BWH systems using statistical tests.
Researchers measured how well AIRIS could predict poor outcomes using standard metrics like sensitivity, specificity and AUC. AIRIS was also tested for consistency and ability to separate high- and low-risk cases.
It was found that AIRIS outperformed BWH and AJCC8 in a number of key areas for predicting poor outcomes in patients with cSCC.
In low-risk groups, AIRIS showed fewer poor outcomes: 50.9% for local recurrence (LR), 26.3% for nodal metastasis (NM), 17.5% for distant metastasis (DM) and 27.8% for disease-specific death (DSD).
In comparison, BWH and AJCC8 systems had nearly twice as many poor outcomes in their low-risk groups, indicating there were less consistent results.
AIRIS also showed further progression, overall.
For high-risk AIRIS classes, the poor outcome rates increased significantly: LR (49.1%), NM (73.7%), DM (82.5%) and DSD (72.2%).
As far as diagnostic performance, AIRIS had higher sensitivity for all outcomes—ranging from 49.1% to 82.5%—but slightly lower compared to BWH and AJCC8.
Although overall accuracy was lower, AIRIS demonstrated stronger predictive power, with AUC values of 0.69 (LR), 0.81 (NM), 0.85 (DM), and 0.80 (DSD)—all higher than the traditional systems.
While much data was collected, the study did have several strengths.
For example, reviewed over 2,000 primary tumors to validate AIRIS. AIRIS included important patient risk factors such as immunosuppression, lymphovascular invasion and in-transit metastasis, which are often missing from traditional staging systems, authors of the study noted.
This helped AIRIS better predict poor outcomes and showed improved sensitivity and risk discrimination compared to current standards.
However, limitations include the relatively low event rate of poor outcomes in cSCC which cab make validation challenging.
In addition, large language models such as GPT rely on probable predictions and can have biases based on their training data and inputs.
While RAG helps ground the model in reliable literature, AI-generated outputs still require careful validation, authors suggest.
Future improvements are recommended to include weighting immunosuppression categories and integrating multimodal data including imaging or gene profiles to personalize risk predictions further.
Roflumilast Foam Showed Strong Results for Scalp and Body Psoriasis in Teens and Adults
May 7th 2025A new JAMA Dermatology study found that once-daily Zoryve (roflumilast) foam, 0.3%, safely and effectively reduced scalp and body plaque psoriasis symptoms in patients 12 and older, offering a convenient, and more tolerated treatment option that may boost adherence and quality of life.
Read More
Patients With Atopic Dermatitis Turn to Social Media but Trust Medical Advice Most
May 5th 2025A study found that while patients with atopic dermatitis often turn to social media for skincare advice, their decisions are most influenced by healthcare professionals, personal experience, and disease severity.
Read More
Most Patients With HS Aren’t Aware of Approved Treatments
May 1st 2025Treating hidradenitis suppurativa is complex and often requires taking a number of medications, including antibiotics, hormone therapies and immunosuppressants. However, many of these treatments are used off-label, and at this time, only two biologic therapies—Humira (adalimumab) and Cosentyx (secukinumab)— are approved by the FDA for HS treatment.
Read More
Experts Explore Causes and Care for Chronic Itch
April 22nd 2025In a recent discussion with Managed Healthcare Executive, three leading dermatologists and itch experts—Shawn Kwatra, M.D., Brian Kim, M.D., and Gil Yosipovitch, M.D.—shared where the science is going, what’s holding it back and how the healthcare system can better support patients.
Read More