Dec 11, 2025

Poster ECP 2025

AI-assisted TPS and CPS scoring of PD-L1 expression in kidney cancer

By: M. SOCKEEL (Primaa), S. SOCKEEL (Primaa),  C. LIU (Primaa), R. PEYRET (Primaa), C. SIMMAT (Primaa), L. ALBIGES (Institut Gustave Roussy), S.F KAMMERER-JACQUET (Rennes University Hospital),
N. RIOUX-LECLERCQ (Rennes University Hospital)

 

Introduction

As part of the European CARE1 clinical trial on metastatic kidney cancer, Programmed death-ligand 1 (PD-L1) expression is studied as a predictive biomarker for immune checkpoint inhibitors (ICI) treatment. Their evaluation relies on visual assessment of immuno-histochemistry (IHC) stains, a tedious and time consuming process that is also subject to inter-observer variability. In the context of kidney cancer, the interpretation is further complicated by the absence of a well-defined PD-L1 threshold, which makes consistent scoring and clinical decision-making more challenging.

Quantifying TPS and CPS PD-L1 scores in IHC-stained slides requires:

  • Identifying tumor regions as well as the surrounding inflammatory region in hematoxylin-eosin (HE).
  • Locating them in associated IHC-stained slides.
  • Evaluating the presence of PD-L1 positive tumor and immune cells in those regions compared to the total of viable tumor cells.

Pathologists could therefore benefit from Artificial Intelligence (AI) assistance in clinical diagnosis scenarios. This work aims to develop an end-to-end inference pipeline capable of computing TPS and CPS scores directly from IHC slides.

 

Interference Pipeline

Materials & Methods

 

Data Specification :

The dataset includes 193 training cases and 65 testing cases with paired HE and IHC slides.
However, due to annotation availability constraints, different subsets of cases were used at each stage of the pipeline:

  • Registration and lesion detection: 36 cases for training and 20 for testing, each with annotated regions of interest on HE slides.
  • Cell classification: 62 cases for training and 19 for testing, with all cells manually annotated using bounding boxes within four small frames per slide.

Lesion Detection :

Once the annotations are available on the IHC slides, a U-Net segmentation model was trained directly on 5103 IHC patches using online augmentations, including color variation, noise injection, and random flips.

Cells Positivity :

Cells positivity determined by a threshold on DAB, using color deconvolution.

Registration :

Jaccard index (overlap) maximization using simulated annealing to register HE region annotations with IHC slides in order to transfer the annotated regions onto the IHC.

Cells Classification :

A RetinaNet-based detection and classification model was then trained on 3,588 lesion patches with data augmentation, including color variation, noise injection, and random flips. The model distinguishes tumor cells, immune cells, and other cells, the latter category comprising normal, polymorphonuclear, and apoptotic cells.

 

Experiments & Results

Lesion Detection : 

The lesion detection model was tested on 2720 patches. A pixel-wise confusion matrix was used to evaluate how accurately each pixel was classified as lesion or non-lesion.

 

Cells Classification and TPS scoring : 

The cells classification model was tested on 1575 patches.

Discussion

For the lesion detection, the model achieved reached an F1 score of 0.788, reflecting balanced precision and recall.

For the cells classification, the model shows promising performance on tumor and immune cells.
However, adding more annotations for other cells would help address the training imbalance and improve overall accuracy.

At this stage, the automated scoring lacks sufficient precision for low TPS and CPS values, which remains one of the current limitations.. The assessment of cell positivity is still at an early stage and will benefit from further evaluation or complementary methods. Continued improvements across pipeline stages are expected to enhance overall performance and clinical relevance.

 

Conclusion

The pipeline effectively identifies regions of interest in IHC images and shows promising results in cell classification. A comprehensive evaluation, including full pipeline testing on multi-annotator datasets to address inter-observer variability in TPS and CPS scoring is ongoing. This step is currently limited by the availability of ground truth annotations, which are still being collected. Future work will focus on completing this validation and extending the study to a larger cohort to ensure robustness and generalizability.

 

 

This project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101104801. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.