Ph. D. Project
Supervised statistical and machine learning methods for multi-class skin cancer classification exploiting multidimensional datasets from in vivo UV-Vis-NIR spectroscopy, hyperspectral imaging and optical coherence tomography
2021/11/22 - 2024/11/21
Other supervisor(s):
Prof. Yuri Kistenev (
Multimodal in vivo optical biopsy refers to the combination of the strengths of different spectrally- and spatially-resolved methods which allow for enhanced non invasive characterization of the optical properties of biological tissues. Metabolic and morphological pathological modifications in biological tissues induce changes in their optical properties at subcellular, cellular and tissue scales. Skin cancer is the most common form of cancers in humans, among which advanced Melanoma Skin Cancer (MSC) and Non-Melanoma Skin Cancers (NMSC) have the poorest prognosis and the highest morbidity, respectively. The standard clinical procedure for diagnosis is based on a visual examination of the skin surface, eventually followed by a surgical biopsy for histopathological grading in case of a suspicious lesion. But this procedure has low sensitivity and specificity. Furthermore, this method is invasive and time-consuming, therefore bearing health and economic issues. To answer the latter, non-invasive spectro-imaging techniques have been developed to provide earlier, more accurate detection of skin cancers, thus improving the diagnostic and care efficiencies.
AutoFluorescence (AFS) and Diffuse Reflectance (DRS) spectroscopies were investigated as non-invasive diagnostic tools for skin cancers in clinics in a number of studies as they provide access to structural and functional information of interest for non invasive diagnosis and pathological evolution follow-up. DR spectra carry information about the absorption and scattering properties of the tissue related to its structure and organization. Multiply excited AF (mAF) spectra convey information about tissue composition and metabolism through intrinsic fluorescing compounds present in cells and extracellular matrix. Combining these complementary techniques enables the detection of changes appearing in MSC and NMSC, as well as the accurate differentiation between dysplastic and benign lesions. Spatially resolved (SR) spectroscopy provides depth sensing from a reflectance configuration by using a multiple optical fiber probe with several excitation-to-collecting fiber separations. Further discriminant information can be extracted from absorption, scattering and AF changes in the various tissue layers, the latter being differently affected depending on the pathology stages. The challenge consists in exploiting this spatial resolution approach and in extracting from these multidimensional data set, discriminant information related to the characteristics and concentrations of the latter constituents at various depths in the tissue probed.
Line-field Confocal Optical Coherence Tomography OCT (LCOCT) is an emerging technique in dermatology, which combines Reflectance Confocal Microscopy and OCT advantages, i.e. high spatial resolution and appropriate penetration depth. LCOCT provides a comprehensive mapping of the skin based on morphological information with image contrast arising from variations in the tissue refractive index. The latter 3D images of vertical and en face sections are the closest to histological images as compared to all other non-invasive imaging technologies. However, little changes in the refractive index between normal and pathological tissues in the early stages of diseases are difficult to distinguish. To solve this problem, functional information on the local biochemistry of skin should be collected along with LCOCT images.
Hyperspectral imaging (HSI) is an emerging spatially resolved spectral imaging modality for medical applications, especially in disease diagnosis and image-guided surgery. A hypercube 3D data set is acquired which provides diagnostic information about the tissue physiology, morphology and composition. HSI offers possibilities for distant mapping of specific skin parameters, facilitating better diagnostics of skin pathologies. A large number of proteins have vibrational modes in the near and far IR ranges.
Artificial intelligence (AI) is being increasingly studied for its potential use in medicine, particularly in dermatology. It was demonstrated that AI models, namely classification models based on deep learning methods can achieve the accuracy of board-certified dermatologists for the classification between skin benign and malignant lesions based on dermoscopic image data. In Supervised learning classification (SLC), a training set is used to build a description (classifier) of the data so as the algorithm learns from already labeled data. Then, the trained classifier is used to predict the class of new input data to which they most likely belong into. In our case, the extraction and selection of biological-meaningful features as well as taking into account a degree of uncertainty on the 'reference' histological classification are at stake to perform useful and reliable multi-class supervised classification. Supervised statistical and machine learning methods for fast inference via supervised classification and deep neural network - based approaches should be implemented to process the acquired data and improve the performances in differentiating between cancer stages. The expected improvements in skin pathology diagnostic efficiency will be studied and quantified in the frame of experimental validations on optical phantoms, ex vivo skin samples and in clinics.
The aim of the present PhD thesis consist in coupling three complementary spectro-imaging modalities in order to enhance the in vivo diagnostic efficiency of skin lesions including cancers (MSC and NMSC). LCOCT, HSI and SR-DRmAFS will allow for respectively recording (i) highly spatially resolved 3D images of skin microstructures, (ii) highly spectrally and spatially resolved images of skin reflectance and (iii) highly spectrally resolved functional information of skin absorption, scattering and autofluorescence. Experimental works will be conducted on layered absorbing, scattering and fluorescing optical phantoms, on hybrid ex vivo skin-gel models and in clinics on patients with skin MSC and NMSC. Various supervised machine learning classification strategies will be studied including namely kernel Support Vector Machine, decision trees, Random Forests and neural networks and a dedicated AI multidimensional data processing algorithm will be developed taking into account the specificity of the data collected (variety, dimension etc.).
Biology, Signals and Systems in Cancer and Neuroscience