Publication

A bioinformatics analysis of the lncRNA-associated antigen load landscape across multiple cancer types

Malik, Sumaira
Citation
Abstract
This thesis explores lncRNAs potential to produce tumor antigens. Ribosome profiling has provided broad evidence of short open reading frames (sORFs) in lncRNAs. Most of the available literature on tumor antigen load has explored antigens arising from mutated coding regions. In Chapter 2, we present a novel resource, “lnc.IM pipeline”, which is focused on identifying patient-specific tumor antigen load arising from translatable lncRNAs referred to as lnc.IM scores. In subsequent chapters (Chapters 3 and 4) the lnc.IM pipeline was employed to study the association of lnc.IM scores with immune checkpoint inhibitor (ICI) efficacy predictions and tumor immune microenvironment (TIM). ICI therapy is one of the most promising treatments for cancers. ICI response, however, varies among patients, emphasizing the importance of identifying genomic biomarkers to predict likely therapeutic efficacy in advance of treatment. In Chapter 3, we explore the usability of lnc.IM scores to predict ICI response in melanoma. We reported improved overall survival among patients with low lnc.IM scores (HR=0.39, p=0.009) in the skin cutaneous melanoma (TCGA-SKCM) cohort. We hypothesized that integrating lnc.IM with the tumor mutation burden (TMB) associated neoantigen load could augment the current criteria for ICI patient selection, which primarily relies on TMB only. Using the ICI-treated cohorts, we demonstrated that a classifier based on both lnc.IM scores and neoantigen load as predictors improved the prediction of immunotherapy outcomes as compared to using TMB alone, yielding an area under the curve (AUC) of 0.89. We also demonstrated a reduced rate of false negatives (14%) by using a combined antigen score as compared to the use of TMB alone (33%) in ICI-treated cohorts. Next, we examined lnc.IM scores in 29 different TCGA cancers (Chapter 4). We observed that cancers that showed the highest median lnc.IM scores (acute myeloid leukaemia (LAML), colon adenocarcinoma (COAD), rectum adenocarcinoma (READ), liver hepatocellular carcinoma (LIHC)) were also associated with low median TMB, providing an alternative source of cancer antigens in these cancers. Survival analysis revealed that lnc.IM scores are associated with survival outcomes in 9 out of 29 cancers. Among these, four cancers (LAML, brain lower grade glioma (LGG), lung adenocarcinoma (LUAD), and glioblastoma multiforme (GBM)) indicated that high lnc.IM scores were linked to better survival. Additionally, three of these four cancers exhibited a positive correlation between lnc.IM scores and the transcriptome derived abundance of anti-tumor immune cells. These results suggest the association of lnc.IM scores with elevated anti-tumor immune response in these cancers. Finally, we conducted a comprehensive in-silico analysis, comparing the distribution of transposable elements (TEs) among lncRNA sORFs that were used for lnc.IM scoring in previous chapters. Our findings revealed that Endogenous Retroviruses (ERVs) are the most represented (58%) among lncRNA sORFs, followed by Long Interspersed Nuclear Elements (LINEs) 11%, Short Interspersed Nuclear Elements (SINEs) 7%, and SINE-VNTR-Alu elements (SVA) 2%. Specifically within the ERV class, ERV1 emerged as the most represented sub-class. A comparison of predicted immunogenicity across different ERV classes did not exhibit significant differences. Lastly, we extended our analysis to predict the biochemical properties of peptides derived from these lncRNA sORFs. We predicted 25 TE-associated immunogenic peptides based on their major histocompatibility complex (MHC-I) binding predictions, T-cell reactivity predictions, hydrophobicity and stability predictions.
Funder
Publisher
University of Galway
Publisher DOI
Rights
CC BY-NC-ND