Identification and characterisation of germline variants in an Irish cohort with breast cancer

McVeigh, Úna
Breast cancer is the most common female malignancy worldwide, and the second most common cause of cancer-related deaths in women. Twin studies estimate that up to one third of breast cancers are the result of hereditary factors; approximately 50% of the inherited genetic predisposition to the disease is yet to be fully elucidated. Next-generation sequencing (NGS) is a rapidly evolving technology that enables sequencing of many genes across multiple samples simultaneously. NGS, particularly multi-gene panels, is an accessible and practical option for clinicians and research groups investigating the genetic basis of inherited disease. However, variant interpretation remains a significant challenge, particularly in genes with an ill-defined association to disease. This research focused on the development of a custom NGS multi-gene panel to investigate the prevalence of variants in putative breast cancer susceptibility genes in Irish women with breast cancer, and to assess the pathogenicity of variants based on in silico predictions and biochemical analysis. A custom multi-gene panel was designed comprising 282 genes, including known breast cancer susceptibility genes, along with candidate genes identified by extensive literature review. Germline DNA was obtained from patients affected by breast cancer (n=91) and ethnically matched controls (n=77). Bioinformatic analysis was conducted in-line with GATK best practices. Variant annotation was performed with SnpEff, VEP, and ANNOVAR. Loss-of-function variants were verified using the UCSC Genome Browser. Missense variants were prioritised using five in silico algorithms. Population frequency and variant classification data were obtained from public databases. Common (MAF>1%) and benign variants were removed from analysis. Analysis was restricted to genes appearing on clinical breast ± ovarian cancer panels (n=83). Novel variants were prioritised for biochemical characterisation by the availability of 3D protein structures and functional assays. Recombinant wildtype and mutant protein were expressed and compared with respect to structure and function. Forty-five variants in 28 genes were selected for further investigation. At least one variant was identified in 26 patients with breast cancer and 25 unaffected controls. Half of the prioritised variants (n=22), lacking sufficient evidence for pathogenicity, were classified as variants of uncertain significance. Nine variants classified as pathogenic/likely pathogenic were identified; three occurred in ATM, BRCA1, and CHEK2. Six variants were identified in the mismatch repair genes MLH1, MSH2, MSH6, and PMS2. Eight novel variants were identified, including a frameshift variant in NF1. Functional investigation of a novel variant in FH demonstrated significantly impaired the enzymatic activity. This work has demonstrated the process of developing and applying a multi-gene panel in assessing genetic predisposition to breast cancer. The challenges encountered during bioinformatics analysis stresses the need for a standardised, validated bioinformatics workflow, with safeguards to ensure correct sample identification at all points of the pipeline. Many of the challenges in variant interpretation are highlighted in this work, particularly for novel variants, and variants in genes of moderate-, low-, or uncertain risk. Detailed structure-function analysis of variants may provide further evidence for pathogenicity but must be carefully considered in a phenotype-specific setting. These results highlight the need for caution when considering the use of multi-gene testing in the clinic, particularly when genes with an ill-defined association to disease are sequenced, or when testing is undertaken in an unaffected individual.
NUI Galway
Publisher DOI
Attribution-NonCommercial-NoDerivs 3.0 Ireland