Essays on exploring heterogeneity in effectiveness and cost-effectiveness in health applications using a machine learning approach
Hattab, Zaid
Hattab, Zaid
Loading...
Publication Date
2024-12-12
Type
doctoral thesis
Downloads
Citation
Abstract
Personalized (also known as stratified or precision) medicine aims to improve population health by providing the right treatment to the right patient at the right time, recognizing that the effectiveness and harms of treatments can vary based on individuals' baseline characteristics (such as age, gender, and disease severity).
Traditional methods for identifying subgroup effects are limited, often focusing on characteristics ‘1-variable-at-a-time’ or using simple linear parametric models. More flexible, data-driven approaches from the Machine Learning (ML) literature have been proposed to improve estimates of heterogeneous treatment effects (HTE), which can lead to better treatment decisions for individual patients.
The overall aim of this thesis is to enhance the understanding and application of ML methods for estimating HTEs in the context of real-world Randomized (or quasi randomized) Controlled Trials (RCTs). The knowledge gap addressed in this thesis lies in the limited application of advanced ML methods, specifically forest-based methods, in estimating HTE within RCTs. ML methods have several advantages, such as the ability to handle complex interactions between variables, adaptability to various data structures, and the potential for higher accuracy in predicting outcomes. Among these, as described below, forest-based methods are particularly attractive due to their robustness, interpretability, and capability to capture non-linear relationships and interactions without the need for strict parametric assumptions.
This thesis addresses this knowledge gap by applying a recently proposed ML approach, the causal forest method, to estimate HTE and identify subgroups of patients who are most likely to benefit from interventions or policies through a series of papers based on data from real-world experiments. This research makes significant contributions in three critical contexts: examining HTE in RCTs, analyzing observational studies, and incorporating cost-effectiveness analysis. Focusing on RCTs, which are the gold standard of causal inference, enhances the
credibility of this thesis's findings by ensuring the estimated HTE are theoretically welldefined. Additionally, Chapter 5 of this research demonstrates the capacity of the employed ML methods to handle observational data effectively when selection on observables is plausible or a valid instrumental variable is available, yielding reliable results. Moreover, integrating cost-effectiveness analyses ensures that treatments are evaluated not only for their clinical benefits but also for their economic implications. This dual approach is vital for healthcare systems that must balance quality of care with budgetary constraints and opportunity costs. Assessing the cost-effectiveness of treatments allows us to identify interventions that offer the best value, thereby optimizing the allocation of healthcare resources. Overall, estimating heterogeneity in these contexts is crucial for developing personalized treatment strategies, making informed policy decisions, and optimizing healthcare resource utilization. To advance these objectives the thesis undertakes a series of empirical analyses presented as follows:
Chapter 2 examines heterogeneity in the cost-effectiveness of high flow nasal cannula (HFNC) therapy compared with continuous positive airway pressure (CPAP) in children following extubation (Step-Down trial). This chapter uses data from the FIRST-line support for Assistance in Breathing in Children (FIRST-ABC) trial to identify heterogeneity at the individual and subgroup levels using a cost-effectiveness analysis causal forest (CEA forest) approach, alongside a seemingly unrelated regression (SUR) approach for comparison. The primary outcome of this study is the incremental net monetary benefit (INB) of HFNC compared to CPAP using a willingness-to-pay threshold of £20,000 per QALY gain. INB is calculated based on total costs and Quality Adjusted Life Years (QALYs) at six months. The findings suggest modest heterogeneity in cost-effectiveness of HFNC compared to CPAP at the subgroup-level, while greater heterogeneity is detected at the individual-level. Overall, the estimated INB of HFNC is smaller than the INB for patients with better baseline status suggesting that HFNC can be more cost-effective among less severely ill patients.
Chapter 3 uses complementary data and analysis to Chapter 2. The main difference is that Chapter 2 uses data from the FIRST-ABC trial focusing on children who require non-invasive respiratory support following extubation (Step-Down trial) while Chapter 3 uses data from the FIRST-ABC trial focusing on acutely ill children (Step-Up trial). Additionally, this chapter not only utilizes the CEA forest approach but also introduces the Bayesian causal forest method to provide a more comprehensive analysis. The findings reveal that more subgroup effects are significant in the Step-Up population than in the Step-Down population. This suggests a higher degree of heterogeneity in the cost-effectiveness of HFNC in acutely ill children compared to those in the postextubation phase. The combined analysis of Step-Up and Step-Down trials adds significant value by providing a holistic view of HFNC's cost-effectiveness heterogeneity across different stages of respiratory support in children, in addition to the heterogeneity at the patient level.
Chapter 4 evaluates heterogeneity in the treatment effect of cemented hemiarthroplasty in the WHiTE 5 multicentre, randomized, controlled trial conducted in England and Wales using a ML approach, Causal Forests (CF); the study compared cemented with modern, uncemented hemiarthroplasty in patients 60 years of age or older with an intracapsular hip fracture. The analysis revealed a complex landscape of response to cemented hemiarthroplasty over a 12-month period. Findings suggest greater variability in treatment effects at the 1-month mark than at subsequent follow-up periods, with particular regard to subgroups based on age. Results showed that conclusions regarding heterogeneity of effects with respect to baseline characteristics, including age, health status, and lifestyle factors like alcohol consumption depend on the timepoint considered. In almost all cases the overall effect estimates lies within the confidence intervals for subgroups estimates which suggests that one cannot be confident that effects are heterogeneous by subgroup or timepoint.
In Chapter 5, the thesis applies causal forest and instrumental forest methods to data from the Oregon Health Insurance Experiment (OHIE), to explore heterogeneity in the uptake of health insurance, and in the effects of (a) lottery selection and (b) health insurance on a range of health-related outcomes. The findings of this study suggest that the impact of winning the lottery on the health insurance uptake varies among different subgroups based on age and race. This highlights the need for targeted policy interventions to address specific barriers faced by these subgroups, ultimately aiming to improve health insurance enrollment and equity in healthcare access. In addition, the results generally coincide with findings in the literature regarding the overall effects: lottery selection (and insurance) reduces out-of-pocket spending, increases physician visits and drug prescriptions, with little (short-term) impact on the number of emergency department visits and hospital admissions. Overall, quite weak evidence of heterogeneity in the effects of the lottery and of health insurance is detected across the outcomes considered.
Overall, this thesis demonstrates the flexibility of causal forests in the realm of personalized medicine (Chapters 2-4) and policy making more broadly (Chapter 5), contributing to their increasing use in applied research. Further, I consider the role of heterogeneity when assessing cost-effectiveness in Chapters 2 & 3 and accounting for unobserved heterogeneity in Chapter 5. The application of ML methods, particularly non-parametric forest-based methods, has been widely recommended in the methodological literature for their proficiency in managing complex covariate-treatment relationships. However, their full potential has yet to be fully explored in applied health economic studies. This thesis presents applications of these methods to real-world health economic applications, advancing knowledge on their applicability in health economics.
Publisher
University of Galway
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International