Dynamic Bayesian modelling of biological networks
Ajmal, Hamda
Ajmal, Hamda
Loading...
Identifiers
http://hdl.handle.net/10379/16990
https://doi.org/10.13025/16905
https://doi.org/10.13025/16905
Repository DOI
Publication Date
2021-10-25
Type
Thesis
Downloads
Citation
Abstract
In this thesis we review, analyse and develop a series of different algorithms to model dynamic variables of a system from the given data or domain knowledge, in order to identify the regulatory relationship between these variables. We apply these algorithms to infer network structures from biological data or domain knowledge and for real-time temporal data mining from these models. We review and evaluate existing methods for Bayesian network structure learning and propose new methods to elucidate dynamic networks from data. The first contribution of this thesis is the development and evaluation of a new software tool called PROFET, that automatically yields probabilistic networks from the existing domain knowledge in form of mathematical models. It builds dynamic Bayesian network (DBN) structures from ordinary differential equations (ODEs) and can handle both model and data uncertainty in a principled manner. It can be used for temporal data mining with noisy and missing variables. We demonstrate PROFET’s functionality by using it to infer the model variables by estimating the model parameters of four benchmark ODE systems, three of which are biological systems. The second contribution of this thesis is the analysis of different statistical methods to learn Bayesian network structures from data. We analyse methods based on low-order conditional independence graphs and Lasso regression. We apply these methods to infer gene regulatory networks (GRNs) from high-dimensional time series data. We highlight the advantages and drawbacks of using these methods to learn network structures by applying the algorithms on simulated as well as real gene expression data. The third contribution of this thesis is the development, analysis and evaluation of two novel Bayesian scoring methods to infer sparse network models from high dimensional tem poral gene expression data. These scoring methods are based on the Bayesian information criterion (BIC) score. ScoreLOPC combines the BIC score with first order conditional independence values, ScoreLASSO combines the BIC score with L1 regularised Lasso regression. We use these scoring functions in conjunction with different structure search algorithms to infer GRNs from gene expression data. We conclude that the network structures learned using ScoreLASSO and ScoreLOPC have fewer spurious edges as compared to those learned using the BIC scoring function.
Funder
Publisher
NUI Galway