Input and output data analysis for system dynamics modelling using the tidyverse libraries of R

Duggan, Jim
Duggan, Jim. (2018). Input and output data analysis for system dynamics modelling using the tidyverse libraries of R. System Dynamics Review, 34(3), 438-461. doi: 10.1002/sdr.1600
From a number of perspectives, system dynamics is a data‐intensive activity. First, each modelling challenge addresses behaviour over time, where historical time series data inform the model‐building process, and techniques such as calibration and optimisation (Rahmandad et al., 2015) are deployed to estimate parameters and enhance user confidence in model outputs. Second, the simulation of higher‐order models (Forrester, 1987) typically yields many time‐based observations across a significant number of variables. These results must be interpreted and analysed as part of the model‐building and policy analysis process. Third, simulation methods such as sensitivity analysis (Hekimoğlu and Barlas, 2016; Walrave, 2016; Jadun et al., 2017) generate large datasets that need to be processed for further analysis—for example, techniques such as statistical screening (Ford and Flynn, 2005; Taylor et al., 2010; Yasaman and Ford, 2016). Therefore, in the context of these data‐intensive modelling processes, there are opportunities for system dynamics modellers to leverage complementary data exploration technologies such as R (Duggan, 2016b). The R programming language provides a flexible framework for supporting system dynamics modelling. In particular, R now contains a suite of libraries, collectively known as the tidyverse,1 that are specifically designed to process rectangular data, which is highly structured data with rows as individual observations, and columns containing variables. In a system dynamics context, a row represents the simulation output at a unique point in time, and columns contain model variables (i.e., stocks, flows and auxiliaries). Given this perspective, the output from a system dynamics model with n time steps and m variables can be viewed as a single rectangular data set with dimensions (n × m). Many of the tidyverse libraries provide quick and efficient ways to process rectangular data, including importing data from external sources (e.g., comma‐separated files), summarising and aggregating observations (frequency counts, summary functions) and visualising large datasets. Before describing these libraries, an overview of R is provided.
Publisher DOI
Attribution-NonCommercial-NoDerivs 3.0 Ireland