Just tweet it - The collection, processing, classification and analysis of 2 million fitness tweets

Vickey, Theodore
In 2013, the World Health Organization coined the term “Globesity” to highlight the importance of the epidemic and impact as a major health problem in many parts of the world (World Health Organization, 2013). Many believe that technology has been a key to this decrease in physical and increase in sedentary behaviors. However, the effective use of technology may be one of the keys towards better health, fitness and wellness for the global population using a device they already use, a mobile phone. This research investigates a novel and scalable method of physical activity data collection through the use of Twitter. The publicly available fitness tweets can provide a wealth of demographic and activity information from Twitter users from around the world. Specifically, this dissertation reviews the use of five mobile fitness apps that allow participants to share their workouts from the app through Twitter. At the time of the data collection, this was the first known academic research in the world that combined Twitter, mobile fitness apps and physical activity. The underlying hypotheses and research questions explore if researchers can learn enough information to help decrease the incidence of physical inactivity from a population level. There were three research questions to be answered. The first was to determine if an automated data collector could be created and to then accurately identify fitness tweets shared from one’s mobile fitness app from the Twitter stream. If so, how can these fitness tweets be collected and processed and what are the limitations in the data processing of these fitness tweets? The second was to determine if an automated fitness tweet classification model could accurately quantify characteristics of physical activity such as but not limited to duration, type and intensity. If so, what is the process and what are the limitations in the collection of physical-activity minutes using Twitter? The third and final was to determine if additional demographic information from those who share their workouts online using Twitter be generated to give additional insights into the characteristics of the type of person whom fitness tweets? Building on existing research tools and literature on data collection, analysis and classification from computer science research, a Fitness Tweet Collection Tool and a Fitness Tweet Classification Model were created to process and catalogue over 2 million fitness tweets. The tools were designed in such a way that would allow future health and wellness researchers, without an in-depth knowledge of programming or coding, to modify the tools to conduct their own Twitter and health research. One such research project that used this data collection model was by researchers at Harvard studying skin cancer. Four hypotheses are presented from the resulting fitness tweets: that as one’s online influence increases, the busier they will be, thus less time to exercise; that women are more expressive in their tweets and would tend to fitness tweet more; that people feel better when they exercise, and this will be reflected in their tweets and that those who use mobile fitness apps are more interested in their physical activity, thus will report greater minutes of activity than those that were self-reported through surveys. The findings of this research and the three data experiments conducted from the collected data provide implications for not only future researchers, but also for mobile fitness developers. This research demonstrates that it is possible to collect self-reported health information, in this case physical activity, from a diverse population at different levels of physical activity from around the world.
NUI Galway
Publisher DOI
Attribution-NonCommercial-NoDerivs 3.0 Ireland