Publication

Probabilistic metadata generation for places based on user data

Hegde, Vinod Kumar Gajanana
Citation
Abstract
In recent years, there has been a wide adoption of mobile devices such as smart phones and tablets. This wide adoption is supported by numerous mobile and Web applications which help users to consume and generate data on the go. Users generate large volumes of data using these applications which represent real time contextual information about them. Most of the current mobile and Web applications analyse user data such as social interests and physical presence of users at places to deliver better services and user experience in applications. However, studies have shown that spatial databases lack sufficient metadata for places as users are required to manually provide this information. Since this is time consuming work, users rarely annotate places in spite of having knowledge about them. Automatically generating annotations for places by exploiting user generated data on mobile and Web applications can potentially be used to overcome the lack of metadata for places. Rich metadata about places can be used by geospatial web services and location based services to provide accurate results. Automatic generation of place metadata requires new sophisticated data mining algorithms. This thesis focuses on unsolved questions regarding the utilization of physical presence and social data of users to generate metadata for places. Specifically, we have developed probabilistic models and text processing algorithms for short text snippet or tag generation for locations using social interest profiles and check-ins of users at places. Then, we have studied how only the user presence data at places can be used to infer real world events at those places. To this end, we discuss a probabilistic outlier detection model and an algorithm to detect any unusual presence of huge crowds at places. We have then defined and implemented an approach to generate tags by analysing textual data generated during events conducted at locations. We have evaluated all the discussed models and algorithms with both synthetic and real world data. Our experiments show that rich metadata for places can be derived by analysing user generated data.
Funder
Publisher
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivs 3.0 Ireland