Exploring advanced methodologies for the generation of synthetic data

Duignan, Samuel
In the field of computer vision and machine learning the need for high-quality diverse data sets are crucial. However, acquiring such data sets with detailed ground truth can be challenging. This problem led to the increased attention towards synthetic data generation as a cost-effective and scalable solution. This research introduces a novel pipeline to generate synthetic data sets, lever aging both open-source and commercially available software. This approach democ ratizes synthetic data creation which often restricted to large organizations with extensive resources. The pipeline employs an innovative use of OpenPose, a tool for estimating facial landmarks, to fit an animatable mesh to each 3D scan. By refining and filtering OpenPose predictions, the accuracy of mesh fitting is significantly improved, leading to the generation of more diverse synthetic data. Further, we use a custom rendering pipeline built on Blender, an open-source 3D creation suite, to simulate intricate lighting scenarios improving the realism of the synthetic data. The data set is rich with variations in expressions, lighting, eye gaze, and head pose orientations that mirrors the complexities of real-world situations. Alongside synthetic facial images our pipeline generates a wealth of metadata such as bounding boxes, expression/action values as well as eye gaze and head pose values. This accurately labeled metadata serves as a rich source of ground truth data, extending the utility of our data set to various computer vision and image analysis tasks.
NUI Galway
Publisher DOI
Attribution-NonCommercial-NoDerivs 3.0 Ireland