Delving into Synthetic Data: An Introduction

4 September 2023

Part of our Synthetic Data Workshop Series.

Join us for the inaugural workshop series at the University of Exeter, focusing on the evolving and intriguing domain of Synthetic Data. This series welcomes a diverse set of speakers and audience members from academia, industry, and governmental sectors.

Watch the workshop

Synthetic Data is an evolving field of data science, pushing the bounds of what is possible with AI and machine learning. This seminar introduces our series on synthetic data, with an overview of the topic, and a look at research currently underway.

We focus on use cases of synthetic data for data augmentation, and for privacy, looking at current and emerging methods for synthetic data creation and validation. Synthetic data for privacy enables private data to be shared anonymously while retaining key characteristics and statistical features. Augmenting datasets allows machine learning models to be trained on larger and broader datasets, addressing imbalances and minimising the requirement for real-data collection.

This is an opportunity for learning and engagement amongst professionals interested and working with the science of synthetic data.

Speakers

Prof. Richard Everson – Professor of Machine Learning, University of Exeter

Prof. Everson’s research interests are in machine learning, statistical pattern recognition, multi-objective optimisation and the links between them. Particular current interests are in optimisation in wireless and mobile networks to maintain quality of service, in automatic analysis of video and accelerometer data for inferring behaviour of animals (funded by NERC and the Open Innovation Platform) and people (with the Royal Devon and Exeter Hospital), and in modelling big data storage systems (with the Met Office).

Andrew Kennedy – Graduate Research Assistant, University of Exeter

Andrew Kennedy is a researcher for the Defence Data Research Centre (DDRC), primarily focusing of the science and application of synthetic data within the UK’s defence sector.

Agenda

◦ What is synthetic data?
◦ Different data types – images, audio, other media, tabular, text
◦ Main use-cases
◦ Privacy
◦ Augmentation
◦ Data creation methods – geometric methods e.g. SMOTE, discriminatory methods e.g. GANs, DDRC’s work with ImageGPT
◦ Validation methods
◦ Other case studies
◦ Alternatives to synthetic data