Keynote Speaker (by invitation only) at Digital Twin Symposium 2019, presented by Amanda S Barnard
A fundamental aim of materials research is to identify features of materials that can be tuned to control how the material performs under specific application conditions. The combination of computational materials science with machine learning provides a powerful way of relating structural features with functional properties, but combining these fundamentally different scientific approaches is not as straightforward as it seems. Machine learning methods were developed for large data sets with small numbers of consistent features. Typical materials data sets are small, with high dimensionality and high variance in the feature space, and suffer from numerous destructive biases. None of the established data science or machine learning methods in widespread use today were devised with materials data sets in mind, but there are ways to overcome these issues and use them reliably. In this presentation we will discuss the impact of domain-specific constraints on data-driven materials design, and explore the differences between materials simulation and materials informatics that can be leveraged for greater impact. We will review the differences between machine learning models and simulations models, and discuss feature engineering, dimension reduction, prediction and visualisation. We will conclude by discussing surrogate models and types of model-informed machine learning, and their potential role in a digital twin.