Data Analytics in Chemistry

Data Analytics in Chemistry#

Data Analytics in Chemistry (CHEM70012) gives an introduction to statistical learning, data analysis, visualisation, and research communication for chemistry. It is designed for Masters level undergraduate students in the Department of Chemistry at Imperial College London.

The course will introduce how modern machine learning approaches can be applied to chemical datasets. The generation of chemical features is not discussed in detail. Instead the focus is on the application of statistical models to extract relationships from data and build models that can be used for chemical discoveries.

This website is intended as a compliment to the course lectures. It hosts the notebooks that will be used in the workshop sessions. Familiary with python is assumed. For a refresher, see the introductory chapters of Scientific Computing for Chemists with Python

We welcome feedback and suggestions in the form of a pull request or issue on GitHub.

Learning Outcomes#

At the end of this course, you will be able to:

  • Describe and contrast methods for learning from data

  • Extract relationships between chemical features and properties

  • Apply machine learning models for chemical problems

  • Understand the challenges of working with realistic data

  • Explain the process of communicating and justifying data analysis