• Funding : Artois
  • Start year :
  • 2023

Design of diffusion-based generative models for the discovery of new crystallographic structures The discovery of new innovative materials in the field of energy is crucial for economic and societal reasons. However, the trial-and-error strategies typically used to identify new materials are lengthy and tedious since even slight modifications of experimental conditions can have drastic effects on their properties. The combination of recent technological innovations in computing with current approaches in chemistry (materials synthesis, modeling, and characterization) represents an opportunity to revolutionize and accelerate the research and development of new materials. This new approach, which integrates various tools such as high-throughput screening and machine learning (ML) algorithms, will ultimately reduce the cost associated with material discovery and therefore lower the price of finished products (such as solar panels, batteries, etc.). Indeed, the use of ML techniques in materials science can help improve the prediction and optimization of material properties and better understand the relationships between their structure and properties [1-6]. For example, these techniques are used to predict the corrosion resistance of a material or optimize the mechanical properties of an alloy. They can also be used to predict the electrical conductivity of a material based on its chemical composition.

There are many generative models (VAE, normalizing flow, GAN, etc.) for predicting the structures of molecular systems, but there are few models for crystalline systems and their effectiveness is very limited. This is because, in addition to the chemical composition, the periodicity and geometric equivalence of the crystalline system must be taken into account in the generative model. A denoising architecture has recently been proposed to address this aspect, but no generation has yet been performed with this approach [7-8]. We propose in this topic to develop a generative algorithm based on this architecture.

Recently, diffusion models have gained popularity. For example, diffusion models in imaging such as DALL-E have shown remarkable results compared to classical models such as GANs [9]. Diffusion models are generative models used to create data similar to the data on which they are trained. These models strive to capture the propagation dynamics of certain features in a network by adding noise to the input data. The noise is often added randomly to ‘corrupt’ the training data. Inverting the denoising process to recover the original data, by minimizing errors, is used to train a diffusion model. The latter can then be used to generate data simply by passing randomly sampled noise through the learned denoising process.


  1. Ward, L., et al. “Accelerating the discovery of new materials with machine learning.” Nature 585 (2020): 773-778.
  2. Raccuglia, P., et al. “Machine learning predictions of crystalline structures and their stability.” Physical review letters 122 (2019): 105502.
  3. Choudhary, A., et al. “Machine learning for crystal structure prediction.” The Journal of Chemical Physics 149 (2018): 241722.
  4. Zhang, Y., et al. “Crystal structure prediction using machine learning.” Nature communications 8 (2017): 13915.
  5. De, S., et al. “Crystal structure prediction using a data-driven approach.” Scientific reports 6 (2016): 23607.
  6. Risi, C., Ramprasad, R., & Chen, W. (2020). Machine learning in materials science. Materials Today, 33, 6-17.
  7. Klipfel, A., Bouraoui Z., Frégier, U. & Sayede, A.Equivariant Graph Neural Network for Crystalline Materials. STRL@IJCAI 2022.
  8. Klipfel, A., Peltre O., Frégier, Y., Sayede, A. & Bouraoui Z. Equivariant Message Passing Neural Network for Crystal Material Discovery. AAAI 2023.
  9. Dhariwal P, Nichol A. Diffusion Models Beat GANs on Image Synthesis. Neurips 2021.