Funded Projects

ChemGenRNA

Full title: Chemistry-informed deep generative models for catalytic RNA

Involved researchers: Vaitea Opuu (ESPCI Paris - PSL), Guillaume Stirnemann (ENS - PSL)

Catalytic RNAs play essential roles in various biological processes. However, our limited understanding of their sequence-function relationship hampers the rational design of RNA-based molecular systems—systems that hold significant potential in synthetic biology, therapeutics, and biotechnology. Unlike proteins, for which extensive molecular structure data is available, catalytic RNAs suffer from a scarcity of detailed experimental data on their catalytic sites. Consequently, deep learning (DL) models that excel for proteins often underperform when applied to RNA design or functional prediction. Preliminary work demonstrated the potential of hybrid biophysics-DL approaches to address data scarcity but was limited to capturing RNA’s global shape, missing atomic-level details critical for catalysis. Using this as a starting point, the current project aims to integrate simulation, data science, and high-throughput experimental measurements to advance state-of-the-art RNA design by

  • employing modern DL algorithms that can learn from diverse experimental data sources
  • integrating chemistry-based model constraints to ensure the generated RNAs form catalytically sound molecular structures

Our ultimate goal is to overcome data limitations and enable the rational design of catalytic RNAs with enhanced activity and expanded sequence and structural diversity, followed by rigorous experimental validation. This approach will pave the way for engineering novel RNA-based catalysts with superior functionality, unlocking new opportunities in synthetic biology, therapeutics, and biotechnology.

------------------------------------------------------------------------------------------------------

Active-Multi-MOF

Full title: Active-MOF

Involved researchers: François-Xavier Coudert (Chimie ParisTech - PSL), Georges Moouchaham (ENS - PSL)

This project aims to develop a data-driven approach for optimizing the synthesis of multivariate metal-organic frameworks (MOFs) using active learning methods to drive highthroughput robotic synthesis. Traditional MOF discovery methods rely on serendipity, with computational screening unable to predict synthesizability or control synthesis conditions. By integrating experimental data acquisition with machine learning, this project aims to systematically explore the efect of synthesis parameters and help shorten the MOF discovery loop. Bayesian optimization will guide experiment selection, minimizing costs and maximizing insights. We will develop this methodology on a specific family of materials: multivariate MOFs. Mixed-ligand or mixed-metal MOFs can ofer enhanced adsorption and catalytic activity over their parent compounds, but are dificult to characterize due to their structural complexity. The study is divided into three phases:

  • optimizing CALF-20 MOF synthesis
  • controlling polymorphism in mixed-linker CALF-20 MOFs
  • extending methods to heterometallic MOFs in the MIP-212 family for improved CO2 adsorption

This innovative integration of active learning and robotics ofers a more systematic approach to materials discovery, focusing on eficiency in terms of both data and chemicals.

----------------------------------------------------------------------------------------------------

React-IR

Full title: Enhancing Data-Driven Chemical Process Optimization with Real-Time Monitoring Using React-IR

Involved researchers: Jean-François Soulé (Chimie ParisTech - PSL), Guillaume Lefèvre (Chimie ParisTech - PSL), Phannarath Phansavath (Chimie ParisTech - PSL), Virginie Vidal (Chimie ParisTech - PSL), Amandine Guérinot (ESPCI Paris - PSL), Benjamin Laroche (ESPCI Paris -PSL), Christophe Meyer (ESPCI Paris - PSL), Renaud Nicolaÿ (ESPCI Paris - PSL), Nathan Van Zee (ESPCI Paris - PSL), Laurence Grimaud (ENS - PSL), Maxime Vitale (ENS - PSL)

This infrastructure project seeks to acquire a React-IR system, a real-time infrared spectroscopy tool for continuous reaction monitoring. By providing high-resolution in situ kinetic data, it will improve the reproducibility and reliability of experimental results –essential for building medium-sized datasets used in AI assisted reaction optimization and mechanistic studies. Beyond its experimental applications, React-IR will be integrated into PSL’s data infrastructure, ensuring structured storage and accessibility of experimental data. The standardized datasets generated will contribute to the PSL Data Hub, creating a valuable resource for AI-driven reaction modeling and predictive analytics. This initiative will strengthen PSL’s leadership in data-driven chemistry, fostering collaboration between experimental chemists, data scientists, and AI researchers. The transportable equipment will serve multiple research groups across Chimie ParisTech-PSL, ESPCI Paris -PSL, and ENS-PSL, advancing research in organic synthesis, catalysis, and materials science.

------------------------------------------------------------------------------------------------------

Active-MOF

Full title: Active-MOF

Involved researchers: Georges Mouchaham (ENS - PSL), François-Xavier Coudert (Chimie ParisTech - PSL)

This infrastructure project seeks to upgrade our top-notch automated high-throughput synthesis robot to double its screening capacity. This project aims to support the Active-Multi-MOF PhD project, which aims to develop a data-driven approach for optimizing the synthesis of multivariate metal-organic frameworks (MOFs) using active learning methods to drive high-throughput robotic synthesis. To date, the vast majority of MOF discovery methods relies on serendipity, with computational screening unable to predict synthesizability or control synthesis conditions. By integrating experimental data acquisition with machine learning, the project aims to systematically explore the effect of synthesis parameters and help shorten the MOF discovery loop. Bayesian optimization will guide experiment selection, minimizing costs and maximizing insights. We will develop this methodology on a specific family of materials: multivariate MOFs. In fact, Mixed-ligand or mixed-metal MOFs can offer enhanced adsorption, separation or catalytic activity over their parent compounds, but are difficult to characterize due to their structural complexity and/or the need to apply careful activation processes. This innovative integration of active learning and robotics offers a more systematic approach to materials discovery, focusing on efficiency in terms of both data and chemicals.