Turning Greenhouse Gases into Useful Chemicals: Catalyst Design via AI and High-Throughput Calculations


CO2 is a critical pollutant of the atmosphere with noticeable climatic consequences. However, routes already exist to convert CO2 into useful chemicals or fuels. The key is catalysis – a process of accelerating desired chemical reactions, involving special materials (catalysts).

Using recently developed AI methods, NOMAD researchers have identified the materials genes important for CO2 activation and rules for finding improved or novel catalyst materials.

Because of its chemical stability, CO2 is presently one of the critical mankind-created greenhouses gases. However, at some point CO2 may well become a raw material for creating fuels and valuable chemicals. Figure 1 displays a reaction network that starts e.g. from CO2 showing the catalytic, chemical conversion towards methane (viable for combustion engines and heating) and other important chemicals. This is all possible, already today, but the process is very inefficient. We need better catalysts.


The NOMAD Laboratory of the Fritz Haber Institute developed and advanced artificial intelligence (AI) methods [2] that enable the identification of basic materials parameters that correlate with materials properties and functions of interest (here the activation of CO2). These parameters are also called materials genes as they correlate with different mechanisms that trigger, but maybe just facilitate or even hinder the different processes playing together, depending on their combination – very much as genes in biology. In the coordinate system of these genes, regions are identified where good catalytic materials can be found. Specifically, this CO2 study employed the AI method called “subgroup discovery” [3], and it focused on the wide class of metal oxides. Catalysis happens at surfaces. Thus, the study also considered various surfaces of all these materials. Altogether, 141 different surfaces where calculated (71 different materials) with state-of-the-art density-functional-theory high-throughput methodology. The results were then used for training the AI.

In general terms, this study also represents a conceptual change of modeling heterogeneous catalysis (and other materials functions). In the past it was attempted to calculate the full catalytic process [1]. However, it became clear that the many aspects that rule heterogeneous catalysis, e.g. the dynamical restructuring of the surface under reaction conditions, are too intricate, and that a full theoretical description does not make sense. Thus, a combined approach linking high-throughput calculations, AI, and experimental results appears more appropriate and was suggested in this study.

The idea of CO2 activation is well established. In the gas phase, it is achieved by charging the molecule. This weakens the C-O bonds, and it is visible by a change of geometry from linear to bent (see top of Fig. 2), This bent molecule will then readily do a chemical reaction. Analogously, for heterogeneous catalysis it was assumed that surfaces that bind CO2 with a bent geometry may be good catalysts, while other surfaces don’t need to be considered. The new study took a more general perspective and considered five different properties (indicators) that signal weakening of the chemical bond between C and O in adsorbed CO2. The O-C-O angle was one of the five, and we only mention one other indicator of bond weakening, namely the C-O bond length. More can be found in the respective publication [5]. 

Our AI method of choice, subgroup discovery, can find the coordinate systems where these subgroups live, i.e., it identifies the important parameters (or materials genes). For this purpose we considered 46 different features as potential genes. These are basic physical parameters of the free atoms composing the material, and the pristine surfaces. They don’t carry explicit information about the adsorption or the catalytic processes. At the end, subgroup discovery automatically selects the relevant genes based on the data. In simple words, essentially everything was offered that may be or could be relevant. 

Figure 3 shows the results for all 255 calculated data points, displayed in terms of the O-C-O angle and the C-O bond length. The blue data points belong to the “small O-C-O angle” subgroup and the green data points belong to the “large C-O-bond length” subgroup. The black points don’t belong to any of the two. In the coordinate system of this plot, the blue or the green dots are scattered. They don’t look like groups. Obviously, the group character is not defined by these coordinates, but by the identified materials genes (see Ref. [5] for details).

As said above, evaluating the catalytic efficiency under realistic conditions numerically and reliably is hardly possible. Too many different processes are playing together. Thus, after identifying the two subgroups (blue and green), the study considered experimental information. It turns out that most known materials with good catalytic performance belong to these two groups. Thus, when searching for a good catalyst, one should look at the region in materials spaces that is defined by these two subgroups. This reduces the search space significantly, and improves the search process.

1.     H.-J. Freund, G. Meijer, M. Scheffler, R. Schlögl, and M. Wolf, CO oxidation as a prototypical reaction for heterogeneous processes. Angew. Chem. Int. 50, 10064 (2011).
2.     C. Draxl and M. Scheffler, Big-Data-Driven Materials Science and its FAIR Data Infrastructure. Plenary Chapter in Handbook of Materials Modeling (eds. S. Yip and W. Andreoni), Springer (2020).
3.     B.R. Goldsmith, M. Boley, J. Vreeken, M. Scheffler, and L.M. Ghiringhelli, Uncovering structure-property relationships of materials by subgroup discovery. New J. Phys. 19, 013031 (2017).
4.     H.-J. Freund, M.W. Roberts, Surface chemistry of carbon dioxide. Surface Science Reports, Volume 25, 225 (1996).
5.     A. Mazheika, Y. Wang, R. Valero, F. Viñes, F. Illas, L. M. Ghiringhelli, S. V. Levchenko, M. Scheffler, Nature Comm., accepted (2022).

The study was published in Nature Communications in January 2022.

A. Mazheika, Y. Wang, R. Valero, F. Vines, F. Illas, L. Ghiringhelli, S. Levchenko, and M. Scheffler (2021)
Artificial-intelligence-driven discovery of catalyst “genes” with application to CO2 activation on semiconductor oxides
Nature Communications 13, 419 (2022)