10.07.2015 Views

Catalysis of Organic..

Catalysis of Organic..

Catalysis of Organic..

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Rothenberg et al. 265pre-define the weight <strong>of</strong> each figure <strong>of</strong> merit. For example, if you know beforehandthat the catalytic activity is the most important parameter, you can assign a heavierweight to the TOF. In this way, the computer searches for the most suitable catalyst.To demonstrate this concept, let us consider the first optimization cycle for thepalladium-catalyzed Heck reaction in the presence <strong>of</strong> bidentate ligands. Theliterature shows us some promising leads (14-18), with yields > 95% and TOFs >1000, but much <strong>of</strong> the catalyst space is unexplored territory.In this example, we will assume that each catalyst consists <strong>of</strong> one Pd atom andone bidentate ligand. The ligand includes two ligating groups L 1 and L 2 , a backbonegroup B, and three residue groups R 1 , R 2 , and R 3 . To simplify, we will limit theligating groups to 1–14, the backbone groups to 15–21, and the residue groups to 22–29 (Figure 3). Further, we will constrain the R groups to one per ligating or backbonegroup. There is no restriction on group similarity, i.e. it is possible that L 1 ≡ L 2 andso forth. Each ligand has a unique {L 1 (R 1 )-B(R 2 )-L 2 (R 3 )} identifier. The connectionpoints for the R groups and between the L and B groups are predefined for eachbuilding block (for example, the tetrahydrothiophene ligating group 9 connects to thePd via the S atom, and to the backbone and the residue group on positions 2, 3, or 4).The total number <strong>of</strong> ligand-Pd complexes one can assemble from the above 29building blocks (connecting only via the specified connection points and limitingourselves to the L 1 (R 1 )-B(R 2 )-L 2 (R 3 ) form) is 2.61 × 10 17 . This is a huge number,well beyond the combined synthetic capabilities <strong>of</strong> all <strong>of</strong> the laboratories in theworld. Note that these building blocks were chosen specifically for this example,mimicking some <strong>of</strong> the ligand types in the following dataset to enable goodintrapolation. The precise relationship between building block structure andconnectivity and the resulting catalyst diversity is very complicated and will bediscussed in a separate paper (19).To simplify things, we will show here only one iteration (the meta-modelling isnot necessary for demonstrating the two-stage screening). As a starting point, weassemble a dataset containing 253 published Heck reactions performed using 58different catalysts and/or under different reaction conditions (Table 1 shows a partialrepresentation <strong>of</strong> this dataset). For each reaction, we include the substrates, catalyst,reaction conditions, and three figures <strong>of</strong> merit: Product yield, TON, and TOF. Wethen calculate a set <strong>of</strong> thirty-one 2D descriptors (20). This gives a 253 × 31 matrix.As we showed earlier, both linear regression models (such as partial least squares,PLS), and nonlinear ones (e.g. artificial neural networks) can be used (10). In thiscase, however, there are not enough initial data to ‘feed’ a neural network, so we usea PLS model, which is also more robust (21). This model is used for correlating the2D descriptors and the reaction conditions (temperature, Pd concentration, andsolvent) with the above three figures <strong>of</strong> merit.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!