Computational Chemistry

Operations Research in Molecular Science

From AI-guided retrosynthesis planning that discovers drug synthesis routes in seconds, to Bayesian optimisation that intelligently navigates vast catalyst design spaces, to molecular docking that screens millions of drug candidates computationally — three operations research problems accelerating drug discovery, materials design, and the quest for novel catalysts.

Science Context

Synthesising a new drug molecule typically requires 5–12 reaction steps, each chosen from thousands of known transformations. Expert chemists spend weeks planning viable routes. Computer-aided retrosynthesis tools like AiZynthFinder now find synthesis routes in under 10 seconds by searching backwards from the target molecule to commercially available building blocks, using neural network policies to guide the search through an enormous AND-OR tree of possible disconnections.

Problem type: AND-OR tree search with learned heuristics. Decompose a target molecule into purchasable building blocks by applying reaction templates in reverse. Monte Carlo Tree Search (MCTS) with neural network expansion policies explores the combinatorial space efficiently.

AND-OR Tree & MCTS Formulation // AND-OR tree: OR nodes = molecules, AND nodes = reactions
Goal: find path from target to stock molecules

// MCTS iteration:
1. Select: UCB1 = Q/N + c√(ln N_parent / N)
2. Expand: apply reaction template via policy_nn(mol)
3. Simulate: rollout to stock or max depth
4. Backprop: update Q, N along path

Retrosynthesis Solver

Target Molecule

Aspirin (3 colours)

Ibuprofen (4 colours)

Caffeine (5 colours)

Paracetamol (3 colours)

Penicillin (6 colours)

Algorithm

BFS

DFS

Greedy

MCTS (10 rollouts)

Select a target molecule and algorithms, then click Solve.

Evidence Base

Corey, E. J. & Wipke, W. T. (1969). Computer-assisted design of complex organic syntheses. Science, 166(3902), 178–192. Published
Genheden, S., et al. (2020). AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning. Journal of Cheminformatics, 12, 70. Operational

Science Context

Discovering a new catalyst or reaction condition often requires expensive wet-lab experiments, each costing hours of time and thousands of dollars. The design space is enormous — temperature, pressure, solvent, ligand, concentration. Bayesian Optimisation (BO) fits a Gaussian Process surrogate to observed data and uses an acquisition function to intelligently decide which experiment to run next, balancing exploitation of known good regions with exploration of uncertain areas.

Problem type: Sequential black-box optimisation. Fit a Gaussian Process (GP) surrogate model to observed experiment outcomes, then optimise an acquisition function (Expected Improvement or Upper Confidence Bound) to select the next experiment, iterating until the budget is exhausted.

Bayesian Optimisation Loop for t = 1, 2, ..., T:
  1. Fit GP to {(x_i, y_i)}_i=1..t-1
  2. x_t = argmax α(x) // acquisition function
  3. y_t = f(x_t) + ε // run experiment
  4. Update dataset

// EI(x) = E[max(f(x) - f_best, 0)]
// UCB(x) = μ(x) + κσ(x)

Bayesian Optimisation Solver

Acquisition Strategy

EI (Expected Improvement)

UCB (Upper Confidence)

Random

Select acquisition strategies and click Run.

Evidence Base

Shields, B. J., et al. (2021). Bayesian reaction optimization as a tool for chemical synthesis. Nature, 590, 89–96. Published
Goldsmith, B. R., et al. (2018). Machine learning for heterogeneous catalyst design and discovery. AIChE Journal, 64(7), 2311–2323. Published
Häse, F., Roch, L. M., Kreisbeck, C., & Aspuru-Guzik, A. (2018). Phoenics: A Bayesian optimizer for chemistry. ACS Central Science, 4(9), 1134–1145. Published

Science Context

Virtual screening allows researchers to evaluate millions of drug candidates computationally rather than through expensive and time-consuming wet-lab assays. Molecular docking predicts how a small molecule (ligand) binds to a protein target by searching over torsion angles and translational/rotational degrees of freedom to find the lowest-energy pose. Tools like AutoDock Vina use branch-and-bound and gradient-based local search to efficiently explore the conformational space.

Problem type: Continuous optimisation over torsion angles. Minimise the binding free energy of a ligand in a protein binding site by searching over rotatable bond angles, subject to steric and electrostatic constraints.

Docking Energy Formulation min E_binding(pose) = E_vdw + E_elec + E_hbond + E_desolv

where pose = (θ₁, θ₂, ..., θ_n)
θ_i ∈ [0, 2π] // rotatable bond angles

// E_vdw: van der Waals (Lennard-Jones 12-6)
// E_elec: Coulombic electrostatics
// E_hbond: hydrogen bonding
// E_desolv: desolvation penalty

Molecular Docking Solver

Torsion Angles

θ₁: 90°

θ₂: 180°

Search Strategy

Random 500

Grid Search

Gradient from Best Grid

Adjust angles or run automated search strategies.

Evidence Base

Trott, O. & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry, 31(2), 455–461. Operational
Friesner, R. A., et al. (2004). Glide: a new approach for rapid, accurate docking and scoring. Journal of Medicinal Chemistry, 47(7), 1739–1749. Operational

Explore More Applications

See how the same mathematical families — tree search, Bayesian optimisation, continuous optimisation — apply across healthcare, energy, manufacturing, and logistics.

Portfolio GitHub