DFT, Semi-empirical Methods & Tight-binding Models¶

Quick-start decision tree¶

Use this decision tree when selecting a QC method for geometry optimization and/or fast energetics.

1) What is the goal?¶

High-throughput geometry screening (hundreds–millions of structures)?

Start with GFN2-xTB (or g-xTB) for optimizations; use CREST for conformers if needed. If the system is extremely large (1000+ atoms), consider GFN-FF for initial geometries.

Production-quality energies/structures for small/medium molecules (< ~50 atoms)?

Use DFT + dispersion with a production basis (e.g., ωB97X-D/def2-TZVP or r2SCAN-D4/def2-TZVP).

Noncovalent interactions (binding energies, conformers, host–guest)?

Prefer a method with robust dispersion handling (e.g., ωB97M-V for smaller systems; GFN2-xTB for larger ones).

Reaction barriers / transition states?

Start with a fast method (e.g., g-xTB) to get a reasonable guess, then refine with hybrid DFT + dispersion (e.g., PBE0-D3(BJ) or ωB97X-D).

2) Are there red flags?¶

Transition metals / unusual coordination?

Be cautious with semi-empirical methods (PM6/PM7) and validate with DFT; false coordination can occur. (https://pubs.acs.org/doi/10.1021/acs.jctc.8b00018) Consider DFT with dispersion as the "sanity check" even if xTB is used for screening.

Highly charged species / charge transfer / polarons?

Expect larger errors from approximate DFT due to delocalization/self-interaction issues; prefer range-separated hybrids when charge transfer is important. (https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.100.146401 , https://pubs.acs.org/doi/10.1021/acs.jctc.1c01307)

3) Minimum checklist¶

If using most GGAs/hybrids, include a dispersion model (D3(BJ) or D4), or use a functional with built-in nonlocal dispersion (e.g., "-V" functionals).
Use at least triple-zeta quality for final DFT energies where feasible (e.g., def2-TZVP).

DFT fundamentals: from Hohenberg-Kohn and Kohn-Sham to practical calculations¶

Density Functional Theory represents a paradigm shift from traditional wavefunction methods by expressing all ground-state properties through the electron density \(n(\mathbf{r})\), a function of only 3 spatial coordinates rather than \(3N\) coordinates for \(N\) electrons.

The theoretical foundation rests on two Hohenberg-Kohn theorems: the first establishes that the external potential (and hence total energy) is a unique functional of electron density, while the second proves that the correct ground-state density minimizes the energy functional.

The Kohn-Sham framework makes DFT computationally tractable by mapping the intractable many-body problem onto a fictitious system of non-interacting electrons moving in an effective potential that reproduces the real ground-state density.

The Kohn-Sham equation takes the familiar eigenvalue form:

\[\left[-\frac{\hbar^2}{2m} \nabla^2 + v_{\text{eff}}(\mathbf{r})\right] \varphi_i(\mathbf{r}) = \varepsilon_i \varphi_i(\mathbf{r})\]

where the many-body physics is contained in the exchange-correlation potential \(v_{\text{xc}}(\mathbf{r})\).

The self-consistent field (SCF) procedure iteratively solves these equations: an initial density guess generates an effective potential, which yields new orbitals, which produce a new density—repeating until convergence.

The Jacob's ladder: choosing the right exchange-correlation functional¶

All DFT approximations arise from the unknown exchange-correlation functional, organized by John Perdew into a hierarchy called Jacob's Ladder with increasing sophistication and accuracy.

Rung 1 (LDA) depends only on local electron density and is exact for the uniform electron gas, but overbinds molecules. Rung 2 (GGA) adds density gradients, with PBE being non-empirical and widely used for solids, while BLYP suits organic molecules. Rung 3 (meta-GGA) includes kinetic energy density; r2SCAN satisfies 17 exact constraints with improved numerical stability over SCAN.

Hybrid functionals (Rung 4) incorporate Hartree-Fock exchange. B3LYP remains the most widely used functional in computational chemistry with 20% HF exchange, though it must always be paired with dispersion corrections. PBE0 (25% HF exchange) provides better barrier heights. M06-2X (54% HF exchange) excels for thermochemistry and noncovalent interactions but requires fine integration grids.

Even though B3LYP is very widely used it is regarded as an initial functional to receive first results with while other models such as ωB97X-D / ωB97M-V / r2SCAN-D4 give more accurate results.

Range-separated hybrids vary HF exchange with distance. ωB97X-D (22% short-range → 100% long-range HF exchange) includes built-in dispersion and represents an excellent general-purpose choice. For highest accuracy, double-hybrid functionals like revDSD-PBEP86-D4 incorporate MP2-like correlation but cost 10-100× more than standard hybrids. Only recommended for ≤30–40 atoms and when benchmark‑level accuracy is needed.

Dispersion corrections: essential for modern DFT¶

Standard DFT functionals fundamentally cannot describe London dispersion forces—the long-range correlation responsible for van der Waals interactions.

Dispersion corrections are mandatory for non-covalent complexes, conformational energies, crystal structures, and any system larger than ~10 atoms.

DFT-D3(BJ) with Becke-Johnson damping uses coordination-number-dependent \(C_6\) coefficients and includes \(R^{-8}\) terms with finite dispersion at \(R \to 0\) providing more physical behavior than zero-damping.

Parameters exist for over 60 functionals. DFT-D4 represents the current state-of-the-art, using charge-dependent polarizabilities that improve performance for metal-containing systems, with 3.8% mean relative deviation for \(C_6\) coefficients versus 4.7% for D3.

VV10 non-local correlation takes a fundamentally different approach as a true density functional rather than atom-pairwise correction. Functionals ending in "-V" (ωB97X-V, ωB97M-V) have built-in VV10 and should never be combined with D3/D4 corrections. These *-V functionals are already dispersion-corrected and no corrections are needed.

Basis set selection for production calculations¶

Karlsruhe def2 basis sets are recommended for general-purpose DFT work. def2-SVP (split-valence polarized) suits initial optimizations, def2-TZVP (triple-zeta polarized) represents the production standard, and def2-QZVP approaches complete basis set accuracy. Adding "D" provides diffuse functions essential for anions, excited states, and hydrogen bonding. Diffuse functions come with higher computational costs and exhibit occasionally trickier SCF convergence with diffuse functions.

Dunning correlation-consistent basis sets (cc-pVXZ) enable systematic CBS extrapolation—crucial for coupled-cluster benchmarks—with aug- prefixes adding diffuse functions.

Pople basis sets (6-31G, 6-311++G*) remain common historically but have known issues: the 6-311G family has functions that are too tight, describing core rather than valence regions.

Application	Recommended Basis
Geometry screening	def2-SVP
Production optimization	def2-TZVP
Accurate energies	def2-TZVPP(D) or aug-cc-pVTZ
Anions/weak interactions	def2-SVPD with counterpoise

PM6 and PM7: legacy semi-empirical methods¶

Semi-empirical quantum chemistry methods based on the NDDO (Neglect of Diatomic Differential Overlap) approximation achieve 100–1000× speedup over DFT by simplifying two-electron integrals and parameterizing against experimental and high-level computational data. While PM6 and PM7 were workhorses for large-system calculations for many years, they have been largely superseded by the GFN-xTB family for most applications.

PM6¶

PM6 (Stewart, 2007) represented a major advancement in semi-empirical methods, parameterized for 70 elements using approximately 9,000 reference species. Key features include:

Average unsigned error of 8.0 kcal/mol for heats of formation overall and 4.4 kcal/mol for organic compounds
d-orbitals for hypervalent elements (S, P, halogens)
Voityuk-Rösch core-core corrections with diatomic parameters
Specific corrections for O-H, N-H, C-C, and Si-O interactions

Dispersion-corrected variants such as PM6-D3H4X provide improved noncovalent interaction accuracy, making them competitive with early xTB methods for certain applications.

PM7¶

PM7 (Stewart, 2013) incorporated dispersion and hydrogen bonding corrections before parameter optimization rather than as post-hoc additions. This approach yields:

~10% improvement in heats of formation versus PM6
60% reduction in errors for organic solids and crystals
S22 noncovalent benchmark error reduced from 3.38 kcal/mol (PM6) to 0.74 kcal/mol (PM7)

Both methods are implemented in MOPAC, which is now open-source (https://github.com/openmopac/mopac).

When to still consider PM6/PM7¶

While GFN-xTB methods are generally preferred today, PM6/PM7 may still be useful when:

Working with elements not well-covered by xTB parameterization
Reproducing legacy calculations or workflows
Specific applications where PM6/PM7 has been validated for your system class

Known limitations¶

Transition metal complexes: PM6/PM7 can produce false coordination geometries and significant energetic errors for realistic transition-metal systems. Always validate with DFT.
Conformer ranking: Potential energy surface shapes can differ qualitatively from DFT, leading to incorrect conformer orderings for metal complexes.
Charged species: Performance degrades for highly charged systems.

References:

PM6: Stewart, J. J. P. J. Mol. Model. 2007, 13, 1173–1213. https://link.springer.com/article/10.1007/s00894-007-0233-4
PM7: Stewart, J. J. P. J. Mol. Model. 2013, 19, 1–32. https://link.springer.com/article/10.1007/s00894-012-1667-x

GFN-xTB: fast and accurate for large systems¶

The GFN-xTB (Geometry, Frequency, Noncovalent, eXtended Tight-Binding) family from Stefan Grimme's group represents the current state-of-the-art in semi-empirical methods, targeting 1000+ atom systems while maintaining quantum mechanical accuracy. These methods have largely replaced PM6/PM7 as the default choice for rapid geometry optimization and conformer sampling.

GFN2-xTB: the current standard¶

GFN2-xTB (Bannwarth et al., 2019) is the recommended method for most applications. Key features include:

Anisotropic multipole electrostatics for improved description of polar interactions
Self-consistent D4 dispersion integrated into the SCF, eliminating the need for classical halogen/hydrogen bonding corrections
Parameterized for all elements up to radon (Z = 86) using only global and element-specific parameters—no pairwise fitting required
Excellent performance on the PLF547 drug design benchmark: 8.1% average error for noncovalent interactions, outperforming all other semi-empirical methods tested

The total energy in GFN2-xTB is expressed as:

\[E_{\text{GFN2-xTB}} = E_{\text{rep}} + E_{\text{disp}} + E_{\text{EHT}} + E_{\text{IES+IXC}} + E_{\text{AES}} + E_{\text{AXC}}\]

where the terms represent repulsion, dispersion, extended Hückel-type electronic energy, isotropic electrostatic/exchange-correlation, anisotropic electrostatic, and anisotropic exchange-correlation contributions.

g-xTB: improved thermochemistry and barriers¶

g-xTB (2024/2025) represents the next generation, halving errors on the GMTKN55 benchmark compared to GFN2-xTB:

WTMAD-2 of 9.3 kcal/mol (versus ~18 kcal/mol for GFN2-xTB)
Approximate range-separated Fock exchange dramatically improves reaction barriers and HOMO-LUMO gaps
Only ~30% slower than GFN2-xTB
Can substitute for low/mid-accuracy DFT calculations in many workflows

GFN1-xTB and GFN-FF¶

GFN1-xTB uses a simpler Hamiltonian than GFN2-xTB and may be preferred for:

Systems containing very heavy elements
Transition metal complexes (where it sometimes outperforms GFN2-xTB)
Cases where GFN2-xTB shows convergence issues

GFN-FF is a force-field derived from GFN2-xTB, suitable for:

Initial geometry generation for 1000+ atom systems
Rapid pre-screening before xTB optimization
Molecular dynamics of very large systems

The xtb program¶

The xtb program (https://github.com/grimme-lab/xtb, LGPL-licensed) implements all GFN-xTB variants with:

Geometry optimization (including transition state search)
Analytical frequencies and thermochemistry
GBSA/ALPB implicit solvation
Metadynamics for reaction path exploration
Integration with CREST for automated conformer sampling

Practical recommendations¶

Application	Recommended method
Geometry optimization	GFN2-xTB
Conformer sampling	GFN2-xTB + CREST
Reaction barriers	g-xTB (then refine with DFT)
Very large systems (>1000 atoms)	GFN-FF → GFN2-xTB
Transition metals	GFN1-xTB or GFN2-xTB (validate with DFT)

Limitations¶

Element coverage: GFN2-xTB does not support elements beyond radon (Z > 86)
Electronic properties: Band gaps and ionization potentials are approximate; use DFT for accurate electronic structure
Reaction energetics: While geometries are excellent, reaction energies and barriers should be refined with DFT for quantitative accuracy
Strong correlation: Multi-reference systems require proper DFT or wavefunction methods

References:

GFN-xTB: Grimme, S. et al. J. Chem. Theory Comput. 2017, 13, 1989–2009. https://pubs.acs.org/doi/full/10.1021/acs.jctc.7b00118
GFN2-xTB: Bannwarth, C. et al. J. Chem. Theory Comput. 2019, 15, 1652–1671. https://pubs.acs.org/doi/10.1021/acs.jctc.8b01176
g-xTB: https://chemrxiv.org/engage/chemrxiv/article-details/685434533ba0887c335fc974

DFTB and specialized methods¶

Density Functional Tight Binding (DFTB)¶

DFTB derives from DFT via a Taylor expansion of the Kohn-Sham energy around a reference density, providing a systematic approximation hierarchy:

DFTB1 (non-self-consistent): Uses pre-computed Hamiltonian and overlap matrix elements from DFT calculations on atom pairs. Fast but limited accuracy for polar systems.

DFTB2 (self-consistent charge, SCC-DFTB): Includes second-order charge fluctuation effects through a self-consistent treatment of Mulliken charges. Significantly improves electrostatics.

DFTB3 (2011): Adds third-order terms for improved hydrogen bonding, proton affinities, and charged species. The energy expression becomes:

\[E_{\text{DFTB3}} = E_{\text{rep}} + E_{\text{band}} + E_{\gamma} + E_{\Gamma}\]

where \(E_{\Gamma}\) captures the third-order diagonal contribution.

Key characteristics of DFTB:

Approximately 1000× faster than DFT
Requires pre-computed Slater-Koster parameter files for each element pair
Less "black-box" than GFN-xTB due to parameter file requirements
Excellent for specific, well-parameterized systems (organic molecules, DNA, certain materials)

DFTB+ (https://dftbplus.org/) is the primary implementation, supporting:

Periodic systems with k-point sampling
TD-DFTB for excited states
QM/MM coupling
Electron transport calculations

OMx methods (OM1, OM2, OM3)¶

The OMx family from Walter Thiel's group includes explicit orthogonalization corrections that capture Pauli repulsion and penetration effects absent from standard NDDO methods:

Limited to H, C, N, O, F elements only
Excel for excited-state molecular dynamics when combined with MRCI
ODM2/ODM3 dispersion-corrected versions achieve 7.9–8.3 kcal/mol MAD on GMTKN24 versus 18.2 kcal/mol for PM6

When to use specialized methods¶

Method	Best for	Limitations
DFTB3	Well-parameterized systems, periodic calculations	Requires parameter files, limited element coverage
OM2/OM3	Excited states, photochemistry	Only H, C, N, O, F
PM6-D3H4X	Legacy workflows, specific validations	Superseded by xTB for most uses

For most new projects, GFN2-xTB or g-xTB should be the default semi-empirical choice due to broader element coverage, no parameter file requirements, and competitive accuracy.

GPU accelerated methods¶

GPU acceleration has transformed computational chemistry, enabling 10–100× speedups for DFT calculations and reducing costs by approximately 90%. Modern implementations deliver performance equivalent to 600–1000 CPU cores on a single NVIDIA A100 GPU.

Which calculations benefit from GPU acceleration?¶

High GPU benefit:

Two-electron integral evaluation (ERIs): 10–100× speedup
SCF iterations: 20–50× speedup typical
Hessian calculations: up to 50× speedup
Hybrid functional calculations (due to HF exchange)

Moderate GPU benefit:

Gradient calculations: ~70% cost savings
Pure GGA DFT (less compute-intensive baseline)

Limited/no GPU benefit:

Very small molecules (<10–15 atoms): GPU overhead dominates
Standard post-HF without specific GPU implementations
Memory-bound calculations exceeding GPU VRAM

GPU-accelerated DFT codes¶

TeraChem (commercial, https://petachem.com/):

First quantum chemistry code designed for GPUs (2010)
Supports HF, DFT, TDDFT, CASSCF, CCSD
8–50× speedup versus CPU clusters
B3LYP-D3/6-31G(d) optimization of vancomycin (176 atoms) in ~2 minutes

QUICK (open-source, https://github.com/merzlab/QUICK):

Free under Mozilla Public License 2.0
Supports both NVIDIA (CUDA) and AMD (HIP) GPUs
82% parallel efficiency on 16 GPUs for Kohn-Sham matrix
Integrates with AMBER for QM/MM

GPU4PySCF (open-source, 2024):

Python-based, 30× speedup over 32-core CPU
~90% cost reduction on cloud platforms
Supports basis sets through g functions

VASP (GPU version):

OpenACC implementation, up to 15× speedup
Full support for hybrid functionals (HSE06)
Requires 1 MPI rank per GPU

Quantum ESPRESSO:

CUDA Fortran/OpenACC implementation
Best for large periodic systems
GPU-accelerated hybrid functionals available

GPU-accelerated semi-empirical methods¶

xtb with GPU support:

Native GPU acceleration in version 6.8.0+ via NVIDIA HPC SDK
Performance varies with system size—benchmark for your application

GPU-MOPAC:

200× total speedup for large systems (3000+ atoms)
Bacteriorhodopsin (3,352 atoms): 2,199 min → 11.8 min
Does NOT work with MOZYME localized orbital method

GPU-DFTB+ / PySEQM:

Enables quantum MD on explicitly solvated biomolecules (4000+ atoms)
PySEQM: PyTorch-based, excited states for ~1000 atoms in under a minute

Machine learning potentials on GPU¶

All modern MLIPs leverage GPU acceleration through deep learning frameworks:

Framework	Backend	Key features
MACE	PyTorch	10× faster than NequIP, Apple Silicon support
NequIP/Allegro	PyTorch	E(3)-equivariant, excellent data efficiency
DeePMD-kit	TF/PyTorch/JAX	Achieved 86 PFLOPS on Summit, 100M atoms
SchNetPack	PyTorch	Integrated GPU-accelerated MD
JAX-MD	JAX	Hardware-agnostic (CPU/GPU/TPU)

Practical GPU recommendations¶

Use GPU acceleration when:

Systems exceed 20–30 atoms
Using triple-zeta or larger basis sets
Running geometry optimizations or scans
Using hybrid or range-separated functionals
GPU memory (VRAM) can accommodate your system

CPU may be preferred when:

Systems smaller than 10–15 atoms
Running embarrassingly parallel small jobs
Memory requirements exceed GPU VRAM
Using methods without GPU support

Memory considerations¶

GPU VRAM limits maximum system sizes:

GPU	VRAM	Approximate limits
RTX 3090/4090	24 GB	~2000 basis functions (DFT)
A100-40G	40 GB	~4000 basis functions (DFT)
A100-80G	80 GB	~6000 basis functions (DFT)
H100/H200	80–141 GB	Largest molecular systems

For systems exceeding VRAM, codes like GPU4PySCF implement memory-efficient batching with some performance penalty.

Method selection guide¶

Use case	Primary recommendation	Alternative
Small molecules	ωB97X-D/def2-TZVP	r2SCAN-D4/def2-TZVP
(<50 atoms),
high accuracy

Large molecules	GFN2-xTB	PM7
(50-1000 atoms)

High-throughput	GFN2-xTB or PM6-D3H4	GFN-FF
screening

Conformer sampling	GFN2-xTB + CREST	PM7 + screening

Reaction barriers	g-xTB or PBE0-D3(BJ)	ωB97X-D

Noncovalent interactions	ωB97M-V or GFN2-xTB	PM6-D3H4X

Transition metals	GFN1-xTB or PBE0-D3	BP86

Periodic solids	PW-DFT or PM7	DFTB3

Method limitations & failure modes¶

No single method is universally reliable; each has well-known pathological cases that can silently yield plausible-looking but wrong results.

DFT (general)¶

Delocalization / self-interaction error (SIE): Many common DFAs show systematic delocalization error, which contributes to issues like underestimated band gaps and incorrect charge localization. (https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.100.146401 , https://arxiv.org/pdf/2102.12992)
Charge-transfer sensitivity: Approximate functionals can fail for charge-transfer excitations and related phenomena; self-interaction is a major contributor to these failures. (https://pubs.acs.org/doi/10.1021/acs.jctc.1c01307)
Dispersion omission: Standard functionals do not capture London dispersion well without an explicit correction or a nonlocal correlation term, so noncovalent interactions and conformers can be qualitatively wrong if dispersion is neglected. (https://pubs.acs.org/doi/10.1021/acs.jctc.4c00689)

Dispersion models (D3/D4/VV10)¶

Double counting risk: Functionals with built-in nonlocal dispersion (often "-V"/VV10-type) should not be combined with D3/D4-style corrections because it can double-count dispersion contributions.
Parameter dependence: D3/D4 parameters are functional-specific; using mismatched combinations can degrade accuracy even when "dispersion is on."

Semi-empirical methods (PM6/PM7)¶

Unpredictable performance for transition metals: For realistic transition-metal complexes, PM6*/PM7 can deviate strongly from DFT energetics and may distort coordination geometries, including false coordination/chemical rearrangements during optimization. (https://pubs.acs.org/doi/10.1021/acs.jctc.8b00018)
Out-of-the-box conformer sampling risk: Because PES shapes can differ from DFT, conformer ranking for metal complexes may be unreliable without careful validation. (https://pubs.acs.org/doi/10.1021/acs.jctc.8b00018)

Tight-binding methods (GFN-xTB family)¶

Element coverage limits: Practical workflows should confirm whether the chosen xTB variant supports the required elements; for example, GFN2-xTB has reported limitations for very heavy elements (e.g., no support for \(Z > 86\) in some tooling contexts). (https://pmc.ncbi.nlm.nih.gov/articles/PMC10185541/)
"Good geometries, approximate energetics": xTB methods often excel at rapid structure generation, but reaction barriers, subtle electronic effects, and fine energy ordering can require DFT refinement.

Practical diagnostics (recommended)¶

Run a cheap cross-check: If a result drives a decision, re-evaluate a subset with a higher-level method (e.g., xTB → hybrid DFT + dispersion).
Watch for warning signs: Unexpected bond formation/breaking during optimization, unusual coordination changes, or large charge migration are common indicators that the chosen method is outside its comfort zone.

Computational complexity/scaling behaviour of methods:¶

Method	Scaling	System size
CCSD(T)	\(O(N^7)\)	<20 atoms
DFT	\(O(N^{3-4})\)	50-200 atoms
GFN2-xTB or g-xTB	\(O(N^{2-3})\)	100-1000 atoms
PM6/PM7	\(O(N^2)\)	100-1000 atoms
GFN-FF	\(O(N^2)\)	1000+ atoms

Geometry optimization via semi-empirical methods and machine-learned interatomic potentials (MLIPs)¶

Semi-empirical methods and machine-learned interatomic potentials (MLIPs) serve as computationally efficient alternatives to density functional theory (DFT), offering rapid yet reasonably accurate predictions of molecular structures and energetics.

Semi-empirical methods, such as the g-xTB, GFN2-xTB, and AM1 models, balance quantum mechanical accuracy and computational cost by employing parameterized approximations derived from experimental or higher-level theoretical data. g-xTB and GFN2-xTB are modern tight-binding approaches with anisotropic electrostatics and built-in dispersion corrections, offering excellent geometry predictions and non-covalent interaction energies; g-xTB can in most cases be a substitue for low/mid accuracy DFT calculations. AM1 (Austin Model 1) is a classic method using modified core-core repulsion functions, originally parameterized for organic molecules.

Similarly, MLIPs such as:

The ORB models apply machine learning (graph neural networks, transformers) to infer interatomic interactions from reference datasets, enabling accurate force and energy evaluations for geometry optimizations at a fraction of DFT's computational expense. The newer ORB V3 offers over 10x lower latency and 8x reduced memory requirements compared to ORB V2, while maintaining or improving accuracy across a range of chemical systems. Both ORB versions were trained on extensive datasets covering diverse chemical space: ORB V2 was trained on a combination of the MPtrj and Alexandria datasets (containing approximately 30 million calculations on crystalline materials) at the DFT PBE level of theory, while ORB V3 incorporates additional data from the OMAT24 dataset, which includes high-energy configurations, molecular dynamics trajectories, and relaxation paths for a more comprehensive representation of potential energy surfaces and out-of-equilibrium structures. The OMAT24 dataset is particularly valuable as it contains DFT calculations for over 110 million structures with diverse elemental compositions covering most of the periodic table, with energy, force, and stress distributions much wider than previous datasets.
The MACE-MP (Multi-Atomic Cluster Expansion - Materials Project) model provides accuracy for crystalline materials by leveraging the complete Many-Body Expansion and equivariant message passing, trained on the MPTrj database containing over 1.6 million bulk crystal structures from DFT relaxation trajectories.
The PET-MAD (Point Edge Transformer trained on Massive Atomic Diversity Dataset) model combines transformer architectures with physics-informed learning to achieve high accuracy across molecular and materials systems, trained on 95,595 structures, including 3D and 2D inorganic crystals, surfaces, molecular crystals, nanoclusters, and molecules. Meta AI's UMA (Universal Models for Atoms) is trained on diverse datasets including crystalline materials, catalysts with adsorbed species, and molecular systems --- over 30 billion atoms across all training data from Meta datasets released in the last 5 years --- with specialized task heads for different applications including catalysis (oc20), inorganic materials (omat), metal-organic frameworks (mof), molecules (omol), and molecular crystals (omc), enabling unified modeling across multiple domains of materials science.
The GRACE-2L-OMAT model is a two-layer machine learning interatomic potential that was pre-fitted on the Meta Open Materials 2024 dataset and fine-tuned on the sAlex and MPTraj datasets. The two-layer models include semi-local interactions mediated by equivariant message passing and employ chemical embedding for efficiently condensing chemical interactions into low rank representations.

Cebule SDK TaskType: GEOMETRY_OPT¶

Semi-empirical/MLIP geometry optimization of molecule 3D coords after either initial force field optimization or user-defined geometry
Inputs:

optimization_method: str from [g_xtb, gfn2_xtb, am1, uma]

smiles_list: List[str] SMILES list

force_field: str from [mmff94, ghemical] for initial optimization

geometry_list: List[List[List[float]]] of 3D coordinates can be provided

symbols_list: List[List[str]] of atomic symbols.

Cebule max_processors: Used to limit concurrency of optimization
Output:

List containing each molecule's optimized 3D coords (see Atom Order which defines the order of atoms in this outputted geometry list)

Example (SMILES Input):

task_geometry_opt = session.cebule.create_task("Geometry Opt Example",
                                               TaskType.GEOMETRY_OPT, 
                                               smiles_list=["CCO", "O"], 
                                               # Optimize with MMFF94 force field followed by GFN2-xTB.
                                               force_field="mmff94", 
                                               optimization_method="gfn2_xtb",
                                               max_processors=4)

Example (Geometry and Symbols Input):

task_geometry_opt_coords = session.cebule.create_task("Geometry Opt Coords Example",
                                                     TaskType.GEOMETRY_OPT,
                                                     geometry_list=[[[0.0, 0.0, 0.0], [0.96, 0.0, 0.0]]],
                                                     symbols_list=[["O", "O"]],
                                                     optimization_method="gfn2_xtb",
                                                     max_processors=4)

Cebule SDK TaskType: PERIODIC_GEOMETRY_OPT¶

MLIP/semi-empirical geometry optimization of periodic systems (crystals, surfaces, interfaces) applying suitable models (e.g. MLIPs, tight-binding) for extended systems.
Inputs:

optimization_method: str from [mace_mp, orb_v2, orb_v3, uma, pet_mad, grace-2l-omat, gfn1_xtb, gfn2_xtb]

geometry: List[List[float]] of atomic coordinates

symbols: List[str] of atomic symbols

cell: List[List[float]] as 3×3 cell parameter matrix

pbc: List[bool] for periodic boundary conditions (UMA and ORB require [True, True, True])

Optional Inputs:

optional fmax: float force convergence (default 0.10), optional fixed: List[bool] to specify which atoms should be fixed during optimization (same length as geometry/symbols, default allows all atoms to move),

structure_type: str required for UMA only from [catalysis, inorganic_material, metal_organic_framework, molecular_crystal]

max_processors: int Used to limit concurrency of optimization

Output:

geometry: dict optimized atomic coordinates

symbols: list preserved atomic symbols

energy: dict final energy in eV

Examples:

Ni(111) surface with MACE-MP:

# 4-atom Ni(111) surface slab
geometry = [[0.0, 0.0, 7.5], [2.49, 0.0, 7.5], [1.245, 2.156, 7.5], [3.735, 2.156, 7.5]]
symbols = ["Ni", "Ni", "Ni", "Ni"] 
cell = [[4.98, 0.0, 0.0], [0.0, 4.312, 0.0], [0.0, 0.0, 15.0]]
pbc = [True, True, False]

task_geometry_opt_slab = session.cebule.create_task("Periodic Geometry Opt Ni Example",
                                             TaskType.PERIODIC_GEOMETRY_OPT,
                                             geometry=geometry,
                                             symbols=symbols, 
                                             cell=cell,
                                             pbc=pbc,
                                             optimization_method="mace_mp",
                                             fmax=0.10,
                                             max_processors=4)

Pt(111) surface with adsorbed H using UMA:

# 6-atom Pt(111) surface with H adsorbate
geometry = [[0.0, 0.0, 5.0], [2.77, 0.0, 5.0], [1.385, 2.40, 5.0],     # Bottom Pt layer
            [0.0, 0.0, 7.77], [2.77, 0.0, 7.77], [1.385, 2.40, 7.77],   # Top Pt layer  
            [1.385, 0.80, 9.0]]                                           # H adsorbate
symbols = ["Pt", "Pt", "Pt", "Pt", "Pt", "Pt", "H"]
cell = [[5.54, 0.0, 0.0], [0.0, 4.80, 0.0], [0.0, 0.0, 15.0]]
pbc = [True, True, True]  # UMA requires full periodicity so it can self-select pbc
fixed = [True, True, True, False, False, False, False]  # Fix bottom layer for optimization

task_geometry_opt_catalyst = session.cebule.create_task("Periodic Geometry Opt Pt Example",
                                                TaskType.PERIODIC_GEOMETRY_OPT,
                                                geometry=geometry,
                                                symbols=symbols,
                                                cell=cell,
                                                pbc=pbc,
                                                optimization_method="uma",
                                                structure_type="catalysis",
                                                fmax=0.10,
                                                fixed=fixed,
                                                max_processors=4)

We have worked on various projects utilizing these models and their extensions/modifications/re-parameterizations.

Let us know if you want to discuss your use-case with us and evaluate which models should be used with respect to the molecules and structures you work with (contact [at] mqs (dot) dk).