Molecular dynamics with franken and ASE

In this notebook we show how to perform MD with a trained franken model. Once again we use the water dataset as a simple example, and finetune the MACE-MP0 foundation model which was trained on the materials-project dataset to accurately predict a potential field at a high level of theory.

[1]:
try:
    import franken
except ImportError:
    %pip install franken[mace]
    import franken
[ ]:
from pathlib import Path
import sys
import torch
import numpy as np
import matplotlib.pyplot as plt
import ase
import ase.md
import ase.md.velocitydistribution
from ase.geometry.rdf import get_recommended_r_max, get_rdf

from franken.calculators import FrankenCalculator
from franken.datasets.registry import DATASET_REGISTRY
from franken.backbones.utils import CacheDir
from franken.autotune import autotune
from franken.config import (
    MaceBackboneConfig,
    MultiscaleGaussianRFConfig,
    DatasetConfig,
    SolverConfig,
    HPSearchConfig,
    AutotuneConfig,
)
/home/lbonati@iit.local/software/miniforge3-mamba/envs/franken/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Run autotune to train a franken model

The first step is to obtain a trained franken model, with which to perform the MD.

We do this using the autotune interface, for which a dedicated tutorial is available.

[3]:
gnn_config = MaceBackboneConfig(
    path_or_id="mace_mp/small",
    interaction_block=2,
)

dataset_cfg = DatasetConfig(name="water", max_train_samples=8,
                            # or specify train/val/test paths:
                            # train_path="train.xyz",
                            # val_path="val.xyz",
                            )
rf_config = MultiscaleGaussianRFConfig(
    num_random_features=512,
    length_scale_low=8,
    length_scale_high=32,
    length_scale_num=4,
    rng_seed=42,
)
solver_cfg = SolverConfig(
    l2_penalty=HPSearchConfig(start=-10, stop=-5, num=5, scale='log'),  # equivalent of numpy.logspace
    force_weight=HPSearchConfig(start=0.01, stop=0.99, num=5, scale='linear'),  # equivalent of numpy.linspace
)
autotune_cfg = AutotuneConfig(
    dataset=dataset_cfg,
    solver=solver_cfg,
    backbone=gnn_config,
    rfs=rf_config,
    seed=42,
    jac_chunk_size='auto',
    run_dir="./results",
)
[4]:
run_path = autotune(autotune_cfg)
console_logging_level: INFO
dtype: float64
jac_chunk_size: auto
rf_normalization: leading_eig
run_dir: ./results
save_every_model: False
save_fmaps: False
scale_by_species: True
seed: 42
backbone:
    family: mace
    interaction_block: 2
    path_or_id: mace_mp/small
dataset:
    max_train_samples: 8
    name: water
    test_path: null
    train_path: null
    val_path: null
rfs:
    length_scale_high: 32
    length_scale_low: 8
    length_scale_num: 4
    num_random_features: 512
    rf_type: multiscale-gaussian
    rng_seed: 42
    use_offset: true
solver:
    force_weight:
      num: 5
      scale: linear
      start: 0.01
      stop: 0.99
      value: null
      values: null
    l2_penalty:
      num: 5
      scale: log
      start: -10
      stop: -5
      value: null
      values: null

2026-01-25 14:11:06.787 INFO (rank 0): Initializing default cache directory at /home/lbonati@iit.local/.franken
2026-01-25 14:11:06.793 INFO (rank 0): Run folder: results/run_260125_141106_19bfafa6
cuequivariance or cuequivariance_torch is not available. Cuequivariance acceleration will be disabled.
ASE -> MACE (train): 100%|██████████| 8/8 [00:00<00:00, 258.97it/s]
ASE -> MACE (val): 100%|██████████| 189/189 [00:00<00:00, 242.35it/s]
Computing dataset statistics: 100%|██████████| 8/8 [00:00<00:00, 16.00it/s]
2026-01-25 14:11:15.739 INFO (rank 0): jacobian chunk size automatically set to 32
2026-01-25 14:11:46.246 WARNING (rank 0): `leading_eig` normalization has high memory usage. If you encounter OOM errors try to disable it.
2026-01-25 14:12:32.102 INFO (rank 0): Trial   1 | rf_type: multiscale-gaussian | num_random_features:   512   | length_scale_low:    8    | length_scale_high:   32    | length_scale_num:    4    | use_offset:  True   | rng_seed:   42    | Best trial 1 (energy 0.32 meV/atom - forces 25.1 meV/Ang)

Run molecular dynamics

Next we can run the molecular dynamics simulation.

Most of the code below is boiler-plate for the simulation

We start by defining some parameters of the molecular dynamics:

  • The MD length (in nanoseconds)

  • The timestep (in femtoseconds)

  • The simulation temperature (in kelvin)

  • Friction of the Langevin integrator (in units 1/fs)

[5]:
md_length_ns = 0.01
timestep_fs = 0.5
temperature_K = 325
friction = 0.01
device = "cuda:0" if torch.cuda.is_available() else "cpu"

Load the initial configuration: a random one taken from the water training set

[6]:
data_path = DATASET_REGISTRY.get_path("water", "train", CacheDir.get())
initial_configuration = ase.io.read(data_path, index=567)

Initialize the trajectory object. It will be used to save the outputs of the MD.

[7]:
md_path = run_path / "md"
md_path.mkdir(exist_ok=True)
# Trajectory will contain the output data
trajectory_path = md_path / f"md_output.traj"
trajectory = ase.io.Trajectory(trajectory_path, "w", initial_configuration)

Initialize the calculator object which wraps the franken model we have learned previously.

[8]:
model_path = run_path / "best_ckpt.pt"
calc = FrankenCalculator(model_path, device=device)
initial_configuration.calc = calc

Initialize ASE-specific MD-related objects:

  • the velocity initialization

  • the Langevin integrator

  • and a logger to record results.

[9]:
ase.md.velocitydistribution.MaxwellBoltzmannDistribution(
    atoms=initial_configuration,
    temperature_K=temperature_K
)
integrator = ase.md.Langevin(
    atoms=initial_configuration,
    temperature_K=temperature_K,
    friction=friction / ase.units.fs,
    timestep=timestep_fs * ase.units.fs
)
md_logger = ase.md.MDLogger(
    integrator,
    initial_configuration,
    sys.stdout,  # log directly to console
    header=True,
    stress=False,
    peratom=False,
)
integrator.attach(md_logger, interval=2000)
integrator.attach(trajectory.write, interval=20)
Time[ps]      Etot[eV]     Epot[eV]     Ekin[eV]    T[K]

Finally run the MD for the desired number of time-steps

[10]:
integrator.run(md_length_ns * 1e6 / timestep_fs)
0.0000         -915.772     -923.371        7.600   311.1
1.0000         -914.456     -922.815        8.359   342.1
2.0000         -915.416     -923.301        7.884   322.7
3.0000         -914.253     -922.529        8.276   338.8
4.0000         -914.204     -921.829        7.624   312.1
5.0000         -915.925     -923.492        7.567   309.7
6.0000         -914.147     -922.450        8.302   339.8
7.0000         -914.538     -922.829        8.291   339.4
8.0000         -915.881     -923.429        7.548   309.0
9.0000         -915.414     -923.459        8.045   329.3
10.0000        -915.183     -922.423        7.240   296.3
[10]:
True

Analyse MD results

The trajectory can be loaded with ASE, we can then plot the radial-distribution function of the simulation.

[11]:
traj = ase.io.read(trajectory_path, index=":")
# Skip first time-step which is just the initial configuration
traj = traj[1:]
print(f"Loaded MD trajectory of length {len(traj)}")
Loaded MD trajectory of length 1000

We compute the RDF of the O-O atoms within the trajectory, and then compute the average and plot it. We’re using ASE’s get_rdf for the actual distance calculations. The trajectory is very short, so the RDF is very noisy, but by increasing the simulation time it’s easy to get a clean curve.

[12]:
equilibration_fs = 0.5 * 100  # deliberately short
nbins_per_angstrom = 50

# Skip the beginning of the trajectory where simulation is not at equilibrium
traj_len_fs = timestep_fs * len(traj)
# equilibration time / 100 (sampling interval)
traj_eq = traj[int(equilibration_fs / timestep_fs / 20) :]

rmax = 6.3
nbins = int(rmax * nbins_per_angstrom)

element_pair = (8, 8)  # O-O distances

# Compute the average RDF
rdf_list = []
for i, atoms in enumerate(traj_eq):
    rdf, radii = get_rdf(
        atoms,
        rmax,
        nbins,
        no_dists=False,
        elements=element_pair,
    )
    rdf_list.append(rdf)
avg_rdf = np.mean(rdf_list, axis=0)
[13]:
fig, ax = plt.subplots()
ax.plot(radii, avg_rdf, lw=3)
ax.set_xlabel("Radius (A)")
ax.set_ylabel("g(r)")
ax.set_title(f"O-O RDF of water at {temperature_K}K")
ax.set_xlim([1.5, rmax])
[13]:
(1.5, 6.3)
../_images/notebooks_molecular_dynamics_23_1.png