MultiScaleOrthogonalRFF

class franken.rf.heads.MultiScaleOrthogonalRFF(input_dim, num_random_features=2**10, num_species=None, chemically_informed_ratio=None, use_offset=True, length_scale_low=1.0, length_scale_high=10.0, length_scale_num=4, rng_seed=None)

Bases: RandomFeaturesHead

A multi-scale version of franken.rf.heads.OrthogonalRFF which splits the available random features among multiple length-scales. This approximates a mixture of Gaussian kernels at different scales, simplifying hyper-parameter tuning.

\[\text{exp}\left(-\frac{\| x - y \|^{2}}{2\ell^{2}}\right)\]

Multiple scales are specified with arguments length_scale_low, length_scale_high and length_scale_num which will be used to subdivide the available num_random_features random features into length_scale_num blocks with equally spaced length-scales. This means that each length-scale will only use a fraction of the total random features, but in practice we found this have very small impact on overall accuracy. This kernel can be seen as implicitly doing a grid-search over linearly spaced length-scales.

Parameters:
  • input_dim (int) – Dimensionality of the input features.

  • num_random_features (int) – The number of random features to use in the feature mapping. Defaults to \(2^{10} = 1024\).

  • num_species (int | None) – The number of chemical species for which the kernel is computed. This parameter is relevant for systems with multiple chemical species. Defaults to None.

  • chemically_informed_ratio (float | None) – The relative weight of chemically-informed kernels with respect to the all-species kernel. Ignored if num_species is None. Defaults to None.

  • use_offset (bool) – A flag indicating whether to use an offset in the random feature generation. Using an offset reduces the number of random features by half but increases variance. Defaults to True.

  • length_scale_low (float) – The lower end of the interval of length-scales considered.

  • length_scale_high (float) – The higher end of the interval of length-scales considered.

  • length_scale_num (int) – The number of different length-scales, equally spaced between length_scale_low and length_scale_high which are considered in the multi-scale approximation.

  • rng_seed (int | None) – A seed for the random number generator used in generating random features. Setting this ensures reproducibility of results. Defaults to None.

feature_map(h, atomic_numbers=None, batch_ids=None)

Computes the random-feature map for a given configuration h

Parameters:
  • h (torch.Tensor) – descriptors for a single configuration ~[natoms, descriptors]

  • atomic_numbers (torch.Tensor) – atomic numbers for a single configuration ~[natoms]