class gpr::RFFModel

Overview

Random Fourier Features (RFF) approximation to the SE kernel. More…

#include <RFFModel.h>
 
class RFFModel {
public:
    // methods
 
    void build(SexpatCF& kernel, const AtomsConfiguration& conf_info, const Observation& training_data, int D_rff, double noise_var = 1e-6, double grad_noise_var = 1e-4);
    void calculatePotential(Observation& image) const;
    void calculateVariance(Observation& image) const;
    bool isBuilt() const;
};

Detailed Documentation

Random Fourier Features (RFF) approximation to the SE kernel.

Approximates the SexpatCF kernel with D_rff random cosine features, then solves Bayesian linear regression in the feature space. Training cost is O(N * D * D_rff + D_rff^3), prediction is O(D_rff * D) per test point compared to O((N*(D+1))^3) for exact GP.

The spectral density of the SE kernel with effective variance 1/(2*l^2) per feature is N(0, 2/l^2). Per-pairtype ARD lengthscales are applied: each feature f has pairtype p(f), and W[:,f] ~ N(0, 2/l_p^2).

Workflow:

  1. Train exact GP on subset to optimize hyperparameters

  2. Call build() with trained kernel and all training data

  3. Call calculatePotential() / calculateVariance() for prediction

Note

The factor-of-2 convention: dist_at computes d = sqrt(sum 2*s2*delta^2), and the kernel is k = sigma^2 * exp(-0.5*d^2). Expanding: k = sigma^2 * exp(-sum s2*delta^2) where s2 = 1/l^2. This is an SE kernel with effective variance 1/(2*l^2) per feature, so the spectral density is N(0, 2*s2*I) = N(0, 2/l^2 * I).

See also:

InverseDistanceDescriptor for the feature map from Cartesian coordinates to inverse distances.

SexpatCF for the exact kernel this approximates.

Methods

void build(SexpatCF& kernel, const AtomsConfiguration& conf_info, const Observation& training_data, int D_rff, double noise_var = 1e-6, double grad_noise_var = 1e-4)

Build the RFF model from trained hyperparameters and data.

Parameters:

kernel

Trained SexpatCF with optimized magnSigma2 and lengthScale values.

conf_info

Atom configuration (for descriptor computation).

training_data

All training observations (R, E, G).

D_rff

Number of random features.

noise_var

Energy observation noise variance.

grad_noise_var

Gradient observation noise variance.

void calculatePotential(Observation& image) const

Predict energy and gradient at test configurations.

Fills image.E and image.G from the RFF model.

Parameters:

image

Observation with R set; E and G are filled.

void calculateVariance(Observation& image) const

Predict energy/gradient variance at test configurations.

Fills image.E and image.G with predictive variances.

Parameters:

image

Observation with R set; E and G filled with variances.