Flexible and consistent simulation of a matrix of Monte Carlo variates

Consider the Monte Carlo simulation of a matrix of i.i.d. normal random variables. We will show how rTRNG can be used to perform a consistent (fair-playing) simulation of a subset of the variables and simulations.

Consistent simulation in R

We rely on the TRNG engines exposed to R as reference classes by rTRNG.

library(rTRNG)

The mcMatR function below performs the full sequential Monte Carlo simulation of nrow normal i.i.d. samples of ncol variables using the yarn2 generator.

mcMatR <- function(nrow, ncol) {
  r <- yarn2$new(12358)
  M <- matrix(rnorm_trng(nrow * ncol, engine = r),
              nrow = nrow, ncol = ncol, byrow = TRUE)
  M
}

A second function mcSubMatR relies on jump and split operations to perform only a chunk [startRow, endRow] of simulations for a subset subCols of the variables.

mcSubMatR <- function(nrow, ncol,
                      startRow, endRow, subCols) {
  r <- yarn2$new(12358)
  r$jump((startRow - 1)*ncol)
  nSubCols <- endRow - startRow + 1
  S <- matrix(0.0, nrow, ncol)
  S[startRow:endRow, subCols] <-
    vapply(subCols,
           function(j) {
             rj = r$copy()
             rj$split(ncol, j)
             rnorm_trng(nSubCols, engine = rj)
           },
           FUN.VALUE = numeric(nSubCols))
  S
}

The parallel nature of the yarn2 generator ensures the sub-simulation obtained via mcSubMatR is consistent with the full sequential simulation.

rows <- 9
cols <- 5
M <- mcMatR(rows, cols)
startRow <- 4
endRow <- 6
subCols <- c(2, 4:5)
S <- mcSubMatR(rows, cols,
               startRow, endRow, subCols)
identical(M[startRow:endRow, subCols],
          S[startRow:endRow, subCols])
## [1] TRUE

	M.1	M.2	M.3	M.4	M.5	S.2	S.4	S.5
1	0.20256	-0.41401	-0.76749	-0.33344	0.10718	0.00000	0.00000	0.00000
2	-2.76439	-1.15524	-0.39394	-1.16604	1.61759	0.00000	0.00000	0.00000
3	-0.42199	-1.13148	-0.30448	0.12741	-0.16111	0.00000	0.00000	0.00000
4	-0.94448	-1.86384	-1.03244	-0.41155	1.31009	-1.86384	-0.41155	1.31009
5	-0.09614	-0.16366	-0.31964	0.87053	0.77996	-0.16366	0.87053	0.77996
6	1.42049	-0.73062	-1.19459	-1.02146	0.07202	-0.73062	-1.02146	0.07202
7	-0.61202	0.02906	-0.29100	0.10095	-0.74647	0.00000	0.00000	0.00000
8	1.10246	-0.50507	0.01286	0.63140	-1.28893	0.00000	0.00000	0.00000
9	-0.08732	-0.32545	0.29099	0.62003	-0.94617	0.00000	0.00000	0.00000

Consistent simulation with Rcpp

We now use Rcpp to define functions mcMatRcpp and mcSubMatRcpp for the full sequential simulation and the sub-simulation, respectively. The Rcpp::depends attribute makes sure the TRNG library and headers shipped with rTRNG are available to the C++ code. Moreover, Rcpp::plugins(cpp11) enforces the C++11 standard required by TRNG >= 4.22.

// [[Rcpp::depends(rTRNG)]]
// TRNG >= 4.22 requires C++11
// [[Rcpp::plugins(cpp11)]]
#include <Rcpp.h>
#include <trng/normal_dist.hpp>
#include <trng/yarn2.hpp>
using namespace Rcpp;

// [[Rcpp::export]]
NumericMatrix mcMatRcpp(const int nrow, const int ncol) {
  NumericMatrix M(nrow, ncol);
  trng::yarn2 r(12358);
  trng::normal_dist<> normal(0.0, 1.0);
  for (int i = 0; i < nrow; i++) {
    for (int j = 0; j < ncol; j++) {
      M(i, j) = normal(r);
    }
  }
  return M;
}

// [[Rcpp::export]]
NumericMatrix mcSubMatRcpp(const int nrow, const int ncol,
                           const int startRow,
                           const int endRow,
                           const IntegerVector subCols) {
  NumericMatrix M(nrow, ncol);
  trng::yarn2 r(12358), rj;
  trng::normal_dist<> normal(0.0, 1.0);
  r.jump((startRow - 1) * ncol);
  for (IntegerVector::const_iterator jSub = subCols.begin();
       jSub < subCols.end(); jSub++) {
    int j = *jSub - 1;
    rj = r;
    rj.split(ncol, j);
    for (int i = startRow - 1; i < endRow; i++) {
      M(i, j) = normal(rj);
    }
  }
  return M;
}

As seen above for the R case, consistency of the simulation obtained via mcSubMatRcpp with the full sequential simulation is guaranteed.

rows <- 9
cols <- 5
startRow <- 4
endRow <- 6
subCols <- c(2, 4:5)
M <- mcMatRcpp(rows, cols)
S <- mcSubMatRcpp(rows, cols, startRow, endRow, subCols)
identical(M[startRow:endRow, subCols],
          S[startRow:endRow, subCols])
## [1] TRUE

	M.1	M.2	M.3	M.4	M.5	S.2	S.4	S.5
1	0.20256	-0.41401	-0.76749	-0.33344	0.10718	0.00000	0.00000	0.00000
2	-2.76439	-1.15524	-0.39394	-1.16604	1.61759	0.00000	0.00000	0.00000
3	-0.42199	-1.13148	-0.30448	0.12741	-0.16111	0.00000	0.00000	0.00000
4	-0.94448	-1.86384	-1.03244	-0.41155	1.31009	-1.86384	-0.41155	1.31009
5	-0.09614	-0.16366	-0.31964	0.87053	0.77996	-0.16366	0.87053	0.77996
6	1.42049	-0.73062	-1.19459	-1.02146	0.07202	-0.73062	-1.02146	0.07202
7	-0.61202	0.02906	-0.29100	0.10095	-0.74647	0.00000	0.00000	0.00000
8	1.10246	-0.50507	0.01286	0.63140	-1.28893	0.00000	0.00000	0.00000
9	-0.08732	-0.32545	0.29099	0.62003	-0.94617	0.00000	0.00000	0.00000

Consistent parallel simulation with RcppParallel

The same technique used for generating a sub-set of the simulations can be exploited for performing a parallel simulation in C++. We can embed the body of mcSubMatRcpp above into an RcppParallel::Worker for performing chunks of Monte Carlo simulations in parallel, for any subset subCols of the variables.

struct MCMatWorker : public Worker
{
  RMatrix<double> M;
  const RVector<int> subCols;

  // constructor
  MCMatWorker(NumericMatrix M,
              const IntegerVector subCols)
    : M(M), subCols(subCols) {}

  // operator processing an exclusive range of indices
  void operator()(std::size_t begin, std::size_t end) {
    trng::yarn2 r(12358), rj;
    trng::normal_dist<> normal(0.0, 1.0);
    r.jump((int)begin * M.ncol());
    for (IntegerVector::const_iterator jSub = subCols.begin();
         jSub < subCols.end(); jSub++) {
      int j = *jSub - 1;
      rj = r;
      rj.split(M.ncol(), j);
      for (int i = (int)begin; i < (int)end; i++) {
        M(i, j) = normal(rj);
      }
    }
  }
};
// [[Rcpp::export]]
NumericMatrix mcMatRcppParallel(const int nrow, const int ncol,
                                const IntegerVector subCols) {
  NumericMatrix M(nrow, ncol);
  MCMatWorker w(M, subCols);
  parallelFor(0, M.nrow(), w);
  return M;
}

The parallel nature of the yarn2 generator ensures the parallel simulation is playing fair, i.e. is consistent with the sequential simulation.

M <- mcMatRcpp(rows, cols)
Mp <- mcMatRcppParallel(rows, cols, seq_len(ncol(M)))
identical(M, Mp)
## [1] TRUE

Similarly, we can achieve a consistent parallel simulation of a subset of the variables only.

Sp <- mcMatRcppParallel(rows, cols, subCols)
identical(M[, subCols],
          Sp[, subCols])
## [1] TRUE

	M.1	M.2	M.3	M.4	M.5	Sp.2	Sp.4	Sp.5
1	0.20256	-0.41401	-0.76749	-0.33344	0.10718	-0.41401	-0.33344	0.10718
2	-2.76439	-1.15524	-0.39394	-1.16604	1.61759	-1.15524	-1.16604	1.61759
3	-0.42199	-1.13148	-0.30448	0.12741	-0.16111	-1.13148	0.12741	-0.16111
4	-0.94448	-1.86384	-1.03244	-0.41155	1.31009	-1.86384	-0.41155	1.31009
5	-0.09614	-0.16366	-0.31964	0.87053	0.77996	-0.16366	0.87053	0.77996
6	1.42049	-0.73062	-1.19459	-1.02146	0.07202	-0.73062	-1.02146	0.07202
7	-0.61202	0.02906	-0.29100	0.10095	-0.74647	0.02906	0.10095	-0.74647
8	1.10246	-0.50507	0.01286	0.63140	-1.28893	-0.50507	0.63140	-1.28893
9	-0.08732	-0.32545	0.29099	0.62003	-0.94617	-0.32545	0.62003	-0.94617

Flexible and consistent simulation of a matrix of Monte Carlo variates

Riccardo Porreca, Roland Schmid

2022-03-14

Consistent simulation in R

Consistent simulation with Rcpp

Consistent parallel simulation with RcppParallel