Progress for Multi-Processing

Robert M Flight

2024-07-23

Multi-Processing

It is often useful to be able to also monitor the progress of jobs that are carried out using multi-processing. There are two functions used for that in this package: set_progress_mp and watch_progress_mp that are used in conjunction with each other.

The code below uses the parallel package, but in theory, it should work with any of the parallel architectures available for R.

Setting Up

To set up progress monitoring, you will again need to add a progress argument to your function. Notice that you can still use the update_progress function to update your progress correctly.

library(knitrProgressBar)
library(parallel)

arduously_long_nchar <- function(input_var, .pb=NULL) {
  
  update_progress(.pb) # this is a function provided by the package
  
  Sys.sleep(0.1)
  
  nchar(input_var)
  
}

Then you set-up the actual progress indicator function using set_progress_mp:

set_kpb <- set_progress_mp("progress_file.log")

This object will write a “.” to the file progress_file.log each time it is called. Therefore, the file location should be accessible to all of the threads.

Running

To use this, start your multi-processing code.

options(mc.cores = 2)
mclapply(seq(1, 100), arduously_long_nchar, .pb = set_kpb)

And then in a separate R process, setup the watcher:

kpb_watch <- watch_progress_mp(100, watch_location = "progress_file.log")
kpb_watch

In this separate process, you should see the progress indicator keeping up with the multi-processing code.