3  Parallel Processing

3.1 TLDR

To set up, start and end parallel processing for either tidymodels or future_map() use the following code:

library(future) # include among other packages that you load at top of script

# set plan when ready to begin parallel processing.  
plan(multisession, workers = parallel::detectCores(logical = FALSE))

#
# do your parallel processing here
#

# end parallel processing when done  
plan(sequential) 

For parallel processing with foreach(), you also need to load the doFuture package and use %dofuture% instead of %do% or %dopar% in your foreach() code chunk. The same plan setup and ending is used as for future_map().

library(future) # include among other packages that you load at top of script
library(doFuture) 

# set plan when ready to begin parallel processing.  
plan(multisession, workers = parallel::detectCores(logical = FALSE))

# an example foreach loop in parallel mode
x <- foreach(time = c(2, 2, 2), .combine = "c") %dofuture% {
  Sys.sleep(time)
  time
}


# end parallel processing when done  
plan(sequential) 

3.2 Some introductory notes

The furrr package provides a parallel version of the map functions for iteration. The developers provide useful documentation and deep dives that are worth reading when you start using future_map() and its variants.

foreach provides an alternative for loop that can run sequentially or in parallel as requested. Previously, foreach was used under the hood to do resampling by fit_resamples() and tune_grid() in tidymodels. However, this has been replaced by the same backend used by furrr (future) as of tune 1.2.1.

Michael Hallquist has provided a useful and detailed overview of parallel processing. It is a good first read to orient to terms and concepts. However, it does not describe either the future package or the furrr package. It does provide brief introduction to foreach

Info on future ecosystem and more

Also see documentation on parallel processing in tidymodels

3.3 Begin test bed to comfirm that this code works as expected.

First set up code for tidymodels/tidyverse, packages for parallel processing and then specific packages for timing code, future_map() and foreach().

library(tidyverse)
library(tidymodels)

# for setting up parallel processing plans for tidymodels and other parallel workflows
library(future)  
# for foreach() parallel processing with future backend
library(doFuture) 

library(tictoc)  # for crude timing to evaluate benefits
library(furrr) # for future_map()
library(foreach, exclude = c("accumulate", "when")) # for foreach()

# source for demo ML workflow in tidymodels
source("https://github.com/jjcurtin/lab_support/blob/main/fun_ml.R?raw=true")

3.3.1 future_map()

Here is the use of map (from purrr) that uses sequential processing

tic()
x <- map(c(2, 2, 2), \(time) Sys.sleep(time))
toc()
6.014 sec elapsed

Using future_map() with a plan

# library(future)  # need this loaded at top of script to use plan()

# using detectCores from parallel package to set number of workers to number of physical cores on machine.  You may want to set this to a lower number if you want to leave some cores free for other tasks.
plan(multisession, workers = parallel::detectCores(logical = FALSE))

tic()
x <- future_map(c(2, 2, 2), \(time) Sys.sleep(time))
toc()
2.558 sec elapsed
plan(sequential)

3.4 tune_grid() in tidymodels

Set up data, resamples, recipe, tuning grid. Will do 3x 10-fold CV to tune an ElasticNet glm with a sample size of 1000 and 30 features

# set up data
n_obs <- 1000
n_x <- 30
irr_err <- 5
d <- MASS::mvrnorm(n = n_obs, mu = rep(0,n_x), Sigma = diag(n_x)) %>% 
    magrittr::set_colnames(str_c("x", 1:n_x)) %>% 
    as_tibble() %>% 
    mutate(error = rnorm(n_obs, 0, irr_err),
           y = rowSums(across(everything()))) %>% 
    select(-error)

# recipe
rec <- recipe(y ~ ., data = d)

# 10-fold CV
set.seed(19690127)
splits <- d %>% 
  vfold_cv(v = 10, strata = "y")

# tuning grid
tune_grid <- expand_grid(penalty = exp(seq(0, 6, length.out = 200)),
                           mixture = seq(0, 1, length.out = 11))

First, let’s benchmark without parallel processing. tune_grid() (and fit_resamples()) default is to allow parallel processing if it is available. Since our plan is currently set to sequential, this code will not use parallel processing. Note that you ccan also not use parallel processing in tidymodel functions by using control_grid(). I’ve left that code below in comments in case ever needed.

tic()
linear_reg(penalty = tune(), mixture = tune()) %>% 
  set_engine("glmnet") %>% 
  tune_grid(preprocessor = rec, 
            # control = control_grid(allow_par = FALSE), # turn off pp
            resamples = splits, grid = tune_grid, 
            metrics = metric_set(rmse))
# Tuning results
# 10-fold cross-validation using stratification 
# A tibble: 10 × 4
   splits            id     .metrics             .notes          
   <list>            <chr>  <list>               <list>          
 1 <split [900/100]> Fold01 <tibble [2,200 × 6]> <tibble [0 × 4]>
 2 <split [900/100]> Fold02 <tibble [2,200 × 6]> <tibble [0 × 4]>
 3 <split [900/100]> Fold03 <tibble [2,200 × 6]> <tibble [0 × 4]>
 4 <split [900/100]> Fold04 <tibble [2,200 × 6]> <tibble [0 × 4]>
 5 <split [900/100]> Fold05 <tibble [2,200 × 6]> <tibble [0 × 4]>
 6 <split [900/100]> Fold06 <tibble [2,200 × 6]> <tibble [0 × 4]>
 7 <split [900/100]> Fold07 <tibble [2,200 × 6]> <tibble [0 × 4]>
 8 <split [900/100]> Fold08 <tibble [2,200 × 6]> <tibble [0 × 4]>
 9 <split [900/100]> Fold09 <tibble [2,200 × 6]> <tibble [0 × 4]>
10 <split [900/100]> Fold10 <tibble [2,200 × 6]> <tibble [0 × 4]>
toc()
1523.651 sec elapsed

Now we set up a parallel plan (multisession) and run again.

# library(future)  # need this loaded at top of script to use plan()
plan(multisession, workers = parallel::detectCores(logical = FALSE))

tic()
linear_reg(penalty = tune(), mixture = tune()) %>% 
  set_engine("glmnet") %>% 
  tune_grid(preprocessor = rec, 
            resamples = splits, grid = tune_grid, 
            metrics = metric_set(rmse))
# Tuning results
# 10-fold cross-validation using stratification 
# A tibble: 10 × 4
   splits            id     .metrics             .notes          
   <list>            <chr>  <list>               <list>          
 1 <split [900/100]> Fold01 <tibble [2,200 × 6]> <tibble [0 × 4]>
 2 <split [900/100]> Fold02 <tibble [2,200 × 6]> <tibble [0 × 4]>
 3 <split [900/100]> Fold03 <tibble [2,200 × 6]> <tibble [0 × 4]>
 4 <split [900/100]> Fold04 <tibble [2,200 × 6]> <tibble [0 × 4]>
 5 <split [900/100]> Fold05 <tibble [2,200 × 6]> <tibble [0 × 4]>
 6 <split [900/100]> Fold06 <tibble [2,200 × 6]> <tibble [0 × 4]>
 7 <split [900/100]> Fold07 <tibble [2,200 × 6]> <tibble [0 × 4]>
 8 <split [900/100]> Fold08 <tibble [2,200 × 6]> <tibble [0 × 4]>
 9 <split [900/100]> Fold09 <tibble [2,200 × 6]> <tibble [0 × 4]>
10 <split [900/100]> Fold10 <tibble [2,200 × 6]> <tibble [0 × 4]>
toc()
207.339 sec elapsed
plan(sequential)

3.4.1 foreach()

foreach() can run natively in sequential mode using %do%

tic()
x <- foreach(time = c(2, 2, 2), .combine = "c") %do% {
  Sys.sleep(time)
  time
}
toc()
6.02 sec elapsed

foreach90 supports parallel processing with the future backend like future_map and tidymodels functions. However, there are two additiona adjustments needed:

  • The doFuture is needed.
  • and %dofuture% replaces %do% (or the previously used %dopar%) to run code in parallel.
# library(future)  # need this loaded at top of script to use plan as before
library(doFuture) # also need doFuture to use %dofuture% with foreach()

plan(multisession, workers = parallel::detectCores(logical = FALSE))

tic()
x <- foreach(time = c(2, 2, 2), .combine = "c") %dofuture% {
  Sys.sleep(time)
  time
}
toc()
2.394 sec elapsed
plan(sequential)

There are some nuances about using random numbers inside of foreach() loops in parallel. I still need a demo on this. There is some discussion of doFuture handling this better natively. I have also read that %dorng% is recommended but this may be dated.

set.seed(20140102)
x <- foreach(time = c(2, 2, 2), .combine = "c") %dofuture% {
  Sys.sleep(time)
  rnorm(1)
}
print(x)
[1] -0.6978376  0.4016699 -0.4128653
# reproducible?
set.seed(20140102)
x <- foreach(time = c(2, 2, 2), .combine = "c") %dofuture% {
  Sys.sleep(time)
  rnorm(1)
}
print(x)
[1] -0.6978376  0.4016699 -0.4128653