7  Flowcharts

This is a short demo for making a flowchart in R.

Here are two links to more information on the DiagrammeR function:
https://bookdown.org/yihui/rmarkdown-cookbook/diagrams.html
https://rich-iannone.github.io/DiagrammeR/graphviz_and_mermaid.html

Here is a basic example of using DiagrammeR to create a flowchart of participant retention.
The process has 3 main parts:
1. Define the nodes
2. Define the edges
3. Define node labels (content of nodes) using footnotes.

Note: for this simple flowchart I am manually entering the number of particpants as a string. You can also use data in your dataset as content for the label.
ex. ‘Starting Sample = 216’ could also be str_c(‘Starting Sample =’, nrow(sample))


DiagrammeR::grViz("
  digraph {
  graph []
  
  node [fontname = Helvetica, shape = rectangle, fixedsize = true, width = 2.5]
  a [label = '@@1']
  b [label = '@@2']
  c [label = '@@3']
  d [label = '@@4']
  e [label = '@@5']
  f [label = '@@6']
  g [label = '@@7']

  a -> b -> d -> f
  a -> c
  b -> e
  d -> g
  }
  
  [1]: 'Participants screened \\n n = 300'
  [2]: 'Participants eligible \\n n = 250'
  [3]: 'Participants ineligible \\n n = 50'
  [4]: 'Participants enrolled \\n n = 215'
  [5]: 'Participants did not enroll \\n n = 35'
  [6]: 'Participants completed study \\n n= 175'
  [7]: 'Participants discontinued \\n n = 40'
  ") 

Here is a more complicated example of participant enrollment flow code, demonstrating how to code a) edges (lines) that connect to more than one node (box), b) more descriptive node labels, and c) manual line breaks in the node text.

grViz("digraph {

graph[layout = dot, rankdir = TB]  #top to bottom vs LR for left to right

# general node definition
node [shape = rectangle, style = filled, fillcolor = LightBlue, fontsize=14]

#specific nodes with label text specified later
reddit [label = '@@1']
community [label = '@@2']
clinician [label = '@@3']
craigslist [label = '@@15']
unknown [label = '@@4']
screen [width = 3,label = '@@5']
screen_fail [shape = octagon, fillcolor= Red, height = 1.3, width = 1, label = '@@6']
screen_pass [width = 4, label = '@@7']
fail_reasons [label = '@@8']
enrolled [label = '@@9']
not_enrolled [shape = octagon, fillcolor= Red, height = 1.5, label = '@@10']
not_enrolled_why [label = '@@11']
on_study [label = '@@12']
off_study [label = '@@13']
disposition [label = '@@14']


#edge definitions (lines between boxes)
{reddit community craigslist clinician unknown} -> screen -> {screen_fail screen_pass}
screen_fail -> fail_reasons
screen_pass -> {not_enrolled enrolled}
not_enrolled -> not_enrolled_why
enrolled -> {on_study off_study}
off_study -> disposition
}


[1]: str_c('Reddit \\n(n = ', n_reddit, ')')
[2]: str_c('Community \\n(n = ', n_community, ')')
[3]: str_c('Clinician \\n(n = ', n_clinician, ')')
[4]: str_c('Unknown \\n(n = ', n_unknown, ')')
[5]: str_c('Screened \\n(n = ', n_reddit+n_community+n_clinician+n_craigsl+n_unknown, ')')
[6]: str_c('Screen fail \\n(n = ', n_ineligible, ')')
[7]: str_c('Screen pass \\n(n = ', n_eligible, ')')
[8]: str_c('Screen fail reasons: \\n', 'No Android (n = ', sum(the_hist$no_android), ')\\n', 'Internet Number (n = ', sum(the_hist$yes_internet_num), ')\\n', 'Multiple or unreliable phones (n = ', sum(the_hist$maintain_phone, the_hist$multiple_phones), ')\\n', 'Under 18 (n = ', sum(the_hist$under_18), ')\\n', 'No MAT (n = ', sum(the_hist$no_mat), ')\\n', 'Not MAT adherent (n = ', sum(the_hist$daily_not_adhere, the_hist$monthly_not_adhere), ')\\n', 'MAT too short (n = ', sum(the_hist$mat_too_short), ')\\n', 'MAT too long (n = ', sum(the_hist$mat_too_long), ')\\n', 'Relapsed (n = ', sum(the_hist$relapse), ')\\n\\n', mult_inel, ' participants screened out\\non 2+ conditions')
[9]: str_c('Enrolled \\n(n = ', nrow(all_digital), ')\\n', 'Scheduled (n = ', n_eligible_pending, ')')
[10]: str_c('Not enrolled \\n(n = ', not_enrolled_digital, ')')
[11]: str_c('Reasons not enrolled: Needs further exploration')
[12]: str_c('On study \\n(n = ', all_digital %>% filter(study_end_date > today()) %>% nrow(), ')')
[13]: str_c('Off study \\n(n = ', all_digital %>% filter(study_end_date <= today()) %>% nrow(), ')')
[14]: str_c('Disposition: \\nCompleted (n = ', disposition %>% filter(subid %in% all_digital$subid, disposition == 'complete') %>% nrow(), ')\\n', 'Usable data (n = ', all_digital %>% filter(study_end_date <= today(), subid %in% usable_subs$subid) %>% nrow(), ')\\n', 'Unusable (n = ', all_digital %>% filter(study_end_date <= today(), !(subid %in% usable_subs$subid)) %>% nrow(), ')')
[15]: str_c('Craigslist \\n(n = ', n_craigsl, ')')
")