Microbiome AB/BA Crossover Design


The motivation for this post was also recent question on twitter asking how one might approach the analysis of an AB/BA crossover study when the goal is to estimate the difference in the relative abundance of microbial taxa under treatments A and B. I recently consulted on the analysis of such a project, so I thought I would share the approach we used in case it might be of help to others.

A good introduction to the design and analysis of the crossover design is described in the online course notes developed by Penn State’s Department of Statistics for STAT 509: Design and Analysis of Clinical Trials. For a deeper dive, check out Stephen Senn’s Cross-over Trials In Clinical Research text. In general, a crossover study utilizes a repeated measurements design where participants receive each treatment, in differing (randomized) sequences, and inference is focused on the main effect of treatment. The major advantage of the crossover design is that the within-subjects design allows for increased precision relative to a parallel arm trial. The most common example, and the one I focus on here, is the 2 x 2 crossover. In this setup, there are two treatments (A and B), and two periods (1 and 2), resulting in two sequences (AB or BA) and hence the name “AB/BA design”.

A major limitation of the crossover design is that it cannot be used to study curative treatments (i.e., no treatments with a lasting effect). Thus, in clinical studies it is often used to study palliative treatments (of chronic disease), treatments with short effect durations, and bioequivalence studies. A limitation of the AB/BA design in particular, is that the treatment effect is confounded by the treatment sequence in the presence of non-equal carryover effects (i.e., when an effect from period one carries over to period two and is not the same for treatments A and B). See this paper by Dr. Senn for a more detailed explanation or Chapter 17 in his text Statistical Issues in Drug Development for more detail. Thus, the design should only be used when one is certain that the washout period is sufficient to eliminate any carryover effect.

Should potential carryover be of concern, alternative crossover designs including additional sequences, or perhaps conducting the parallel arm trial, should be considered. One can (and often do) test for an interaction between treatment and sequence in an attempt to detect differential carryover; however, this test is expected to have low statistical power since (see sources above) it is based on a sample size requirement for a within-subjects comparison, likely of smaller magnitude than the expected treatment effect, and only manifests in the second period (i.e., half of observations). So, testing for the interaction has been criticized for being of limited use, as has the formerly popular two-stage approach of Grizzle since it does not control the nominal significance level at the claimed value. Thus, if one were to conduct an AB/BA study and realize a clear carryover effect, the best one might be able to do to in an attempt to salvage something from the study is to analyze the data collected during the first period as if it were obtained from a parallel arm trial (i.e., throw out half your data). This analysis would be expected to be unbiased, but the limited sample size will lead to much lower precision/power in the estimated treatment effect.

I mention all this only to make clear that this post is not meant to condone the use of the AB/BA design to study the effects of treatments on microbial differential abundance, but rather to provide an example of how one might go about the analysis should they decide the design is suitable for their purposes. I have no idea how long a washout period might need to be for microbial communities at a given site (i.e., stool, skin, soil, etc.) to return to their “baseline state” after some given treatment. I expect this decision will be highly study specific. However, for environments like the human gut, skin, saliva, etc., I do often expect it will return (perhaps rather quickly) to some stable state after most treatments. Thus, when one can safely rule out a differential carryover effect of treatment, the precision/power gains from the crossover design can be appreciable (see STAT 509 link above).


Load libraries

library(curatedMetagenomicData); packageVersion("curatedMetagenomicData") 
## [1] '3.0.10'
library(tidyverse); packageVersion("tidyverse")                                                               
## [1] '1.3.1'
library(phyloseq); packageVersion("phyloseq")                              
## [1] '1.36.0'
library(microbiome); packageVersion("microbiome")
## [1] '1.14.0'
library(broom); packageVersion("broom")
## [1] '1.0.1'


Pull down example data

Since I do not own (i.e., cannot share) the data from the project I consulted on, and I do not know of any easily accessible data from a published AB/BA design, I will just grab some pre-post data in IBD patients from the curatedMetagenomicData package I used before and set up a “mock” AB/BA design so that you can see the approach and have access to the code. I use time 1 as treatment A and time 2 as treatment B, and then randomly assign participants to either sequence “AB” or “BA”.

I do that here.

meta_df <- curatedMetagenomicData::sampleMetadata

mydata <- meta_df %>%
  filter(study_name == "HallAB_2017") %>%
  filter(visit_number %in% c(1, 6)) %>%
  filter(study_condition == "IBD") %>%
  arrange(subject_id, visit_number) %>%
  select(sample_id, subject_id, visit_number, body_site, study_condition, days_from_first_collection)

count_ids <- mydata %>%
  group_by(subject_id) %>%
  summarise(n = n()) %>%
  filter(n == 2)

keep_ids <- count_ids$subject_id

mydata <- mydata %>%
  filter(subject_id %in% keep_ids) %>%
  mutate(treatment = ifelse(visit_number == 1, "A", "B"),
         treatment = factor(treatment, levels = c("A", "B")))

set.seed(123)
mydata <- mydata %>%
  group_by(subject_id) %>% 
  mutate(period = sample.int(2, 2, replace = FALSE)) %>%
  ungroup() %>%
  mutate(seq = ifelse(period == 1 & treatment == "A", "AB",
                      ifelse(period == 1 & treatment == "B", "BA", NA))) %>%
  group_by(subject_id) %>% 
  fill(seq, .direction = "downup") %>%
  select(sample_id, subject_id, treatment, period, seq)
  
rm(keep_ids, count_ids, meta_df)

keep_id <- mydata$sample_id

my_se <- sampleMetadata %>%
  dplyr::filter(sample_id %in% keep_id) %>%
  curatedMetagenomicData::returnSamples("relative_abundance", counts = TRUE)

sp_df <- as.data.frame(as.matrix(assay(my_se)))

sp_df <- sp_df %>%
  rownames_to_column(var = "Species")

otu_tab <- sp_df %>%
  tidyr::separate(Species, into = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"), sep = "\\|") %>%
  dplyr::select(-Kingdom, -Phylum, -Class, -Order, -Family, -Genus) %>%
  dplyr::mutate(Species = gsub("s__", "", Species)) %>%
  tibble::column_to_rownames(var = "Species")

tax_tab <- sp_df %>%
  dplyr::select(Species) %>%
  tidyr::separate(Species, into = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"), sep = "\\|") %>%
  dplyr::mutate(Kingdom = gsub("k__", "", Kingdom),
                Phylum = gsub("p__", "", Phylum),
                Class = gsub("c__", "", Class),
                Order = gsub("o__", "", Order),
                Family = gsub("f__", "", Family),
                Genus = gsub("g__", "", Genus),
                Species = gsub("s__", "", Species)) %>%
  dplyr::mutate(spec_row = Species) %>%
  tibble::column_to_rownames(var = "spec_row")

meta_df <- mydata %>%
  tibble::column_to_rownames(var = "sample_id")

(ps <- phyloseq(sample_data(meta_df),
                otu_table(otu_tab, taxa_are_rows = TRUE),
                tax_table(as.matrix(tax_tab))))
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 507 taxa and 30 samples ]
## sample_data() Sample Data:       [ 30 samples by 4 sample variables ]
## tax_table()   Taxonomy Table:    [ 507 taxa by 7 taxonomic ranks ]
minTotRelAbun <- 1e-6           
x <- taxa_sums(ps)
keepTaxa <- (x / sum(x)) > minTotRelAbun
(ps <- prune_taxa(keepTaxa, ps))
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 292 taxa and 30 samples ]
## sample_data() Sample Data:       [ 30 samples by 4 sample variables ]
## tax_table()   Taxonomy Table:    [ 292 taxa by 7 taxonomic ranks ]
rm(list= ls()[!(ls() %in% c("ps"))])


You can see that we end up with data on 292 taxa from 30 samples.


Examine sample metadata for AB/BA design

In this code chunk, we look at the sample metadata from the AB/BA design. We can see that the data are in long format and we have variables for subject, treatment, period, and sequence (sometimes referred to as group).

We have an even number of treatments and periods and 9 assigned to treatment A, and 6 to treatment B, in period 1.

meta_df <- ps %>%
  sample_data(.) %>%
  data.frame(.) %>%
  arrange(subject_id)

head(meta_df, 10)
##           subject_id treatment period seq
## p8582_mo6      p8582         B      2  AB
## p8582_mo1      p8582         A      1  AB
## p8585_mo6      p8585         B      2  AB
## p8585_mo1      p8585         A      1  AB
## p8600_mo1      p8600         A      1  AB
## p8600_mo6      p8600         B      2  AB
## p8646_mo6      p8646         B      1  BA
## p8646_mo1      p8646         A      2  BA
## p8649_mo1      p8649         A      1  AB
## p8649_mo6      p8649         B      2  AB
meta_df %>% 
  count(treatment)
##   treatment  n
## 1         A 15
## 2         B 15
meta_df %>% 
  count(period)
##   period  n
## 1      1 15
## 2      2 15
meta_df %>% 
  distinct(subject_id, .keep_all = TRUE) %>%
  count(seq)
##   seq n
## 1  AB 9
## 2  BA 6


CLR transform

Given the uneven sequencing depths, we likely want to consider some approach to try and either normalize or transform the read counts across samples prior to analysis. Here I convert the counts to the centered log-ratio scale, but one could use whatever approach makes the most sense for their specific experiment.

ps_clr <- microbiome::transform(ps, 'clr')


Analysis of AB/BA design assuming equal carry over

Section 15.4 of the STAT509 link above shows the treatment effect, under the assumption of equal carryover, can be obtained as the mean difference of:

  • one half the difference in the effect of A – B when treatment A is given in period 1
  • one half the difference in the effect of B – A when treatment B is given in period 1

The indicator term for sequence can then be used to form the contrast of interest. Here I obtain estimates and conduct formal tests using simple linear regression (i.e., equal variance t-test). One could also consider using a test that does not assume equal variance or a non-parametric alternative (Wilcoxon rank-sum) etc.

The indicator term, as entered into regression here, provides a contrast of treatment B when compared to treatment A.

ps_to_df <- function(ps_obj){
  my_otus <- data.frame(t(data.frame(otu_table(ps_obj))))
  my_otus <- my_otus %>%
    rownames_to_column(var = "id")
  
  meta_df <- data.frame(sample_data(ps_obj)) %>%
    rownames_to_column(var = "id") %>%
    rename_all(tolower)
  
  out_df <- left_join(meta_df, my_otus)
  return(out_df)
}

clr_df <- ps_to_df(ps_clr) %>%
  select(-id)

fp_df <- clr_df %>%
  select(subject_id, treatment, period, seq, Bacteroides_uniformis)
  
senn_df <- fp_df %>%
  select(-period) %>%
  pivot_wider(names_from = treatment, values_from = Bacteroides_uniformis) %>%
  mutate(diff = ifelse(seq == "AB", 0.5 * (A - B), 0.5 * (B - A))) %>%
  arrange(seq)

tidy(lm(diff ~ seq, data = senn_df))[2, ]
## # A tibble: 1 × 5
##   term  estimate std.error statistic p.value
##   <chr>    <dbl>     <dbl>     <dbl>   <dbl>
## 1 seqBA    -1.59      1.07     -1.48   0.163


We see that the relative abundance of B.uniformis is decreased under treatment B by -1.59 (1.07) CLR units.


One can also set up this test using a linear mixed-effect model with appropriate contrasts. An example is given on page 356 of John Lawson’s text Design and Analysis of Experiments with R. Below is an example of the code for those interested (not run). One would also need an approach to get the corrected degrees of freedom for testing (i.e., Satterthwaite or Kenward-Roger methods, etc.). The default contrast should be for A vs B here.

c1 <- c(.5, -.5)
lmer1 <- lmer(Bacteroides_uniformis ~ seq + period + treatment + (1 | subject_id:seq), 
             contrasts = list(seq = c1, period = c1, treatment = c1),
             data = fp_df)

summary(lmer)


Iterate AB/BA testing over all species

Here I just provide some example code to show how to iterate the approach over each species in the dataset (and generate an FDR p-value and convert estimates to the log2 fold-change scale) should that be of help.

clr_long_df <- clr_df %>%
  pivot_longer(cols = c(-subject_id,  -treatment, -period, -seq), 
               names_to = "species", values_to = "abundance") %>%
  arrange(species)

fit_senn <- function(my_df){
  senn_df <- my_df %>%
    select(-period) %>%
    pivot_wider(names_from = treatment, values_from = abundance) %>%
    mutate(diff = ifelse(seq == "AB", 0.5 * (A - B), 0.5 * (B - A)))
  
  m <- lm(diff ~ seq, data = senn_df)
  my_out <- tidy(m)[2, ]

  return(my_out)
}

res_df <- clr_long_df %>%
  tidyr::nest(data = c(-species)) %>%
  dplyr::mutate(model = map(data, ~ fit_senn(my_df = .))) %>%
  select(-data) %>%
  unnest(cols = c(model)) %>%
  arrange(p.value) %>%
  mutate(padj = p.adjust(p.value, method = "fdr"),
         log2FoldChange = log2(exp(estimate)),
         se = log2(exp(std.error))) %>%
  dplyr::select(species, term, log2FoldChange, se, statistic, p.value, padj)

head(res_df, 10)
## # A tibble: 10 × 7
##    species                        term  log2FoldCh…¹    se stati…² p.value  padj
##    <chr>                          <chr>        <dbl> <dbl>   <dbl>   <dbl> <dbl>
##  1 Turicibacter_sanguinis         seqBA        2.63  1.04     2.52  0.0256 0.915
##  2 Clostridium_bolteae            seqBA       -3.07  1.28    -2.39  0.0325 0.915
##  3 Eubacterium_sp_CAG_38          seqBA        4.50  1.98     2.28  0.0404 0.915
##  4 Collinsella_aerofaciens        seqBA        2.89  1.34     2.16  0.0503 0.915
##  5 Lawsonibacter_asaccharolyticus seqBA        0.936 0.443    2.11  0.0544 0.915
##  6 Lachnospira_pectinoschiza      seqBA       -2.49  1.24    -2.01  0.0662 0.915
##  7 Bacteroides_xylanisolvens      seqBA       -2.54  1.28    -1.98  0.0691 0.915
##  8 Alistipes_shahii               seqBA       -1.77  0.917   -1.94  0.0750 0.915
##  9 Lactobacillus_rogosae          seqBA        0.637 0.346    1.84  0.0888 0.915
## 10 Salmonella_enterica            seqBA        1.06  0.586    1.80  0.0948 0.915
## # … with abbreviated variable names ¹​log2FoldChange, ²​statistic


Session Info

sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] broom_1.0.1                    microbiome_1.14.0             
##  [3] phyloseq_1.36.0                forcats_0.5.1                 
##  [5] stringr_1.4.0                  dplyr_1.0.9                   
##  [7] purrr_0.3.4                    readr_2.0.2                   
##  [9] tidyr_1.2.0                    tibble_3.1.8                  
## [11] ggplot2_3.3.5                  tidyverse_1.3.1               
## [13] curatedMetagenomicData_3.0.10  TreeSummarizedExperiment_2.0.3
## [15] Biostrings_2.60.2              XVector_0.32.0                
## [17] SingleCellExperiment_1.14.1    SummarizedExperiment_1.22.0   
## [19] Biobase_2.52.0                 GenomicRanges_1.44.0          
## [21] GenomeInfoDb_1.28.4            IRanges_2.26.0                
## [23] S4Vectors_0.30.2               BiocGenerics_0.38.0           
## [25] MatrixGenerics_1.4.3           matrixStats_0.61.0            
## 
## loaded via a namespace (and not attached):
##   [1] readxl_1.3.1                  backports_1.4.1              
##   [3] AnnotationHub_3.0.2           BiocFileCache_2.0.0          
##   [5] plyr_1.8.6                    igraph_1.2.7                 
##   [7] lazyeval_0.2.2                splines_4.1.1                
##   [9] BiocParallel_1.26.2           digest_0.6.28                
##  [11] foreach_1.5.1                 htmltools_0.5.2              
##  [13] fansi_0.5.0                   magrittr_2.0.3               
##  [15] memoise_2.0.0                 cluster_2.1.2                
##  [17] tzdb_0.1.2                    modelr_0.1.8                 
##  [19] colorspace_2.0-2              blob_1.2.2                   
##  [21] rvest_1.0.2                   rappdirs_0.3.3               
##  [23] haven_2.4.3                   xfun_0.29                    
##  [25] crayon_1.4.1                  RCurl_1.98-1.5               
##  [27] jsonlite_1.7.2                survival_3.2-11              
##  [29] iterators_1.0.13              ape_5.5                      
##  [31] glue_1.6.2                    gtable_0.3.0                 
##  [33] zlibbioc_1.38.0               DelayedArray_0.18.0          
##  [35] Rhdf5lib_1.14.2               scales_1.2.1                 
##  [37] DBI_1.1.1                     Rcpp_1.0.7                   
##  [39] xtable_1.8-4                  tidytree_0.3.5               
##  [41] bit_4.0.4                     httr_1.4.2                   
##  [43] ellipsis_0.3.2                pkgconfig_2.0.3              
##  [45] sass_0.4.2                    dbplyr_2.1.1                 
##  [47] utf8_1.2.2                    tidyselect_1.1.2             
##  [49] rlang_1.0.4                   reshape2_1.4.4               
##  [51] later_1.3.0                   AnnotationDbi_1.54.1         
##  [53] munsell_0.5.0                 BiocVersion_3.13.1           
##  [55] cellranger_1.1.0              tools_4.1.1                  
##  [57] cachem_1.0.6                  cli_3.3.0                    
##  [59] generics_0.1.3                RSQLite_2.2.8                
##  [61] ExperimentHub_2.0.0           ade4_1.7-18                  
##  [63] evaluate_0.16                 biomformat_1.20.0            
##  [65] fastmap_1.1.0                 yaml_2.2.1                   
##  [67] knitr_1.40                    bit64_4.0.5                  
##  [69] fs_1.5.2                      KEGGREST_1.32.0              
##  [71] nlme_3.1-152                  mime_0.12                    
##  [73] xml2_1.3.2                    compiler_4.1.1               
##  [75] rstudioapi_0.13               filelock_1.0.2               
##  [77] curl_4.3.2                    png_0.1-7                    
##  [79] interactiveDisplayBase_1.30.0 reprex_2.0.1                 
##  [81] treeio_1.16.2                 bslib_0.3.1                  
##  [83] stringi_1.7.5                 blogdown_1.7                 
##  [85] lattice_0.20-44               Matrix_1.3-4                 
##  [87] permute_0.9-5                 vegan_2.5-7                  
##  [89] multtest_2.48.0               vctrs_0.4.1                  
##  [91] pillar_1.8.0                  lifecycle_1.0.1              
##  [93] rhdf5filters_1.4.0            BiocManager_1.30.16          
##  [95] jquerylib_0.1.4               data.table_1.14.2            
##  [97] bitops_1.0-7                  httpuv_1.6.3                 
##  [99] R6_2.5.1                      bookdown_0.24                
## [101] promises_1.2.0.1              codetools_0.2-18             
## [103] MASS_7.3-54                   assertthat_0.2.1             
## [105] rhdf5_2.36.0                  withr_2.4.2                  
## [107] GenomeInfoDbData_1.2.6        mgcv_1.8-36                  
## [109] hms_1.1.1                     grid_4.1.1                   
## [111] rmarkdown_2.11                Rtsne_0.15                   
## [113] shiny_1.7.1                   lubridate_1.8.0
Associate Professor of Pediatrics

Related