Title: | Matching Methods for Time-Varying Observational Studies |
---|---|
Description: | Implements popular methods for matching in time-varying observational studies. Matching is difficult in this scenario because participants can be treated at different times which may have an influence on the outcomes. The core methods include: "Balanced Risk Set Matching" from Li, Propert, and Rosenbaum (2011) <doi:10.1198/016214501753208573> and "Propensity Score Matching with Time-Dependent Covariates" from Lu (2005) <doi:10.1111/j.1541-0420.2005.00356.x>. Some functions use the 'Gurobi' optimization back-end to improve the optimization problem speed; the 'gurobi' R package and associated software can be downloaded from <https://www.gurobi.com> after obtaining a license. |
Authors: | Sean Kent [aut, cre, cph] , Mitchell Paukner [aut, cph] |
Maintainer: | Sean Kent <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.1 |
Built: | 2024-11-16 04:50:24 UTC |
Source: | https://github.com/skent259/rsmatch |
Perform balanced risk set matching as described in Li et al. (2001) "Balanced Risk Set Matching". Given a longitudinal data frame with covariate information, along with treatment time, build a MIP problem that matches treated individuals to those that haven't been treated yet (or are never treated) based on minimizing the Mahalanobis distance between covariates. If balancing is desired, the model will try to minimize the imbalance in terms of specified balancing covariates in the final pair output. Each treated individual is matched to one other individual.
brsmatch( n_pairs, data, id = "id", time = "time", trt_time = "trt_time", covariates = NULL, balance = TRUE, balance_covariates = NULL, exact_match = NULL, options = list(time_lag = FALSE, verbose = FALSE, optimizer = c("glpk", "gurobi")) )
brsmatch( n_pairs, data, id = "id", time = "time", trt_time = "trt_time", covariates = NULL, balance = TRUE, balance_covariates = NULL, exact_match = NULL, options = list(time_lag = FALSE, verbose = FALSE, optimizer = c("glpk", "gurobi")) )
n_pairs |
The number of pairs desired from matching. |
data |
A data.frame or similar containing columns matching the |
id |
A character specifying the id column name (default |
time |
A character specifying the time column name (default |
trt_time |
A character specifying the treatment time column name
(default |
covariates |
A character vector specifying the covariates to use for
matching (default |
balance |
A logical value indicating whether to include balancing constraints in the matching process. |
balance_covariates |
A character vector specifying the covariates to use
for balancing (default |
exact_match |
A vector of optional covariates to perform exact matching
on. If |
options |
A list of additional parameters with the following components:
|
Note that when using exact matching, the n_pairs
are split roughly in
proportion to the number of treated subjects in each exact matching group.
If you would like to control n_pairs
exactly, we suggest manually
performing exact matching, for example with split()
, and selecting
n_pairs
for each group interactively.
A data frame containing the pair information. The data frame has
columns id
, pair_id
, and type
. id
matches the input parameter and
will contain all ids from the input data frame. pair_id
refers to the id
of the computed pairs; NA
values indicate unmatched individuals. type
indicates whether the individual in the pair is considered as treatment
("trt") or control ("all") in that pair.
Sean Kent
Li, Yunfei Paul, Kathleen J Propert, and Paul R Rosenbaum. 2001. "Balanced Risk Set Matching." Journal of the American Statistical Association 96 (455): 870-82. doi:10.1198/016214501753208573
if (requireNamespace("Rglpk", quietly = TRUE)) { library(dplyr, quietly = TRUE) pairs <- brsmatch( n_pairs = 13, data = oasis, id = "subject_id", time = "visit", trt_time = "time_of_ad", balance = FALSE ) na.omit(pairs) # evaluate the first match first_match <- pairs$subject_id[which(pairs$pair_id == 1)] oasis %>% dplyr::filter(subject_id %in% first_match) }
if (requireNamespace("Rglpk", quietly = TRUE)) { library(dplyr, quietly = TRUE) pairs <- brsmatch( n_pairs = 13, data = oasis, id = "subject_id", time = "visit", trt_time = "time_of_ad", balance = FALSE ) na.omit(pairs) # evaluate the first match first_match <- pairs$subject_id[which(pairs$pair_id == 1)] oasis %>% dplyr::filter(subject_id %in% first_match) }
Perform propensity score matching as described in Lu (2005) "Propensity Score Matching with Time-Dependent Covariates". Given a longitudinal data frame with covariate information, along with treatment time, match treated individuals to those that haven't been treated yet (or are never treated) based on time-dependent propensity scores from a Cox proportional hazards model. Each treated individual is matched to one other individual, unless the number of pairs is specified.
coxpsmatch( n_pairs = 10^10, data, id = "id", time = "time", trt_time = "trt_time", covariates = NULL, exact_match = NULL, options = list(time_lag = FALSE) )
coxpsmatch( n_pairs = 10^10, data, id = "id", time = "time", trt_time = "trt_time", covariates = NULL, exact_match = NULL, options = list(time_lag = FALSE) )
n_pairs |
The number of pairs desired from matching. |
data |
A data.frame or similar containing columns matching the |
id |
A character specifying the id column name (default |
time |
A character specifying the time column name (default |
trt_time |
A character specifying the treatment time column name
(default |
covariates |
A character vector specifying the covariates to use for
matching (default |
exact_match |
A vector of optional covariates to perform exact matching
on. If |
options |
A list of additional parameters with the following components:
|
A data frame containing the pair information. The data frame has
columns id
, pair_id
, and type
. id
matches the input parameter and
will contain all ids from the input data frame. pair_id
refers to the id
of the computed pairs; NA
values indicate unmatched individuals. type
indicates whether the individual in the pair is considered as treatment
("trt") or control ("all") in that pair.
Mitchell Paukner
Lu, Bo. 2005. "Propensity Score Matching with Time-Dependent Covariates." Biometrics 61 (3): 721-28. doi:10.1111/j.1541-0420.2005.00356.x
if (requireNamespace("survival", quietly = TRUE) & requireNamespace("nbpMatching", quietly = TRUE)) { library(dplyr, quietly = TRUE) pairs <- coxpsmatch( n_pairs = 13, data = oasis, id = "subject_id", time = "visit", trt_time = "time_of_ad" ) na.omit(pairs) # evaluate the first match first_match <- pairs$subject_id[which(pairs$pair_id == 1)] oasis %>% dplyr::filter(subject_id %in% first_match) }
if (requireNamespace("survival", quietly = TRUE) & requireNamespace("nbpMatching", quietly = TRUE)) { library(dplyr, quietly = TRUE) pairs <- coxpsmatch( n_pairs = 13, data = oasis, id = "subject_id", time = "visit", trt_time = "time_of_ad" ) na.omit(pairs) # evaluate the first match first_match <- pairs$subject_id[which(pairs$pair_id == 1)] oasis %>% dplyr::filter(subject_id %in% first_match) }
A dataset containing baseline and time-varying information relating to Alzheimer's disease (AD) based on the Open Access Series of Imaging Studies (OASIS). This set consists of a longitudinal collection of 51 subjects aged 62 to 92. Each subject was scanned on two or more visits, separated by at least one year for a total of 115 imaging sessions. For each subject, 3 or 4 individual T1-weighted MRI scans obtained in single scan sessions are included.
oasis
oasis
A data frame with 115 rows and 11 variables:
unique subject identifier
visit order
visit in which a patient first had AD diagnosis
male or female
years of education
socioeconomic status (-1 for missing)
age of patient at visit
MR delay time (contrast)
estimated total intracranial volume
normalized whole brain volume
atlas scaling factor
The data was originally hosted in this Kaggle repository: https://www.kaggle.com/jboysen/mri-and-alzheimers?select=oasis_longitudinal.csv. It has been harmonized for an example analysis for risk set matching based on a reduced sample including patients who go from mild cognitive impairment (MCI) to AD and those patients with MCI throughout.
https://www.kaggle.com/jboysen/mri-and-alzheimers?select=oasis_longitudinal.csv