Causal exposure-response curve estimation with surrogate confounders: a study of air pollution and children's health in Medicaid claims data
In this paper, we undertake a case study in which interest lies in estimating a causal exposure-response function (ERF) for long-term exposure to fine particulate matter (PM_2.5) and respiratory hospitalizations in socioeconomically disadvantaged children using nationwide Medicaid claims data. New methods are needed to address the specific challenges the Medicaid data present. First, Medicaid eligibility criteria, which are largely based on family income for children, differ by state, creating socioeconomically distinct populations and leading to clustered data, where zip codes (our units of analysis) are nested within states. Second, Medicaid enrollees' individual-level socioeconomic status, which is known to be a confounder and an effect modifier of the exposure-response relationships under study, is not available. However, two useful surrogates are available: median household income of each enrollee's zip code of residence and state-level Medicaid family income eligibility thresholds for children. In this paper, we introduce a customized approach, called MedMatch, that builds on generalized propensity score matching methods for estimating causal ERFs, adapting these approaches to leverage our two surrogate variables to account for potential confounding and/or effect modification by socioeconomic status. We conduct extensive simulation studies, consistently demonstrating the strong performance of MedMatch relative to conventional approaches to handling the surrogate variables. We apply MedMatch to estimate the causal ERF between long-term PM_2.5 exposure and first respiratory hospitalization among children in Medicaid from 2000 to 2012. We find a positive association, with a steeper curve at PM_2.5≤ 8 μg/m^3 that levels off at higher concentrations.
READ FULL TEXT