Selection of invalid instruments can improve estimation in Mendelian randomization
Mendelian randomization (MR) is a widely-used method to identify causal links between a risk factor and disease. A fundamental part of any MR analysis is to choose appropriate genetic variants as instrumental variables. Current practice usually involves selecting only those genetic variants that are deemed to satisfy certain exclusion restrictions, in a bid to remove bias from unobserved confounding. Many more genetic variants may violate these exclusion restrictions due to unknown pleiotropic effects (i.e. direct effects on the outcome not via the exposure), but their inclusion could increase the precision of causal effect estimates at the cost of allowing some bias. We explore how to optimally tackle this bias-variance trade-off by carefully choosing from many weak and locally invalid instruments. Specifically, we study a focused instrument selection approach for publicly available two-sample summary data on genetic associations, whereby genetic variants are selected on the basis of how they impact the asymptotic mean square error of causal effect estimates. We show how different restrictions on the nature of pleiotropic effects have important implications for the quality of post-selection inferences. In particular, a focused selection approach under systematic pleiotropy allows for consistent model selection, but in practice can be susceptible to winner's curse biases. Whereas a more general form of idiosyncratic pleiotropy allows only conservative model selection, but offers uniformly valid confidence intervals. We propose a novel method to tighten honest confidence intervals through support restrictions on pleiotropy. We apply our results to several real data examples which suggest that the optimal selection of instruments does not only involve biologically-justified valid instruments, but additionally hundreds of potentially pleiotropic variants.
READ FULL TEXT