Generating knockoffs via conditional independence
Let X be a p-variate random vector and X a knockoff copy of X (in the sense of <cit.>). A new approach for constructing X (henceforth, NA) has been introduced in <cit.>. NA has essentially three advantages: (i) To build X is straightforward; (ii) The joint distribution of (X,X) can be written in closed form; (iii) X is often optimal under various criteria, including mean absolute correlation and reconstructability. However, for NA to apply, the distribution of X needs to be of the form (*) P(X_1∈ A_1,…,X_p∈ A_p)=E{∏_i=1^pP(X_i∈ A_i| Z)} for some random element Z. Our first result is that any probability measure μ on ℝ^p can be approximated by a probability measure μ_0 which makes condition (*) true. If μ is absolutely continuous, the approximation holds in total variation distance. In applications, regarding μ as the distribution of X, this result suggests using the knockoffs based on μ_0 instead of those based on μ (which are generally unknown). Our second result is a characterization of the pairs (X,X) where X is obtained via NA. It turns out that (X,X) is of this type if and only if it can be extended to an infinite sequence so as to satisfy certain invariance conditions. The basic tool for proving this fact is de Finetti's theorem for partially exchangeable sequences.
READ FULL TEXT