Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning
We design efficient distance approximation algorithms for several classes of structured high-dimensional distributions. Specifically, we show algorithms for the following problems: Given sample access to two Bayesian networks P_1 and P_2 over known directed acyclic graphs G_1 and G_2 having n nodes and bounded in-degree, approximate d_tv(P_1,P_2) to within additive error ϵ using poly(n,ϵ) samples and time Given sample access to two ferromagnetic Ising models P_1 and P_2 on n variables with bounded width, approximate d_tv(P_1, P_2) to within additive error ϵ using poly(n,ϵ) samples and time Given sample access to two n-dimensional Gaussians P_1 and P_2, approximate d_tv(P_1, P_2) to within additive error ϵ using poly(n,ϵ) samples and time Given access to observations from two causal models P and Q on n variables that are defined over known causal graphs, approximate d_tv(P_a, Q_a) to within additive error ϵ using poly(n,ϵ) samples, where P_a and Q_a are the interventional distributions obtained by the intervention do(A=a) on P and Q respectively for a particular variable A Our results are the first efficient distance approximation algorithms for these well-studied problems. They are derived using a simple and general connection to distribution learning algorithms. The distance approximation algorithms imply new efficient algorithms for tolerant testing of closeness of the above-mentioned structured high-dimensional distributions.
READ FULL TEXT