Robustness against data loss with Algebraic Statistics
The paper describes an algorithm that, given an initial design ℱ_n of size n and a linear model with p parameters, provides a sequence ℱ_n ⊃…⊃ℱ_n-k⊃…⊃ℱ_p of nested robust designs. The sequence is obtained by the removal, one by one, of the runs of ℱ_n till a p-run saturated design ℱ_p is obtained. The potential impact of the algorithm on real applications is high. The initial fraction ℱ_n can be of any type and the output sequence can be used to organize the experimental activity. The experiments can start with the runs corresponding to ℱ_p and continue adding one run after the other (from ℱ_n-k to ℱ_n-k+1) till the initial design ℱ_n is obtained. In this way, if for some unexpected reasons the experimental activity must be stopped before the end when only n-k runs are completed, the corresponding ℱ_n-k has a high value of robustness for k ∈{1, …, n-p}. The algorithm uses the circuit basis, a special representation of the kernel of a matrix with integer entries. The effectiveness of the algorithm is demonstrated through the use of simulations.
READ FULL TEXT