Learning to extrapolate using continued fractions: Predicting the critical temperature of superconductor materials
In Artificial Intelligence we often seek to identify an unknown target function of many variables y=f(𝐱) giving a limited set of instances S={(𝐱^(𝐢),y^(i))} with 𝐱^(𝐢)∈ D where D is a domain of interest. We refer to S as the training set and the final quest is to identify the mathematical model that approximates this target function for new 𝐱; with the set T={𝐱^(𝐣)}⊂ D with T ≠ S (i.e. thus testing the model generalisation). However, for some applications, the main interest is approximating well the unknown function on a larger domain D' that contains D. In cases involving the design of new structures, for instance, we may be interested in maximizing f; thus, the model derived from S alone should also generalize well in D' for samples with values of y larger than the largest observed in S. In that sense, the AI system would provide important information that could guide the design process, e.g., using the learned model as a surrogate function to design new lab experiments. We introduce a method for multivariate regression based on iterative fitting of a continued fraction by incorporating additive spline models. We compared it with established methods such as AdaBoost, Kernel Ridge, Linear Regression, Lasso Lars, Linear Support Vector Regression, Multi-Layer Perceptrons, Random Forests, Stochastic Gradient Descent and XGBoost. We tested the performance on the important problem of predicting the critical temperature of superconductors based on physical-chemical characteristics.
READ FULL TEXT