Deep Neural Networks with ReLU-Sine-Exponential Activations Break Curse of Dimensionality on Hölder Class
In this paper, we construct neural networks with ReLU, sine and 2^x as activation functions. For general continuous f defined on [0,1]^d with continuity modulus ω_f(·), we construct ReLU-sine-2^x networks that enjoy an approximation rate 𝒪(ω_f(√(d))·2^-M+ω_f(√(d)/N)), where M,N∈ℕ^+ denote the hyperparameters related to widths of the networks. As a consequence, we can construct ReLU-sine-2^x network with the depth 5 and width max{⌈2d^3/2(3μ/ϵ)^1/α⌉,2⌈log_23μ d^α/2/2ϵ⌉+2} that approximates f∈ℋ_μ^α([0,1]^d) within a given tolerance ϵ >0 measured in L^p norm p∈[1,∞), where ℋ_μ^α([0,1]^d) denotes the Hölder continuous function class defined on [0,1]^d with order α∈ (0,1] and constant μ > 0. Therefore, the ReLU-sine-2^x networks overcome the curse of dimensionality on ℋ_μ^α([0,1]^d). In addition to its supper expressive power, functions implemented by ReLU-sine-2^x networks are (generalized) differentiable, enabling us to apply SGD to train.
READ FULL TEXT