Characterization of anomalous diffusion through convolutional transformers
The results of the Anomalous Diffusion Challenge (AnDi Challenge) have shown that machine learning methods can outperform classical statistical methodology at the characterization of anomalous diffusion in both the inference of the anomalous diffusion exponent alpha associated with each trajectory (Task 1), and the determination of the underlying diffusive regime which produced such trajectories (Task 2). Furthermore, of the five teams that finished in the top three across both tasks of the AnDi challenge, three of those teams used recurrent neural networks (RNNs). While RNNs, like the long short-term memory (LSTM) network, are effective at learning long-term dependencies in sequential data, their key disadvantage is that they must be trained sequentially. In order to facilitate training with larger data sets, by training in parallel, we propose a new transformer based neural network architecture for the characterization of anomalous diffusion. Our new architecture, the Convolutional Transformer (ConvTransformer) uses a bi-layered convolutional neural network to extract features from our diffusive trajectories that can be thought of as being words in a sentence. These features are then fed to two transformer encoding blocks that perform either regression or classification. To our knowledge, this is the first time transformers have been used for characterizing anomalous diffusion. Moreover, this may be the first time that a transformer encoding block has been used with a convolutional neural network and without the need for a transformer decoding block or positional encoding. Apart from being able to train in parallel, we show that the ConvTransformer is able to outperform the previous state of the art at determining the underlying diffusive regime in short trajectories (length 10-50 steps), which are the most important for experimental researchers.
READ FULL TEXT