Recurrence required to capture the dynamic computations of the human ventral visual stream
The visual system is an intricate network of brain regions that enables us to recognize the world around us. Despite its abundant lateral and feedback connections, human object processing is commonly viewed and studied as a feedforward process. Here, we measure and model the rapid representational dynamics across multiple stages of the human ventral stream using time-resolved brain imaging and deep learning. We observe substantial representational transformations during the first 300 ms of processing within and across ventral-stream regions. Categorical divisions emerge in sequence, cascading forward and in reverse across regions, and Granger causality analysis suggests bidirectional information flow between regions. Finally, recurrent deep neural network models clearly outperform feedforward models in terms of their ability to jointly capture the multi-region cortical dynamics. These results establish that recurrent models are required to understand information processing in the human ventral stream.
READ FULL TEXT