Convolutional Neural Networks Trained to Identify Words Provide a Good Account of Visual Form Priming Effects
A wide variety of orthographic coding schemes and models of visual word identification have been developed to account for masked priming data that provide a measure of orthographic similarity between letter strings. These models tend to include hand-coded orthographic representations with single unit coding for specific forms of knowledge (e.g., units coding for a letter in a given position or a letter sequence). Here we assess how well a range of these coding schemes and models account for the pattern of form priming effects taken from the Form Priming Project and compare these findings to results observed in with 11 standard deep neural network models (DNNs) developed in computer science. We find that deep convolutional networks perform as well or better than the coding schemes and word recognition models, whereas transformer networks did less well. The success of convolutional networks is remarkable as their architectures were not developed to support word recognition (they were designed to perform well on object recognition) and they classify pixel images of words (rather artificial encodings of letter strings). The findings add to the recent work of (Hannagan et al., 2021) suggesting that convolutional networks may capture key aspects of visual word identification.
READ FULL TEXT