Creative Artificial Intelligence – Algorithms vs. humans in an incentivized writing competition
The release of openly available, robust text generation algorithms has spurred much public attention and debate, due to algorithm's purported ability to generate human-like text across various domains. Yet, empirical evidence using incentivized tasks to assess human behavioral reactions to such algorithms is lacking. We conducted two experiments assessing behavioral reactions to the state-of-the-art Natural Language Generation algorithm GPT-2 (Ntotal = 830). Using the identical starting lines of human poems, GPT-2 produced samples of multiple algorithmically-generated poems. From these samples, either a random poem was chosen (Human-out-of-the-loop) or the best one was selected (Human-in-the-loop) and in turn matched with a human written poem. Taking part in a new incentivized version of the Turing Test, participants failed to reliably detect the algorithmically-generated poems in the human-in-the-loop treatment, yet succeeded in the Human-out-of-the-loop treatment. Further, the results reveal a general aversion towards algorithmic poetry, independent on whether participants were informed about the algorithmic origin of the poem (Transparency) or not (Opacity). We discuss what these results convey about the performance of NLG algorithms to produce human-like text and propose methodologies to study such learning algorithms in experimental settings.
READ FULL TEXT 
  
  
     share
 share