The acquisition of high-quality human annotations through crowdsourcing
...
We observe a severe under-reporting of the different kinds of errors tha...
Machine learning approaches applied to NLP are often evaluated by summar...
We introduce GEM, a living benchmark for natural language Generation (NL...