How language models process complex input that requires multiple steps o...
Bi-encoder architectures for distantly-supervised relation extraction ar...
We present Semi-Structured Explanations for COPA (COPA-SSE), a new
crowd...
Improving model generalization on held-out data is one of the core objec...
Pretrained language models have been suggested as a possible alternative...
Pretrained language models, such as BERT and RoBERTa, have shown large
i...
Constructive feedback is an effective method for improving critical thin...
Recent work has validated the importance of subword information for word...
How can we represent hierarchical information present in large type
inve...
Pretrained contextual and non-contextual subword embeddings have become
...
We present BPEmb, a collection of pre-trained subword unit embeddings in...
Selectional preferences have long been claimed to be essential for
coref...