Real-world graphs naturally exhibit hierarchical or cyclical structures ...
Pretraining molecular representations from large unlabeled data is essen...
In-context learning (ICL) operates by showing language models (LMs) exam...
This paper studies the problem of open-domain question answering, with t...
In the context of multi-step reasoning, language models (LMs) probabilit...
This work explores the problem of generating task graphs of real-world
a...
Real-world tasks consist of multiple inter-dependent subtasks (e.g., a d...
Recently, Language Models (LMs) instruction-tuned on multiple tasks, als...
Since the recent advent of regulations for data protection (e.g., the Ge...
Despite surprising performance on zero-shot transfer, pre-training a
lar...
To overcome the quadratic cost of self-attention, recent works have prop...
Pretrained Language Models (LMs) memorize a vast amount of knowledge dur...
Graph pooling is a crucial operation for encoding hierarchical structure...
Since most of music has repetitive structures from motifs to phrases,
re...
We show that standard Transformers without graph-specific modifications ...
Continual learning (CL) aims to learn from sequentially arriving tasks
w...
Pre-trained large language models have shown successful progress in many...
We study unsupervised multi-hop reranking for multi-hop QA (MQA) with
op...
Batch Normalization (BN) is an essential layer for training neural netwo...
Across many data domains, co-occurrence statistics about the joint appea...
Spectral topic modeling algorithms operate on matrices/tensors of word
c...
The anchor words algorithm performs provably efficient topic model infer...
Spectral inference provides fast algorithms and provable optimality for
...
In this paper we present the initial development of a general theory for...
Question answering tasks have shown remarkable progress with distributed...