Model efficiency is a critical aspect of developing and deploying machin...
There remain many open questions pertaining to the scaling behaviour of
...
Dense retrieval has been shown to be effective for retrieving relevant
d...
Self-attention has the promise of improving computer vision systems due ...
We present BoTNet, a conceptually simple yet powerful backbone architect...
Self-attention has recently been adopted for a wide range of sequence
mo...
Convolutions are a fundamental building block of modern computer vision
...
Advances in learning and representations have reinvigorated work that
co...
Convolutional networks have been the paradigm of choice in many computer...
Batch-splitting (data-parallelism) is the dominant distributed Deep Neur...
Music relies heavily on self-reference to build structure and meaning. W...
Music relies heavily on repetition to build structure and meaning.
Self-...
Artificial intelligence (AI) has undergone a renaissance recently, makin...
Deep neural networks with discrete latent variables offer the promise of...
Tensor2Tensor is a library for deep learning models that is well-suited ...
Autoregressive sequence models based on deep neural networks, such as RN...
Relying entirely on an attention mechanism, the Transformer introduced b...
Image generation has been successfully cast as an autoregressive sequenc...
Image generation has been successfully cast as an autoregressive sequenc...
Deep learning yields great results across many fields, from speech
recog...
The dominant sequence transduction models are based on complex recurrent...