The high computational and memory requirements of large language model (...
Recent advances in deep learning have relied heavily on the use of large...
State space models (SSMs) have high performance on long sequence modelin...
State space models (SSMs) have demonstrated state-of-the-art sequence
mo...
We study how networking corruptions–data corruptions caused by networkin...
Transformers are slow and memory-hungry on long sequences, since the tim...
Entity retrieval–retrieving information about entity mentions in a query...
An ideal learned representation should display transferability and
robus...
Foundation models offer an exciting new paradigm for constructing models...
Cable TV news reaches millions of U.S. households each day, meaning that...
Our goal is to enable machine learning systems to be trained interactive...
Weak supervision is a popular method for building machine learning model...
Since manually labeling training data is slow and expensive, recent
indu...
Many real-world video analysis applications require the ability to ident...
Prior work on Automatically Scalable Computation (ASC) suggests that it ...
Flocking is a coordinated collective behavior that results from local se...