Zalán Borsos

research

∙ 08/21/2023

TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition

We present TokenSplit, a speech separation model that acts on discrete t...

0 Hakan Erdogan, et al. ∙

research

∙ 06/22/2023

AudioPaLM: A Large Language Model That Can Speak and Listen

We introduce AudioPaLM, a large language model for speech understanding ...

0 Paul K. Rubenstein, et al. ∙

research

∙ 05/16/2023

SoundStorm: Efficient Parallel Audio Generation

We present SoundStorm, a model for efficient, non-autoregressive audio g...

0 Zalán Borsos, et al. ∙

research

∙ 03/23/2023

LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models

We introduce LMCodec, a causal neural speech codec that provides high qu...

0 Teerapat Jenrungrot, et al. ∙

research

∙ 02/07/2023

Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision

We introduce SPEAR-TTS, a multi-speaker text-to-speech (TTS) system that...

0 Eugene Kharitonov, et al. ∙

research

∙ 01/26/2023

MusicLM: Generating Music From Text

We introduce MusicLM, a model generating high-fidelity music from text d...

0 Andrea Agostinelli, et al. ∙

research

∙ 09/07/2022

AudioLM: a Language Modeling Approach to Audio Generation

We introduce AudioLM, a framework for high-quality audio generation with...

17 Zalán Borsos, et al. ∙

research

∙ 03/29/2022

Disentangling speech from surroundings in a neural audio codec

We present a method to separate speech signals from noisy environments i...

11 Ahmed Omran, et al. ∙

research

∙ 02/15/2022

SpeechPainter: Text-conditioned Speech Inpainting

We propose SpeechPainter, a model for filling in gaps of up to one secon...

0 Zalán Borsos, et al. ∙

research

∙ 09/26/2021

Data Summarization via Bilevel Optimization

The increasing availability of massive data sets poses a series of chall...

0 Zalán Borsos, et al. ∙

research

∙ 10/19/2020

MicAugment: One-shot Microphone Style Transfer

A crucial aspect for the successful deployment of audio-based models "in...

0 Zalán Borsos, et al. ∙

research

∙ 10/19/2020

Semi-supervised Batch Active Learning via Bilevel Optimization

Active learning is an effective technique for reducing the labeling cost...

0 Zalán Borsos, et al. ∙

research

∙ 06/06/2020

Coresets via Bilevel Optimization for Continual Learning and Streaming

Coresets are small data summaries that are sufficient for model training...

9 Zalán Borsos, et al. ∙

research

∙ 06/19/2019

Transfer NAS: Knowledge Transfer between Search Spaces with Transformer Agents

Recent advances in Neural Architecture Search (NAS) have produced state-...

0 Zalán Borsos, et al. ∙

research

∙ 03/29/2019

Online Variance Reduction with Mixtures

Adaptive importance sampling for stochastic optimization is a promising ...

20 Zalán Borsos, et al. ∙

research

∙ 11/22/2018

Inference of the three-dimensional chromatin structure and its temporal behavior

Understanding the three-dimensional (3D) structure of the genome is esse...

0 Bianca-Cristina Cristescu, et al. ∙

research

∙ 02/13/2018

Online Variance Reduction for Stochastic Optimization

Modern stochastic optimization methods often rely on uniform sampling wh...

0 Zalán Borsos, et al. ∙

Zalán Borsos

Featured Co-authors

Sign in with Google

Consider DeepAI Pro