Weakly-supervised learning has emerged as a promising approach to levera...
Contrastive Language-Audio Pretraining (CLAP) is pre-trained to associat...
Learning meaningful frame-wise features on a partially labeled dataset i...
Text-based audio generation models have limitations as they cannot encom...
Recently, the ability of language models (LMs) has attracted increasing
...
The significance of multi-scale features has been gradually recognized b...
In this paper, we describe in detail our system for DCASE 2022 Task4. Th...
Recently, an event-based end-to-end model (SEDT) has been proposed for s...
The recently proposed Mean Teacher has achieved state-of-the-art results...
Sound event detection (SED) has gained increasing attention with its wid...
Due to the limitation of strong-labeled sound event detection data set, ...
In this paper, we describe in detail our systems for DCASE 2020 Task 4. ...
The dominant automatic lexical stress detection method is to split the
u...
In this paper, we describe in detail the system we submitted to DCASE201...
We propose a simple and efficient method to combine semi-supervised lear...
Sound event detection (SED) is to recognize the presence of sound events...
We propose a disentangled feature for weakly supervised multiclass sound...