Sakriani Sakti

research

∙ 01/08/2023

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain

This paper introduces SpeeChain, an open-source Pytorch-based toolkit de...

0 Heli Qi, et al. ∙

research

∙ 12/19/2022

NusaCrowd: Open Source Initiative for Indonesian NLP Resources

We present NusaCrowd, a collaborative initiative to collect and unite ex...

0 Samuel Cahyawijaya, et al. ∙

research

∙ 08/27/2022

Actor-identified Spatiotemporal Action Detection – Detecting Who Is Doing What in Videos

The success of deep learning on video Action Recognition (AR) has motiva...

20 Fan Yang, et al. ∙

research

∙ 06/01/2022

Speech Artifact Removal from EEG Recordings of Spoken Word Production with Tensor Decomposition

Research about brain activities involving spoken word production is cons...

0 Holy Lovenia, et al. ∙

research

∙ 05/14/2022

Improved Consistency Training for Semi-Supervised Sequence-to-Sequence ASR via Speech Chain Reconstruction and Self-Transcribing

Consistency regularization has recently been applied to semi-supervised ...

0 Heli Qi, et al. ∙

research

∙ 11/10/2020

Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS

This paper presents a newly developed, simultaneous neural speech-to-spe...

0 Katsuhito Sudoh, et al. ∙

research

∙ 11/04/2020

Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis

Even though over seven hundred ethnic languages are spoken in Indonesia,...

0 Sashi Novitasari, et al. ∙

research

∙ 11/04/2020

Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition

Attention-based sequence-to-sequence automatic speech recognition (ASR) ...

0 Sashi Novitasari, et al. ∙

research

∙ 11/04/2020

Incremental Machine Speech Chain Towards Enabling Listening while Speaking in Real-time

Inspired by a human speech chain mechanism, a machine speech chain frame...

0 Sashi Novitasari, et al. ∙

research

∙ 11/04/2020

Augmenting Images for ASR and TTS through Single-loop and Dual-loop Multimodal Chain Framework

Previous research has proposed a machine speech chain to enable automati...

0 Johanes Effendi, et al. ∙

research

∙ 10/12/2020

The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units

We present the Zero Resource Speech Challenge 2020, which aims at learni...

0 Ewan Dunbar, et al. ∙

research

∙ 07/07/2020

ReMOTS: Self-Supervised Refining Multi-Object Tracking and Segmentation

We aim to improve the performance of Multiple Object Tracking and Segmen...

0 Fan Yang, et al. ∙

research

∙ 07/07/2020

ReMOTS: Refining Multi-Object Tracking and Segmentation

We aim to improve the performance of Multiple Object Tracking and Segmen...

0 Fan Yang, et al. ∙

research

∙ 05/24/2020

Transformer VQ-VAE for Unsupervised Unit Discovery and Speech Synthesis: ZeroSpeech 2020 Challenge

In this paper, we report our submitted system for the ZeroSpeech 2020 ch...

0 Andros Tjandra, et al. ∙

research

∙ 11/24/2019

Using panoramic videos for multi-person localization and tracking in a 3D panoramic coordinate

This work proposes a new human-related video processing task named 3D pa...

0 Fan Yang, et al. ∙

research

∙ 10/02/2019

Speech-to-speech Translation between Untranscribed Unknown Languages

In this paper, we explore a method for training speech-to-speech transla...

0 Andros Tjandra, et al. ∙

research

∙ 07/23/2019

Make Skeleton-based Action Recognition Model Smaller, Faster and Better

Although skeleton-based action recognition has achieved great success in...

0 Fan Yang, et al. ∙

research

∙ 06/03/2019

From Speech Chain to Multimodal Chain: Leveraging Cross-modal Data Augmentation for Semi-supervised Learning

The most common way for humans to communicate is by speech. But perhaps ...

0 Johanes Effendi, et al. ∙

research

∙ 05/27/2019

VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019

We describe our submitted system for the ZeroSpeech Challenge 2019. The ...

0 Andros Tjandra, et al. ∙

research

∙ 04/25/2019

The Zero Resource Speech Challenge 2019: TTS without T

We present the Zero Resource Speech Challenge 2019, which proposes to bu...

0 Ewan Dunbar, et al. ∙

research

∙ 10/31/2018

End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator

The speech chain mechanism integrates automatic speech recognition (ASR)...

0 Andros Tjandra, et al. ∙

research

∙ 07/22/2018

Multi-scale Alignment and Contextual History for Attention Mechanism in Sequence-to-sequence Model

A sequence-to-sequence model is a neural network module for mapping two ...

0 Andros Tjandra, et al. ∙

research

∙ 03/28/2018

Machine Speech Chain with One-shot Speaker Adaptation

In previous work, we developed a closed-loop speech chain model based on...

0 Andros Tjandra, et al. ∙

research

∙ 02/28/2018

Tensor Decomposition for Compressing Recurrent Neural Network

In the machine learning fields, Recurrent Neural Network (RNN) has becom...

0 Andros Tjandra, et al. ∙

research

∙ 02/23/2018

Interactive Image Manipulation with Natural Language Instruction Commands

We propose an interactive image-manipulation system with natural languag...

2 Seitaro Shinagawa, et al. ∙

research

∙ 02/13/2018

Structured-based Curriculum Learning for End-to-end English-Japanese Speech Translation

Sequence-to-sequence attentional-based neural network architectures have...

0 Takatomo Kano, et al. ∙

research

∙ 10/30/2017

Sequence-to-Sequence ASR Optimization via Reinforcement Learning

Despite the success of sequence-to-sequence approaches in automatic spee...

0 Andros Tjandra, et al. ∙

research

∙ 09/22/2017

Attention-based Wav2Text with Feature Transfer Learning

Conventional automatic speech recognition (ASR) typically performs multi...

0 Andros Tjandra, et al. ∙

research

∙ 07/16/2017

Listening while Speaking: Speech Chain by Deep Learning

Despite the close relationship between speech perception and production,...

0 Andros Tjandra, et al. ∙

research

∙ 06/07/2017

Gated Recurrent Neural Tensor Network

Recurrent Neural Networks (RNNs), which are a powerful scheme for modeli...

0 Andros Tjandra, et al. ∙

research

∙ 05/23/2017

Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing

Recently, encoder-decoder neural networks have shown impressive performa...

0 Andros Tjandra, et al. ∙

Sakriani Sakti

Featured Co-authors

Sign in with Google

Consider DeepAI Pro