International Conference on Computer and Knowledge Engineering

Home / 12th International Conference on Computer and Knowledge Engineering

Deep Deterministic Policy Gradient in Acoustic To Articulatory inversion

Authors :

Farzane Abdoli¹ Hamid Sheikhzade² Vahid Pourahmadi³

1- Amirkabir university of technology 2- Amirkabir university of technology 3- Amirkabir university of technology

Keywords :

acoustic-to-articulatory inversion،DDPG،Reinforcement learning،Speaker-independent،MFCC

Abstract :

This paper aims to utilize a deep reinforcement learning algorithm for the acoustic-to-articulatory inversion problem. A deep deterministic policy gradient (DDPG) based method is adopted to adjust the articulatory parameters of a speaker to minimize the cepstral difference between original speech and the synthesized one. In traditional methods such as NNs, GMMs,... , parallel acoustic and articulatory training data is needed for each speaker, but the proposed iterative DDPG is used to explore articulatory space for finding the best point, which maximizes the desired reward without any need for joint kinematic and articulatory data for the speaker. Acoustic signals are synthesized by VocalTractLab(VTL), a three-dimensional articulatory synthesizer, and represented by Mel-frequency cepstral coefficients (MFCCs). This method provides estimated parameters very close to the ones which are calculated by MRI and advanced processing.