Segregation of voiced and unvoiced components from residual of speech signal

Cheol-woo Jo; Jae-hee Kim

doi:10.1007/s11771-012-1031-4

Journal of Central South University ›› 2012, Vol. 19 ›› Issue (2) :496 -503. DOI: 10.1007/s11771-012-1031-4

Article

Segregation of voiced and unvoiced components from residual of speech signal

Cheol-woo Jo ¹^,^a
, Jae-hee Kim ¹

Author information +

History +

PDF

Abstract

In conventional source-filter models, voiced and unvoiced components were considered independently. However, in practice it was difficult to separate the source into two parts. An actual source consists of a mixture of two sources and the ratio varies according to the content or the intention of speaker. It had been investigated to separate the voiced and unvoiced components for different source models. Source signals were modeled based on the residual signal measured from inverse filtering. Three different source models were assumed. The parameters of each model were optimized for the original speech signal using a genetic algorithm. The resulting parameters were compared in terms of the mel-cepstral distance to the original signal, the spectrogram and the spectral envelope from the synthesized signal. The optimization method achieves an improvement of 15% for the Klatt model, but there is little improvement in the modified residual case.

Keywords

voice source / model / synthesis / optimization / genetic algorithm

Cite this article

Download citation ▾

Cheol-woo Jo, Jae-hee Kim. Segregation of voiced and unvoiced components from residual of speech signal. Journal of Central South University, 2012, 19(2): 496-503 DOI:10.1007/s11771-012-1031-4

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	KlattD. H.. Software for a cascade/parallel formant synthesizer [J]. JASA, 1980, 67(3): 971-994

[2]	FantG., LijencrantsJ., LinQ.. A four-parameter model of glottal flow [C]. Proceedings of the French-Swedish Symposium, 1985, Grenoble, STL-QPSR4: 1-13

[3]	FujisakiH., LjungqvistM.. Proposal an evaluation of models for the glottal source waveform [C]. Proceedings of ASSP, ICASSP’86, 1986, Tokyo, Japan, IEEE: 1605-1608

[4]	ChytilP., PavelM.. Estimation of vocal fold characteristics using a parametric source model [C]. Proceedings of 11th Australian International Conference on Speech Science and Technology, 2006, Auckland, Australian Speech Science and Technology Association Inc: 405-410

[5]	AlkuP.. Glottal wave analysis with pitch synchronous interactive adaptive inverse filtering [J]. Speech Communication, 1992, 11: 109-118

[6]	YegnanarayanaB., D’AlessandroC., DarsinosV.. An iterative algorithm for decomposition of speech signals into periodic and aperiodic components [J]. IEEE Transaction on Speech and Audio Processing, 1998, 6(1): 1-11

[7]	FuQ., MurphyP.. Robust glottal source estimation based on joint source-filter model optimization [J]. IEEE Transactions on Audio, Speech and Language Processing, 2006, 14(2): 492-501

[8]	BouzidA., EllouzeN.. Voice source parameter measurement based on multi-scale analysis of electroglottographic signal [J]. Speech Communication, 2009, 51(9): 782-792

[9]	LankaranyM., SavojiM. H.. A new iterative algorithm for estimating the glottal flow derivative of vowels [C]. Proceedings of 10th International Conference on Information Science, Signal Processing and their Applications, 2010, Kuala Lumpur, IEEE: 13-16

[10]	DrugmanT., BozkurtB., DutoitT.. A comparative study of glottal source estimation techniques [J]. Computer Speech and Language, 2011, 26(1): 20-34

[11]	LehtoL., AirasM., BjorknerE., SundburgJ., AlkP.. Comparison of two inverse filtering methods in parameterization of the glottal closing phase characteristics in different phonation types [J]. Journal of Voice, 2007, 21(2): 138-150

[12]	MooreE., TorresJ.. A performance assessment of objective measures for evaluating the quality of glottal waveform estimates [J]. Speech Communication, 2008, 50: 56-66

[13]	ForcinA., AbbertonE.. Phonetics & measurement of voice quality [C]. VOQUAL’03, 2003, Geneva, ISCA Archive: 1-27

[14]	PedroG. V., RobertoF. B., VictoriaR. B., VictorN. L., AgustinA. M., LuisM. M. F., RafaelM. O., JuanI. G. L.. Glottal source biometrical signature for voice pathology detection [J]. Speech Communication, 2009, 51(9): 759-781

[15]	ShueY. L., AlwanA.. A new voice source model based on high-speed imaging and its application to voice source estimation [C]. Proceedings of ICASSP, 2010, Texas, IEEE: 5134-5137