第22回ロボット聴覚特集 - 奥乃研究室 - 京都大学

More documents

Recommendations

Info

Frequency (kHz)2102102102101) Mixture20 40 60 802) Only ICA20 40 60 803) ICA and T−F masking20 40 60 804) Target source component20 40 60 80Time frameFigure 11: : 1)2) ICA 3) ICA (T-F masking)4) Table 2: 9 3 : SIR (dB) a120 b120 c170 SIR −3.9 −3.6 −5.9ICA 12.5 13.6 14.5ICA 15.1 16.5 17.65 BSS 2permutation ICA BSS permutation [16][1] B. D. Van Veen and K. M. Buckley, “Beamforming: a versatileapproach to spatial filtering,” IEEE ASSP Magazine, vol. 5, pp.4–24, Apr. 1988.[2] S. Haykin, Ed., Unsupervised Adaptive Filtering (Volume I:Blind Source Separation). John Wiley & Sons, 2000.[3] A. Hyvärinen, J. Karhunen, and E. Oja, Independent ComponentAnalysis. John Wiley & Sons, 2001.[4] K. Matsuoka and S. Nakashima, “Minimal distortion principlefor blind source separation,” in Proc. ICA 2001, Dec. 2001, pp.722–727.[5] S. C. Douglas and X. Sun, “Convolutive blind separation ofspeech mixtures using the natural gradient,” Speech Communication,vol. 39, pp. 65–78, 2003.[6] S. C. Douglas, H. Sawada, and S. Makino, “A spatio-temporalFastICA algorithm for separating convolutive mixtures,” inProc. of 2005 IEEE International Conference on Acoustics,Speech, and Signal Processing (ICASSP 2005), vol. V, Mar.2005, pp. 165–168.[7] P. Smaragdis, “Blind separation of convolved mixtures inthe frequency domain,” Neurocomputing, vol. 22, pp. 21–34,1998.[8] N. Murata, S. Ikeda, and A. Ziehe, “An approach to blindsource separation based on temporal structure of speech signals,”Neurocomputing, vol. 41, no. 1-4, pp. 1–24, Oct. 2001.[9] F. Asano, S. Ikeda, M. Ogawa, H. Asoh, and N. Kitawaki,“Combined approach of array processing and independentcomponent analysis for blind separation of acoustic signals,”IEEE Trans. Speech Audio Processing, vol. 11, no. 3, pp. 204–215, May 2003.[10] H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa,and K. Shikano, “Blind source separation combining independentcomponent analysis and beamforming,,” EURASIP Journalon Applied Signal Processing, vol. 2003, no. 11, pp. 1135–1146, Nov. 2003.[11] H. Sawada, R. Mukai, S. Araki, and S. Makino, “Polar coordinatebased nonlinear function for frequency domain blindsource separation,” IEICE Trans. Fundamentals, vol. E86-A,no. 3, pp. 590–596, Mar. 2003.[12] ——, “A robust and precise method for solving the permutationproblem of frequency-domain blind source separation,”IEEE Trans. Speech Audio Processing, vol. 12, no. 5, pp. 530–538, Sept. 2004.[13] R. Mukai, H. Sawada, S. Araki, and S. Makino, “Frequencydomain blind source separation for many speech signals,”in Proc. of the 5th International Conference on IndependentComponent Analysis and Blind Signal Separation (ICA 2004 /LNCS 3195). Springer-Verlag, Sept. 2004, pp. 461–469.[14] H. Sawada, R. Mukai, S. Araki, and S. Makino, “Frequencydomainblind source separation,” in Speech Enhancement,J. Benesty, S. Makino, and J. Chen, Eds. Springer, Mar. 2005,pp. 299–327.[15] H. Sawada, S. Araki, R. Mukai, and S. Makino, “Blind extractionof a dominant source from mixtures of many sources usingICA and time-frequency masking,” in Proc. of 2005 IEEE InternationalSymposium on Circuits and Systems (ISCAS 2005),May 2005, pp. 5882–5885.[16] S. Araki, H. Sawada, R. Mukai, and S. Makino, “A novel blindsource separation method with observation vector clustering,”in Proc. 2005 International Workshop on Acoustic Echo andNoise Control (IWAENC 2005), Sept. 2005.22
社団法人人工知能学会Japanese Society forArtificial Intelligence人工知能学会JSAI Technical ReportSIG-CHallege-0522-4 (10/14)SIMO-ICA 2 Two-Stage Real-Time Blind Source Separation Combining SIMO-ICA and Binary Mask Processing † † † † ‡ ‡Y. Mori † , T. Takatani † , H. Saruwatari † , K. Shikano † ,T.Hiekata ‡ ,T.Morita ‡† Nara Institute of Science and Technology‡() Kobe Steel,Ltd.E-mail: †{yoshim-m, tomoya-t, sawatari, shikano}@is.naist.jp,‡{t-hiekata, takashi-morita}@kobelco.jpAbstractWe newly propose a real-time two-stage blindsource separation (BSS) for binaural mixed signalsobserved at the ears of humanoid robot, inwhich a Single-Input Multiple-Output (SIMO)-model-based independent component analysis(ICA) and binary mask processing are combined.SIMO-model-based ICA can separatethe mixed signals, not into monaural source signalsbut into SIMO-model-based signals fromindependent sources as they are at the microphones.Thus, the separated signals ofSIMO-model-based ICA can maintain the spatialqualities of each sound source, and thisyields that binary mask processing can be appliedto efficiently remove the residual interferencecomponents after SIMO-model-basedICA. The experimental results obtained witha human-like head reveal that the separationperformance can be considerably improved byusing the proposed method in comparison tothe conventional ICA-based and binary-maskbasedBSS methods.1 (BSS) (DOA) BSS BSS [1] [2, 3] (ICA) [4] BSS [5,6,7,8] BSS ICA ICA [9] ICA () 2 BSS BSS 2 Single-Input Multiple-Output (SIMO) ICA (SIMO-ICA) [10, 11] SIMO-ICA SIMO – [12, 13, 14] “SIMO” SIMO-ICA SIMO SIMO-ICA SIMO-ICA BSS23
Page 4: SCOT(Smoothed Coherence Transform)P
Page 8 and 9: Particle (a)(b)φ12(τ )[14]x ( t )
Page 10 and 11: - 8 -
Page 12 and 13: 1 () 2 SIMO-ICA 3 SIMO-ICA tele
Page 14: ICAy FCy FCy SIMO-ICAs 1(t)x 1(t)1(
Page 17 and 18: [15] Y. Mori, H. Saruwatari, T. Tak
Page 19 and 20: 社団法人人工知能学
Page 21 and 22: • 音源位置マイク配置
Page 23: Table 1: 6 : SIR (dB)SIR 1 SIR 2 S
Page 27 and 28: SIMO-ICA SIMO Figure 2(a)SIMO-ICA
Page 29 and 30: Binary maskConventional ICAConventi
Page 33 and 34: k lo (l), k c (l), k hi (l) l k c
Page 35 and 36: 5.75 m4.33 mNoise1.15 mUser 40°2.1
Page 38 and 39: おける方法論に関し
Page 40 and 41: Fig.6 は幼児の ABR (Auditory
Page 42 and 43: ンターフェースはスパイ
Page 44 and 45: マイクロホン[ 正面 ][ 左
Page 46 and 47: s(k)Crosstalkn(k)R S(k)X P(k)X R(k)
Page 48 and 49: する隠れマルコフモデル
Page 50 and 51: 123ÙÖ ½ ¾º¾ ´º ½µ ´º
Page 52 and 53: ÌÐ ½ ¿º¾ ÅÎÆÇÂ
Page 54 and 55: ÁÒØÖÒØÓÒÐ ÓÒÖÒ ÓÒ Á
Page 56 and 57: 例えば、同一時間差
Page 58 and 59: いて、θの絶対値が大
Page 60 and 61: Fig.11 にこのシステムの処
Page 62 and 63: 5 , 2 EMIEWFig.1 EMIEW EMIEW 6 ,
Page 64: 0 P th , (14) 4.4 3 4 4 1 , 3
Page 69 and 70: 3.1. 3.2. Fig. 3. The
Page 71 and 72: 4.1. Fig. 5. The time co
Page 75 and 76:
modal (m, ), whispery (w, ), aspir
Page 77 and 78:
Aperiodicity rate (APR)TLR (Time-La
Page 79 and 80:
社団法人人工知能学
Page 81 and 82:
, À, WDS-BF Ñ À℄·
Page 83 and 84:
Table 1: Localization Error of A Si
Page 85 and 86:
Page 87 and 88:
を行い, 閾値処理を
Page 89 and 90:
4. 音声対話制御実験H
Page 91 and 92:
Page 93 and 94:
3 HLDAMLLR [3] (Useful Information
Page 95 and 96:
Class 10degClass 20degClass 10degCl
Page 97 and 98:
Page 99 and 100:
赤い長方形内 ). 以下
Page 101 and 102:
5.2 音場計測結果(dB SPL)
Page 103 and 104:
Page 105 and 106:
a) 90 b) 90 MFMc) d) MFMe) 9
Page 107 and 108:
(3) MFT Julius 7.1 Figure 4: SIG2
show all

第22回 ロボット聴覚特集 - 奥乃研究室 - 京都大学

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?

第22回ロボット聴覚特集 - 奥乃研究室 - 京都大学