References

[KS10]
  1. Kim and R. M. Stern. Nonlinear enhancement of onset for robust speech recognition. In Proc. Interspeech 2010
[HAY02]
  1. Haykin, “Adaptive filter theory,” Prentice Hall, 2002.
[VSR13]
  1. Virtanen, R. Singh, B Raj, editors, “Techniques for noise robustness in automatic speech recognition,” Willey, 2013
[TRE02]
      1. Trees, “Optimum Array Processing,” Wiley Interscience, 2002.
[VAD92]
    1. Vaidyanathan, “Multirate systems and filter banks”, Prentice Hall, 1992.
[KMSK+08]
  1. Kumatani, J. W. McDonough, S. Schachl, D. Klakow, P. N. Garner and W. Li, “Filter bank design based on minimization of individual aliasing terms for minimum mutual information subband adaptive beamforming,” in Proc. ICASSP, Las Vegas, USA, 2008.
[DGCN03]
    1. de Haan, N. Grbic, I. Claesson and S. E. Nordholm, “Filter bank design for subband adaptive microphone arrays,” IEEE Transactions on Speech and Audio Processing, pp. 14-23, 2003.
[CBH06]
  1. Chen, J. Benesty, Y. Huang, Time Delay Estimation in Room Acoustic Environments: An Overview. EURASIP J. Adv. Sig. Proc. 2006 (2006)
[CAR81]
  1. Carter, “Time delay estimation for passive sonar signal processing,” IEEE Transactions on Acoustics, Speech, and Signal Processing, pp. 463-469, 1981.
[OS94]
  1. Omologo, P. Svaizer, “Acoustic event localization using a crosspower-spectrum phase based technique,” in Proc. ICASSP 1994
[DSB01]J.H. DiBiase, H.F. Silverman, M.S Brandstein (2001) “Robust Localization in Reverberant Rooms,” In: Brandstein M., Ward D. (eds) Microphone Arrays. Digital Signal Processing. Springer, Berlin, Heidelberg
[AWPA05]
  1. Anguera, C. Wooters, B. Peskin, M. Aguilo, “Robust Speaker Segmentation for Meetings: The ICSI-SRI Spring 2005 Diarization System,” in Proc. MLMI 2005.
[SR87]
  1. Schau, A. Robinson, “Passive source localization employing intersecting spherical surfaces from time-of-arrival differences,” IEEE Transactions on Acoustics, Speech, and Signal Processing, pp. 1223-1225, 1987
[AS87]
  1. Abel, J. Smith, “The spherical interpolation method for closed-form passive source localization using range difference measurements,” In Proc. ICASSP, 1987
[BRA95]
    1. Brandstein, “A Framework for Speech Source Localization Using Sensor Arrays,” Ph.D., Brown University, 1995.
[YKA96]
  1. Yli-Hietanen, K. Kalliojarvi, J. Astola, “Low-complexity angle of arrival estimation of wideband signals using small arrays,” In Proc. IEEE Signal Processing Workshop on Statistical Signal and Array Processing, 1996.
[KGM06]
  1. Klee, T. Gehrig and J. W. McDonough, “Kalman Filters for Time Delay of Arrival-Based Source Localization,” EURASIP J. Adv. Sig., 2006.
[WM09]
  1. Woelfel and J. W. McDonough, “Distant Speech Recognition”, New York: Wiley, 2009.
[SBM01]
    1. Simmer, J. Bitzer and C. Marro, “Post-Filtering Techniques,” in Microphone Arrays, Heidelberg, Germany, Springer Verlag, 2001, pp. 39-60.
[SU96]
    1. Sullivan, Multi-Microphone Correlation-Based Processing for Robust Automatic Speech Recognition, Ph.D. thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania, 8 1996.
[MB03]
  1. McCowan and H. Bourlard, “Microphone array post-filter based on noise field coherence,” IEEE Transactions on Speech and Audio Processing, pp. 709-716, 2003.
[LM07]
  1. Lefkimmiatis and P. Maragos, “A generalized estimation approach for linear and nonlinear microphone array post-filters,” Speech Communication, vol. 49, pp. 7-8, 2007.
[SKMC12]
  1. Singh, K. Kumatani, J. McDonough, L. Chen, “A signal-separation-based array postfilter for distant speech recognition,” in Proc. Interspeech 2012.
[BS01]
  1. Bitzer and K. U. Simmer, “Superdirective Microphone Arrays,” in Microphone Arrays, Heidelberg, Germany, Springer Verlag, 2001, pp. 19-38.
[SBA10]
  1. Souden, J. Benesty and S. Affes, “On optimal frequency-domain multichannel linear filtering for noise reduction,” IEEE Trans. Audio, Speech, Language Process, pp. 260-276, 2010.
[KMB12]
  1. Kumatani, J. McDonough and B. Raj, “Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors,” IEEE Signal Processing Magazine, pp. 127-140, 2012.
[KMRK+09]
  1. Kumatani, J. W. McDonough, B. Rauch, D. Klakow, P. N. Garner and W. Li, “Beamforming With a Maximum Negentropy Criterion,” IEEE Transactions on Audio, Speech & Language Processing, pp. 994-1008, 2009.
[KMRG+08]
  1. Kumatani, J. W. McDonough, B. Rauch, P. N. Garner, W. Li and J. Dines, “Maximum kurtosis beamforming with the generalized sidelobe canceller,” in INTERSPEECH, Brisbane, Australia, 2008.
[ME04]
  1. Meyer and G. W. Elko, “Spherical Microphone Arrays for 3D Sound Recording,” in Audio Signal Processing for Next-Generation Multimedia Communication Systems, 2004, pp. 67-89.
[MKAY+13]
    1. McDonough, K. Kumatani, T. Arakawa, K. Yamamoto and B. Raj, “Speaker tracking with spherical microphone arrays,” in ICASSP, Vancouver, Canada, 2013.
[TA05]
  1. Tashev and D. Allred, “Reverberation reduction for improved speech recognition,” in Proceedings of Hands-Free Communication and Microphone Arrays, Piscataway, USA, 2005.
[KDGH+16]
  1. Kinoshita and M. Delcroix and S. Gannot and E. Habets and R. Haeb-Umbach and W. Kellermann and V. Leutnant and R. Maas and T. Nakatani and B. Raj and A. Sehr and T. Yoshioka; “A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research” EURASIP Journal on Advances in Signal Processing, 2016
[YN12]
  1. Yoshioka and T. Nakatani, “Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening,” IEEE Trans. Audio, Speech, Language Process, pp. 2707-2720, 2012.
[TAS09]
  1. Tashev, “Sound Capture and Processing: Practical Approaches,” Wiley, 2009.
[HS04]
  1. Haensler and G. Schmidt, “Acoustic Echo and Noise Control - A Practical Approach,” Wiley Interscience, 2004.
[KEL97]
  1. Kellermann, “Strategies for combining acoustic echo cancellation and adaptive beamforming microphone arrays,” in Proc. ICASSP, 1997.
[HNK05]
  1. Herbordt, S. Nakamura and W. Kellerman, “Joint optimization of LCMV beamforming and acoustic echo cancellation for automatic speech recognition,” in ICASSP, Philadelphia, PA, USA, 2005.
[MCKR+11]
  1. McDonough, W. Chu, K. Kumatani, B. Raj and J. Lehman, “An Information Filter for Voice Prompt Suppression,” in Asilomar, Pacific Grove, CA, 2011.
[MKR11]
  1. McDonough, K. Kumatani and B. Raj, “On the Combination of Voice Prompt Suppression with Maximum Kurtosis Beamforming,” in Proc. WASPAA, New Paltz, NY, 2011.
[EV06]
  1. Enzner and P. Vary, “Frequency-domain adaptive Kalman filter for acoustic echo control in hands-free telephones,” Signal Processing, pp. 1140-1156, 2006.
[FF18]
  1. Franzen, T. Fingscheidt, “An Efficient Residual Echo Suppression for Multi-Channel Acoustic Echo Cancellation Based on the Frequency-Domain Adaptive Kalman Filter”, in Proc. ICASSP 2018.
[CSVH18]
  1. Carbajal, R. Serizel, E. Vincent, E. Humbert, “Multiple-input neural network-based residual echo suppression”, in Proc. ICASSP 2018.
[WM05]
  1. Wölfel and J. McDonough, “Minimum variance distortionless response spectral estimation, review and refinements,” IEEE Signal Processing Magazine, pp. 117-126, 2005.
[MKGS+07]
  1. McDonough, K. Kumatani, T. Gehrig, E. Stoimenov, U. Mayer, S. Schacht, M. Woelfel and D. Klakow, “To separate speech: A system for recognizing simultaneous speech,” in Proceedings of the 4th international conference on Machine learning for multimodal interaction, Brno, Czech Republic, 2007.
[WH07]
  1. Warsitz and R. Haeb-Umbach, “Blind Acoustic Beamforming based on Generalized Eigenvalue Decomposition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, 2007.
[HDH16]
  1. Heymann, L. Drude, R. Haeb-Umbach, “Neural network based spectral mask estimation for acoustic beamforming,” in Proc. ICASSP 2016.
[KMR11]K Kumatani, J McDonough, B Raj, “Block-wise incremental adaptation algorithm for maximum kurtosis beamforming,” in Proc. WASPAA, 2011
[HKIK+18]
  1. Higuchi, K. Kinoshita, N. Ito, S. Karita, and T. Nakatani, “Frame-by-frame closed-form update for mask-based adaptive MVDR beamforming,” in Proc. ICASSP, 2018.