Img

BTK / Millennium ASR

Open source C++ and Python libraries to facilitate research and development for distant speech recognition (DSR)

Introduction

The BTK contains C++ and Python libraries that implement speech processing and microphone array techniques:

  • Speaker tracking,
  • Beamforming,
  • Post-filtering,
  • Speech enhancement,
  • Dereverberation,
  • Echo cancellation and
  • Speech feature extraction.
  • The Millennium ASR implements a weighted finite state transducer (WFST) decoder, training and adaptation methods. These toolkits are meant for facilitating research and development of automatic distant speech recognition.

    The basic components of the BTK and Millennium ASR were originally developed at the University of Karlsruhe under the European integrated project, CHIL. The significant improvements and new functionalities were further added at LSV, Saarland University and MLSP, Carnegie Mellon University.

    Features

  • Portable to Unix-like Systems with the GNU g++ compiler and SWIG
  • Both C++ and Python interfaces
  • Efficient handling for a block of incoming audio samples that makes BTK suitable for real-time prototypes