Skip to content

Latest commit

 

History

History
25 lines (15 loc) · 1.99 KB

README.md

File metadata and controls

25 lines (15 loc) · 1.99 KB

Deep Non-linear Filters for Multi-channel Speech Enhancement and Separation

This repository contains code for the papers

[1] Kristina Tesch, Nils-Hendrik Mohrmann, and Timo Gerkmann, "On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement", Proceedings of Interspeech, pp. 2908-2912, 2022, [arxiv], [audio examples]

[2] Kristina Tesch and Timo Gerkmann, "Insights into Deep Non-linear Filters for Improved Multi-channel Speech Enhancement", IEEE/ACM Transactions of Audio, Speech and Language Processing, vol 31. pp.563-575, 2023, [audio examples]

[3] Kristina Tesch and Timo Gerkmann, "Spatially Selective Deep Non-linear filters for Speaker Extraction", accepted for ICASSP 2023, [audio examples]

[4] Kristina Tesch and Timo Gerkmann, "Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters", IEEE/ACM Transactions of Audio, Speech and Language Processing, vol. 32, pp. 542-553, 2024 [audio examples]

Take a look at a video of our real-time multi-channel enhancement demo: http://uhh.de/inf-sp-jnf-demo

Train JNF with a fixed look direction

  1. Prepare a dataset by running data_gen_fixed_pos.py.
  2. Prepare a config file. Examples can be found in the config folder.
  3. Run the training script in the scripts folder (replace the path to your config file).

Train steerable JNF-SSF

  1. Prepare a dataset by running data_gen_var_pos.py.
  2. Prepare a config file. Examples can be found in the config folder.
  3. Run the training script in the scripts folder (replace the path to your config file).