decoder for open vocabulary keyword spotting #505

pkufool · 2023-12-27T02:39:05Z

No description provided.

…oder

pkufool · 2024-01-02T01:42:57Z

I am busy these days, so this PR won't be merged in several days, this is the progress, in case someone wants to try it.

The C++ binary (sherpa-onnx-keyword-spotter) is working now, you have to build the project yourself, then you can find the binary in /build/bin.

I uploaded one Chinese model to https://www.modelscope.cn/models/pkufool/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/summary you can try it as follows:

Clone the model:

git lfs install
git clone https://www.modelscope.cn/pkufool/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.git
ln -s sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01 exp-ppinyin

Prepare the keywords, the file looks like(ppy_keywords.txt):

w én s ēn t è k ǎ s uǒ  @文森特卡索
zh ōu w àng j ūn @周望军
zh ū l ì n án @朱丽楠
j iǎng y ǒu b ó @蒋友伯
n ǚ ér @女儿
f ǎ g uó @法国
j iàn m iàn h uì @见面会
l uò sh í @落实

For the pinyin part, you can use script/text2token.py to generate them.

python scripts/text2token.py \
--text keywords_raw.txt \
--tokens exp-ppinyin/tokens.txt \
--tokens-type ppinyin \
--output ppy_keywords.txt

keywords_raw.txt is

文森特卡索
周望军
朱丽楠
蒋友伯
女儿
法国
见面会
落实

Note, for now, you have to fill the @xxx by hand (in ppy_keywords.txt), the output of text2token.py does not contain them, will improve it later.

Run the keyword spotter:

./build/bin/sherpa-onnx-keyword-spotter \
  --tokens=exp-ppinyin/tokens.txt \
  --encoder=exp-ppinyin/encoder-epoch-12-avg-2-chunk-16-left-64.onnx \
  --decoder=exp-ppinyin/decoder-epoch-12-avg-2-chunk-16-left-64.onnx \
  --joiner=exp-ppinyin/joiner-epoch-12-avg-2-chunk-16-left-64.onnx \
  --keywords-file=ppy_keywords.txt \
  --max-active-paths=4 \
  --keywords-score=1.0 \
  --keywords-threshold=0.25 \
  --num-threads=8 \
  exp-ppinyin/test_wavs/3.wav exp-ppinyin/test_wavs/4.wav exp-ppinyin/test_wavs/5.wav exp-ppinyin/test_wavs/6.wav

The outputs are:

/star-kw/kangwei/code/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./build/bin/sherpa-onnx-keyword-spotter --tokens=exp-ppinyin/tokens.txt --encoder=exp-ppinyin/encoder-epoch-12-avg-2-chunk-16-left-64.onnx --decoder=exp-ppinyin/decoder-epoch-12-avg-2-chunk-16-left-64.onnx --joiner=exp-ppinyin/joiner-epoch-12-avg-2-chunk-16-left-64.onnx --keywords-file=ppy_keywords.txt --max-active-paths=4 --keywords-score=1.0 --keywords-threshold=0.25 --num-threads=8 exp-ppinyin/test_wavs/3.wav exp-ppinyin/test_wavs/4.wav exp-ppinyin/test_wavs/5.wav exp-ppinyin/test_wavs/6.wav

KeywordSpotterConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OnlineModelConfig(transducer=OnlineTransducerModelConfig(encoder="exp-ppinyin/encoder-epoch-12-avg-2-chunk-16-left-64.onnx", decoder="exp-ppinyin/decoder-epoch-12-avg-2-chunk-16-left-64.onnx", joiner="exp-ppinyin/joiner-epoch-12-avg-2-chunk-16-left-64.onnx"), paraformer=OnlineParaformerModelConfig(encoder="", decoder=""), wenet_ctc=OnlineWenetCtcModelConfig(model="", chunk_size=16, num_left_chunks=4), zipformer2_ctc=OnlineZipformer2CtcModelConfig(model=""), tokens="exp-ppinyin/tokens.txt", num_threads=8, debug=False, provider="cpu", model_type=""), endpoint_config=EndpointConfig(rule1=EndpointRule(must_contain_nonsilence=False, min_trailing_silence=2.4, min_utterance_length=0), rule2=EndpointRule(must_contain_nonsilence=True, min_trailing_silence=1.2, min_utterance_length=0), rule3=EndpointRule(must_contain_nonsilence=False, min_trailing_silence=0, min_utterance_length=20)), enable_endpoint=True, max_active_paths=4, num_trailing_blanks=1, keywords_score=1, keywords_threshold=0.25, keywords_file="ppy_keywords.txt",
2024-01-02 09:39:07.708196448 [E:onnxruntime:, env.cc:254 ThreadMain] pthread_setaffinity_np failed for thread: 3556038, index: 15, mask: {16, 52, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
2024-01-02 09:39:07.709450780 [E:onnxruntime:, env.cc:254 ThreadMain] pthread_setaffinity_np failed for thread: 3556039, index: 16, mask: {17, 53, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.
exp-ppinyin/test_wavs/4.wav
{"start_time":0.00, "keyword": "蒋友伯", "timestamps": [0.64, 0.68, 0.84, 0.96, 1.12, 1.16], "tokens":["j", "iǎng", "y", "ǒu", "b", "ó"]}

exp-ppinyin/test_wavs/5.wav
{"start_time":0.00, "keyword": "周望军", "timestamps": [0.64, 0.68, 0.76, 0.84, 1.00, 1.08], "tokens":["zh", "ōu", "w", "àng", "j", "ūn"]}

exp-ppinyin/test_wavs/6.wav
{"start_time":0.00, "keyword": "朱丽楠", "timestamps": [0.64, 0.68, 0.76, 0.80, 1.00, 1.04], "tokens":["zh", "ū", "l", "ì", "n", "án"]}

exp-ppinyin/test_wavs/3.wav
{"start_time":0.00, "keyword": "文森特卡索", "timestamps": [0.32, 0.72, 0.96, 1.00, 1.28, 1.36, 1.52, 1.60, 1.92, 1.96], "tokens":["w", "én", "s", "ēn", "t", "è", "k", "
ǎ", "s", "uǒ"]}

exp-ppinyin/test_wavs/5.wav
{"start_time":0.00, "keyword": "落实", "timestamps": [1.80, 1.92, 2.12, 2.20], "tokens":["l", "uò", "sh", "í"]}

exp-ppinyin/test_wavs/6.wav
{"start_time":0.00, "keyword": "见面会", "timestamps": [2.16, 2.24, 2.28, 2.36, 2.48, 2.52], "tokens":["j", "iàn", "m", "iàn", "h", "uì"]}

exp-ppinyin/test_wavs/4.wav
{"start_time":0.00, "keyword": "女儿", "timestamps": [3.08, 3.20, 3.24], "tokens":["n", "ǚ", "ér"]}

exp-ppinyin/test_wavs/3.wav
{"start_time":0.00, "keyword": "法国", "timestamps": [4.56, 4.64, 4.80, 4.88], "tokens":["f", "ǎ", "g", "uó"]}

build-kws-apk.sh

sherpa-onnx/csrc/transducer-keywords-decoder.h

sherpa-onnx/jni/jni.cc

sherpa-onnx/csrc/sherpa-onnx-keyword-spotter.cc

pkufool · 2024-01-16T08:45:30Z

@csukuangfj Could have a look again?

.github/scripts/test-kws.sh

android/SherpaOnnxKws/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.kt

sherpa-onnx/csrc/context-graph.h

sherpa-onnx/csrc/keyword-spotter-transducer-impl.h

sherpa-onnx/jni/jni.cc

android/SherpaOnnxKws/app/src/main/java/com/k2fsa/sherpa/onnx/SherpaOnnx.kt

.github/scripts/test-kws.sh

* various fixes to ContextGraph to support open vocabulary keywords decoder * Add keyword spotter runtime * Add binary * First version works * Minor fixes * update text2token * default values * Add jni for kws * add kws android project * Minor fixes * Remove unused interface * Minor fixes * Add workflow * handle extra info in texts * Minor fixes * Add more comments * Fix ci * fix cpp style * Add input box in android demo so that users can specify their keywords * Fix cpp style * Fix comments * Minor fixes * Minor fixes * minor fixes * Minor fixes * Minor fixes * Add CI * Fix code style * cpplint * Fix comments * Fix error

various fixes to ContextGraph to support open vocabulary keywords dec…

a955d23

…oder

pkufool marked this pull request as draft December 27, 2023 02:39

pkufool added 6 commits December 28, 2023 18:39

Add keyword spotter runtime

ed54451

Add binary

ee03ade

First version works

539dd71

Minor fixes

d6124dc

update text2token

fdf7369

default values

9200960

pkufool added 12 commits January 5, 2024 15:11

Add jni for kws

dc06d83

add kws android project

d7b203a

Merge with master

28c5b0c

Minor fixes

0d36d48

Remove unused interface

5f3215b

Minor fixes

47bcbe4

Add workflow

246c832

handle extra info in texts

e6994eb

Minor fixes

24c69ab

Add more comments

645f49e

Fix ci

dc68be6

fix cpp style

c02284c

pkufool marked this pull request as ready for review January 11, 2024 03:13

pkufool and others added 2 commits January 11, 2024 19:02

Add input box in android demo so that users can specify their keywords

d63cc88

Fix cpp style

29bd76f

pkufool changed the title ~~[WIP] decoder for open vocabulary keyword spotting~~ decoder for open vocabulary keyword spotting Jan 11, 2024

pkufool requested a review from csukuangfj January 13, 2024 03:46

csukuangfj requested changes Jan 13, 2024

View reviewed changes

csukuangfj reviewed Jan 13, 2024

View reviewed changes

sherpa-onnx/csrc/sherpa-onnx-keyword-spotter.cc Show resolved Hide resolved

pkufool added 2 commits January 15, 2024 15:12

Fix comments

35b594f

Minor fixes

5a2001d

pkufool added 3 commits January 15, 2024 16:20

Minor fixes

05bf820

minor fixes

b063e3e

Minor fixes

29a1b2c

pkufool mentioned this pull request Jan 15, 2024

Recipes for open vocabulary keyword spotting k2-fsa/icefall#1428

Merged

pkufool added 4 commits January 15, 2024 19:49

Minor fixes

1912a46

Add CI

46ea698

Fix code style

3d85169

cpplint

561c5ed

csukuangfj requested changes Jan 16, 2024

View reviewed changes

csukuangfj reviewed Jan 17, 2024

View reviewed changes

android/SherpaOnnxKws/app/src/main/java/com/k2fsa/sherpa/onnx/SherpaOnnx.kt Show resolved Hide resolved

pkufool added 2 commits January 17, 2024 11:58

Fix comments

6b32571

Fix error

36b7aea

csukuangfj reviewed Jan 17, 2024

View reviewed changes

.github/scripts/test-kws.sh Show resolved Hide resolved

pkufool merged commit b6c0209 into k2-fsa:master Jan 20, 2024
175 of 181 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decoder for open vocabulary keyword spotting #505

decoder for open vocabulary keyword spotting #505

pkufool commented Dec 27, 2023

pkufool commented Jan 2, 2024 •

edited

Loading

pkufool commented Jan 16, 2024

decoder for open vocabulary keyword spotting #505

decoder for open vocabulary keyword spotting #505

Conversation

pkufool commented Dec 27, 2023

pkufool commented Jan 2, 2024 • edited Loading

pkufool commented Jan 16, 2024

pkufool commented Jan 2, 2024 •

edited

Loading