Speech Recognition for Uyghur using deep learning
cafe.wav | ||
data.py | ||
perlin.wav | ||
radionoise.wav | ||
README.md | ||
silence.wav | ||
test1.wav | ||
test2.wav | ||
test3.wav | ||
test4.wav | ||
test5.wav | ||
test6.wav | ||
thuyg20_test.csv | ||
thuyg20_train.csv | ||
tonu.py | ||
train.py | ||
UModel.py | ||
uyghur.py | ||
white.wav |
Speech Recognition for Uyghur using deep learning
Training:
this model using CTC loss for training.
Download pretrained model and dataset from https://github.com/gheyret/uyghur-asr-ctc/releases. unzip results.7z and thuyg20_data.7z to the same folder where python source files located. then run:
python train.py
Recognition: for recognition download only pretrained model(results.7z). then run:
python tonu.py test1.wav
result will be:
Model loaded: results/UModel_last.pth
Best CER: 7.21%
Trained: 473 epochs
The model has 26,389,282 trainable parameters
======================
Recognizing file .\test2.wav
test2.wav -> bu öy eslide xotunining xush tebessumi oghlining omaq külküsi bilen güzel idi
This project using **A free Uyghur speech database Released by CSLT@Tsinghua University & Xinjiang University(http://www.openslr.org/22/)