-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Multi-Window Multi-Head Attention implementation for ASR transformer #2675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Hey guys! Hope you are doing great --- this is a very nice PR! I just turned this PR draft for now, please turn it public when you think it will be ready to be reviewed. You can ping me as well so that I can have a closer look as soon as possible :) Thanks for your contribution :) Best, |
Hey @Adel-Moumen! Thanks for your comment, we have now finished the draft and turned it ready for review :) Best, |
## Transformer | ||
| Language | CV version | hyperparams file | LM | Val. CER | Val. WER | Test CER | Test WER | Hugging Face link | Model link | GPUs | | ||
| ------------- |:-------------:|:---------------------------:| -----:| -----:| -----:| -----:| -----:|:-----------:| :-----------:| :-----------:| | ||
| English | 16.1 | mwmha_transformer_large.yaml | No | 4.72 | 10.97 | 6.68 | 13.69 | - | [model](https://1drv.ms/f/c/039f8ffe91e06416/Et7KEbSlWNdJhkjLIi7_vGQBMVhGwRRBzCSljh6aA4sJSw?e=dXeuiY) | 1xL40 48GB | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the Val WER so high? I think you swapped CER and WER right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No that's right I just double checked, it is the same for Conformer English on CV 16.1 :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Adel-Moumen Val WER for MWMHA (10.97) follows the same trend and is quite close to that of the Conformer model (10.48) and is reported correctly, CER and WER are not swapped.
We've been waiting for a review for some time now. Any chance you can take a look at it soon? :) |
Added Multi-Window Multi-Head attention (MWMHA) module for Transformer ASR (https://openreview.net/forum?id=Q53QLftNkA).
In general, this contribution adds: