Spaces:
No application file
No application file
File size: 1,260 Bytes
be22c40 580b8b8 aaccb31 744947d be22c40 b3a3618 93b8da6 a2d19e9 6fdf70c a70a986 6fdf70c 69df3f8 741ea80 6fdf70c 250ec05 6fdf70c 250ec05 6fdf70c 580b8b8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
title: README
emoji: π
colorFrom: pink
colorTo: red
sdk: streamlit
pinned: false
sdk_version: 1.43.2
thumbnail: >-
https://cdn-uploads.huggingface.co/production/uploads/629e1b71bb6419817ed7566c/jeUU2sPSuMRP9IIqVnufk.png
---
- GenSEC: Text-based Generative Audio & Speech Recognition with Cascaded ASR-LLMs
- Task 1: ASR N-best hypotheses correction
- Task 2: Speaker Tagging from N-best hypotheses
- Task 3: Emotion Recognition from N-best hypotheses
- Open Source Model
- Llama-7b pre-training for ASR correction
- https://huggingface.co/GenSEC-LLM/SLT-Task1-Llama2-7b-HyPo-baseline
- IEEE SLT 2024, References [Paper](https://arxiv.org/abs/2409.09785). See below resources for baseline models and datasets.
```bib
@inproceedings{yang2024large,
title={Large language model based generative error correction: A challenge and baselines for speech recognition, speaker tagging, and emotion recognition},
author={Yang, Chao-Han Huck and Park, Taejin and Gong, Yuan and Li, Yuanchao and Chen, Zhehuai and Lin, Yen-Ting and Chen, Chen and Hu, Yuchen and Dhawan, Kunal and {\.Z}elasko, Piotr and others},
booktitle={2024 IEEE Spoken Language Technology Workshop (SLT)},
pages={371--378},
year={2024},
organization={IEEE}
}
``` |