File size: 1,709 Bytes
362cb4e 102bf51 2ddc644 0813117 beee5a1 b20912b d800dff b20912b d800dff b20912b d800dff b20912b d800dff b20912b 0a667b0 a72b78f 0a667b0 a72b78f 0a667b0 a72b78f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
---
license: apache-2.0
metrics:
- accuracy
language:
- en
- zh
- ko
- ja
- de
- fr
- es
- pt
- vi
- tr
- it
- ru
- id
tags:
- keras
- tensorflow
libraries: TensorBoard
pipeline_tag: audio-classification
---
# Spoken_language_identification
## Model description
This is a spoken language recognition model trained on 2k hours of private dataset using Tensorflow. Approximately 150 hours of speech supervision per language.
the model uses the CRNN-Attention architecture that has previously been used for extracting utterance-level feature representations.
The system is trained with recordings sampled at 16kHz, single channel, and 16-bit Signed Integer PCM encoding.
More details can be found here: [**GitHub**](https://github.com/SpeechFlow-io/Spoken_language_identification)
The model can classify a speech utterance according to the language spoken. It covers 13 different languages.
| Molde Parameters | Supported Languages |
|----------|--------------------------|
| 1 M | chinese, english, french, german, indonesian, italian, japanese, korean, portuguese, russian, spanish, turkish, vietnamese|
## Example
[![ Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/16-Nre8aDvn0wN2dsgGa3xUsZ7S61e1h8#scrollTo=Is60zUMuPqSi)
Please see the provided Colab for details for runing an example.
#### How to use
```python
import librosa
from huggingface_hub import from_pretrained_keras
from featurizers.speech_featurizers import TFSpeechFeaturizer,
model = from_pretrained_keras("SpeechFlow/spoken_language_identification")
signal, _ = librosa.load(wav_path, sr=16000)
output, prob = model.predict_pb(signal)
print(output)
``` |