kotoba-tech
/

kotoba-whisper-v2.1

@@ -26,7 +26,7 @@ datasets:
 # Kotoba-Whisper-v2.1
 _Kotoba-Whisper-v2.1_ is a Japanese ASR model based on [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0), with
 additional postprocessing stacks integrated as [`pipeline`](https://huggingface.co/docs/transformers/en/main_classes/pipelines). The new features includes
-(i) improved timestamp achieved by [stable-ts](https://github.com/jianfch/stable-ts) and (ii) adding punctuation with [punctuators](https://github.com/1-800-BAD-CODE/punctuators/tree/main).
 These libraries are merged into Kotoba-Whisper-v2.1 via pipeline and will be applied seamlessly to the predicted transcription from [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0).
 The pipeline has been developed through the collaboration between [Asahi Ushio](https://asahiushio.com) and [Kotoba Technologies](https://twitter.com/kotoba_tech)
@@ -38,15 +38,9 @@ along with the.
 | model                                                                                                                                             |   [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) |   [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) |   [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
 |:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
 | [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0)                                                         |                                                                                                        17.6 |                                                                                    15.4 |                                                                                                        17.4 |
-| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1)                                                         |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   |
-| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (punctuator + stable-ts)                                |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   |
-| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (punctuator)                                            |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   |
-| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (stable-ts)                                             |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   |
 | [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0)                                                         |                                                                                                        17.8 |                                                                                    15.2 |                                                                                                        17.8 |
-| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1)                                                         |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
-| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator + stable-ts)                                |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
-| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator)                                            |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
-| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (stable-ts)                                             |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
 | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)                                                                         |                                                                                                        15.3 |                                                                                    13.4 |                                                                                                        20.5 |
 | [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2)                                                                         |                                                                                                        15.9 |                                                                                    10.6 |                                                                                                        34.6 |
 | [openai/whisper-large](https://huggingface.co/openai/whisper-large)                                                                               |                                                                                                        16.6 |                                                                                    11.3 |                                                                                                        40.7 |
@@ -97,7 +91,6 @@ pipe = pipeline(
     chunk_length_s=15,
     batch_size=16,
     trust_remote_code=True,
-    stable_ts=True,
     punctuator=True
 )
@@ -116,12 +109,6 @@ print(result)
 + result = pipe("audio.mp3", return_timestamps=True, generate_kwargs=generate_kwargs)
 ```
-- To deactivate stable-ts:
-```diff
--     stable_ts=True,
-+     stable_ts=False,
-```
 - To deactivate punctuator:
 ```diff
 -     punctuator=True,

 # Kotoba-Whisper-v2.1
 _Kotoba-Whisper-v2.1_ is a Japanese ASR model based on [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0), with
 additional postprocessing stacks integrated as [`pipeline`](https://huggingface.co/docs/transformers/en/main_classes/pipelines). The new features includes
+adding punctuation with [punctuators](https://github.com/1-800-BAD-CODE/punctuators/tree/main).
 These libraries are merged into Kotoba-Whisper-v2.1 via pipeline and will be applied seamlessly to the predicted transcription from [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0).
 The pipeline has been developed through the collaboration between [Asahi Ushio](https://asahiushio.com) and [Kotoba Technologies](https://twitter.com/kotoba_tech)
 | model                                                                                                                                             |   [CommonVoice 8 (Japanese test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0) |   [JSUT Basic 5000](https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000) |   [ReazonSpeech (held out test set)](https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test) |
 |:--------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------------:|
 | [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0)                                                         |                                                                                                        17.6 |                                                                                    15.4 |                                                                                                        17.4 |
+| [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1)                                           |                                                                                                        17.7 |                                                                                    15.4 |                                                                                                        17   | -->
 | [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0)                                                         |                                                                                                        17.8 |                                                                                    15.2 |                                                                                                        17.8 |
+| [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1)                                           |                                                                                                        17.9 |                                                                                    15   |                                                                                                        17.8 |
 | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)                                                                         |                                                                                                        15.3 |                                                                                    13.4 |                                                                                                        20.5 |
 | [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2)                                                                         |                                                                                                        15.9 |                                                                                    10.6 |                                                                                                        34.6 |
 | [openai/whisper-large](https://huggingface.co/openai/whisper-large)                                                                               |                                                                                                        16.6 |                                                                                    11.3 |                                                                                                        40.7 |
     chunk_length_s=15,
     batch_size=16,
     trust_remote_code=True,
     punctuator=True
 )
 + result = pipe("audio.mp3", return_timestamps=True, generate_kwargs=generate_kwargs)
 ```
 - To deactivate punctuator:
 ```diff
 -     punctuator=True,