sulaimank
/

mms-1b-all-lg-CVGRAIN-v1

@@ -1,199 +1,145 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+license: cc-by-nc-4.0
+base_model: facebook/mms-1b-all
+tags:
+- generated_from_trainer
+metrics:
+- wer
+model-index:
+- name: mms-1b-all-lg-CVGRAIN-v1
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mms-1b-all-lg-CVGRAIN-v1
+This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.0628
+- Wer: 0.0835
+- Cer: 0.0156
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.001
+- train_batch_size: 8
+- eval_batch_size: 4
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 80
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step   | Validation Loss | Wer    | Cer    |
+|:-------------:|:-----:|:------:|:---------------:|:------:|:------:|
+| 0.356         | 1.0   | 5827   | 0.1269          | 0.1730 | 0.0304 |
+| 0.2237        | 2.0   | 11654  | 0.1178          | 0.1670 | 0.0300 |
+| 0.2186        | 3.0   | 17481  | 0.1145          | 0.1566 | 0.0275 |
+| 0.2139        | 4.0   | 23308  | 0.1101          | 0.1537 | 0.0270 |
+| 0.211         | 5.0   | 29135  | 0.1062          | 0.1479 | 0.0266 |
+| 0.2088        | 6.0   | 34962  | 0.1060          | 0.1469 | 0.0258 |
+| 0.2072        | 7.0   | 40789  | 0.1032          | 0.1452 | 0.0253 |
+| 0.2043        | 8.0   | 46616  | 0.1002          | 0.1427 | 0.0252 |
+| 0.2039        | 9.0   | 52443  | 0.1015          | 0.1431 | 0.0249 |
+| 0.2024        | 10.0  | 58270  | 0.0990          | 0.1398 | 0.0244 |
+| 0.2006        | 11.0  | 64097  | 0.0934          | 0.1328 | 0.0232 |
+| 0.1994        | 12.0  | 69924  | 0.0921          | 0.1330 | 0.0236 |
+| 0.1981        | 13.0  | 75751  | 0.0955          | 0.1249 | 0.0227 |
+| 0.196         | 14.0  | 81578  | 0.0924          | 0.1276 | 0.0227 |
+| 0.1958        | 15.0  | 87405  | 0.0919          | 0.1231 | 0.0219 |
+| 0.1948        | 16.0  | 93232  | 0.0881          | 0.1216 | 0.0218 |
+| 0.1939        | 17.0  | 99059  | 0.0902          | 0.1210 | 0.0218 |
+| 0.1933        | 18.0  | 104886 | 0.0873          | 0.1216 | 0.0218 |
+| 0.1921        | 19.0  | 110713 | 0.0877          | 0.1231 | 0.0218 |
+| 0.1921        | 20.0  | 116540 | 0.0878          | 0.1179 | 0.0212 |
+| 0.1905        | 21.0  | 122367 | 0.0852          | 0.1148 | 0.0208 |
+| 0.1901        | 22.0  | 128194 | 0.0832          | 0.1121 | 0.0206 |
+| 0.189         | 23.0  | 134021 | 0.0809          | 0.1110 | 0.0206 |
+| 0.188         | 24.0  | 139848 | 0.0797          | 0.1086 | 0.0197 |
+| 0.1873        | 25.0  | 145675 | 0.0809          | 0.1083 | 0.0200 |
+| 0.1864        | 26.0  | 151502 | 0.0813          | 0.1110 | 0.0203 |
+| 0.1858        | 27.0  | 157329 | 0.0824          | 0.1030 | 0.0188 |
+| 0.1854        | 28.0  | 163156 | 0.0820          | 0.1098 | 0.0202 |
+| 0.1847        | 29.0  | 168983 | 0.0798          | 0.1065 | 0.0194 |
+| 0.1842        | 30.0  | 174810 | 0.0774          | 0.1044 | 0.0188 |
+| 0.1827        | 31.0  | 180637 | 0.0769          | 0.1063 | 0.0193 |
+| 0.1818        | 32.0  | 186464 | 0.0767          | 0.1032 | 0.0190 |
+| 0.1815        | 33.0  | 192291 | 0.0754          | 0.1001 | 0.0184 |
+| 0.1811        | 34.0  | 198118 | 0.0745          | 0.1011 | 0.0185 |
+| 0.1806        | 35.0  | 203945 | 0.0758          | 0.1032 | 0.0184 |
+| 0.1797        | 36.0  | 209772 | 0.0771          | 0.0982 | 0.0185 |
+| 0.1792        | 37.0  | 215599 | 0.0744          | 0.0982 | 0.0181 |
+| 0.1788        | 38.0  | 221426 | 0.0730          | 0.0957 | 0.0178 |
+| 0.1776        | 39.0  | 227253 | 0.0730          | 0.0965 | 0.0180 |
+| 0.1772        | 40.0  | 233080 | 0.0742          | 0.0986 | 0.0181 |
+| 0.1765        | 41.0  | 238907 | 0.0721          | 0.0951 | 0.0176 |
+| 0.1757        | 42.0  | 244734 | 0.0719          | 0.0976 | 0.0180 |
+| 0.1748        | 43.0  | 250561 | 0.0713          | 0.0934 | 0.0171 |
+| 0.1747        | 44.0  | 256388 | 0.0718          | 0.0947 | 0.0174 |
+| 0.1742        | 45.0  | 262215 | 0.0702          | 0.0939 | 0.0176 |
+| 0.1732        | 46.0  | 268042 | 0.0705          | 0.0943 | 0.0173 |
+| 0.1726        | 47.0  | 273869 | 0.0695          | 0.0939 | 0.0176 |
+| 0.1725        | 48.0  | 279696 | 0.0700          | 0.0930 | 0.0177 |
+| 0.1711        | 49.0  | 285523 | 0.0696          | 0.0914 | 0.0172 |
+| 0.1713        | 50.0  | 291350 | 0.0696          | 0.0920 | 0.0170 |
+| 0.1705        | 51.0  | 297177 | 0.0689          | 0.0938 | 0.0172 |
+| 0.1698        | 52.0  | 303004 | 0.0705          | 0.0932 | 0.0174 |
+| 0.1691        | 53.0  | 308831 | 0.0672          | 0.0914 | 0.0170 |
+| 0.1685        | 54.0  | 314658 | 0.0673          | 0.0883 | 0.0165 |
+| 0.1685        | 55.0  | 320485 | 0.0686          | 0.0912 | 0.0170 |
+| 0.1674        | 56.0  | 326312 | 0.0684          | 0.0907 | 0.0167 |
+| 0.1667        | 57.0  | 332139 | 0.0692          | 0.0895 | 0.0167 |
+| 0.1667        | 58.0  | 337966 | 0.0682          | 0.0870 | 0.0164 |
+| 0.1661        | 59.0  | 343793 | 0.0667          | 0.0864 | 0.0161 |
+| 0.1651        | 60.0  | 349620 | 0.0665          | 0.0868 | 0.0163 |
+| 0.1649        | 61.0  | 355447 | 0.0660          | 0.0866 | 0.0164 |
+| 0.1642        | 62.0  | 361274 | 0.0644          | 0.0876 | 0.0161 |
+| 0.164         | 63.0  | 367101 | 0.0655          | 0.0858 | 0.0161 |
+| 0.1639        | 64.0  | 372928 | 0.0650          | 0.0874 | 0.0160 |
+| 0.1639        | 65.0  | 378755 | 0.0652          | 0.0868 | 0.0160 |
+| 0.1625        | 66.0  | 384582 | 0.0650          | 0.0882 | 0.0161 |
+| 0.1624        | 67.0  | 390409 | 0.0648          | 0.0866 | 0.0158 |
+| 0.1617        | 68.0  | 396236 | 0.0649          | 0.0853 | 0.0158 |
+| 0.1608        | 69.0  | 402063 | 0.0639          | 0.0841 | 0.0159 |
+| 0.1605        | 70.0  | 407890 | 0.0647          | 0.0889 | 0.0161 |
+| 0.1604        | 71.0  | 413717 | 0.0635          | 0.0858 | 0.0157 |
+| 0.1598        | 72.0  | 419544 | 0.0644          | 0.0868 | 0.0160 |
+| 0.1593        | 73.0  | 425371 | 0.0640          | 0.0849 | 0.0158 |
+| 0.1591        | 74.0  | 431198 | 0.0642          | 0.0849 | 0.0156 |
+| 0.1584        | 75.0  | 437025 | 0.0639          | 0.0847 | 0.0158 |
+| 0.1584        | 76.0  | 442852 | 0.0623          | 0.0841 | 0.0155 |
+| 0.1577        | 77.0  | 448679 | 0.0631          | 0.0841 | 0.0155 |
+| 0.1578        | 78.0  | 454506 | 0.0633          | 0.0839 | 0.0155 |
+| 0.1575        | 79.0  | 460333 | 0.0630          | 0.0835 | 0.0154 |
+| 0.157         | 80.0  | 466160 | 0.0628          | 0.0835 | 0.0156 |
+### Framework versions
+- Transformers 4.47.0
+- Pytorch 2.1.0+cu118
+- Datasets 3.2.0
+- Tokenizers 0.21.0

adapter.lug.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9063d5af82b5dd7041be8deaf5d6712afe28564d05eb5a2f20d847595f4e2d12
+size 8870276

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ba55e18e15d83b1b201a4eaea0be99e10c696afd9df6fa1dfa8b7c7a75296b57
 size 3858962660

 version https://git-lfs.github.com/spec/v1
+oid sha256:c197357583684a29494204bfd7766858cae9619dff8e679167c9ff016b03a660
 size 3858962660