jgrosjean-mathesis
/

sentence-swissbert

Sentence Similarity

Transformers

PyTorch

xmod

Inference Endpoints

Model card Files Files and versions Community

jgrosjean commited on Dec 11, 2023

Commit

9abb0dd

•

1 Parent(s): dbc65e7

Update README.md

Browse files

Files changed (1) hide show

README.md +20 -33

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ The fine-tuning script can be accessed [here](Link).
 - **Developed by:** [Juri Grosjean](https://huggingface.co/jgrosjean)
 - **Model type:** [XMOD](https://huggingface.co/facebook/xmod-base)
-- **Language(s) (NLP):** [de_CH, fr_CH, it_CH, rm_CH]
 - **License:** [More Information Needed]
 - **Finetuned from model:** [SwissBERT](https://huggingface.co/ZurichNLP/swissbert)
@@ -70,32 +70,12 @@ tensor([[ 5.6306e-02, -2.8375e-01, -4.1495e-02,  7.4393e-02, -3.1552e-01,
         ...]])
 ```
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 This multilingual model has not been fine-tuned for cross-lingual transfer. It is intended for computing sentence embeddings that can be compared mono-lingually.
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
@@ -115,11 +95,24 @@ Use the code below to get started with the model.
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 [More Information Needed]
@@ -155,12 +148,6 @@ Use the code below to get started with the model.
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
 ## Environmental Impact
 <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

 - **Developed by:** [Juri Grosjean](https://huggingface.co/jgrosjean)
 - **Model type:** [XMOD](https://huggingface.co/facebook/xmod-base)
+- **Language(s) (NLP):** de_CH, fr_CH, it_CH, rm_CH
 - **License:** [More Information Needed]
 - **Finetuned from model:** [SwissBERT](https://huggingface.co/ZurichNLP/swissbert)
         ...]])
 ```
 ## Bias, Risks, and Limitations
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
+This model has been trained on news articles only. Hence, it might not perform as well on other text classes.
 This multilingual model has not been fine-tuned for cross-lingual transfer. It is intended for computing sentence embeddings that can be compared mono-lingually.
 ## Training Details
 ### Training Data
 #### Training Hyperparameters
+- **Training regime:** python3 train_simcse_multilingual.py \
+  --seed 54699 \
+  --model_name_or_path zurichNLP/swissbert \
+  --train_file /srv/scratch2/grosjean/Masterarbeit/data_subsets \
+  --output_dir /srv/scratch2/grosjean/Masterarbeit/model \
+  --overwrite_output_dir \
+  --save_strategy no \
+  --do_train \
+  --num_train_epochs 1 \
+  --learning_rate 1e-5 \
+  --per_device_train_batch_size 4 \
+  --gradient_accumulation_steps 128 \
+  --max_seq_length 512 \
+  --overwrite_cache \
+  --pooler_type avg \
+  --pad_to_max_length \
+  --temp 0.05 \
+  --fp16 <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 [More Information Needed]
 ## Environmental Impact
 <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->