--- language: fr license: mit tags: - deberta-v2 - text-classification base_model: almanach/camembertav2-base datasets: - FLUE-PAWS-X metrics: - accuracy pipeline_tag: text-classification library_name: transformers model-index: - name: almanach/camembertav2-base-pawsx results: - task: type: text-classification name: Paraphrase Identification dataset: type: flue-paws-x name: FLUE-PAWS-X metrics: - name: accuracy type: accuracy value: 0.93511 verified: false --- # Model Card for almanach/camembertav2-base-pawsx almanach/camembertav2-base-pawsx is a deberta-v2 model for text classification. It is trained on the FLUE-PAWS-X dataset for the task of Paraphrase Identification. The model achieves an accuracy of 0.93511 on the FLUE-PAWS-X dataset. The model is part of the almanach/camembertav2-base family of model finetunes. ## Model Details ### Model Description - **Developed by:** Wissam Antoun (Phd Student at Almanach, Inria-Paris) - **Model type:** deberta-v2 - **Language(s) (NLP):** French - **License:** MIT - **Finetuned from model [optional]:** almanach/camembertav2-base ### Model Sources [optional] - **Repository:** https://github.com/WissamAntoun/camemberta - **Paper:** https://arxiv.org/abs/2411.08868 ## Uses The model can be used for text classification tasks in French for Paraphrase Identification. ## Bias, Risks, and Limitations The model may exhibit biases based on the training data. The model may not generalize well to other datasets or tasks. The model may also have limitations in terms of the data it was trained on. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline model = AutoModelForSequenceClassification.from_pretrained("almanach/camembertav2-base-pawsx") tokenizer = AutoTokenizer.from_pretrained("almanach/camembertav2-base-pawsx") classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) classifier({ "text": "Le livre est très intéressant et j'ai appris beaucoup de choses.", "text_pair": "Le livre est très ennuyeux et je n'ai rien appris.", }) ``` ## Training Details ### Training Data The model is trained on the FLUE-PAWS-X dataset. - Dataset Name: FLUE-PAWS-X - Dataset Size: - Train: 49399 - Dev: 1988 - Test: 2000 ### Training Procedure Model trained with the run_classification.py script from the huggingface repository. #### Training Hyperparameters ```yml accelerator_config: '{''split_batches'': False, ''dispatch_batches'': None, ''even_batches'': True, ''use_seedable_sampler'': True, ''non_blocking'': False, ''gradient_accumulation_kwargs'': None}' adafactor: false adam_beta1: 0.9 adam_beta2: 0.999 adam_epsilon: 1.0e-08 auto_find_batch_size: false base_model: camembertv2 base_model_name: camembertav2-base-bf16-p2-17000 batch_eval_metrics: false bf16: false bf16_full_eval: false data_seed: 666.0 dataloader_drop_last: false dataloader_num_workers: 0 dataloader_persistent_workers: false dataloader_pin_memory: true dataloader_prefetch_factor: .nan ddp_backend: .nan ddp_broadcast_buffers: .nan ddp_bucket_cap_mb: .nan ddp_find_unused_parameters: .nan ddp_timeout: 1800 debug: '[]' deepspeed: .nan disable_tqdm: false dispatch_batches: .nan do_eval: true do_predict: false do_train: true epoch: 5.999028340080971 eval_accumulation_steps: 4 eval_accuracy: 0.9351106639839034 eval_delay: 0 eval_do_concat_batches: true eval_loss: 0.4311606884002685 eval_on_start: false eval_runtime: 5.8632 eval_samples: 1988 eval_samples_per_second: 339.064 eval_steps: .nan eval_steps_per_second: 42.468 eval_strategy: epoch eval_use_gather_object: false evaluation_strategy: epoch fp16: false fp16_backend: auto fp16_full_eval: false fp16_opt_level: O1 fsdp: '[]' fsdp_config: '{''min_num_params'': 0, ''xla'': False, ''xla_fsdp_v2'': False, ''xla_fsdp_grad_ckpt'': False}' fsdp_min_num_params: 0 fsdp_transformer_layer_cls_to_wrap: .nan full_determinism: false gradient_accumulation_steps: 2 gradient_checkpointing: false gradient_checkpointing_kwargs: .nan greater_is_better: true group_by_length: false half_precision_backend: auto hub_always_push: false hub_model_id: .nan hub_private_repo: false hub_strategy: every_save hub_token: ignore_data_skip: false include_inputs_for_metrics: false include_num_input_tokens_seen: false include_tokens_per_second: false jit_mode_eval: false label_names: .nan label_smoothing_factor: 0.0 learning_rate: 5.0e-05 length_column_name: length load_best_model_at_end: true local_rank: 0 log_level: debug log_level_replica: warning log_on_each_node: true logging_dir: /scratch/camembertv2/runs/results/flue-PAWS-X/camembertav2-base-bf16-p2-17000/max_seq_length-148-gradient_accumulation_steps-2-precision-fp32-learning_rate-5e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-666/logs logging_first_step: false logging_nan_inf_filter: true logging_steps: 100 logging_strategy: steps lr_scheduler_kwargs: '{}' lr_scheduler_type: linear max_grad_norm: 1.0 max_steps: -1 metric_for_best_model: accuracy mp_parameters: .nan name: camembertv2/runs/results/flue-PAWS-X/camembertav2-base-bf16-p2-17000/max_seq_length-148-gradient_accumulation_steps-2-precision-fp32-learning_rate-5e-05-epochs-6-lr_scheduler-linear-warmup_steps-0 neftune_noise_alpha: .nan no_cuda: false num_train_epochs: 6.0 optim: adamw_torch optim_args: .nan optim_target_modules: .nan output_dir: /scratch/camembertv2/runs/results/flue-PAWS-X/camembertav2-base-bf16-p2-17000/max_seq_length-148-gradient_accumulation_steps-2-precision-fp32-learning_rate-5e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-666 overwrite_output_dir: false past_index: -1 per_device_eval_batch_size: 8 per_device_train_batch_size: 8 per_gpu_eval_batch_size: .nan per_gpu_train_batch_size: .nan prediction_loss_only: false push_to_hub: false push_to_hub_model_id: .nan push_to_hub_organization: .nan push_to_hub_token: ray_scope: last remove_unused_columns: true report_to: '[''tensorboard'']' restore_callback_states_from_checkpoint: false resume_from_checkpoint: .nan run_name: /scratch/camembertv2/runs/results/flue-PAWS-X/camembertav2-base-bf16-p2-17000/max_seq_length-148-gradient_accumulation_steps-2-precision-fp32-learning_rate-5e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-666 save_on_each_node: false save_only_model: false save_safetensors: true save_steps: 500 save_strategy: epoch save_total_limit: .nan seed: 666 skip_memory_metrics: true split_batches: .nan tf32: .nan torch_compile: true torch_compile_backend: inductor torch_compile_mode: .nan torch_empty_cache_steps: .nan torchdynamo: .nan total_flos: 1.3373133118742268e+16 tpu_metrics_debug: false tpu_num_cores: .nan train_loss: 0.1195580537627343 train_runtime: 3073.2453 train_samples: 49399 train_samples_per_second: 96.443 train_steps_per_second: 6.027 use_cpu: false use_ipex: false use_legacy_prediction_loop: false use_mps_device: false warmup_ratio: 0.0 warmup_steps: 0 weight_decay: 0.0 ``` #### Results **Accuracy:** 0.93511 ## Technical Specifications ### Model Architecture and Objective deberta-v2 for sequence classification. ## Citation **BibTeX:** ```bibtex @misc{antoun2024camembert20smarterfrench, title={CamemBERT 2.0: A Smarter French Language Model Aged to Perfection}, author={Wissam Antoun and Francis Kulumba and Rian Touchent and Éric de la Clergerie and Benoît Sagot and Djamé Seddah}, year={2024}, eprint={2411.08868}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2411.08868}, } ```