--- language: fr license: mit tags: - deberta-v2 - text-classification - review-classification base_model: almanach/camembertav2-base datasets: - FLUE-CLS metrics: - accuracy pipeline_tag: text-classification library_name: transformers widget: # example for the french classification model - text: "Le livre est très intéressant et j'ai appris beaucoup de choses." example_title: Books Review - text: "Le film était ennuyeux et je n'ai pas aimé les acteurs." example_title: DVD Review - text: "La musique était très bonne et j'ai adoré les paroles." example_title: Music Review model-index: - name: almanach/camembertav2-base-cls results: - task: type: text-classification name: Amazon Review Classification dataset: type: flue-cls name: FLUE-CLS metrics: - name: accuracy type: accuracy value: 0.95849 verified: false --- # Model Card for almanach/camembertav2-base-cls almanach/camembertav2-base-cls is a deberta-v2 model for text classification. It is trained on the FLUE-CLS dataset for the task of Amazon Review Classification. The model achieves an accuracy of 0.95849 on the FLUE-CLS dataset. The model is part of the almanach/camembertav2-base family of model finetunes. ## Model Details ### Model Description - **Developed by:** Wissam Antoun (Phd Student at Almanach, Inria-Paris) - **Model type:** deberta-v2 - **Language(s) (NLP):** French - **License:** MIT - **Finetuned from model [optional]:** almanach/camembertav2-base ### Model Sources [optional] - **Repository:** https://github.com/WissamAntoun/camemberta - **Paper:** https://arxiv.org/abs/2411.08868 ## Uses The model can be used for text classification tasks in French of Movie, Music, and Book reviews from Amazon. ## Bias, Risks, and Limitations The model may exhibit biases based on the training data. The model may not generalize well to other datasets or tasks. The model may also have limitations in terms of the data it was trained on. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline model = AutoModelForSequenceClassification.from_pretrained("almanach/camembertav2-base-cls") tokenizer = AutoTokenizer.from_pretrained("almanach/camembertav2-base-cls") classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) classifier("Le livre est très intéressant et j'ai appris beaucoup de choses.") ``` ## Training Details ### Training Data The model is trained on the FLUE-CLS dataset. - Dataset Name: FLUE-CLS - Dataset Size: - Train: 5997 - Test: 5999 ### Training Procedure Model trained with the run_classification.py script from the huggingface repository. #### Training Hyperparameters ```yml accelerator_config: '{''split_batches'': False, ''dispatch_batches'': None, ''even_batches'': True, ''use_seedable_sampler'': True, ''non_blocking'': False, ''gradient_accumulation_kwargs'': None}' adafactor: false adam_beta1: 0.9 adam_beta2: 0.999 adam_epsilon: 1.0e-08 auto_find_batch_size: false base_model: camembertv2 base_model_name: camembertav2-base-bf16-p2-17000 batch_eval_metrics: false bf16: false bf16_full_eval: false data_seed: 1.0 dataloader_drop_last: false dataloader_num_workers: 0 dataloader_persistent_workers: false dataloader_pin_memory: true dataloader_prefetch_factor: .nan ddp_backend: .nan ddp_broadcast_buffers: .nan ddp_bucket_cap_mb: .nan ddp_find_unused_parameters: .nan ddp_timeout: 1800 debug: '[]' deepspeed: .nan disable_tqdm: false dispatch_batches: .nan do_eval: true do_predict: false do_train: true epoch: 5.984 eval_accumulation_steps: 4 eval_accuracy: 0.9584930821803634 eval_delay: 0 eval_do_concat_batches: true eval_loss: 0.1653172671794891 eval_on_start: false eval_runtime: 85.3752 eval_samples: 5999 eval_samples_per_second: 70.266 eval_steps: .nan eval_steps_per_second: 8.785 eval_strategy: epoch eval_use_gather_object: false evaluation_strategy: epoch fp16: false fp16_backend: auto fp16_full_eval: false fp16_opt_level: O1 fsdp: '[]' fsdp_config: '{''min_num_params'': 0, ''xla'': False, ''xla_fsdp_v2'': False, ''xla_fsdp_grad_ckpt'': False}' fsdp_min_num_params: 0 fsdp_transformer_layer_cls_to_wrap: .nan full_determinism: false gradient_accumulation_steps: 4 gradient_checkpointing: false gradient_checkpointing_kwargs: .nan greater_is_better: true group_by_length: false half_precision_backend: auto hub_always_push: false hub_model_id: .nan hub_private_repo: false hub_strategy: every_save hub_token: ignore_data_skip: false include_inputs_for_metrics: false include_num_input_tokens_seen: false include_tokens_per_second: false jit_mode_eval: false label_names: .nan label_smoothing_factor: 0.0 learning_rate: 3.0e-05 length_column_name: length load_best_model_at_end: true local_rank: 0 log_level: debug log_level_replica: warning log_on_each_node: true logging_dir: /scratch/camembertv2/runs/results/flue-CLS/camembertav2-base-bf16-p2-17000/max_seq_length-1024-gradient_accumulation_steps-4-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-1/logs logging_first_step: false logging_nan_inf_filter: true logging_steps: 100 logging_strategy: steps lr_scheduler_kwargs: '{}' lr_scheduler_type: linear max_grad_norm: 1.0 max_steps: -1 metric_for_best_model: accuracy mp_parameters: .nan name: camembertv2/runs/results/flue-CLS/camembertav2-base-bf16-p2-17000/max_seq_length-1024-gradient_accumulation_steps-4-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-linear-warmup_steps-0 neftune_noise_alpha: .nan no_cuda: false num_train_epochs: 6.0 optim: adamw_torch optim_args: .nan optim_target_modules: .nan output_dir: /scratch/camembertv2/runs/results/flue-CLS/camembertav2-base-bf16-p2-17000/max_seq_length-1024-gradient_accumulation_steps-4-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-1 overwrite_output_dir: false past_index: -1 per_device_eval_batch_size: 8 per_device_train_batch_size: 8 per_gpu_eval_batch_size: .nan per_gpu_train_batch_size: .nan prediction_loss_only: false push_to_hub: false push_to_hub_model_id: .nan push_to_hub_organization: .nan push_to_hub_token: ray_scope: last remove_unused_columns: true report_to: '[''tensorboard'']' restore_callback_states_from_checkpoint: false resume_from_checkpoint: .nan run_name: /scratch/camembertv2/runs/results/flue-CLS/camembertav2-base-bf16-p2-17000/max_seq_length-1024-gradient_accumulation_steps-4-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-1 save_on_each_node: false save_only_model: false save_safetensors: true save_steps: 500 save_strategy: epoch save_total_limit: .nan seed: 1 skip_memory_metrics: true split_batches: .nan tf32: .nan torch_compile: true torch_compile_backend: inductor torch_compile_mode: .nan torch_empty_cache_steps: .nan torchdynamo: .nan total_flos: 6620583341429724.0 tpu_metrics_debug: false tpu_num_cores: .nan train_loss: 0.0933089647276091 train_runtime: 1923.7045 train_samples: 5997 train_samples_per_second: 18.705 train_steps_per_second: 0.583 use_cpu: false use_ipex: false use_legacy_prediction_loop: false use_mps_device: false warmup_ratio: 0.0 warmup_steps: 0 weight_decay: 0.0 ``` #### Results **Accuracy:** 0.95849 ## Technical Specifications ### Model Architecture and Objective deberta-v2 for sequence classification. ## Citation **BibTeX:** ```bibtex @misc{antoun2024camembert20smarterfrench, title={CamemBERT 2.0: A Smarter French Language Model Aged to Perfection}, author={Wissam Antoun and Francis Kulumba and Rian Touchent and Éric de la Clergerie and Benoît Sagot and Djamé Seddah}, year={2024}, eprint={2411.08868}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2411.08868}, } ```