Upload folder using huggingface_hub

6a6117d verified 2 months ago

7.72 kB

	---
	language: fr
	license: mit
	tags:
	- roberta
	- text-classification
	base_model: almanach/camembertv2-base
	datasets:
	- FLUE-PAWS-X
	metrics:
	- accuracy
	pipeline_tag: text-classification
	library_name: transformers
	model-index:
	- name: almanach/camembertv2-base-pawsx
	results:
	- task:
	type: text-classification
	name: Paraphrase Identification
	dataset:
	type: flue-paws-x
	name: FLUE-PAWS-X
	metrics:
	- name: accuracy
	type: accuracy
	value: 0.92254
	verified: false
	---

	# Model Card for almanach/camembertv2-base-pawsx

	almanach/camembertv2-base-pawsx is a roberta model for text classification. It is trained on the FLUE-PAWS-X dataset for the task of Paraphrase Identification. The model achieves an accuracy of 0.92254 on the FLUE-PAWS-X dataset.

	The model is part of the almanach/camembertv2-base family of model finetunes.

	## Model Details

	### Model Description

	- Developed by: Wissam Antoun (Phd Student at Almanach, Inria-Paris)
	- Model type: roberta
	- Language(s) (NLP): French
	- License: MIT
	- Finetuned from model [optional]: almanach/camembertv2-base

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: https://github.com/WissamAntoun/camemberta
	- Paper: https://arxiv.org/abs/2411.08868

	## Uses

	The model can be used for text classification tasks in French for Paraphrase Identification.

	## Bias, Risks, and Limitations

	The model may exhibit biases based on the training data. The model may not generalize well to other datasets or tasks. The model may also have limitations in terms of the data it was trained on.


	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

	model = AutoModelForSequenceClassification.from_pretrained("almanach/camembertv2-base-pawsx")
	tokenizer = AutoTokenizer.from_pretrained("almanach/camembertv2-base-pawsx")

	classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

	classifier({
	"text": "Le livre est très intéressant et j'ai appris beaucoup de choses.",
	"text_pair": "Le livre est très ennuyeux et je n'ai rien appris.",
	})
	```


	## Training Details

	### Training Data

	The model is trained on the FLUE-PAWS-X dataset.

	- Dataset Name: FLUE-PAWS-X
	- Dataset Size:
	- Train: 49399
	- Dev: 1988
	- Test: 2000


	### Training Procedure

	Model trained with the run_classification.py script from the huggingface repository.



	#### Training Hyperparameters

	```yml
	accelerator_config: '{''split_batches'': False, ''dispatch_batches'': None, ''even_batches'':
	True, ''use_seedable_sampler'': True, ''non_blocking'': False, ''gradient_accumulation_kwargs'':
	None}'
	adafactor: false
	adam_beta1: 0.9
	adam_beta2: 0.999
	adam_epsilon: 1.0e-08
	auto_find_batch_size: false
	base_model: camembertv2
	base_model_name: camembertv2-base-bf16-p2-17000
	batch_eval_metrics: false
	bf16: false
	bf16_full_eval: false
	data_seed: 1.0
	dataloader_drop_last: false
	dataloader_num_workers: 0
	dataloader_persistent_workers: false
	dataloader_pin_memory: true
	dataloader_prefetch_factor: .nan
	ddp_backend: .nan
	ddp_broadcast_buffers: .nan
	ddp_bucket_cap_mb: .nan
	ddp_find_unused_parameters: .nan
	ddp_timeout: 1800
	debug: '[]'
	deepspeed: .nan
	disable_tqdm: false
	dispatch_batches: .nan
	do_eval: true
	do_predict: false
	do_train: true
	epoch: 5.999028340080971
	eval_accumulation_steps: 4
	eval_accuracy: 0.9225352112676056
	eval_delay: 0
	eval_do_concat_batches: true
	eval_loss: 0.3642682433128357
	eval_on_start: false
	eval_runtime: 4.0364
	eval_samples: 1988
	eval_samples_per_second: 492.519
	eval_steps: .nan
	eval_steps_per_second: 61.689
	eval_strategy: epoch
	eval_use_gather_object: false
	evaluation_strategy: epoch
	fp16: false
	fp16_backend: auto
	fp16_full_eval: false
	fp16_opt_level: O1
	fsdp: '[]'
	fsdp_config: '{''min_num_params'': 0, ''xla'': False, ''xla_fsdp_v2'': False, ''xla_fsdp_grad_ckpt'':
	False}'
	fsdp_min_num_params: 0
	fsdp_transformer_layer_cls_to_wrap: .nan
	full_determinism: false
	gradient_accumulation_steps: 2
	gradient_checkpointing: false
	gradient_checkpointing_kwargs: .nan
	greater_is_better: true
	group_by_length: false
	half_precision_backend: auto
	hub_always_push: false
	hub_model_id: .nan
	hub_private_repo: false
	hub_strategy: every_save
	hub_token: <HUB_TOKEN>
	ignore_data_skip: false
	include_inputs_for_metrics: false
	include_num_input_tokens_seen: false
	include_tokens_per_second: false
	jit_mode_eval: false
	label_names: .nan
	label_smoothing_factor: 0.0
	learning_rate: 3.0e-05
	length_column_name: length
	load_best_model_at_end: true
	local_rank: 0
	log_level: debug
	log_level_replica: warning
	log_on_each_node: true
	logging_dir: /scratch/camembertv2/runs/results/flue-PAWS-X/camembertv2-base-bf16-p2-17000/max_seq_length-148-gradient_accumulation_steps-2-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-1/logs
	logging_first_step: false
	logging_nan_inf_filter: true
	logging_steps: 100
	logging_strategy: steps
	lr_scheduler_kwargs: '{}'
	lr_scheduler_type: linear
	max_grad_norm: 1.0
	max_steps: -1
	metric_for_best_model: accuracy
	mp_parameters: .nan
	name: camembertv2/runs/results/flue-PAWS-X/camembertv2-base-bf16-p2-17000/max_seq_length-148-gradient_accumulation_steps-2-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-linear-warmup_steps-0
	neftune_noise_alpha: .nan
	no_cuda: false
	num_train_epochs: 6.0
	optim: adamw_torch
	optim_args: .nan
	optim_target_modules: .nan
	output_dir: /scratch/camembertv2/runs/results/flue-PAWS-X/camembertv2-base-bf16-p2-17000/max_seq_length-148-gradient_accumulation_steps-2-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-1
	overwrite_output_dir: false
	past_index: -1
	per_device_eval_batch_size: 8
	per_device_train_batch_size: 8
	per_gpu_eval_batch_size: .nan
	per_gpu_train_batch_size: .nan
	prediction_loss_only: false
	push_to_hub: false
	push_to_hub_model_id: .nan
	push_to_hub_organization: .nan
	push_to_hub_token: <PUSH_TO_HUB_TOKEN>
	ray_scope: last
	remove_unused_columns: true
	report_to: '[''tensorboard'']'
	restore_callback_states_from_checkpoint: false
	resume_from_checkpoint: .nan
	run_name: /scratch/camembertv2/runs/results/flue-PAWS-X/camembertv2-base-bf16-p2-17000/max_seq_length-148-gradient_accumulation_steps-2-precision-fp32-learning_rate-3e-05-epochs-6-lr_scheduler-linear-warmup_steps-0/SEED-1
	save_on_each_node: false
	save_only_model: false
	save_safetensors: true
	save_steps: 500
	save_strategy: epoch
	save_total_limit: .nan
	seed: 1
	skip_memory_metrics: true
	split_batches: .nan
	tf32: .nan
	torch_compile: true
	torch_compile_backend: inductor
	torch_compile_mode: .nan
	torch_empty_cache_steps: .nan
	torchdynamo: .nan
	total_flos: 1.33712370278538e+16
	tpu_metrics_debug: false
	tpu_num_cores: .nan
	train_loss: 0.1708474308300605
	train_runtime: 2225.7449
	train_samples: 49399
	train_samples_per_second: 133.166
	train_steps_per_second: 8.322
	use_cpu: false
	use_ipex: false
	use_legacy_prediction_loop: false
	use_mps_device: false
	warmup_ratio: 0.0
	warmup_steps: 0
	weight_decay: 0.0

	```

	#### Results

	Accuracy: 0.92254

	## Technical Specifications

	### Model Architecture and Objective

	roberta for sequence classification.

	## Citation

	BibTeX:

	```bibtex
	@misc{antoun2024camembert20smarterfrench,
	title={CamemBERT 2.0: A Smarter French Language Model Aged to Perfection},
	author={Wissam Antoun and Francis Kulumba and Rian Touchent and Éric de la Clergerie and Benoît Sagot and Djamé Seddah},
	year={2024},
	eprint={2411.08868},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2411.08868},
	}
	```