ashaduzzaman commited on
Commit
dd736db
·
verified ·
1 Parent(s): d37c5e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -25
README.md CHANGED
@@ -8,43 +8,66 @@ datasets:
8
  model-index:
9
  - name: distilroberta-finetuned-eli5
10
  results: []
 
 
 
 
 
 
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
 
16
  # distilroberta-finetuned-eli5
17
 
18
- This model is a fine-tuned version of [distilbert/distilroberta-base](https://huggingface.co/distilbert/distilroberta-base) on the eli5_category dataset.
19
- It achieves the following results on the evaluation set:
20
- - Loss: 2.0173
 
 
21
 
22
- ## Model description
 
 
 
23
 
24
- More information needed
25
 
26
- ## Intended uses & limitations
 
 
 
27
 
28
- More information needed
 
 
 
29
 
30
- ## Training and evaluation data
31
 
32
- More information needed
 
 
33
 
34
- ## Training procedure
 
 
 
35
 
36
- ### Training hyperparameters
37
 
38
- The following hyperparameters were used during training:
39
- - learning_rate: 2e-05
40
- - train_batch_size: 8
41
- - eval_batch_size: 8
42
- - seed: 42
43
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
- - lr_scheduler_type: linear
45
- - num_epochs: 3
46
 
47
- ### Training results
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
@@ -52,10 +75,30 @@ The following hyperparameters were used during training:
52
  | 2.1761 | 2.0 | 2664 | 2.0300 |
53
  | 2.1281 | 3.0 | 3996 | 2.0227 |
54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
- ### Framework versions
57
 
58
- - Transformers 4.42.4
59
- - Pytorch 2.3.1+cu121
60
- - Datasets 2.21.0
61
- - Tokenizers 0.19.1
 
8
  model-index:
9
  - name: distilroberta-finetuned-eli5
10
  results: []
11
+ language:
12
+ - en
13
+ metrics:
14
+ - perplexity
15
+ library_name: transformers
16
+ pipeline_tag: fill-mask
17
  ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
  should probably proofread and complete it, then remove this comment. -->
21
 
22
+
23
  # distilroberta-finetuned-eli5
24
 
25
+ This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base), optimized for the ELI5 (Explain Like I'm Five) task. It has been trained on the `eli5_category` dataset to handle masked language modeling specifically within the context of question answering and explanations.
26
+
27
+ ## Model Description
28
+
29
+ DistilRoBERTa is a smaller, faster, and lighter version of the RoBERTa model, designed to retain a good balance between efficiency and performance. This fine-tuned model adapts DistilRoBERTa to the ELI5 task, which involves understanding complex topics and providing simplified, easy-to-understand explanations. The model has been fine-tuned using masked language modeling objectives, making it suitable for tasks requiring contextual understanding and natural language generation.
30
 
31
+ ### Key Features:
32
+ - **Base Model**: [distilroberta-base](https://huggingface.co/distilroberta-base), a distilled version of the robustly optimized BERT approach (RoBERTa).
33
+ - **Fine-Tuned For**: ELI5 (Explain Like I'm Five) tasks, where the goal is to generate simple and coherent explanations for complex topics.
34
+ - **Architecture**: Transformer-based with 6 layers, 768 hidden units, and 12 attention heads, making it lighter and faster than full-scale RoBERTa models.
35
 
36
+ ## Intended Uses & Limitations
37
 
38
+ ### Intended Uses:
39
+ - **Text Completion**: Generate text that completes a given sentence or passage, particularly for educational or explanatory content.
40
+ - **Simplified Explanations**: Generate explanations for complex topics that are easy to understand.
41
+ - **Masked Language Modeling**: Predict masked words in a sentence, making it useful for filling in blanks and understanding context.
42
 
43
+ ### Limitations:
44
+ - **Domain-Specific Knowledge**: The model's understanding is limited to the domains present in the training data. It may not perform well on highly specialized or technical topics not covered during training.
45
+ - **Output Quality**: Although the model can generate coherent text, the quality and accuracy of its explanations may vary. Users should verify and refine the outputs, especially for critical applications.
46
+ - **Biases**: As with all language models, this model may exhibit biases present in the training data. Care should be taken when using it in sensitive or diverse contexts.
47
 
48
+ ## Training and Evaluation Data
49
 
50
+ ### Dataset:
51
+ - **Training Data**: The model was fine-tuned on the `eli5_category` dataset, which contains question-answer pairs and explanatory text sourced from the ELI5 subreddit and other similar data. This dataset focuses on providing simplified explanations for various topics, making it suitable for the ELI5 task.
52
+ - **Evaluation Data**: The model was evaluated on a separate validation set derived from the same dataset to ensure consistency in the type of questions and explanations.
53
 
54
+ ### Data Characteristics:
55
+ - **Topics Covered**: A wide range of topics, including science, technology, health, and general knowledge.
56
+ - **Language**: Primarily English.
57
+ - **Data Size**: The dataset consists of thousands of question-answer pairs, providing a robust training ground for learning explanatory language patterns.
58
 
59
+ ## Training Procedure
60
 
61
+ ### Training Hyperparameters:
62
+ - **Learning Rate**: 2e-05
63
+ - **Train Batch Size**: 8
64
+ - **Eval Batch Size**: 8
65
+ - **Seed**: 42
66
+ - **Optimizer**: Adam with betas=(0.9, 0.999) and epsilon=1e-08
67
+ - **Learning Rate Scheduler Type**: Linear
68
+ - **Number of Epochs**: 3
69
 
70
+ ### Training Results:
71
 
72
  | Training Loss | Epoch | Step | Validation Loss |
73
  |:-------------:|:-----:|:----:|:---------------:|
 
75
  | 2.1761 | 2.0 | 2664 | 2.0300 |
76
  | 2.1281 | 3.0 | 3996 | 2.0227 |
77
 
78
+ - **Final Validation Loss**: 2.0173
79
+
80
+ ### Framework Versions
81
+ - **Transformers**: 4.42.4
82
+ - **PyTorch**: 2.3.1+cu121
83
+ - **Datasets**: 2.21.0
84
+ - **Tokenizers**: 0.19.1
85
+
86
+ ## Usage
87
+
88
+ You can use this model in a Hugging Face pipeline for fill-mask tasks:
89
+
90
+ ```python
91
+ from transformers import pipeline
92
+
93
+ fill_mask = pipeline(
94
+ "fill-mask", model="ashaduzzaman/distilroberta-finetuned-eli5"
95
+ )
96
+
97
+ # Example usage
98
+ text = "The quick brown <mask> jumps over the lazy dog."
99
+ fill_mask(text, top_k=3)
100
+ ```
101
 
102
+ ## Acknowledgments
103
 
104
+ This model was developed using the [Hugging Face Transformers](https://huggingface.co/transformers) library and fine-tuned using the `eli5_category` dataset. Special thanks to the contributors of the ELI5 subreddit for providing a rich source of explanatory content.