File size: 24,485 Bytes
331282d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 |
---
base_model: BAAI/bge-base-en-v1.5
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: 'Reasoning:
**Why the answer may be good:**
1. **Context Grounding**: The answer lists different types of toys suitable for
rabbits, such as small animal toys, wooden blocks, and non-toxic plastic balls,
which aligns well with the document''s mentions of "wooden rabbit chew toys,"
"cat jingles balls," and other similar options.
2. **Relevance**: The answer is directly related to the question of how to choose
toys for a rabbit, providing clear examples and suggestions on suitable toys and
their benefits for the rabbit.
3. **Conciseness**: The answer is generally concise and informative, discussing
specific toy types and why they are beneficial, in a straightforward manner.
**Why the answer may be bad:**
1. **Context Grounding**: While the answer is generally aligned with the suggestions
from the document, it does not mention some crucial points from the document such
as avoiding toxic materials, small parts, or the importance of regularly checking
the toys for wear and tear.
2. **Relevance**: The answer omits some key advice from the document regarding
safety considerations (e.g., materials and types of wood to avoid).
3. **Conciseness**: The answer is mostly concise; however, it could be more comprehensive
by including some of the safety tips and other precautions mentioned in the document,
without making it overly lengthy.
Final Result:
****'
- text: '**Reasoning:**
### Why the Answer May Be Good:
1. **Context Grounding**: The answer correctly references Martin Allen signing
Kieron Freeman when he went on loan to Notts County, which is supported by the
document.
2. **Relevance**: It directly names the manager who signed Kieron Freeman, which
is related to the document’s content.
### Why the Answer May Be Bad:
1. **Context Grounding**: The provided document discusses Kieron Freeman and his
football career; however, the document does not mention Aaron Pryor or any details
about his boxing career.
2. **Relevance**: The question specifically asks about Aaron Pryor’s manager during
his boxing career, not about Kieron Freeman''s football career or the managers
associated with Freeman.
3. **Conciseness**: The answer, although addressing the content correctly, does
not respond to the specific question asked.
### Final Result:
**Bad**
The answer does not address the question about Aaron Pryor''s boxing career and
his manager, making it irrelevant despitebeing correct within its own footballing
context.'
- text: 'Reasoning for why the answer may be good:
1. The provided answer does touch on a general concern associated with online
casinos, which is user data security.
2. Security concerns, especially loss or compromise of data, are common issues
related to online activities including online gambling.
Reasoning for why the answer may be bad:
1. The document provided does not seem to contain any content from July 10, 2011,
or any specific message related to that date.
2. The answer does not address the specific question about the concern of the
husband of the person who wrote the message on July 10, 2011.
3. There is no indication or context from the provided document that discusses
the husband or a message from July 10, 2011.
Final result:'
- text: 'Reasoning why the answer may be good:
1. The answer attempts to provide steps on how to train a dog from running out
of the house, which is directly related to the question.
2. It suggests using a command ("fly") and a toy as a reward, aligning with typical
training methods where behavior is rewarded.
3. It touches on the aspect of preventing the dog from running out of the house,
addressing the core issue.
Reasoning why the answer may be bad:
1. Context Grounding: The answer is not well-supported by the provided document.
The document mentions commands such as "sit" and "stay," and strategies like using
treats, clickers, and physical barriers, but does not mention a "fly" command
or holding hands above the dog''s tail.
2. Relevance: The answer deviates by suggesting a "fly" command, holding toys
above the tail, and the use of a magic spell, which are not grounded in the document
or part of standard dog training techniques.
3. Conciseness: The mention of using a magic spell and providing minimal exercise
at night are unnecessary and irrelevant, making the answer less clear and to the
point.
Final Result:'
- text: 'Reasoning:
The answer is directly taken from the provided document, specifically from the
line "Allan Cox''s First Class Delivery on a H128-10W for his Level 1 certification
flight." This indicates that the information is well-supported and context-grounded.
The answer is relevant as it directly addresses the specific question about the
type of engine used for Allan Cox''s Level 1 certification flight.
The answer is concise and to the point without any unnecessary information.
Final Result:'
inference: true
---
# SetFit with BAAI/bge-base-en-v1.5
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.
## Model Details
### Model Description
- **Model Type:** SetFit
- **Sentence Transformer body:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
- **Maximum Sequence Length:** 512 tokens
- **Number of Classes:** 2 classes
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
### Model Labels
| Label | Examples |
|:------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 0 | <ul><li>"**Reasoning:**\n\n**Why the Answer May Be Good:**\n1. **Context Grounding:** The answer references the points made in the document, such as Coach Brian Shaw's strategy of pushing the ball after makes and misses as well as encouraging players to take the first available shot within the rhythm of the offense.\n2. **Relevance:** The answer directly addresses why the Nuggets are having an offensive outburst, highlighting the coaching strategy and players' adaptation.\n3. **Conciseness:** The answer is mostly to the point and focuses on the main question.\n\n**Why the Answer May Be Bad:**\n1. **Context Grounding:** The mention of a new training technique using virtual reality is not supported by any information within the document provided.\n2. **Conciseness:** The additional detail about the virtual reality training is unnecessary given that it is not referenced in the document and does not contribute to answering the specific question about the offensive outburst.\n \n**Final Result:**\nBased on the evaluation criteria, the inclusion of fictitious or unsupported information about the virtual reality training significantly detracts from the answer’s credibility and relevance.\n\n****"</li><li>'Reasoning why the answer may be good:\n1. **Context Grounding:** The provided answer cites specific information about film and digital photography directly from the provided document, showing a good grounding.\n2. **Relevance:** The answer addresses the specific question by discussing different aspects such as exposure tolerance, color capture, and overall image resolution between film and digital photography.\n3. **Conciseness:** The answer is relatively concise and sticks to the main points relevant to the question without unnecessary elaboration.\n\nReasoning why the answer may be bad:\n1. **Overly Detailed:** The answer could be seen as too detailed in certain segments, which might slightly detract from conciseness.\n2. **Possible Confusion:** The mention of specific technical details like "5MP digital sensors" could confuse readers who are not familiar with the technical specifications, detracting from clarity.\n3. **Omission of Key Comparison Points:** The answer does not touch upon some of the more subjective observations made by the author, like the practical advantages in using film forcertain types of photography.\n\nFinal Result:'</li><li>'Reasoning:\n1. **Context Grounding**: The answer provided does not reference the third book of the Arcana Chronicles by Kresley Cole or even discuss any content relevant to it. Instead, it discusses an MMA event in Calgary, Alberta, Canada.\n2. **Relevance**: The answer is entirely irrelevant to the question. The question is about the main conflict in the third book of a specific book series, but the answer describes an MMA fight event.\n3. **Conciseness**: While the answer is concise in its context, it is entirely off-topic and therefore does not satisfy the conciseness criterion in a meaningful way.\n\nThe answer may be deemed bad because it does not address the question about the Arcana Chronicles at all and instead provides unrelated information about an MMA event.\n\nFinal result:'</li></ul> |
| 1 | <ul><li>'Reasoning:\n\n1. Context Grounding:\n - Good: The answer is supported by the document. The suggestions mentioned (getting to know the client, signing a contract, and showcasing honesty and diplomacy) are directly referenced in the text provided.\n - Bad: There is no significant bad aspect in terms of context grounding; the answer sticks closely to the source material.\n\n2. Relevance:\n - Good: The answer is highly relevant to the question about best practices to avoid unnecessary revisions and conflicts. It addresses client understanding, contractual agreements, and the handling of extra charges—all crucial for minimizing conflicts.\n - Bad: There is no deviation from the topic. The answer is focused solely on the best practices, as asked in the question.\n\n3. Conciseness:\n - Good: The answer is concise and to the point, effectively summarizing the practices without unnecessary details.\n - Bad: The level of detail might be too succinct for some readers looking for more in-depth discussion, but this is minor given the criteria.\n\nFinal Result:'</li><li>"Reasoning for why the answer may be good:\n- The answer references the author’s emphasis on drawing from personal experiences of pain and emotion to create genuine and relatable characters, which is well-supported by the document.\n- It highlights the importance of genuineness and relatability, which aligns directly with the content provided in the document.\n- The answer stays focused on the specific question about creating a connection between the reader and the characters.\n\nReasoning for why the answer may be bad:\n- The answer could be seen as slightly verbose and might include more detail than necessary, rather than being extremely concise.\n- It does not explicitly mention the document's use of pain for romance authors specifically, which might add to the context.\n\nFinal result:"</li><li>"**Reasoning:**\n\n**Pros:**\n1. **Context Grounding:** The document explicitly states that Mauro Rubin is the CEO of JoinPad and mentions that he was speaking at the event, which directly supports the answer.\n2. **Relevance:** The answer directly and correctly responds to the question about the CEO's identity during the event.\n3. **Conciseness:** The answer is brief and to the point, providing only the necessary information.\n\n**Cons:**\n- There are no significant cons as the answer fulfillsall criteria effectively.\n\n**Final Result:**"</li></ul> |
## Uses
### Direct Use for Inference
First install the SetFit library:
```bash
pip install setfit
```
Then you can load this model and run inference.
```python
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_gpt-4o_improved-cot-instructions_two_reasoning_remove_final_evaluation_e1")
# Run inference
preds = model("Reasoning:
The answer is directly taken from the provided document, specifically from the line \"Allan Cox's First Class Delivery on a H128-10W for his Level 1 certification flight.\" This indicates that the information is well-supported and context-grounded.
The answer is relevant as it directly addresses the specific question about the type of engine used for Allan Cox's Level 1 certification flight.
The answer is concise and to the point without any unnecessary information.
Final Result:")
```
<!--
### Downstream Use
*List how someone could finetune this model on their own dataset.*
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Set Metrics
| Training set | Min | Median | Max |
|:-------------|:----|:---------|:----|
| Word count | 45 | 125.3464 | 274 |
| Label | Training Sample Count |
|:------|:----------------------|
| 0 | 199 |
| 1 | 208 |
### Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
### Training Results
| Epoch | Step | Training Loss | Validation Loss |
|:------:|:----:|:-------------:|:---------------:|
| 0.0010 | 1 | 0.1714 | - |
| 0.0491 | 50 | 0.2595 | - |
| 0.0982 | 100 | 0.2566 | - |
| 0.1473 | 150 | 0.2328 | - |
| 0.1965 | 200 | 0.0808 | - |
| 0.2456 | 250 | 0.0481 | - |
| 0.2947 | 300 | 0.0402 | - |
| 0.3438 | 350 | 0.0251 | - |
| 0.3929 | 400 | 0.0204 | - |
| 0.4420 | 450 | 0.0194 | - |
| 0.4912 | 500 | 0.0175 | - |
| 0.5403 | 550 | 0.0146 | - |
| 0.5894 | 600 | 0.0088 | - |
| 0.6385 | 650 | 0.0052 | - |
| 0.6876 | 700 | 0.0053 | - |
| 0.7367 | 750 | 0.0033 | - |
| 0.7859 | 800 | 0.0028 | - |
| 0.8350 | 850 | 0.0026 | - |
| 0.8841 | 900 | 0.0031 | - |
| 0.9332 | 950 | 0.0022 | - |
| 0.9823 | 1000 | 0.0021 | - |
### Framework Versions
- Python: 3.10.14
- SetFit: 1.1.0
- Sentence Transformers: 3.1.1
- Transformers: 4.44.0
- PyTorch: 2.4.0+cu121
- Datasets: 3.0.0
- Tokenizers: 0.19.1
## Citation
### BibTeX
```bibtex
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--> |