--- base_model: BAAI/bge-large-en-v1.5 library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:4370 - loss:CosineSimilarityLoss widget: - source_sentence: ' Construct: Recognise a linear graph from its shape Subject: Finding the Gradient and Intercept of a Line from the Equation Question: Use a graphing program (e.g. Desmos) to plot the following pairs of functions. \[ y=3 \text { and } y=-2 \] Tom says both functions are linear Katie says both functions are vertical lines Who is correct? Incorrect Answer: Neither is correct Correct Answer: Only Tom ' sentences: - Believes the coefficent of x in an expanded quadratic comes from multiplying the two numbers in the brackets - Does not know the properties of a linear graph - Misremembers the quadratic formula - source_sentence: ' Construct: Multiply two decimals together with the same number of decimal places Subject: Multiplying and Dividing with Decimals Question: \( 0.6 \times 0.4= \) Incorrect Answer: \( 2.4 \) Correct Answer: \( 0.24 \) ' sentences: - When asked to solve simultaneous equations, believes they can just find values that work in one equation - Believes the solutions of a quadratic equation are the constants in the factorised form - When multiplying decimals, divides by the wrong power of 10 when reinserting the decimal - source_sentence: ' Construct: Estimate the volume or capacity of an object Subject: Volume of Prisms Question: Each of these measurements matches one of these objects. ![An image of 4 objects and 4 measurements. The objects are an egg cup, a cereal box, a chest of drawers and a piggy bank. And, the measurements are 87 cm^3, 1013 cm^3, 4172 cm^3 and 197,177 cm^3.]() Which measurement most likely matches the egg cup? Incorrect Answer: \( 197177 \mathrm{~cm}^{3} \) Correct Answer: \( 87 \mathrm{~cm}^{3} \) ' sentences: - Confuses quadratic and exponential graphs - Cannot estimate the relative volume order, for different objects - Does not know how many days are in a leap year - source_sentence: ' Construct: Carry out division problems involving one negative integer Subject: Multiplying and Dividing Negative Numbers Question: \( 12 \div(-4)= \) Incorrect Answer: \( 3 \) Correct Answer: \( -3 \) ' sentences: - Believes dividing a positive by a negative gives a positive answer - Believes -a is always smaller than a, ignoring the possibility that a is negative - Subtracts instead of divides - source_sentence: ' Construct: Construct frequency tables Subject: Frequency tables Question: Dave has recorded the number of pets his classmates have in the frequency table on the right. \begin{tabular}{|c|c|} \hline Number of pets & Frequency \\ \hline \( 0 \) & \( 4 \) \\ \hline \( 1 \) & \( 6 \) \\ \hline \( 2 \) & \( 3 \) \\ \hline \( 3 \) & \( 2 \) \\ \hline \( 4 \) & \( 5 \) \\ \hline \end{tabular} If Dave wanted to work out the total number of pets own by his classmates, what would be a useful column to include? Incorrect Answer: Number of pets - Frequency Correct Answer: Number of pets \( x \) Frequency ' sentences: - Subtracts rather than multiplies when calculating total frequency - Does not follow the arrows through a function machine, changes the order of the operations asked. - 'Believes the intersection in a prime factor venn diagram does not contribute to the size of the number represented by a circle ' --- # SentenceTransformer based on BAAI/bge-large-en-v1.5 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 1024 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("VaggP/bge-fine-tuned") # Run inference sentences = [ '\nConstruct: Construct frequency tables\nSubject: Frequency tables\nQuestion: Dave has recorded the number of pets his classmates have in the frequency table on the right. \\begin{tabular}{|c|c|}\n\\hline Number of pets & Frequency \\\\\n\\hline \\( 0 \\) & \\( 4 \\) \\\\\n\\hline \\( 1 \\) & \\( 6 \\) \\\\\n\\hline \\( 2 \\) & \\( 3 \\) \\\\\n\\hline \\( 3 \\) & \\( 2 \\) \\\\\n\\hline \\( 4 \\) & \\( 5 \\) \\\\\n\\hline\n\\end{tabular} If Dave wanted to work out the total number of pets own by his classmates, what would be a useful column to include?\nIncorrect Answer: Number of pets -\nFrequency\nCorrect Answer: Number of pets \\( x \\) Frequency\n', 'Subtracts rather than multiplies when calculating total frequency', 'Does not follow the arrows through a function machine, changes the order of the operations asked.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 1024] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 4,370 training samples * Columns: sentence_0, sentence_1, and label * Approximate statistics based on the first 1000 samples: | | sentence_0 | sentence_1 | label | |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------| | type | string | string | float | | details | | | | * Samples: | sentence_0 | sentence_1 | label | |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------|:-----------------| |
Construct: Construct a pictogram involving fractions of symbols
Subject: Pictogram
Question: This pictogram shows the different types of music Bob has in his music collection.
Bob has \( 2 \) rave CDs.

How would he display this on the pictogram? ![A pictogram showing the number of CDs Bob has in his musical collection. Pop has 3 and a half symbols, rock has 2 symbols, blues has 2 and a quarter symbols, jazz has 3 and a quarter symbols and classical has 1 and three-quarter symbols. Each symbol represents 4 CDs.]()
Incorrect Answer: ![\( 00 \)]()
Correct Answer: ![\( 0 \)]()
| When interpreting a pictogram, thinks each symbol stands for 1 | 1.0 | |
Construct: Use brackets to write function machines as calculations
Subject: Writing Expressions
Question: Tom and Katie are arguing about the result of this Function Machine:
Tom says the output is: \( 3 n-12 \)
Katie says the output is: \( 3(n-4) \)
Who is correct? ![A function machine with input n and operations subtract 4, multiply by 3]()
Incorrect Answer: Only Tom
Correct Answer: Both Tom and Katie
| Does not think a factorised expression is equivalent to its multiplied out form | 1.0 | |
Construct: Interpret linear sections of real life graphs
Subject: Real Life Graphs
Question: The graph on the right shows the mass of sand in a bucket over time

What might the horizontal section represent? ![A graph with time (secs) on the horizontal axis and mass (g) on the vertical axis. The graph starts at the origin, travels in a straight line up and right, travels horizontally, then travels in a straight line down and right back to the x-axis, more steeply than the start. ]()
Incorrect Answer: Sand is being tipped out
Correct Answer: The bucket is full
| Believes a horizontal line can show a constant rate of change | 1.0 | * Loss: [CosineSimilarityLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters: ```json { "loss_fct": "torch.nn.modules.loss.MSELoss" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `num_train_epochs`: 1 - `multi_dataset_batch_sampler`: round_robin #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 8 - `per_device_eval_batch_size`: 8 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: round_robin
### Training Logs | Epoch | Step | Training Loss | |:------:|:----:|:-------------:| | 0.9141 | 500 | 0.0055 | ### Framework Versions - Python: 3.10.14 - Sentence Transformers: 3.2.0 - Transformers: 4.45.1 - PyTorch: 2.4.0 - Accelerate: 0.34.2 - Datasets: 3.0.1 - Tokenizers: 0.20.0 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ```