NECOUDBFM
/

Jellyfish-8B

@@ -10,6 +10,9 @@ language:
 -->
 <img src="https://i.imgur.com/E1vqCIw.png" alt="PicToModel" width="330"/>
 ## Model Details
 Jellyfish-8B is a large language model equipped with 8 billion parameters.
@@ -26,6 +29,7 @@ More details about the model can be found in the [Jellyfish paper](https://arxiv
 - **Language(s) (NLP):** English
 - **License:** Non-Commercial Creative Commons license (CC BY-NC-4.0)
 - **Finetuned from model:** [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
 ## Citation
 If you find our work useful, please give us credit by citing:
@@ -105,3 +109,84 @@ _Few-shot is disabled for Jellyfish models._
 <|start_header_id|>user<|end_header_id|>{prompt}<|eot_id|>
 <|start_header_id|>assistant<|end_header_id|>
 ```

 -->
 <img src="https://i.imgur.com/E1vqCIw.png" alt="PicToModel" width="330"/>
+Other versions of Jellyfish:
+[Jellyfish-7B](https://huggingface.co/NECOUDBFM/Jellyfish-7B)
+[Jellyfish-13B](https://huggingface.co/NECOUDBFM/Jellyfish-13B)
 ## Model Details
 Jellyfish-8B is a large language model equipped with 8 billion parameters.
 - **Language(s) (NLP):** English
 - **License:** Non-Commercial Creative Commons license (CC BY-NC-4.0)
 - **Finetuned from model:** [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
 ## Citation
 If you find our work useful, please give us credit by citing:
 <|start_header_id|>user<|end_header_id|>{prompt}<|eot_id|>
 <|start_header_id|>assistant<|end_header_id|>
 ```
+## Prompts
+We provide the prompts used for both the model's fine-tuning and inference.
+You can structure your data according to these prompts.
+### System Message
+```
+You are an AI assistant that follows instruction extremely well.
+User will give you a question. Your task is to answer as faithfully as you can.
+```
+### For Entity Matching
+```
+You are tasked with determining whether two records listed below are the same based on the information provided.
+Carefully compare the {attribute 1}, {attribute 2}... for each record before making your decision.
+Note that missing values (N/A or \"nan\") should not be used as a basis for your decision.
+Record A: [{attribute 1}: {attribute 1 value}, {attribute 2}: {attribute 2 value}, ...]
+Record B: [{attribute 1}: {attribute 1 value}, {attribute 2}: {attribute 2 value}, ...]
+Are record A and record B the same entity? Choose your answer from: [Yes, No].
+```
+### For Data Imputation
+```
+You are presented with a {keyword} record that is missing a specific attribute: {attribute X}.
+Your task is to deduce or infer the value of {attribute X} using the available information in the record.
+You may be provided with fields like {attribute 1}, {attribute 2}, ... to help you in the inference.
+Record: [{attribute 1}: {attribute 1 value}, {attribute 2}: {attribute 2 value}, ...]
+Based on the provided record, what would you infer is the value for the missing attribute {attribute X}?
+Answer only the value of {attribute X}.
+```
+### For Data Imputation
+```
+You are presented with a {keyword} record that is missing a specific attribute: {attribute X}.
+Your task is to deduce or infer the value of {attribute X} using the available information in the record.
+You may be provided with fields like {attribute 1}, {attribute 2}, ... to help you in the inference.
+Record: [{attribute 1}: {attribute 1 value}, {attribute 2}: {attribute 2 value}, ...]
+Based on the provided record, what would you infer is the value for the missing attribute {attribute X}?
+Answer only the value of {attribute X}.
+```
+### For Error Detection
+_There are two forms of the error detection task.
+In the first form, a complete record row is provided, and the task is to determine if a specific value is erroneous.
+In the second form, only the value of a specific attribute is given, and the decision about its correctness is based solely on the attribute's name and value.
+The subsequent prompt examples pertain to these two forms, respectively._
+```
+Your task is to determine if there is an error in the value of a specific attribute within the whole record provided.
+The attributes may include {attribute 1}, {attribute 2}, ...
+Errors may include, but are not limited to, spelling errors, inconsistencies, or values that don't make sense given the context of the whole record.
+Record [{attribute 1}: {attribute 1 value}, {attribute 2}: {attribute 2 value}, ...]
+Attribute for Verification: [{attribute X}: {attribute X value}]
+Question: Is there an error in the value of {attribute X}? Choose your answer from: [Yes, No].
+```
+```
+Your task is to determine if there is an error in the value of a specific attribute.
+The attributes may belong to a {keyword} record and could be one of the following: {attribute 1}, {attribute 2}, ...
+Errors can include, but are not limited to, spelling errors, inconsistencies, or values that don't make sense for that attribute.
+Note: Missing values (N/A or \"nan\") are not considered errors.
+Attribute for Verification: [{attribute X}: {attribute X value}]
+Question: Is there an error in the value of {attribute X}? Choose your answer from: [Yes, No].
+```
+### For Schema Matching
+```
+Your task is to determine if the two attributes (columns) are semantically equivalent in the context of merging two tables.
+Each attribute will be provided by its name and a brief description.
+Your goal is to assess if they refer to the same information based on these names and descriptions provided.
+Attribute A is [name: {value of name}, description: {value of description}].
+Attribute B is [name: {value of name}, description: {value of description}].
+Are Attribute A and Attribute B semantically equivalent? Choose your answer from: [Yes, No].
+```
+### For Column Type Annotation
+We follow the prompt in [Column Type Annotation using ChatGPT](https://arxiv.org/abs/2306.00745) (text+inst+2-step).
+### For Attribute Value Extraction
+We follow the prompt in [Product Attribute Value Extraction using Large Language Models](https://arxiv.org/abs/2310.12537) (textual, w/o examples).