Deepseek R1 Distilled Qwen 1.5B as a SLM Judge (Binary)

Developed by: 1rsh
License: apache-2.0
Finetuned from model : unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit

This model was finetuned 2x faster with Unsloth and Huggingface's TRL library for LLM-as-a-Judge tasks.

Input Format:

Score the question answer pair on the following metric: Is the response factually accurate, based on verifiable financial data, trends, and principles? Does it align with established financial knowledge, avoiding misrepresentation or incorrect information?

# Scoring Rubric
| Score | Description                                                                                    |
|-------|------------------------------------------------------------------------------------------------|
| 1     | Approved: The response is fully factually accurate, with no errors or omissions.               |
| 0     | Disapproved: The response contains significant factual inaccuracies or misleading information. |

# Question
"How does diversification in an investment portfolio reduce risk?"

# Answer
"Diversification reduces risk by spreading investments across various assets, which means if one asset fails, others will likely succeed. This strategy ensures that your entire portfolio doesn't collapse because it's not reliant on a single type of investment. For example, investing in both stocks and bonds can balance out risks since they often perform differently under the same market conditions."

Output Format:

<reasoning>
The response accurately explains how diversification reduces risk by spreading investments across different assets. It correctly states that if one asset fails, others will succeed, and that the entire portfolio isn't reliant on a single type of investment. The example of stocks and bonds is appropriate and clear. There are no factual errors or omissions, so the response is fully accurate and should receive a score of 1.
</reasoning>
<score>
1
</score>

1rsh
/

DeepSeek-R1-Distill-Qwen-1.5B-SLMJ-binary

Deepseek R1 Distilled Qwen 1.5B as a SLM Judge (Binary)

Input Format:

Output Format:

Model tree for 1rsh/DeepSeek-R1-Distill-Qwen-1.5B-SLMJ-binary

Dataset used to train 1rsh/DeepSeek-R1-Distill-Qwen-1.5B-SLMJ-binary