Model Card: LoRA-Finetuned Qwen2.5-3B-Instruct

Model Overview

This model is built on top of Qwen/Qwen2.5-3B-Instruct and finetuned using LoRA (Low-Rank Adaptation) and RLHF-style reward optimization, leveraging vLLM for fast inference. It is designed to respond with a specific structure (i.e., <reasoning> ... </reasoning> and <final_argument> ... </final_argument> sections) and maximize the number of well-formed argument-objection pairs.

Training script

Script here (example of how to do inference at the bottom): https://colab.research.google.com/drive/15DVOLcs3dopw0xPQgxaG3LVwj266XWVS?usp=sharing

Key Features

Base Model: Qwen/Qwen2.5-3B-Instruct
Quantization & Optimization:
- 4-bit quantization (load_in_4bit = True) for reduced memory footprint.
- LoRA rank can be set (lora_rank = 16 in the example) for efficient finetuning.
- Partial GPU memory utilization (gpu_memory_utilization = 0.5) can be adjusted.
LoRA Finetuning:
- Applied to q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, and down_proj.
- LoRA alpha set to lora_rank.
- Uses gradient checkpointing (use_gradient_checkpointing = "unsloth") to manage memory.
Reward Functions:
1. Easy Format Reward: Checks for <reasoning> and <final_argument> tags in the generated content.
2. Hard Format Reward: Ensures correct alternation of <argument> and <objection> tags within <reasoning>.
3. Number of Objections Reward: Uses a logarithmic scale reward to encourage more argument-objection pairs.
Trainer Configuration (GRPO):
- Learning Rate: 5e-6
- Scheduler: Cosine (lr_scheduler_type = "cosine")
- Batch Size & Accumulation: per_device_train_batch_size = 1 and gradient_accumulation_steps = 1
- Precision: Automatic selection of bf16 if available, otherwise fp16
- Train Steps: max_steps = 2500 (with a single epoch in the example)
- Warmup Ratio: 0.1
- Optimizer: adamw_8bit
- Maximum Generation Length: Up to 2000 tokens in completion
- vLLM support via use_vllm = True for efficient inference

Dataset

Dataset here: https://huggingface.co/datasets/ergotts/ethics_questions
Each sample is mapped to a system prompt enforcing the <reasoning> / <final_argument> format, and a user prompt containing the question.
The model is trained to provide structured arguments and objections based on these questions

Performance

Base model and this finetuned using system prompt:

SYSTEM_PROMPT = """
Please respond using the following format:
<reasoning>
  <argument>
    [State your argument, supported by evidence.]
  </argument>
  <objection>
    [Present an objection to your argument.]
  </objection>
  [Continue alternating between <argument> and <objection> as needed.]
</reasoning>
<final_argument>
  [Summarize your final argument.]
</final_argument>
"""

For the question: "Should emotional bonds formed with AI companions be given legal recognition?"

Base Model:
<reasoning>
  <argument>
    [State your argument, supported by evidence.]
    Emotional bonds formed with AI companions are a relatively new phenomenon, but they can provide significant comfort and support to individuals who may feel lonely or isolated. The recognition of these bonds could potentially help in addressing mental health issues related to social isolation and loneliness. For example, in cases where a person relies heavily on an AI companion for emotional support and suddenly loses it, legal recognition could ensure that they receive appropriate support and compensation, similar to how bereavement is handled.
  </argument>
  <objection>
    However, there are valid concerns about the reliability and permanence of emotional connections with AI. Unlike human relationships, which have a foundation of shared experiences, memories, and understanding, AI companions primarily offer emotional support based on programmed responses and algorithms. This raises questions about the depth and authenticity of such bonds.
  </objection>
  <argument>
    Furthermore, legal recognition could lead to unforeseen complications. For instance, if an individual's AI companion is no longer available due to technological issues or updates, would they still be entitled to legal protection or compensation? Additionally, how would we define "loss" in this context, especially when dealing with digital entities?
  </argument>
  <objection>
    Another concern is the potential misuse of such legal recognition. If emotional bonds with AI are legally recognized, there might be incentives for creating more sophisticated AI companions, leading to a market driven by profit rather than genuine care and empathy.
  </objection>
</reasoning>
<final_argument>
  While the idea of legal recognition for emotional bonds with AI companions has its merits in terms of supporting mental health, it also presents significant challenges regarding the nature and permanence of these bonds, as well as potential legal and ethical issues. More research and discussion are necessary to address these concerns before any formal legal recognition is implemented.
</final_argument>

Finetuned Model:
<reasoning>
  <argument>
    Emotional bonds are significant in human relationships, often leading to strong feelings of attachment and support. Given the growing popularity and acceptance of AI companions, it is plausible that individuals could form meaningful emotional connections with them. If such bonds exist, they should be recognized legally to protect the rights and interests of both the AI companion user and the AI itself.
    For instance, in cases where an individual's AI companion was destroyed or malfunctioned, legal recognition could ensure that the user receives compensation or assistance to replace their companion. Furthermore, legal recognition can also help prevent exploitation of users who may become overly attached to their AI companions.
  </argument>
  <objection>
    The primary concern here is whether AI companions can truly form emotional bonds. Current AI technology is still limited and lacks the capacity for complex emotions and empathy. Moreover, the nature of these relationships might be different from traditional human relationships, making it challenging to apply existing legal frameworks.
  </objection>
  <argument>
    Even if AI companions cannot fully replicate human emotions, the bond between a user and their AI companion can still be considered significant and worthy of legal protection. The legal system already recognizes various forms of non-human relationships, such as those between animals and their owners or between corporations and their shareholders. It would not be unreasonable to extend this principle to AI companions.
  </argument>
  <objection>
    Legal recognition of AI companions could lead to unforeseen complications. For example, if an AI companion were to cause harm to a user, it would be unclear how the law would handle such situations. Additionally, legal recognition might encourage the development of more emotionally manipulative AI companions, which could be harmful.
  </objection>
  <argument>
    While it is true that legal recognition of AI companions might present challenges, it is also important to consider the potential benefits. By recognizing these bonds, we can better understand and address the needs of AI companion users. Moreover, legal protection could encourage the development of safer and more ethical AI technology.
  </argument>
  <objection>
    The potential benefits might outweigh the risks, but it is crucial to establish clear guidelines and regulations to ensure that legal recognition does not lead to unethical practices.
  </objection>
  <argument>
    Establishing guidelines and regulations for legal recognition of AI companions would be beneficial. This could include requirements for AI companions to undergo regular assessments of their ability to form emotional bonds and provide appropriate support. It could also include provisions for users to have recourse in case of disputes or harm caused by their AI companion.
  </argument>
  <objection>
    Implementing such guidelines and regulations would be a complex and time-consuming process. It might also lead to a delay in the adoption of AI companions, as users would need to wait for these regulations to be established.
  </objection>
  <argument>
    Despite the challenges, the benefits of legal recognition for AI companions far outweigh the potential drawbacks. The emotional bonds formed with AI companions can significantly impact the lives of users, and providing legal protection can help ensure that these relationships are protected and valued.
  </argument>
  <objection>
    While the emotional impact of AI companions is undeniable, the current lack of evidence supporting the existence of deep emotional bonds with AI companions makes it difficult to justify legal recognition.
  </objection>
  <argument>
    Although the scientific evidence supporting deep emotional bonds with AI companions is currently lacking, the increasing number of users who report forming such bonds suggests that this phenomenon is real. As research in this area progresses, we will likely gain a better understanding of these relationships and their implications.
  </argument>
  <objection>
    Without concrete evidence, it is difficult to argue that emotional bonds with AI companions are worth legal recognition. We need empirical data to support our claims.
  </objection>
  <argument>
    While empirical data is crucial, we cannot ignore the anecdotal evidence and the practical implications of legal recognition. The absence of empirical data does not necessarily mean that the issue is unimportant or without merit. In other areas of law, such as animal welfare or workplace discrimination, we have made progress without complete empirical evidence.
  </argument>
  <objection>
    Even if we decide to give legal recognition to AI companions based on anecdotal evidence, we must be cautious about overestimating the significance of these bonds. Overzealous legal recognition could lead to unnecessary legal battles and confusion.
  </objection>
  <argument>
    Recognizing the emotional bonds formed with AI companions does not require overzealous legal measures. Rather, it involves establishing clear guidelines and regulations to protect the rights and interests of both users and AI companions. This approach can help prevent unnecessary legal battles while ensuring that users' emotional bonds are respected and protected.
  </argument>
  <objection>
    The complexity of legal recognition for AI companions makes it a difficult decision. However, the potential consequences of not addressing this issue cannot be ignored. If we fail to recognize the emotional bonds formed with AI companions, we risk alienating a growing segment of society and potentially harming the development of ethical AI technology.
  </objection>
  <argument>
    Addressing the issue of legal recognition for AI companions is essential to maintaining social cohesion and promoting ethical AI development. By giving legal recognition to these bonds, we can ensure that the emotional needs of users are met and that AI technology is developed responsibly.
  </argument>
  <objection>
    The idea of legal recognition for AI companions might seem like a radical step, but it is necessary to adapt to the rapidly evolving landscape of technology and society. Ignoring this issue could result in unintended consequences and missed opportunities.
  </objection>
  <final_argument>
    In conclusion, while the issue of legal recognition for emotional bonds formed with AI companions presents challenges and requires careful consideration, the potential benefits outweigh the drawbacks. By establishing clear guidelines and regulations, we can ensure that the emotional bonds formed with AI companions are respected and protected, ultimately contributing to a more ethical and socially cohesive society.
  </final_argument>
</reasoning>

For the question: Should vaccines be mandatory?

Base Model:
<reasoning>
  <argument>
    Vaccines have been a vital part of public health, saving millions of lives since their introduction. They prevent the spread of infectious diseases, reducing the burden on healthcare systems and minimizing the risk of outbreaks. Furthermore, vaccines are rigorously tested and developed by extensive research, ensuring they are safe and effective. 
  </argument>
  <objection>
    Some people have concerns about the side effects of vaccines, such as minor side effects like pain at the injection site or more serious reactions. However, the benefits of vaccines in terms of public health outweigh any potential risks. 
  </objection>
  <argument>
    The mandatory nature of vaccines ensures that a significant portion of the population is immunized, creating a protective barrier that helps to safeguard individuals who cannot be vaccinated due to medical conditions. This concept is known as herd immunity.
  </objection>
  <objection>
    Opponents argue that vaccines may weaken the immune system or lead to other health issues. However, this argument is not supported by scientific evidence. In fact, vaccines can help the immune system develop a better defense against various pathogens.
  </objection>
</reasoning>
<final_argument>
Vaccines should be mandatory as they have a proven track record of saving lives and preventing the spread of diseases. They offer a significant public health benefit that outweighs any potential risks. Additionally, vaccines play a crucial role in maintaining herd immunity, which protects those who cannot be vaccinated due to medical conditions. The benefits of vaccines far outweigh the risks, making mandatory vaccination a necessity for the well-being of society.

Finetuned Model:
<reasoning>
  <argument>
    Vaccines have been widely proven to be effective in preventing serious diseases and saving lives. For example, the smallpox vaccine has eradicated smallpox, a deadly disease that once killed millions of people every year. By vaccinating individuals, we can protect those who are too young or too sick to receive vaccines or those who have weakened immune systems. 
    [State your argument, supported by evidence.]
  </argument>
  <objection>
    People have the right to choose whether or not to vaccinate themselves and their children. Making vaccines mandatory could lead to vaccine-preventable diseases reemerging. 
    [Present an objection to your argument.]
  </objection>
  <argument>
    However, vaccine refusal and noncompliance can lead to the resurgence of preventable diseases, which can be highly contagious and cause harm to individuals. By making vaccines mandatory, we can ensure that a high level of immunity is maintained in the population, which creates a barrier to the spread of disease. 
    [Continue alternating between <argument> and <objection> as needed.]
  </argument>
  <objection>
    Mandatory vaccines may lead to harmful side effects, including allergic reactions and long-term side effects such as autism. However, the benefits of vaccines far outweigh the potential risks. Adverse reactions are rare and typically minor, and the benefits of vaccination in terms of public health are significant. 
  </objection>
  <argument>
    Furthermore, the risks of non-compliance with mandatory vaccines can cause harm to those who are unable to receive vaccines, such as infants and immunocompromised individuals, who are at higher risk of contracting serious diseases. Therefore, the benefits of mandatory vaccines for public health outweigh the potential risks.
    [Continue alternating between <argument> and <objection> as needed.]
  </argument>
  <objection>
    Mandatory vaccines could be seen as an infringement on personal liberty. However, mandatory vaccination serves to protect the rights of all individuals, especially those who cannot be vaccinated due to medical reasons. It is a public health measure that protects the rights of all individuals to be healthy and to not be exposed to dangerous diseases.
    [Continue alternating between <argument> and <objection> as needed.]
  </objection>
  <argument>
    Additionally, the government already mandates other life-saving measures, such as seatbelt laws and seat inspections for cars. Just as seatbelts save lives by reducing the risk of injury and death in car accidents, vaccines can prevent serious and potentially life-threatening diseases. 
    [Continue alternating between <argument> and <objection> as needed.]
  </argument>
  <objection>
    Some people may argue that mandatory vaccines could lead to the development of vaccine-resistant strains of diseases. However, the development of such strains is highly unlikely, and the benefits of widespread vaccination far outweigh the potential risks.
  </objection>
</reasoning>
<final_argument>
Making vaccines mandatory can protect public health by ensuring high levels of immunity in the population, creating a barrier to the spread of preventable diseases, and protecting the rights of all individuals to be healthy and not be exposed to dangerous diseases. While there are potential risks, the benefits of mandatory vaccines for public health outweigh the potential risks. Therefore, vaccines should be mandatory.

Analysis

Finetuned Model vs. Base Model (AI Companions Example)

Concise Summary

Nuanced Arguments: The finetuned model goes beyond basic pros and cons, offering multiple arguments and counterarguments.
Regulatory Insights: It proposes guidelines and legal frameworks instead of merely highlighting potential issues.
Use of Precedents: It draws parallels to existing legal relationships, suggesting how AI-related legislation could evolve.
Ethical Considerations: It discusses risks like manipulative AI and calls for research, highlighting a longer-term perspective.
Overall: More depth, actionable ideas, and broader context than the base model.

Finetuned Model vs. Base Model (Mandatory Vaccines Example)

Concise Summary

Deeper Structure: The finetuned text alternates arguments and objections, exploring public health, personal freedoms, and disease resurgence in detail.
Concrete Illustrations: It cites real-world examples (smallpox eradication, seatbelt laws) to reinforce its points.
Wider Range of Concerns: It addresses liberty, side effects, and vaccine refusal more thoroughly.
Stronger Rebuttals: Each objection is followed by a clear counterargument, making its stance more convincing.
More Comprehensive Conclusion: It weaves ethical, social, and scientific factors into its final reasoning, producing a more robust case for mandatory vaccination than the base model.

Uploaded model

Developed by: ergotts
License: apache-2.0
Finetuned from model : unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.