license: llama3.1
datasets:
- DebateLabKIT/deepa2-conversations
- DebateLabKIT/deep-argmap-conversations
- allenai/tulu-3-sft-mixture
base_model:
- meta-llama/Llama-3.1-8B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- logic
- argumentation
- critical-thinking
- argument-mapping
- trl
- sft
model-index:
- name: Llama-3.1-Argunaut-1-8B-SFT
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: wis-k/instruction-following-eval
split: train
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 55.19
name: averaged accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=DebateLabKIT%2FLlama-3.1-Argunaut-1-8B-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: SaylorTwift/bbh
split: test
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 27.19
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=DebateLabKIT%2FLlama-3.1-Argunaut-1-8B-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: lighteval/MATH-Hard
split: test
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 11.18
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=DebateLabKIT%2FLlama-3.1-Argunaut-1-8B-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
split: train
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 4.47
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=DebateLabKIT%2FLlama-3.1-Argunaut-1-8B-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 15.85
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=DebateLabKIT%2FLlama-3.1-Argunaut-1-8B-SFT
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 27.47
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=DebateLabKIT%2FLlama-3.1-Argunaut-1-8B-SFT
name: Open LLM Leaderboard
Model Card for Llama-3.1-Argunaut-1-8B-SFT
This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct. It has been trained using TRL.
Quick start
from transformers import pipeline
question = "Are you familiar with Argdown syntax? What's its purpose?"
generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
Evaluation
Chat Experience
👤 You
Hi, are you familiar with Argdown?
🤖 Assistant
Yes, I'm familiar with Argdown. It's a markup language for reasoning that allows you to write structured text in a natural and readable way. With Argdown, you can create argument maps where sentences are connected by lines that represent different relations (e.g., support, reject, reason). This makes it easy to visualize complex argumentative structures.
Would you like to create an Argdown argument map? Feel free to provide a topic and/or some sentences, and I'll help you set up the map!
👤 You
Can you please map the following text?
School uniforms deter crime and increase student safety.
In Long Beach, California, after two years of a district-wide K-8 mandatory uniform policy, reports of assault and battery in the district’s schools decreased by 34%, assault with a deadly weapon dropped by 50%, fighting incidents went down by 51%, sex offenses were cut by 74%, robbery dropped by 65%, possession of weapons (or weapon “look-alikes”) decreased by 52%, possession of drugs went down by 69%, and vandalism was lowered by 18%. [64]
One year after Sparks Middle School in Nevada instituted a uniform policy, school police data showed a 63% drop in police log reports, and decreases were also noted in gang activity, student fights, graffiti, property damage, and battery. A peer-reviewed study found that schools with uniform policies had 12% fewer firearm-related incidents and 15% fewer drug-related incidents than schools without uniforms. [25][69]
School uniforms also prevent students from concealing weapons under baggy clothing, make it easier to keep track of students on field trips, and make intruders on campus more visible. Frank Quatrone, superintendent in the Lodi school district of New Jersey, states, “When you have students dressed alike, you make them safer. If someone were to come into a building, the intruder could easily be recognized.” [6][38]
Further, school uniforms create a level playing field among students, reducing peer pressure and bullying. When all students are dressed alike, competition between students over clothing choices and the teasing of those who are dressed in less expensive or less fashionable outfits can be eliminated. Research by the Schoolwear Association found that 83% of teachers thought “a good school uniform…could prevent bullying based on appearance or economic background.” Arminta Jacobson, founder and director of the Center for Parent Education at the University of North Texas, states that uniforms put “all kids on the same playing field in terms of their appearance. I think it probably gives them a sense of belonging and a feeling of being socially accepted.” [5][91]
🤖 Assistant
School uniforms deter crime and increase student safety.
<+ In Long Beach, California, after two years of a district-wide K-8 mandatory uniform policy, reports of assault and battery in the district’s schools decreased by 34%, assault with a deadly weapon dropped by 50%, fighting incidents went down by 51%, sex offenses were cut by 74%, robbery dropped by 65%, possession of weapons (or weapon “look-alikes”) decreased by 52%, possession of drugs went down by 69%, and vandalism was lowered by 18%. [64]
<+ One year after Sparks Middle School in Nevada instituted a uniform policy, school police data showed a 63% drop in police log reports, and decreases were also noted in gang activity, student fights, graffiti, property damage, and battery. A peer-reviewed study found that schools with uniform policies had 12% fewer firearm-related incidents and 15% fewer drug-related incidents than schools without uniforms. [25][69]
<+ School uniforms also prevent students from concealing weapons under baggy clothing, make it easier to keep track of students on field trips, and make intruders on campus more visible. Frank Quatrone, superintendent in the Lodi school district of New Jersey, states, “When you have students dressed alike, you make them safer. If someone were to come into a building, the intruder could easily be recognized.” [6][38]
<+ Further, school uniforms create a level playing field among students, reducing peer pressure and bullying. When all students are dressed alike, competition between students over clothing choices and the teasing of those who are dressed in less expensive or less fashionable outfits can be eliminated. Research by the Schoolwear Association found that 83% of teachers thought “a good school uniform…could prevent bullying based on appearance or economic background.” Arminta Jacobson, founder and director of the Center for Parent Education at the University of North Texas, states that uniforms put “all kids on the same playing field in terms of their appearance. I think it probably gives them a sense of belonging and a feeling of being socially accepted.” [5][91]
👤 You
Please summarize, in your own words, the main argument presented in the text. No more than 2-3 sentemces, please.
🤖 Assistant
School uniforms are an effective solution to crime and safety issues in schools, as evidenced by the reduction of incidents in various institutions that implemented uniform policies.
👤 You
Let's focus on this argument. Just for the protocol, what's the conclusion?
🤖 Assistant
The text argues that:
School uniforms deter crime and increase student safety.
By examining this argument, we can now proceed to reconstruct it. Please provide the next step in this inference guide:
👤 You
Can you reconstruct this argument as premise conclusion structure, using argdown syntax?
🤖 Assistant
The argument can be reconstructed as follows:
School uniforms deter crime and increase student safety.
<+ In Long Beach, California, after two years of a district-wide K-8 mandatory uniform policy, reports of assault and battery in the district’s schools decreased by 34%, assault with a deadly weapon dropped by 50%, fighting incidents went down by 51%, sex offenses were cut by 74%, robbery dropped by 65%, possession of weapons (or weapon “look-alikes”) decreased by 52%, possession of drugs went down by 69%, and vandalism was lowered by 18%. [64]
<+ One year after Sparks Middle School in Nevada instituted a uniform policy, school police data showed a 63% drop in police log reports, and decreases were also noted in gang activity, student fights, graffiti, property damage, and battery. A peer-reviewed study found that schools with uniform policies had 12% fewer firearm-related incidents and 15% fewer drug-related incidents than schools without uniforms. [25][69]
<+ School uniforms also prevent students from concealing weapons under baggy clothing, make it easier to keep track of students on field trips, and make intruders on campus more visible. Frank Quatrone, superintendent in the Lodi school district of New Jersey, states, “When you have students dressed alike, you make them safer. If someone were to come into a building, the intruder could easily be recognized.” [6][38]
<+ Further, school uniforms create a level playing field among students, reducing peer pressure and bullying. When all students are dressed alike, competition between students over clothing choices and the teasing of those who are dressed in less expensive or less fashionable outfits can be eliminated. Research by the Schoolwear Association found that 83% of teachers thought “a good school uniform…could prevent bullying based on appearance or economic background.” Arminta Jacobson, founder and director of the Center for Parent Education at the University of North Texas, states that uniforms put “all kids on the same playing field in terms of their appearance. I think it probably gives them a sense of belonging and a feeling of being socially accepted.” [5][91]
🛑 Poor answer
That's NOT a reconstruction in standard form, but just a repetition of the previously given map.
👤 You
No, I really want you to reconstruct the argument in standard form, listing premises and conclusions.
🤖 Assistant
School uniforms deter crime and increase student safety.
<+ <Subargument 1>
<+ In Long Beach, California, after two years of a district-wide K-8 mandatory uniform policy, reports of assault and battery in the district’s schools decreased by 34%, assault with a deadly weapon dropped by 50%, fighting incidents went down by 51%, sex offenses were cut by 74%, robbery dropped by 65%, possession of weapons (or weapon “look-alikes”) decreased by 52%, possession of drugs went down by 69%, and vandalism was lowered by 18%. [64]
<+ One year after Sparks Middle School in Nevada instituted a uniform policy, school police data showed a 63% drop in police log reports, and decreases were also noted in gang activity, student fights, graffiti, property damage, and battery. A peer-reviewed study found that schools with uniform policies had 12% fewer firearm-related incidents and 15% fewer drug-related incidents than schools without uniforms. [25][69]
<+ School uniforms also prevent students from concealing weapons under baggy clothing, make it easier to keep track of students on field trips, and make intruders on campus more visible. Frank Quatrone, superintendent in the Lodi school district of New Jersey, states, “When you have students dressed alike, you make them safer. If someone were to come into a building, the intruder could easily be recognized.” [6][38]
<+ Further, school uniforms create a level playing field among students, reducing peer pressure and bullying. When all students are dressed alike, competition between students over clothing choices and the teasing of those who are dressed in less expensive or less fashionable outfits can be eliminated. Research by the Schoolwear Association found that 83% of teachers thought “a good school uniform…could prevent bullying based on appearance or economic background.” Arminta Jacobson, founder and director of the Center for Parent Education at the University of North Texas, states that uniforms put “all kids on the same playing field in terms of their appearance. I think it probably gives them a sense of belonging and a feeling of being socially accepted.” [5][91]
🛑 Poor answer
The assistant fails to correct the answer, despite the user's directions.
Anbother chat:
👤 You
Hi, are you familiar with Argdown and rendering arguments as premise conclusion structures?
🤖 Assistant
Yes, I am familiar with Argdown and can render arguments as premise-conclusion structures. What would you like to discuss or have me demonstrate?
👤 You
Can you please describe the main argument presented in the following text?
School uniforms deter crime and increase student safety.
In Long Beach, California, after two years of a district-wide K-8 mandatory uniform policy, reports of assault and battery in the district’s schools decreased by 34%, assault with a deadly weapon dropped by 50%, fighting incidents went down by 51%, sex offenses were cut by 74%, robbery dropped by 65%, possession of weapons (or weapon “look-alikes”) decreased by 52%, possession of drugs went down by 69%, and vandalism was lowered by 18%. [64]
One year after Sparks Middle School in Nevada instituted a uniform policy, school police data showed a 63% drop in police log reports, and decreases were also noted in gang activity, student fights, graffiti, property damage, and battery. A peer-reviewed study found that schools with uniform policies had 12% fewer firearm-related incidents and 15% fewer drug-related incidents than schools without uniforms. [25][69]
School uniforms also prevent students from concealing weapons under baggy clothing, make it easier to keep track of students on field trips, and make intruders on campus more visible. Frank Quatrone, superintendent in the Lodi school district of New Jersey, states, “When you have students dressed alike, you make them safer. If someone were to come into a building, the intruder could easily be recognized.” [6][38]
Further, school uniforms create a level playing field among students, reducing peer pressure and bullying. When all students are dressed alike, competition between students over clothing choices and the teasing of those who are dressed in less expensive or less fashionable outfits can be eliminated. Research by the Schoolwear Association found that 83% of teachers thought “a good school uniform…could prevent bullying based on appearance or economic background.” Arminta Jacobson, founder and director of the Center for Parent Education at the University of North Texas, states that uniforms put “all kids on the same playing field in terms of their appearance. I think it probably gives them a sense of belonging and a feeling of being socially accepted.” [5][91]
🤖 Assistant
School uniforms are not only a traditional and respectful way for institutions to create an environment of unity, but they also benefit students' safety and performance. The implementation of school uniforms can lead to a significant reduction in crime rates, improved student behavior, and enhanced safety on school campuses.
👤 You
Can you please reconstruct this argument in standard form (premise-conclusion structure)?
🤖 Assistant
(1) Implementing school uniforms has led to a significant reduction in crime rates in the Long Beach, California district.
(2) The reduction in crime rates is a result of the uniform policy and not just a coincidence.
(3) The reduction in crime rates is a desirable outcome that contributes to the overall safety and well-being of students.
(4) The reduction in crime rates is a result of the uniform policy and not just a coincidence.
-----
(5) Implementing school uniforms is an effective way to deter crime and increase student safety.
👤 You
Can you please add intermediary conclusions and add inference rule?
🤖 Assistant
Sure, here are the intermediate steps leading to the final conclusion:
Implementing school uniforms has led to a significant reduction in crime rates in the Long Beach, California district.
The reduction in crime rates is a result of the uniform policy and not just a coincidence.
The reduction in crime rates is a desirable outcome that contributes to the overall safety and well-being of students.
Implementing school uniforms is an effective way to deter crime and increase student safety.
👤 You
Please render the refined reconstruction in Argdown.
🤖 Assistant
The argument can be represented as follows:
[Conclusion 1]: Implementing school uniforms is an effective way to deter crime and increase student safety.
<+ <Subargument 1>
<+ [Premise]: Implementing school uniforms has led to a significant reduction in crime rates in the Long Beach, California district.
<+ [Premise]: The reduction in crime rates is a result of the uniform policy and not just a coincidence.
<+ [Premise]: The reduction in crime rates is a desirable outcome that contributes to the overall safety and well-being of students.
👤 You
And as premise-conclusion structure, with inference info added?
🤖 Assistant
[Conclusion 1]: Implementing school uniforms is an effective way to deter crime and increase student safety.
<+ <Subargument 1>
<+ [Premise]: Implementing school uniforms has led to a significant reduction in crime rates in the Long Beach, California district.
<+ [Premise]: The reduction in crime rates is a result of the uniform policy and not just a coincidence.
<+ [Premise]: The reduction in crime rates is a desirable outcome that contributes to the overall safety and well-being of students.
Metrics
Llama-3.1-8B-Instruct | Argunaut-1-8B-SFT | |
---|---|---|
⬇️ Argdown Bench | ||
pass@1 | 80.8 | 98.9 |
pass@5 | 98.9 | 99.8 |
graph sim | 8.4 | 65.5 |
isomorphic | 1.3 | 44.0 |
🤗 HF Leaderboard | ||
MMLU pro | 37.6 | 34.5 |
MUSR | 40.1 | 41.6 |
GPQA Diamond | 32.8 | 30.8 |
GPQA Main | 28.5 | 32.1 |
MATH | 12.5 | 9.1 |
BBH | 54.7 | 48.2 |
⛓️ COT Leaderboard | ||
LogiQA | 5.9 | 1.4 |
LogiQA2 | 15.5 | 0.8 |
LSAT-ar | 11.7 | 3.0 |
LSAT-lr | 20.8 | 3.9 |
LSAT-rc | 27.5 | 13.8 |
SFT dataset mixture
Dataset | Weight (examples) | Weight (tokens) |
---|---|---|
DebateLabKIT/deepa2-conversations | 25% | 49% |
DebateLabKIT/deep-argmap-conversations | 25% | 18% |
allenai/tulu-3-sft-mixture | 50% | 33% |
Training procedure
Trained with SFT on 1M examples and for 1 epoch with
- context length 8196
- packing (trl implementation)
- spectrum (top 30 percent)
# Training parameters
num_train_epochs: 1
per_device_train_batch_size: 8
gradient_accumulation_steps: 2
gradient_checkpointing: true
gradient_checkpointing_kwargs:
use_reentrant: false
learning_rate: 5.0e-6 # following _Tülu 3_ recipe
lr_scheduler_type: cosine
warmup_ratio: 0.1
Hardware: 2 x H100 GPUs.
This work was performed on the HoreKa supercomputer funded by the Ministry of Science, Research and the Arts Baden-Württemberg and by the Federal Ministry of Education and Research.
Framework versions
- TRL: 0.12.1
- Transformers: 4.46.3
- Pytorch: 2.4.1
- Datasets: 3.1.0
- Tokenizers: 0.20.3
Credits
This work wouldn't be possible without all the great contributions from the open LLM community. Thank you! Special kudos go to
- @philschmid for his latest fine-tuning boilerplate
- @lvwerra, @lewtun et al for building and maintaining trl
- @cognitivecomputations for sharing spectrum
Open LLM Leaderboard Evaluation Results
Detailed results can be found here! Summarized results can be found here!
Metric | Value (%) |
---|---|
Average | 23.56 |
IFEval (0-Shot) | 55.19 |
BBH (3-Shot) | 27.19 |
MATH Lvl 5 (4-Shot) | 11.18 |
GPQA (0-shot) | 4.47 |
MuSR (0-shot) | 15.85 |
MMLU-PRO (5-shot) | 27.47 |