Nordavind-7B

A model trained on top of https://huggingface.co/norallm/normistral-7b-warm on various Norwegian instruct data with a context length of 4096. The name - nordavind - a cold wind from the north, is derived from Mistral. See https://en.wikipedia.org/wiki/Mistral_(wind)! Runtime: ~20 hours on a V100 32GB. Final loss: 1.301.

Prompt format:

<s>{system_prompt} [INST] {inst} [/INST] \\n {res} </s>

Inference prompt:

<s>{system_prompt} [INST] {inst} [/INST] \\n

Examples:

generate("Lag en skikkelig norsk setning", tokens=50, first_sent=True)

Jeg liker å gå på tur i skogen.

generate("Lag en liste med tre forskjellige bruksområder (inkludert eksempler) for kunstig intelligens.")

  1. Automatisering av oppgaver: AI kan brukes til å automatisere oppgaver som å svare på kundehenvendelser, behandle medisinske bilder og diagnostisere sykdommer, og til og med kjøre biler.
  2. Forbedre kundeservice: AI kan brukes for å forbedre kundeservicen ved å gi personlige anbefalinger, gi raskere svar og gi mer nøyaktige svar.
  3. Forbedre sikkerhet: AI kan brukes i sikkerhetssystemer for å oppdage og forhindre cyberangrep, oppdage svindel og oppdage mistenkelige aktiviteter.

generate("Hvem tror du vinner valget i Bergen?", tokens=60)

Det er vanskelig å forutsi hvem som vinner valget i Bergen, men det er sannsynlig at enten Arbeiderpartiet eller Høyre vil vinne. Arbeiderpartiet har hatt makten i Bergen i mange år, men Høyre har hatt en sterk tilstedeværelse i byen de siste årene.

generate("Forklar dyp læring til en femteklassing")

Dyp læring er en type maskinlæring som bruker nevrale nettverk for å lære komplekse mønstre i data. Det innebærer å trene en datamaskin til å gjenkjenne mønstre i data og ta beslutninger basert på disse mønstrene.

Model loading

from transformers import pipeline
pipe = pipeline("text-generation", model="tollefj/nordavind-7b-instruct-warm")

Inference

system_prompt = 'Du er "Nordavind", en hjelpsom assistent.'

def make_prompt(inst, res=None):
    if not res:
        return f"""<s>{system_prompt} [INST] {inst} [/INST] \\n"""
    return f"""<s>{system_prompt} [INST] {inst} [/INST] \\n {res} </s>"""

def postprocess(output, first_sent=False):
    output = output.split("\\n")[-1].strip()
    # ignore hashtags as we often see #no_output
    output = output.split("#")[0].strip()
    # ignore incomplete sentences
    if not output.endswith("."):
        output = output.rsplit(".", 1)[0] + "."
    if first_sent:
        return output.split(".")[0] + "."
    return output

def generate(prompt, tokens=100, first_sent=False, sample=False, temperature=1.0):
    prompt = make_prompt(prompt)
    output = pipe(
        prompt,
        max_length=tokens,
        do_sample=sample,
        temperature=temperature,
    )
    output = output[0]["generated_text"]
    output = postprocess(output, first_sent=first_sent)
    print(output)

Training details

The model was fine-tuned in an 4bit BitsAndBytes config.

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=getattr(torch, "float16"),
    bnb_4bit_use_double_quant=False,
)

with the following LoRa-configuration:

config = LoraConfig(
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
        "lm_head",
    ],
    bias="none",
    lora_dropout=0.05,
    task_type="CAUSAL_LM",
)
Downloads last month
16
Safetensors
Model size
7.25B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train tollefj/nordavind-7b-instruct-warm

Collection including tollefj/nordavind-7b-instruct-warm