santacoderpack / README.md
Muennighoff's picture
Update README.md
ea5895e
|
raw
history blame
5.35 kB
metadata
pipeline_tag: text-generation
inference: true
widget:
  - text: >-
      <commit_before>def has_close_elements(numbers: List[float], threshold:
      float) -> bool:\n    for idx, elem in enumerate(numbers):\n        for
      idx2, elem2 in enumerate(numbers):\n            if idx !=
      idx2:\n                distance = elem - elem2\n                if
      distance < threshold:\n                    return True\n\n    return
      False<commit_message>Fix bugs in has_close_elements.<commit_after>
    example_title: Fix has_close_elements
    group: Python
license: bigcode-openrail-m
datasets:
  - bigcode/commitpack-subset-cf
metrics:
  - code_eval
library_name: transformers
tags:
  - code
model-index:
  - name: SantaCoderPack
    results:
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Python
        metrics:
          - name: pass@1
            type: pass@1
            value: 3.2
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix JavaScript
        metrics:
          - name: pass@1
            type: pass@1
            value: 4.9
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Java
        metrics:
          - name: pass@1
            type: pass@1
            value: 1.8
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Go
        metrics:
          - name: pass@1
            type: pass@1
            value: 3.6
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix C++
        metrics:
          - name: pass@1
            type: pass@1
            value: 4.2
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Rust
        metrics:
          - name: pass@1
            type: pass@1
            value: 1.7
            verified: false
      - task:
          type: text-generation
        dataset:
          type: bigcode/humanevalpack
          name: HumanEvalFix Average
        metrics:
          - name: pass@1
            type: pass@1
            value: 3.3
            verified: false

Octopack

Table of Contents

  1. Model Summary
  2. Use
  3. Training
  4. Citation

Model Summary

SantaCoderPack is an pre-trained model with the same architecture of SantaCoder on

CommitPack using this format:
<commit_before>code_before<commit_msg>message<commit_after>code_after
  • Repository: bigcode/octopack
  • Paper: TODO
  • Languages: Python, JavaScript, Java, C++, Go, Rust
  • SantaCoderPack:
    Data CommitPack 4TB of GitHub commits across 350 programming languages
    Model SantaCoderPack SantaCoderPack (1.1B parameters) pre-trained on CommitPack
    Evaluation   HumanEvalPack/HumanEvalFix Extension of OpenAI's HumanEval to HumanEvalFix

Use

Intended use

The model follows instructions provided in the input. We recommend prefacing your input with "def has_close_elements(numbers: List[float], threshold: float) -> bool:\n for idx, elem in enumerate(numbers):\n for idx2, elem2 in enumerate(numbers):\n if idx != idx2:\n distance = elem - elem2\n if distance < threshold:\n return True\n\n return FalseFix bugs in has_close_elements."

Feel free to share your generations in the Community tab!

Generation

# pip install -q transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "bigcode/santacoderpack"
device = "cuda" # for GPU usage or "cpu" for CPU usage
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
inputs = tokenizer.encode("Q<commit_before>def has_close_elements(numbers: List[float], threshold: float) -> bool:\n    for idx, elem in enumerate(numbers):\n        for idx2, elem2 in enumerate(numbers):\n            if idx != idx2:\n                distance = elem - elem2\n                if distance < threshold:\n                    return True\n\n    return False<commit_message>Fix bugs in has_close_elements.<commit_after>", return_tensors="pt").to(device)
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

Training

Model

  • Architecture: GPT-2 model with multi-query attention
  • Steps: 250k pretraining
  • Pretraining tokens: 131B
  • Precision: bfloat16

Hardware

  • Pretraining:
    • GPUs: 32 Tesla A100
    • Training time: 15 days

Software

Citation

TODO