English

LoNAS Model Card: lonas-bloomz-7b-math

The super-network fine-tuned on BLOOMZ-7B with some math reasoning datasets using LoNAS.

Model Details

Information

Adapter Configuration

  • LoRA rank: 32
  • LoRA alpha: 64
  • LoRA target modules: query_key_value, dense_h_to_4h, dense_4h_to_h

Training Hyperparameters

  • Batch size: 16
  • Learning rate: 3e-4
  • Epoch: 8

Training Data

Unified math reasoning dataset: math_10k.json (collected with the training sets of GSM8K, MAWPS, and AQuA).

Evaluation Data

GSM8K, AQuA, MAWPS and SVAMP

How to use

Refer to https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS#evaluation:

CUDA_VISIBLE_DEVICES=${DEVICES} python run_math.py \
    --dataset_path None \
    --model_name_or_path bigscience/bloomz-7b1 \
    --lora \
    --lora_weights lonas-bloomz-7b-math \
    --nncf_config nncf_config/unified_math/nncf_lonas_bloomz_7b.json \
    --do_test \
    --output_dir lonas-bloomz-7b-math/results

Evaluation Results

Results of the heuristic sub-network discoverd from the super-network:

Method Total Params. TFLOPs GSM8K AQuA MAWPS SVAMP Average
LoRA 7.1B 1.8 17.4 21.3 70.2 41.0 37.5
LoNAS 6.1B 1.5 18.6 22.0 76.5 31.8 37.2

Model Sources

Repository: https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS

Paper:

Citation

@inproceedings{munoz-etal-2024-lonas,
    title = "{L}o{NAS}: Elastic Low-Rank Adapters for Efficient Large Language Models",
    author = "Munoz, Juan Pablo  and
      Yuan, Jinjie  and
      Zheng, Yi  and
      Jain, Nilesh",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.940",
    pages = "10760--10776",
}

License

Apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.