YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

OpenLLaMA 7Bv2 Model Card

Model Description

OpenLLaMA 7Bv2 is a cutting-edge language model, trained with a focus on delivering high-quality, contextually relevant text predictions. It leverages a diverse composite dataset that includes web-crawled data, scholarly articles, and a wide range of literature and question-answer pairs to ensure broad domain coverage and applicability.

Training Data

The model was trained on a composite dataset that includes:

  • Falcon refined-web dataset
  • starcoder datasets
  • Contributions from Wikipedia for encyclopedic knowledge
  • Academic papers from arXiv for scientific understanding
  • A vast collection of books spanning multiple genres
  • Stack Exchange data curated by RedPajama

Training Procedure

  • Learning Rate: Utilized a maximum learning rate of 3e-4 and a minimum learning rate of 3e-5.
  • Batch Size: Employed a batch size of 4 million tokens, optimizing the training process for both efficiency and performance.
  • Learning Rate Scheduler: The model's learning rate scheduling closely follows the strategy used in Llama2, ensuring gradual adjustments for optimal convergence.
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including m-a-p/OpenLLaMA-Reproduce-1828.72B