finetuned-llama / README.md
krishmurjani's picture
Update README.md
78211e3 verified
metadata
library_name: transformers
tags:
  - retrieval-augmented-generation
  - finetuning
  - llm
  - huggingface

Model Card for Finetuned Llama 3.2 (ROS Query System)

This model is a finetuned version of Llama 3.2 specifically designed to answer questions related to the Robot Operating System (ROS). It was finetuned on Kaggle using domain-specific data scraped from GitHub repositories and Medium articles. The model powers a Retrieval-Augmented Generation (RAG) pipeline in our AI final project.


Model Details

Model Description

  • Developed by: Krish Murjani (netid: km6520) & Shresth Kapoor (netid: sk11677)
  • Project Name: CS-GY-6613 AI Final Project: ROS Query System
  • Finetuned From: sentence-transformers/all-MiniLM-L6-v2
  • Language(s): English
  • License: Apache 2.0

Model Sources


Uses

Direct Use

The model is used in a Retrieval-Augmented Generation (RAG) pipeline for answering questions related to the Robot Operating System (ROS). It integrates with a vector search engine (Qdrant) and MongoDB for efficient retrieval and query response generation.

Downstream Use

The model can be extended for other technical domains through additional finetuning or plug-in integration into larger AI systems.

Out-of-Scope Use

The model is not designed for tasks outside of technical documentation retrieval and answering ROS-related queries.


Bias, Risks, and Limitations

  • Bias: The model may reflect biases inherent in the scraped ROS documentation and articles.
  • Limitations: Responses are limited to the scraped and finetuned dataset. It may not generalize to broader queries.

Recommendations

  • Use the model for educational and research purposes in robotics and ROS-specific domains.
  • Avoid using the model in high-stakes applications where critical decisions rely on the accuracy of generated responses.

How to Get Started with the Model

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("your-model-id")
tokenizer = AutoTokenizer.from_pretrained("your-model-id")

input_text = "How can I navigate to a specific pose using ROS?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model(**inputs)
print(outputs)

Training Details

Training Data

  • Sources:
    • GitHub repositories related to the Robot Operating System (ROS).
    • Medium articles discussing ROS topics.

Training Procedure

  • Preprocessing:

    • Data cleaning, text chunking, and embedding using Sentence-BERT (all-MiniLM-L6-v2).
    • Used ClearML orchestrator for ETL and finetuning pipelines.
  • Training Framework:

    • Hugging Face Transformers, PyTorch
  • Training Regime:

    • fp16 mixed precision (for efficiency and memory optimization)

Evaluation

Testing Data

  • Dataset:
    • Internal evaluation dataset created from project-specific queries and generated question-answer pairs.

Factors & Metrics

  • Metrics:

    • Query relevance, answer accuracy, and completeness.
  • Evaluation Results:

    • Achieved high relevance and precision for domain-specific questions related to ROS.

Environmental Impact

  • Hardware Type:

    • NVIDIA Tesla T4 (Kaggle)
  • Hours Used:

    • Approximately 15-20 hours of training
  • Compute Region:

    • US Central (Kaggle Cloud)
  • Carbon Emitted:


Technical Specifications

  • Model Architecture:

    • Transformer-based language model (Llama 3.2)
  • Compute Infrastructure:

    • Kaggle Cloud with NVIDIA Tesla T4 GPUs
  • Frameworks:

    • Hugging Face Transformers, PyTorch, ClearML

Citation

@misc{kapoor2024rosquery,
  title={ROS Query System: A Retrieval-Augmented Generation Pipeline},
  author={Shresth Kapoor and Krish Murjani},
  year={2024},
  note={CS-GY-6613 AI Final Project, NYU Tandon School of Engineering}
}

Model Card Authors

Model Card Contact

For any inquiries, please contact us through our (GitHub Repository).