Spaces:

MakiAi
/

Llama-finetune-sandbox

Sleeping

App Files Files Community

MakiAi commited on Oct 28, 2024

Commit

0b24ea8

2 Parent(s): 85dd626 89556f9

Merge docs/update-readme

Browse files

Files changed (3) hide show

README.md +9 -7
docs/README.en.md +40 -14
sandbox/efficient-ollama-colab-setup-with-litellm-guide.md +125 -0

README.md CHANGED Viewed

@@ -64,18 +64,20 @@ license: mit
    - メモリ使用量の最適化
    - 実験結果の可視化
 ## 📚 実装例
 本リポジトリには以下の実装例が含まれています：
-1. **Unslothを使用した高速ファインチューニング**
-   - Llama-3.2-1B/3Bモデルの高速ファインチューニング実装
-     - → 詳細は [`Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md`](sandbox/Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md) をご参照ください。
-     - → [マークダウン形式からノートブック形式への変換はこちらを使用してください](https://huggingface.co/spaces/MakiAi/JupytextWebUI)
-   - [📒ノートブックはこちら](https://colab.research.google.com/drive/1AjtWF2vOEwzIoCMmlQfSTYCVgy4Y78Wi?usp=sharing)
-2. その他の実装例は随時追加予定
 ## 🛠️ 環境構築

    - メモリ使用量の最適化
    - 実験結果の可視化
 ## 📚 実装例
 本リポジトリには以下の実装例が含まれています：
+### Unslothを使用した高速ファインチューニング
+ - Llama-3.2-1B/3Bモデルの高速ファインチューニング実装
+   - → 詳細は [`Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md`](sandbox/Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md) をご参照ください。
+   - → [マークダウン形式からノートブック形式への変換はこちらを使用してください](https://huggingface.co/spaces/MakiAi/JupytextWebUI)
+ - [📒ノートブックはこちら](https://colab.research.google.com/drive/1AjtWF2vOEwzIoCMmlQfSTYCVgy4Y78Wi?usp=sharing)
+### OllamaとLiteLLMを使用した効率的なモデル運用
+ - Google Colabでのセットアップと運用ガイド
+ - → 詳細は [`efficient-ollama-colab-setup-with-litellm-guide.md`](sandbox/efficient-ollama-colab-setup-with-litellm-guide.md) をご参照ください。
+ - [📒ノートブックはこちら](https://colab.research.google.com/drive/1buTPds1Go1NbZOLlpG94VG22GyK-F4GW?usp=sharing)
 ## 🛠️ 環境構築

docs/README.en.md CHANGED Viewed

@@ -31,7 +31,7 @@ license: mit
 </p>
 <h2 align="center">
-  Llama Model Fine-tuning Experiment Environment
 </h2>
 <p align="center">
@@ -44,41 +44,67 @@ license: mit
 ## 🚀 Project Overview
-**Llama-finetune-sandbox** provides an experimental environment for learning and verifying Llama model fine-tuning.  You can try various fine-tuning methods, customize models, and evaluate performance.  It caters to a wide range of users, from beginners to researchers. Version 0.1.0 includes a repository name change, a significantly updated README, and the addition of a Llama model fine-tuning tutorial.
 ## ✨ Key Features
-1. **Diverse Fine-tuning Methods:**
    - LoRA (Low-Rank Adaptation)
    - QLoRA (Quantized LoRA)
-   - Full Fine-tuning
-   - Parameter-Efficient Fine-tuning (PEFT)
-2. **Flexible Model Configuration:**
    - Customizable maximum sequence length
    - Various quantization options
    - Multiple attention mechanisms
-3. **Experiment Environment Setup:**
    - Performance evaluation tools
    - Memory usage optimization
    - Visualization of experimental results
-## 🔧 How to Use
-This repository includes a tutorial on high-speed fine-tuning using the Unsloth library (`sandbox/Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md`). This tutorial explains the fine-tuning process step-by-step with numerous code examples.  The tutorial is written in Japanese. [Use this to convert from Markdown to Notebook format](https://huggingface.co/spaces/MakiAi/JupytextWebUI). A [Google Colab notebook](https://colab.research.google.com/drive/1AjtWF2vOEwzIoCMmlQfSTYCVgy4Y78Wi?usp=sharing) is also available.
-## 📦 Installation Instructions
-Information not available.
-## 🆕 Latest News
-- 🎉 Added Llama model fine-tuning tutorial.
-## 📄 License
 This project is licensed under the MIT License.

 </p>
 <h2 align="center">
+  ～ Llama Model Fine-tuning Experiment Environment ～
 </h2>
 <p align="center">
 ## 🚀 Project Overview
+**Llama-finetune-sandbox** is an experimental environment for learning and verifying Llama model fine-tuning.  You can try various fine-tuning methods, customize models, and evaluate performance.  It caters to a wide range of users, from beginners to researchers.  Version 0.1.0 includes a repository name change, significantly updated README, and added a Llama model fine-tuning tutorial.
 ## ✨ Key Features
+1. **Various Fine-tuning Methods:**
    - LoRA (Low-Rank Adaptation)
    - QLoRA (Quantized LoRA)
+   - ⚠️~Full Fine-tuning~
+   - ⚠️~Parameter-Efficient Fine-tuning (PEFT)~
+2. **Flexible Model Settings:**
    - Customizable maximum sequence length
    - Various quantization options
    - Multiple attention mechanisms
+3. **Experimental Environment Setup:**
    - Performance evaluation tools
    - Memory usage optimization
    - Visualization of experimental results
+## 📚 Implementation Examples
+This repository includes the following implementation examples:
+1. **High-speed fine-tuning using Unsloth:**
+   - Implementation of high-speed fine-tuning for Llama-3.2-1B/3B models.
+     - → See [`Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md`](sandbox/Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md) for details.
+     - → [Use this to convert from Markdown to Notebook format](https://huggingface.co/spaces/MakiAi/JupytextWebUI)
+   - [📒Notebook here](https://colab.research.google.com/drive/1AjtWF2vOEwzIoCMmlQfSTYCVgy4Y78Wi?usp=sharing)
+2.  Other implementation examples will be added periodically.
+## 🛠️ Environment Setup
+1. Clone the repository:
+```bash
+git clone https://github.com/Sunwood-ai-labs/Llama-finetune-sandbox.git
+cd Llama-finetune-sandbox
+```
+## 📝 Adding Example Experiments
+1. Add new implementations to the `examples/` directory.
+2. Add necessary settings and utilities to `utils/`.
+3. Update documentation and tests.
+4. Create a pull request.
+## 🤝 Contributions
+- Implementation of new fine-tuning methods
+- Bug fixes and feature improvements
+- Documentation improvements
+- Addition of usage examples
+## 📚 References
+- [HuggingFace PEFT Documentation](https://huggingface.co/docs/peft)
+- [About Llama Models](https://github.com/facebookresearch/llama)
+- [Fine-tuning Best Practices](https://github.com/Sunwood-ai-labs/Llama-finetune-sandbox/wiki)
+## ⚖️ License
 This project is licensed under the MIT License.

sandbox/efficient-ollama-colab-setup-with-litellm-guide.md ADDED Viewed

	@@ -0,0 +1,125 @@

+# LiteLLMを活用してOllamaをGoogle Colabで効率的に運用する方法
+## はじめに
+ローカルLLMの運用において、OllamaとLiteLLMの組み合わせは非常に強力なソリューションとなっています。本記事では、Google Colab環境でこれらのツールを効率的に統合する方法を解説します。
+## Ollamaとは
+Ollamaは、ローカル環境でLLM（大規模言語モデル）を簡単に実行できるオープンソースのツールです。主な特徴として：
+- 簡単なコマンドラインインターフェース
+- 効率的なモデル管理
+- 軽量な実行環境
+- APIサーバーとしての機能
+## LiteLLMを使う利点
+LiteLLMを導入することで得られる主なメリット：
+1. **統一されたインターフェース**
+   - OpenAI
+   - Anthropic
+   - Ollama
+   - その他の主要なLLMプロバイダーに同じコードで接続可能
+2. **容易なプロバイダー切り替え**
+   - モデルの指定を変更するだけで異なるプロバイダーに切り替え可能
+   - 開発環境とプロダクション環境での柔軟な切り替え
+3. **標準化されたエラーハンドリング**
+   - 各プロバイダー固有のエラーを統一的に処理
+## 実装手順
+### 環境のセットアップ
+```python
+# Ollamaのインストール
+!curl https://ollama.ai/install.sh | sh
+# CUDAドライバーのインストール
+!echo 'debconf debconf/frontend select Noninteractive' | sudo debconf-set-selections
+!sudo apt-get update && sudo apt-get install -y cuda-drivers
+```
+### サーバーの起動とモデルのダウンロード
+```python
+# Ollamaサーバーの起動
+!nohup ollama serve &
+# モデルのダウンロード
+!ollama pull llama3:8b-instruct-fp16
+```
+### LiteLLMを使用したモデル実行
+```python
+from litellm import completion
+response = completion(
+    model="ollama/llama3:8b-instruct-fp16",
+    messages=[{ "content": "respond in 20 words. who are you?","role": "user"}],
+    api_base="http://localhost:11434"
+)
+print(response)
+```
+## プロバイダーの切り替え例
+LiteLLMを使用することで、以下のように簡単に異なるプロバイダーに切り替えることができます：
+```python
+# OpenAIの場合
+response = completion(
+    model="gpt-3.5-turbo",
+    messages=[{"content": "Hello!", "role": "user"}]
+)
+# Anthropicの場合
+response = completion(
+    model="claude-3-opus-20240229",
+    messages=[{"content": "Hello!", "role": "user"}]
+)
+# Ollamaの場合（ローカル実行）
+response = completion(
+    model="ollama/llama3:8b-instruct-fp16",
+    messages=[{"content": "Hello!", "role": "user"}],
+    api_base="http://localhost:11434"
+)
+```
+## 注意点とベストプラクティス
+1. **リソース管理**
+   - Google Colabの無料枠でも実行可能
+   - GPUメモリの使用状況に注意
+2. **セッション管理**
+   - Colabのセッション切断時は再セットアップが必要
+   - 長時間の実行にはPro版の使用を推奨
+## まとめ
+OllamaとLiteLLMの組み合わせは、ローカルLLMの運用を大幅に簡素化します。特に：
+- 統一されたインターフェースによる開発効率の向上
+- 異なるプロバイダー間での容易な切り替え
+- Google Colab環境での簡単な実行
+これらの利点により、プロトタイピングから本番環境まで、柔軟なLLMの活用が可能となります。
+## ノートブック
+https://colab.research.google.com/drive/1buTPds1Go1NbZOLlpG94VG22GyK-F4GW?usp=sharing
+## リポジトリ
+https://github.com/Sunwood-ai-labs/Llama-finetune-sandbox
+## 参考サイト
+https://note.com/masayuki_abe/n/n9640e08492ac