iris-s-coon commited on
Commit
cdb6ab8
·
1 Parent(s): 729aeb2

📖 [docs] 英語READMEの更新

Browse files
Files changed (1) hide show
  1. docs/README.en.md +35 -35
docs/README.en.md CHANGED
@@ -31,7 +31,7 @@ license: mit
31
  </p>
32
 
33
  <h2 align="center">
34
- Llama model fine-tuning experiment environment
35
  </h2>
36
 
37
  <p align="center">
@@ -44,42 +44,42 @@ license: mit
44
 
45
  ## 🚀 Project Overview
46
 
47
- **Llama-finetune-sandbox** provides an experimental environment for learning and verifying Llama model fine-tuning. You can try various fine-tuning methods, customize models, and evaluate performance. It caters to a wide range of users, from beginners to researchers. Version 0.6.0 includes updated documentation and the implementation of an LLM evaluation system. This system automatically assesses the quality of LLM responses and generates detailed evaluation reports.
48
 
49
 
50
- ## ✨ Main Features
51
 
52
- 1. **Various Fine-tuning Methods:**
53
  - LoRA (Low-Rank Adaptation)
54
  - QLoRA (Quantized LoRA)
55
 
56
- 2. **Flexible Model Settings:**
57
  - Customizable maximum sequence length
58
  - Various quantization options
59
  - Multiple attention mechanisms
60
 
61
- 3. **Experiment Environment Setup:**
62
- - Optimized memory usage
63
  - Visualization of experimental results
64
 
65
- 4. **Context-Aware Reflective QA Generation System:**
66
  - Generates high-quality Q&A datasets from Wikipedia data.
67
- - Uses LLMs to generate context-aware questions and answers, automatically evaluate quality, and iteratively improve them.
68
- - Employs a reflective approach, quantifying factuality, question quality, and answer completeness to enable iterative improvements.
69
- - Provides comprehensive code and explanations covering environment setup, model selection, data preprocessing, Q&A pair generation, quality evaluation, and the improvement process.
70
  - Uses libraries such as `litellm`, `wikipedia`, and `transformers`.
71
  - Generated Q&A pairs are saved in JSON format and can be easily uploaded to the Hugging Face Hub.
72
 
73
- 5. **LLM Evaluation System:**
74
  - Automatically evaluates the quality of LLM responses.
75
- - Evaluates questions, model answers, and LLM responses on a 4-level scale, generating detailed evaluation reports.
76
  - Features error handling, retry functionality, logging, customizable evaluation criteria, and report generation in CSV and HTML formats.
77
- - Also includes functionality for uploading to the HuggingFace Hub.
78
 
79
 
80
  ## 🔧 Usage
81
 
82
- Please refer to the notebooks in this repository.
83
 
84
 
85
  ## 📦 Installation Instructions
@@ -87,25 +87,25 @@ Please refer to the notebooks in this repository.
87
  Refer to `requirements.txt` and install the necessary packages.
88
 
89
 
90
- ## 📚 Examples
91
 
92
- This repository includes the following examples:
93
 
94
- ### Fast Fine-tuning using Unsloth
95
- - Fast fine-tuning implementation for Llama-3.2-1B/3B models
96
  - → See [`Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md`](sandbox/Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md) for details.
97
- - → [Use this to convert from markdown to notebook format](https://huggingface.co/spaces/MakiAi/JupytextWebUI)
98
  - [📒Notebook here](https://colab.research.google.com/drive/1AjtWF2vOEwzIoCMmlQfSTYCVgy4Y78Wi?usp=sharing)
99
 
100
- ### Fast Inference using Unsloth
101
- - Fast inference implementation for Llama-3.2 models
102
  - → See [`Unsloth_inference_llama3-2.md`](sandbox/Unsloth_inference_llama3-2.md) for details.
103
- - → Implementation of efficient inference processing for Llama-3.2 models using Unsloth
104
  - [📒Notebook here](https://colab.research.google.com/drive/1FkAYiX2fbGPTRUopYw39Qt5UE2tWJRpa?usp=sharing)
105
 
106
- - Fast inference implementation for LLM-JP models
107
  - → See [`Unsloth_inference_llm_jp.md`](sandbox/Unsloth_inference_llm_jp.md) for details.
108
- - → Implementation and performance optimization of fast inference processing for Japanese LLMs
109
  - [📒Notebook here](https://colab.research.google.com/drive/1lbMKv7NzXQ1ynCg7DGQ6PcCFPK-zlSEG?usp=sharing)
110
 
111
  ### Efficient Model Operation using Ollama and LiteLLM
@@ -114,33 +114,33 @@ This repository includes the following examples:
114
  - [📒Notebook here](https://colab.research.google.com/drive/1buTPds1Go1NbZOLlpG94VG22GyK-F4GW?usp=sharing)
115
 
116
  ### Q&A Dataset Generation from Wikipedia Data (Sentence Pool QA Method)
117
- - High-quality Q&A dataset generation using the Sentence Pool QA method
118
- - → A new dataset creation method that generates Q&A pairs while preserving context by pooling sentences separated by periods.
119
- - → Chunk size can be flexibly adjusted (default 200 characters) to generate Q&A pairs with optimal context ranges for various applications.
120
  - → See [`wikipedia-qa-dataset-generator.md`](sandbox/wikipedia-qa-dataset-generator.md) for details.
121
  - [📒Notebook here](https://colab.research.google.com/drive/1mmK5vxUzjk3lI6OnEPrQqyjSzqsEoXpk?usp=sharing)
122
 
123
- ### Context-Aware Reflective QA Generation System
124
- - Q&A dataset generation with reflective quality improvement
125
  - → A new method that automatically evaluates the quality of generated Q&A pairs and iteratively improves them.
126
  - → Quantifies factuality, question quality, and answer completeness for evaluation.
127
- - → Uses contextual information for high-accuracy question generation and answer consistency checks.
128
  - → See [`context_aware_Reflexive_qa_generator_V2.md`](sandbox/context_aware_Reflexive_qa_generator_V2.md) for details.
129
  - [📒Notebook here](https://colab.research.google.com/drive/1OYdgAuXHbl-0LUJgkLl_VqknaAEmAm0S?usp=sharing)
130
 
131
  ### LLM Evaluation System (LLMs as a Judge)
132
  - Advanced quality evaluation system utilizing LLMs as evaluators
133
- - → Automatically evaluates questions, model answers, and LLM responses on a 4-level scale.
134
  - → Robust design with error handling and retry functionality.
135
  - → Generates detailed evaluation reports in CSV and HTML formats.
136
  - → See [`LLMs_as_a_Judge_TOHO_V2.md`](sandbox/LLMs_as_a_Judge_TOHO_V2.md) for details.
137
  - [📒Notebook here](https://colab.research.google.com/drive/1Zjw3sOMa2v5RFD8dFfxMZ4NDGFoQOL7s?usp=sharing)
138
 
139
 
140
- ## 🆕 Latest Information (v0.6.0)
141
 
142
- - **Implementation of the LLM Evaluation System:** Added a system that automatically evaluates the quality of LLM responses. Questions, model answers, and LLM responses are compared and evaluated on a 4-level scale. Features error handling, retry functionality, logging, customizable evaluation criteria, and report generation in CSV and HTML formats.
143
- - Added information about the LLM evaluation system to README.md
144
 
145
 
146
  ## 🤝 Contributions
 
31
  </p>
32
 
33
  <h2 align="center">
34
+ Llama Model Fine-tuning Experiment Environment
35
  </h2>
36
 
37
  <p align="center">
 
44
 
45
  ## 🚀 Project Overview
46
 
47
+ **Llama-finetune-sandbox** provides an experimental environment for learning and verifying Llama model fine-tuning. You can try various fine-tuning methods, customize models, and evaluate performance. It caters to a wide range of users, from beginners to researchers. Version 0.7.0 includes updated documentation and the addition of a guide for implementing high-speed inference using Unsloth.
48
 
49
 
50
+ ## ✨ Key Features
51
 
52
+ 1. **Diverse Fine-tuning Methods**:
53
  - LoRA (Low-Rank Adaptation)
54
  - QLoRA (Quantized LoRA)
55
 
56
+ 2. **Flexible Model Configuration**:
57
  - Customizable maximum sequence length
58
  - Various quantization options
59
  - Multiple attention mechanisms
60
 
61
+ 3. **Experiment Environment Setup**:
62
+ - Memory usage optimization
63
  - Visualization of experimental results
64
 
65
+ 4. **Context-Aware Reflexive QA Generation System**:
66
  - Generates high-quality Q&A datasets from Wikipedia data.
67
+ - Uses LLMs to automatically generate context-aware questions and answers, evaluate quality, and iteratively improve them.
68
+ - Employs a reflexive approach, quantifying factuality, question quality, and answer completeness to evaluate and iteratively improve the generated Q&A pairs.
69
+ - Provides code and explanations covering environment setup, model selection, data preprocessing, Q&A pair generation, quality evaluation, and the improvement process.
70
  - Uses libraries such as `litellm`, `wikipedia`, and `transformers`.
71
  - Generated Q&A pairs are saved in JSON format and can be easily uploaded to the Hugging Face Hub.
72
 
73
+ 5. **LLM Evaluation System**:
74
  - Automatically evaluates the quality of LLM responses.
75
+ - Evaluates questions, model answers, and LLM responses on a 4-point scale, generating detailed evaluation reports.
76
  - Features error handling, retry functionality, logging, customizable evaluation criteria, and report generation in CSV and HTML formats.
77
+ - Also includes functionality for uploading to the Hugging Face Hub.
78
 
79
 
80
  ## 🔧 Usage
81
 
82
+ Refer to the notebooks in this repository.
83
 
84
 
85
  ## 📦 Installation Instructions
 
87
  Refer to `requirements.txt` and install the necessary packages.
88
 
89
 
90
+ ## 📚 Implementation Examples
91
 
92
+ This repository includes the following implementation examples:
93
 
94
+ ### High-Speed Fine-tuning using Unsloth
95
+ - High-speed fine-tuning implementation for Llama-3.2-1B/3B models
96
  - → See [`Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md`](sandbox/Llama_3_2_1B+3B_Conversational_+_2x_faster_finetuning_JP.md) for details.
97
+ - → [Use this to convert from Markdown to Notebook format](https://huggingface.co/spaces/MakiAi/JupytextWebUI)
98
  - [📒Notebook here](https://colab.research.google.com/drive/1AjtWF2vOEwzIoCMmlQfSTYCVgy4Y78Wi?usp=sharing)
99
 
100
+ ### High-Speed Inference using Unsloth
101
+ - High-speed inference implementation for Llama-3.2 models
102
  - → See [`Unsloth_inference_llama3-2.md`](sandbox/Unsloth_inference_llama3-2.md) for details.
103
+ - → Implementation of efficient inference processing for the Llama-3.2 model using Unsloth
104
  - [📒Notebook here](https://colab.research.google.com/drive/1FkAYiX2fbGPTRUopYw39Qt5UE2tWJRpa?usp=sharing)
105
 
106
+ - High-speed inference implementation for LLM-JP models
107
  - → See [`Unsloth_inference_llm_jp.md`](sandbox/Unsloth_inference_llm_jp.md) for details.
108
+ - → Implementation and performance optimization of high-speed inference processing for Japanese LLMs
109
  - [📒Notebook here](https://colab.research.google.com/drive/1lbMKv7NzXQ1ynCg7DGQ6PcCFPK-zlSEG?usp=sharing)
110
 
111
  ### Efficient Model Operation using Ollama and LiteLLM
 
114
  - [📒Notebook here](https://colab.research.google.com/drive/1buTPds1Go1NbZOLlpG94VG22GyK-F4GW?usp=sharing)
115
 
116
  ### Q&A Dataset Generation from Wikipedia Data (Sentence Pool QA Method)
117
+ - High-quality Q&A dataset generation using the sentence pool QA method
118
+ - → A new dataset creation method that generates Q&A pairs while maintaining context by pooling sentences separated by punctuation marks.
119
+ - → Chunk size is flexibly adjustable (default 200 characters) to generate Q&A pairs with optimal context range depending on the application.
120
  - → See [`wikipedia-qa-dataset-generator.md`](sandbox/wikipedia-qa-dataset-generator.md) for details.
121
  - [📒Notebook here](https://colab.research.google.com/drive/1mmK5vxUzjk3lI6OnEPrQqyjSzqsEoXpk?usp=sharing)
122
 
123
+ ### Context-Aware Reflexive QA Generation System
124
+ - Q&A dataset generation with reflexive quality improvement
125
  - → A new method that automatically evaluates the quality of generated Q&A pairs and iteratively improves them.
126
  - → Quantifies factuality, question quality, and answer completeness for evaluation.
127
+ - → Generates highly accurate questions and checks the consistency of answers using contextual information.
128
  - → See [`context_aware_Reflexive_qa_generator_V2.md`](sandbox/context_aware_Reflexive_qa_generator_V2.md) for details.
129
  - [📒Notebook here](https://colab.research.google.com/drive/1OYdgAuXHbl-0LUJgkLl_VqknaAEmAm0S?usp=sharing)
130
 
131
  ### LLM Evaluation System (LLMs as a Judge)
132
  - Advanced quality evaluation system utilizing LLMs as evaluators
133
+ - → Automatically evaluates questions, model answers, and LLM responses on a 4-point scale.
134
  - → Robust design with error handling and retry functionality.
135
  - → Generates detailed evaluation reports in CSV and HTML formats.
136
  - → See [`LLMs_as_a_Judge_TOHO_V2.md`](sandbox/LLMs_as_a_Judge_TOHO_V2.md) for details.
137
  - [📒Notebook here](https://colab.research.google.com/drive/1Zjw3sOMa2v5RFD8dFfxMZ4NDGFoQOL7s?usp=sharing)
138
 
139
 
140
+ ## 🆕 What's New (v0.7.0)
141
 
142
+ - **Addition of a guide for high-speed inference implementation using Unsloth**: Added information on high-speed inference implementation for Llama-3.2 and LLM-JP models, how to use each model, and links to Colab notebooks.
143
+ - Updated documentation
144
 
145
 
146
  ## 🤝 Contributions