Update README.md
Browse files
README.md
CHANGED
@@ -1,200 +1,76 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
tags:
|
4 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
---
|
|
|
6 |
|
7 |
-
|
8 |
|
9 |
-
|
10 |
|
|
|
11 |
|
|
|
|
|
12 |
|
13 |
-
|
|
|
14 |
|
15 |
-
|
|
|
16 |
|
17 |
-
|
|
|
18 |
|
19 |
-
|
|
|
20 |
|
21 |
-
|
22 |
-
- **Funded by [optional]:** [More Information Needed]
|
23 |
-
- **Shared by [optional]:** [More Information Needed]
|
24 |
-
- **Model type:** [More Information Needed]
|
25 |
-
- **Language(s) (NLP):** [More Information Needed]
|
26 |
-
- **License:** [More Information Needed]
|
27 |
-
- **Finetuned from model [optional]:** [More Information Needed]
|
28 |
|
29 |
-
|
30 |
|
31 |
-
|
|
|
32 |
|
33 |
-
- **
|
34 |
-
-
|
35 |
-
- **Demo [optional]:** [More Information Needed]
|
36 |
|
37 |
-
|
|
|
38 |
|
39 |
-
|
|
|
40 |
|
41 |
-
|
42 |
|
43 |
-
|
44 |
|
45 |
-
|
46 |
|
47 |
-
|
|
|
48 |
|
49 |
-
|
|
|
50 |
|
51 |
-
|
|
|
52 |
|
53 |
-
|
|
|
54 |
|
55 |
-
|
56 |
|
57 |
-
|
58 |
-
|
59 |
-
## Bias, Risks, and Limitations
|
60 |
-
|
61 |
-
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
62 |
-
|
63 |
-
[More Information Needed]
|
64 |
-
|
65 |
-
### Recommendations
|
66 |
-
|
67 |
-
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
|
68 |
-
|
69 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
70 |
-
|
71 |
-
## How to Get Started with the Model
|
72 |
-
|
73 |
-
Use the code below to get started with the model.
|
74 |
-
|
75 |
-
[More Information Needed]
|
76 |
-
|
77 |
-
## Training Details
|
78 |
-
|
79 |
-
### Training Data
|
80 |
-
|
81 |
-
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
82 |
-
|
83 |
-
[More Information Needed]
|
84 |
-
|
85 |
-
### Training Procedure
|
86 |
-
|
87 |
-
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
88 |
-
|
89 |
-
#### Preprocessing [optional]
|
90 |
-
|
91 |
-
[More Information Needed]
|
92 |
-
|
93 |
-
|
94 |
-
#### Training Hyperparameters
|
95 |
-
|
96 |
-
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
97 |
-
|
98 |
-
#### Speeds, Sizes, Times [optional]
|
99 |
-
|
100 |
-
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
|
101 |
-
|
102 |
-
[More Information Needed]
|
103 |
-
|
104 |
-
## Evaluation
|
105 |
-
|
106 |
-
<!-- This section describes the evaluation protocols and provides the results. -->
|
107 |
-
|
108 |
-
### Testing Data, Factors & Metrics
|
109 |
-
|
110 |
-
#### Testing Data
|
111 |
-
|
112 |
-
<!-- This should link to a Dataset Card if possible. -->
|
113 |
-
|
114 |
-
[More Information Needed]
|
115 |
-
|
116 |
-
#### Factors
|
117 |
-
|
118 |
-
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
|
119 |
-
|
120 |
-
[More Information Needed]
|
121 |
-
|
122 |
-
#### Metrics
|
123 |
-
|
124 |
-
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
125 |
-
|
126 |
-
[More Information Needed]
|
127 |
-
|
128 |
-
### Results
|
129 |
-
|
130 |
-
[More Information Needed]
|
131 |
-
|
132 |
-
#### Summary
|
133 |
-
|
134 |
-
|
135 |
-
|
136 |
-
## Model Examination [optional]
|
137 |
-
|
138 |
-
<!-- Relevant interpretability work for the model goes here -->
|
139 |
-
|
140 |
-
[More Information Needed]
|
141 |
-
|
142 |
-
## Environmental Impact
|
143 |
-
|
144 |
-
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
145 |
-
|
146 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
147 |
-
|
148 |
-
- **Hardware Type:** [More Information Needed]
|
149 |
-
- **Hours used:** [More Information Needed]
|
150 |
-
- **Cloud Provider:** [More Information Needed]
|
151 |
-
- **Compute Region:** [More Information Needed]
|
152 |
-
- **Carbon Emitted:** [More Information Needed]
|
153 |
-
|
154 |
-
## Technical Specifications [optional]
|
155 |
-
|
156 |
-
### Model Architecture and Objective
|
157 |
-
|
158 |
-
[More Information Needed]
|
159 |
-
|
160 |
-
### Compute Infrastructure
|
161 |
-
|
162 |
-
[More Information Needed]
|
163 |
-
|
164 |
-
#### Hardware
|
165 |
-
|
166 |
-
[More Information Needed]
|
167 |
-
|
168 |
-
#### Software
|
169 |
-
|
170 |
-
[More Information Needed]
|
171 |
-
|
172 |
-
## Citation [optional]
|
173 |
-
|
174 |
-
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
175 |
-
|
176 |
-
**BibTeX:**
|
177 |
-
|
178 |
-
[More Information Needed]
|
179 |
-
|
180 |
-
**APA:**
|
181 |
-
|
182 |
-
[More Information Needed]
|
183 |
-
|
184 |
-
## Glossary [optional]
|
185 |
-
|
186 |
-
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
187 |
-
|
188 |
-
[More Information Needed]
|
189 |
-
|
190 |
-
## More Information [optional]
|
191 |
-
|
192 |
-
[More Information Needed]
|
193 |
-
|
194 |
-
## Model Card Authors [optional]
|
195 |
-
|
196 |
-
[More Information Needed]
|
197 |
-
|
198 |
-
## Model Card Contact
|
199 |
-
|
200 |
-
[More Information Needed]
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
tags:
|
4 |
+
- rag
|
5 |
+
- security
|
6 |
+
- legal
|
7 |
+
- ai4good
|
8 |
+
license: apache-2.0
|
9 |
+
language:
|
10 |
+
- en
|
11 |
+
metrics:
|
12 |
+
- accuracy
|
13 |
+
base_model:
|
14 |
+
- google/gemma-3-4b-it
|
15 |
+
pipeline_tag: text-generation
|
16 |
---
|
17 |
+
# GEMMA Document Rewriter for RAG Pipeline
|
18 |
|
19 |
+
## Overview
|
20 |
|
21 |
+
The **GEMMA Document Rewriter for RAG Pipeline** is a state-of-the-art text rewriting model built on top of the pre-trained [Google Gemma 3 4B](https://huggingface.co/unsloth/gemma-3-4b-it) language model. This model has been fine-tuned using a LoRA (Low-Rank Adaptation) technique, with the adapter weights provided by [ZySec-AI/gemma-3-4b-document-writer-lora](https://huggingface.co/ZySec-AI/gemma-3-4b-document-writer-lora). The primary goal of this model is to intelligently rewrite documents by eliminating unnecessary information, byte spaces, and redundant content. It extracts and emphasizes the information that is significant for Retrieval-Augmented Generation (RAG) pipelines, outputting a clean, structured version of the document in Markdown format with appropriate headings.
|
22 |
|
23 |
+
## Key Features
|
24 |
|
25 |
+
- **Efficient Document Rewriting:**
|
26 |
+
Extracts the essential content from lengthy documents, removing extraneous details and whitespace to create a more concise version ideal for RAG systems.
|
27 |
|
28 |
+
- **Markdown Output:**
|
29 |
+
The model reformats content into Markdown, automatically generating headings and subheadings for improved readability and further processing.
|
30 |
|
31 |
+
- **Cost-Effective and Speed Optimized:**
|
32 |
+
Built on top of a relatively small language model (Gemma 3 4B), this approach offers a cost-effective solution while delivering fast inference speeds suitable for production pipelines.
|
33 |
|
34 |
+
- **LoRA Fine-Tuning:**
|
35 |
+
Utilizes LoRA adapter layers to efficiently fine-tune the base model, enabling rapid adaptation to the document rewriting task without the need for full-scale model retraining.
|
36 |
|
37 |
+
- **State-of-the-Art Performance:**
|
38 |
+
Designed to integrate seamlessly into modern RAG pipelines, ensuring that only the most relevant and structured information is preserved and highlighted.
|
39 |
|
40 |
+
## Intended Use Cases
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
|
42 |
+
This model is ideal for a range of document processing and natural language understanding tasks, including:
|
43 |
|
44 |
+
- **Document Summarization & Rewriting:**
|
45 |
+
Simplify and restructure long documents or articles by extracting key information and presenting it in an organized, Markdown formatted style.
|
46 |
|
47 |
+
- **Data Preprocessing for RAG Pipelines:**
|
48 |
+
Serve as a preprocessing step in retrieval-augmented generation systems by providing clean, condensed documents that enhance retrieval quality and downstream performance.
|
|
|
49 |
|
50 |
+
- **Content Cleanup & Standardization:**
|
51 |
+
Remove noise such as extra whitespace, irrelevant bytes, and redundant verbiage, ensuring that documents conform to a standardized format before further processing.
|
52 |
|
53 |
+
- **Cost-Effective Deployment:**
|
54 |
+
For organizations that require document rewriting capabilities without the overhead of large, resource-intensive models, this solution provides an excellent balance between performance and efficiency.
|
55 |
|
56 |
+
## Model Architecture
|
57 |
|
58 |
+
The model is built on the [Google Gemma 3 4B](https://huggingface.co/unsloth/gemma-3-4b-it) architecture, a transformer-based language model designed for high-speed inference. On top of this base model, LoRA adapter layers are applied to efficiently specialize the model for document rewriting. The adapter mechanism allows the model to learn task-specific modifications with only a fraction of the parameters updated, making the fine-tuning process both memory- and compute-efficient.
|
59 |
|
60 |
+
## How It Works
|
61 |
|
62 |
+
1. **Input Processing:**
|
63 |
+
The model accepts input as a raw text string, which can be an entire document or a section of text. It first tokenizes the input and identifies areas with extraneous content such as byte spaces and redundant sentences.
|
64 |
|
65 |
+
2. **Information Extraction:**
|
66 |
+
Using its fine-tuned attention mechanisms, the model extracts content that is semantically important for the intended downstream RAG tasks. It evaluates context and relevance to determine which pieces of information should be retained.
|
67 |
|
68 |
+
3. **Content Rewriting & Formatting:**
|
69 |
+
The extracted information is then rewritten into a concise format. The model organizes the output into Markdown format, automatically adding appropriate headings and subheadings based on the structure and flow of the content.
|
70 |
|
71 |
+
4. **Output Generation:**
|
72 |
+
The final output is a clean, structured document that preserves key insights and removes unnecessary noise, ready for use in RAG pipelines or other downstream applications.
|
73 |
|
74 |
+
## Usage
|
75 |
|
76 |
+
https://colab.research.google.com/drive/11yIG9FFp3cU5G5iUXxHjJrXEXH-7zOYw?usp=sharing
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|