|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
base_model: |
|
- llm-jp/llm-jp-3-13b |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
llm-jp-3-13bをichikaraデータセットでファインチューニングしたモデル |
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
NEFTuneによりファインチューニングを実行 |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Developed by:** [More Information Needed] |
|
- **Funded by [optional]:** [More Information Needed] |
|
- **Shared by [optional]:** [More Information Needed] |
|
- **Model type:** [More Information Needed] |
|
- **Language(s) (NLP):** [More Information Needed] |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** [More Information Needed] |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo [optional]:** [More Information Needed] |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
### Direct Use |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
|
|
```python |
|
model_id = "1kbooks/llm-jp-3-13b-finetuned-ver2" |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.bfloat16, |
|
) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
quantization_config=bnb_config, |
|
device_map="auto" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
|
|
|
input = "ここに指示を入力" |
|
with torch.no_grad(): |
|
prompt = f"""### 指示\n{input}\n### 回答\n""" |
|
|
|
inputs = tokenizer([prompt], return_tensors = "pt").to(model.device) |
|
tokenized_input = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt").to(model.device) |
|
attention_mask = torch.ones_like(tokenized_input) |
|
|
|
outputs = model.generate( |
|
tokenized_input, |
|
attention_mask=attention_mask, |
|
max_new_tokens = 512, |
|
use_cache = True, |
|
do_sample=False, |
|
repetition_penalty=1.2, |
|
pad_token_id=tokenizer.eos_token_id |
|
) |
|
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### 回答')[-1] |
|
|
|
print(prediction) |
|
|
|
``` |
|
|
|
### Downstream Use [optional] |
|
|
|
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app --> |
|
|
|
[More Information Needed] |
|
|
|
### Out-of-Scope Use |
|
|
|
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
|
|
|
[More Information Needed] |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
[More Information Needed] |
|
|
|
### Recommendations |
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
[More Information Needed] |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
- ichikara dataset |
|
|
|
### Training Procedure |
|
|
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
|
|
- NEFTune |
|
|
|
|