Edit model card

Experimental model, may not perform that well. Dataset used is a modified version of NilanE/ParallelFiction-Ja_En-100k.

After training with an 8k context length it didn't appear to improve performance much at all. Not sure if I should keep training it (which is costly) or if I should fix some issues with the dataset (like it starting with Ch or Chapter) or I go back to finetuning Finnish models.

Prompt format: Alpaca

Below is a translation task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}

Uploaded model

  • Developed by: mpasila
  • License: apache-2.0
  • Finetuned from model : augmxnt/shisa-base-7b-v1

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
14
Safetensors
Model size
7.96B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mpasila/JP-EN-Translator-2K-steps-7B

Finetuned
(4)
this model
Quantizations
1 model

Datasets used to train mpasila/JP-EN-Translator-2K-steps-7B