T-pro-1.0
Collection
5 items
•
Updated
•
3
🚨 T-pro is designed for further fine-tuning and is not intended as a ready-to-use conversational assistant. Users are advised to exercise caution and are responsible for any additional training and oversight required to ensure the model's responses meet acceptable ethical and safety standards. The responsibility for incorporating this model into industrial or commercial solutions lies entirely with those who choose to deploy it.
This repository contains the T-pro-it-1.0
model, which has been quantized into the GGUF format using the llama.cpp
repository.
Detailed evaluation results of oringal model can be found in our habr post.
Benchmark | T-pro-it-1.0 | T-pro-it-1.0-Q4_K_M | T-pro-it-1.0-Q5_K_M | T-pro-it-1.0-Q6_K | T-pro-it-1.0-Q8_0 |
---|---|---|---|---|---|
Arena-Hard-Ru | 90.17 (-1.3, 1.5) | 89.0 (-1.5, 1.3) | 89.29 (-1.6, 1.3) | 88.5 (-1.3, 1.3) | 89.35 (-1.2, 1.2) |
From HF:
llama-server --hf-repo t-tech/T-pro-it-1.0-Q8_0-GGUF --hf-file t-pro-it-1.0-q8_0.gguf -c 8192
Or locally:
./build/bin/llama-server -m t-pro-it-1.0-q8_0.gguf -c 8192
curl --request POST \
--url http://localhost:8080/completion \
--header "Content-Type: application/json" \
--data '{
"prompt": "<|im_start|>user\nРасскажи мне чем отличается Python от C++?\n<|im_end|>\n<|im_start|>assistant\n",
"n_predict": 256
}'
ollama serve
From HF:
ollama run hf.co/t-tech/T-pro-it-1.0-Q8_0-GGUF:Q8_0 "Расскажи мне про отличия C++ и Python"
Or locally:
ollama create example -f Modelfile
ollama run example "Расскажи мне про отличия C++ и Python"
where Modelfile
is
FROM ./t-pro-it-1.0-q8_0.gguf
Base model
t-tech/T-pro-it-1.0