Ivan444410
/

Mistral-Passthrough-8L-10B-zbmath

Update Readme for model description

by AnkitSatpute - opened Nov 28, 2024

←

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,3 +1,17 @@
----
-license: mit
----

+---
+license: mit
+language:
+- en
+- de
+---
+This model is trained for document separation for printed reviews from zbMATH Open.
+We had old scanned volumes of documents dating back to the 1800s, which we wanted to convert to LaTeX machine-processable format. We first converted all scanned documents to LaTeX using
+mathPiX and then trained an LLM to match the metadata of a document with the converted LaTeX (a single page had many documents).
+1) download LLamaFactory (I recommend on this point - https://github.com/hiyouga/LLaMA-Factory/tree/36039b0fe01c17ae30dba60e247d7ba8a1beb20a , it 100% works, I did not check with the new versions)
+2) save in data folder your dataset, update dataset_info (ex. of the dataset and dataset_info attached).
+3) upload the model you want
+4) run
+python3 -u LLaMA-Factory/src/train.py --stage sft --model_name_or_path (ex. louisbrulenaudet/Maxine-7B-0401-stock from huggingface, base model of mine) --adapter_name_or_path way_to_my_the_model  --finetuning_type lora --template default --dataset_dir LLaMA-Factory/data --eval_dataset dataset_name --cutoff_len 10000 --max_samples 100000 --per_device_eval_batch_size 1 --predict_with_generate True --max_new_tokens 8000 --top_p 0.7 --temperature 0.95 --output_dir output_dir --do_predict True