shujatoor
/

phi3nedtuned-ner

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

phi3nedtuned-ner / README.md

shujatoor's picture

Update README.md

518b212 verified 8 months ago

|

history blame contribute delete

3.13 kB

	---
	license: mit
	library_name: peft
	tags:
	- trl
	- sft
	- generated_from_trainer
	base_model: microsoft/Phi-3-mini-4k-instruct
	datasets:
	- shujatoor/ner_instruct-chat
	model-index:
	- name: checkpoint_dir
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# phi3nedtuned-ner

	This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) on the generator dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6568

	## For Inference
	```python
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

	config = PeftConfig.from_pretrained("shujatoor/phi3nedtuned-ner")
	model = AutoModelForCausalLM.from_pretrained(
	"microsoft/Phi-3-mini-4k-instruct",
	device_map="cuda",
	torch_dtype="auto",
	trust_remote_code=True,
	)
	model = PeftModel.from_pretrained(model, "shujatoor/phi3nedtuned-ner")
	model.config.to_json_file('adapter_config.json')


	torch.random.manual_seed(0)
	tokenizer = AutoTokenizer.from_pretrained("shujatoor/phi3nedtuned-ner")


	text = "Hasan Pharmacy Madina Market Mustafa Chowk.PCsiR Staff Society College Road, Lahore Drug Lic#441-A/AIT No.1023874 24/04/202422:18:03 M/s*CASH SALES-WALKING CUST Remarks: Ref.: Item Name Qty Price Total Advant Tab 16mg 28 37.50 1050.00 Kepra 500mg Tab 30 85.91 2577.30 Kabrokin 200mg 240 10.67 2560.80 Tab Myteka 10mg Tab 14 37.71 527.94 Cipocain Ear/drops 1 168.00 168.00 Medicam T/paste 1 240.00 240.00 100gm Total items:6 Gross Total : 7,124.04 Disc: 523.68 DR.HASAN Net Total. 6,600.00 (Computer Software developed by Abuzar Consultancy Ph 042-37426911-15)."
	qs = f'{text} What is the drug license number of the store??'
	print('Question:',qs, '\n')
	messages = [
	#{"role": "system", "content": "Only output the answer, nothing else"},
	{"role": "user", "content": qs},

	]

	pipe = pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer,
	)

	generation_args = {
	"max_new_tokens": 512,
	"return_full_text": False,
	#"temperature": 0.0,
	"do_sample": False,
	}

	output = pipe(messages, **generation_args)

	print('Answer:', output[0]['generated_text'], '\n')

	"""
	expected answer:

	Answer: 441-A/AIT No.1023874

	"""

	```

	## Intended uses & limitations

	Named Entity Recognition (NER)

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 0
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.2
	- num_epochs: 1

	### Training results



	### Framework versions

	- PEFT 0.10.1.dev0
	- Transformers 4.41.0.dev0
	- Pytorch 2.2.1+cu121
	- Datasets 2.19.0
	- Tokenizers 0.19.1

	### License

	The model is licensed under the MIT license.