navodPeiris commited on
Commit
2206f59
·
verified ·
1 Parent(s): f4e0e48

updated readme

Browse files
Files changed (1) hide show
  1. README.md +52 -7
README.md CHANGED
@@ -21,19 +21,64 @@ It achieves the following results on the evaluation set:
21
  - Loss: 0.0008
22
  - Accuracy: 1.0
23
 
24
- ## Model description
25
 
26
- More information needed
27
 
28
- ## Intended uses & limitations
29
 
30
- More information needed
31
 
32
- ## Training and evaluation data
 
 
 
 
 
 
33
 
34
- More information needed
35
 
36
- ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ### Training hyperparameters
39
 
 
21
  - Loss: 0.0008
22
  - Accuracy: 1.0
23
 
24
+ ## Dataset Infomation
25
 
26
+ This model was fine-tuned to classify some company documents.
27
 
28
+ Dataset used: [Company Documents Dataset](https://www.kaggle.com/datasets/navodpeiris/company-documents-dataset)
29
 
30
+ ## Dependencies
31
 
32
+ ```
33
+ pip install PyMuPDF
34
+ pip install transformers
35
+ pip install torch
36
+ pip install torchvision
37
+ pip install pytesseract
38
+ ```
39
 
40
+ - setup tesseract locally in your machine follow steps here: [install instructions](https://tesseract-ocr.github.io/tessdoc/Installation.html)
41
 
42
+ ## Model Usage
43
+
44
+ use a file in this dataset to test: https://www.kaggle.com/datasets/navodpeiris/company-documents-dataset
45
+
46
+ ```
47
+ import os
48
+ from PIL import Image
49
+ from transformers import LayoutLMv2Processor, LayoutLMv2ForSequenceClassification
50
+ import fitz
51
+ import io
52
+
53
+ processor = LayoutLMv2Processor.from_pretrained("microsoft/layoutlmv2-base-uncased")
54
+ model = LayoutLMv2ForSequenceClassification.from_pretrained("navodPeiris/layoutlmv2-document-classifier")
55
+
56
+ DATA_FOLDER = "data"
57
+ filename = "invoice.pdf"
58
+
59
+ file_location = os.path.join(DATA_FOLDER, filename)
60
+ doc = fitz.open(file_location)
61
+
62
+ page = doc.load_page(0)
63
+ pix = page.get_pixmap(dpi=200)
64
+
65
+ # Convert Pixmap to bytes
66
+ img_bytes = pix.tobytes("png")
67
+
68
+ # Load into PIL.Image
69
+ image = Image.open(io.BytesIO(img_bytes)).convert("RGB")
70
+ doc.close()
71
+
72
+ encoding = processor(image, return_tensors="pt", truncation=True, padding="max_length", max_length=512)
73
+
74
+ outputs = model(**encoding)
75
+ logits = outputs.logits
76
+
77
+ predicted_class_id = logits.argmax(dim=1).item()
78
+ classified_output = model.config.id2label[predicted_class_id]
79
+
80
+ print(f"Predicted class: {classified_output}")
81
+ ```
82
 
83
  ### Training hyperparameters
84