--- tags: - image-classification - document-classification - vision library_name: transformers pipeline_tag: image-classification license: mit --- # Document Classification Model ## Overview This model is trained for document classification using vision transformers (DiT). ## Model Details * Architecture: Vision Transformer (DiT) * Tasks: Document Classification * Training Framework: 🤗 Transformers * Base Model: microsoft/dit-large * Training Dataset Size: 32786 ## Training Parameters * Batch Size: 256 * Learning Rate: 0.002 * Number of Epochs: 1 * Mixed Precision: BF16 ## Usage ```python from transformers import AutoImageProcessor, AutoModelForImageClassification from PIL import Image # Load model and processor processor = AutoImageProcessor.from_pretrained("jnmrr/ds3-img-classification") model = AutoModelForImageClassification.from_pretrained("jnmrr/ds3-img-classification") # Process an image image = Image.open("document.png") inputs = processor(image, return_tensors="pt") # Make prediction outputs = model(**inputs) predicted_label = outputs.logits.argmax(-1).item() ```