--- license: mit base_model: - Ultralytics/YOLOv8 pipeline_tag: object-detection --- # Overview This repository hosts a YOLOv8l model trained on the ArxivFormula (https://github.com/microsoft/ArxivFormula) dataset, which focuses on the detection of mathematical expressions in scientific papers. # Training Data: - Source: ArxivFormula (https://github.com/microsoft/ArxivFormula) - Classes: 6 classes (InlineFormula, DisplayedFormulaLine, FormulaNumber, DisplayedFormulaBlock, Table, Figure) Pages: ~600,000 images of document pages # Model: - YOLOv8l (https://github.com/ultralytics/ultralytics) - epochs = 100 - imgsz = 640 - optimizer = 'AdamW' - lr0 = 0.0001 - augment = True # Usage ## Example Code ``` from ultralytics import YOLO import pathlib # Sample images img_list = ['sample1.png', 'sample2.png', 'sample3.png'] # Load the document segmentation model model = YOLO('arxivFormula_YOLOv8l.pt') # Process the images results = model(source=img_list, save=True, show_labels=True, show_conf=True, show_boxes=True) ```