ArxivFormulaYOLOv8 / README.md
LouiseBloch's picture
Update README.md
aa3ec1e verified
metadata
license: mit
base_model:
  - Ultralytics/YOLOv8
pipeline_tag: object-detection

Overview

This repository hosts a YOLOv8l model trained on the ArxivFormula (https://github.com/microsoft/ArxivFormula) dataset, which focuses on the detection of mathematical expressions in scientific papers.

Training Data:

  • Source: ArxivFormula (https://github.com/microsoft/ArxivFormula)
  • Classes: 6 classes (InlineFormula, DisplayedFormulaLine, FormulaNumber, DisplayedFormulaBlock, Table, Figure) Pages: ~600,000 images of document pages

Model:

Usage

Example Code

from ultralytics import YOLO
import pathlib

# Sample images
img_list = ['sample1.png', 'sample2.png', 'sample3.png']

# Load the document segmentation model
model = YOLO('arxivFormula_YOLOv8l.pt')

# Process the images
results = model(source=img_list, save=True, show_labels=True, show_conf=True, show_boxes=True)