|
---
|
|
license: mit
|
|
language:
|
|
- en
|
|
pipeline_tag: image-to-text
|
|
datasets:
|
|
- katanaml-org/invoices-donut-data-v1
|
|
---
|
|
|
|
## Sparrow - Data extraction from documents with ML
|
|
|
|
This model is finetuned Donut ML base model on invoices data. Model aims to verify how well Donut performs on enterprise docs.
|
|
|
|
Mean accuracy on test set: 0.96
|
|
|
|
Inference:
|
|
|
|
![Inference Results](https://raw.githubusercontent.com/katanaml/sparrow/main/sparrow-ui/assets/inference_actual.png)
|
|
|
|
Training loss:
|
|
|
|
![Training Loss](https://raw.githubusercontent.com/katanaml/sparrow/main/sparrow-ui/assets/donut_training_loss.png)
|
|
|
|
Sparrow on [GitHub](https://github.com/katanaml/sparrow)
|
|
|
|
Sample invoice [docs](https://github.com/katanaml/sparrow/tree/main/sparrow-ui/docs/images) to use for inference (docs up to 500 were used for fine-tuning, use docs from 500 for inference)
|
|
|
|
Our website [KatanaML](https://www.katanaml.io)
|
|
|
|
On [Twitter](https://twitter.com/katana_ml) |