Spaces:
Configuration error
Configuration error
title: README | |
emoji: π | |
colorFrom: yellow | |
colorTo: indigo | |
sdk: streamlit | |
pinned: false | |
Welcome to our space! π | |
The [Unstructured.io](www.unstructured.io) Team provides libraries with open-source components for pre-processing text documents | |
such as **PDFs**, **HTML** and **Word** Documents. These components are packaged as *bricks* π§±, which provide | |
users the building blocks they need to build pipelines targeted at the documents they care | |
about. Bricks in the library fall into three categories: | |
- 𧩠***Partitioning bricks*** that break raw documents down into standard, structured | |
elements. | |
- π§Ή ***Cleaning bricks*** that remove unwanted text from documents, such as boilerplate and | |
sentence | |
fragments. | |
- π ***Staging bricks*** that format data for downstream tasks, such as ML inference | |
and data labeling. | |
In this space we explore different settings of deep-learning models fine-tuned with several datasets containing a | |
specific document type and corresponding annotations. | |
Main GitHub repository link: [here](https://github.com/Unstructured-IO/unstructured) | |