|
--- |
|
language: |
|
- ko |
|
- en |
|
--- |
|
|
|
# Document extract |
|
## This model is layoutlmv2 base model |
|
|
|
> if you want to use this model then you have to preprocessing the data to use this model.(use LayoutLMv2Processor models) |
|
|
|
|
|
## Process |
|
1. I used Korean language invoice document image data to training this model |
|
2. Use Naver Clova service for extract text data from images |
|
3. Determining text Label(target) for each text box |
|
4. Combining the image text, bounding box position data, Label |
|
5. And use LayoutLMv2Processor models for encoding the data |
|
6. Do prediction for encoded data to this model |
|
|