File size: 2,603 Bytes

---
language:
- ko
pipeline_tag: image-to-text
---

# **deplot_kr**

deplot_kr is a Image-to-Data(Text) model based on the google's pix2struct architecture.
It was fine-tuned from [DePlot](https://huggingface.co/google/deplot), using korean chart image-text pairs.

deplot_kr은 google의 pix2struct 구조를 기반으로 한 한국어 image-to-data(텍스트 형태의 데이터 테이블) 모델입니다.
[DePlot](https://huggingface.co/google/deplot) 모델을 한국어 차트 이미지-텍스트 쌍 데이터세트(30만 개)를 이용하여 fine-tuning 했습니다.

## **How to use**

You can run a prediction by input an image.    
Model predict the data table of text form in the image.    
    
이미지를 모델에 입력하면 모델은 이미지로부터 표 형태의 데이터 테이블을 예측합니다.    

```python
from transformers import Pix2StructForConditionalGeneration, Pix2StructImageProcessor, AutoTokenizer, Pix2StructProcessor
from PIL import Image

image_processor = Pix2StructImageProcessor()
tokenizer = AutoTokenizer.from_pretrained("brainventures/deplot_kr")
processor = Pix2StructProcessor(image_processor=image_processor, tokenizer=tokenizer)

model = Pix2StructForConditionalGeneration.from_pretrained("brainventures/deplot_kr")

image_path = "IMAGE_PATH"
image = Image.open(image_path)

inputs = processor(images=image, return_tensors="pt")
pred = model.generate(flattened_patches=flattened_patches, attention_mask=attention_mask, max_length=1024)
print(processor.batch_decode(deplot_generated_ids, skip_special_token=True)[0])

```

**Model Input Image**
![model_input_image](./sample.jpg)

**Model Output - Prediction**

대상:     
제목: 2011-2021 보건복지 분야 일자리의 <unk>증    
유형: 단일형 일반 세로 <unk>대형    
| 보건(천 명) | 복지(천 명)    
1분위 | 29.7 | 178.4    
2분위 | 70.8 | 97.3    
3분위 | 86.4 | 61.3    
4분위 | 28.2 | 16.0    
5분위 | 52.3 | 0.9    
     
     

### **Preprocessing**

According to [Liu et al.(2023)](https://arxiv.org/pdf/2212.10505.pdf)...     

- markdown format
- | : seperating cells (열 구분)
- \n : seperating rows (행 구분)       


### **Train**

The model was trained in a TPU environment.
- num_warmup_steps : 1,000
- num_training_steps : 40,000 

## **Evaluation Results**

This model achieves the following results:

|metrics name | % |
|:---|---:|
| RNSS (Relative Number Set Similarity)| 99.5483 |
| RMS F1 (Relative Mapping Similarity)| 16.6401 |

## Contact

For questions and comments, please use the discussion tab or email [email protected]