microsoft
/

trocr-small-stage1

vision-encoder-decoder

image-text-to-text

Inference Endpoints

Model card Files Files and versions Community

liminghao1630 commited on Jan 11, 2022

Commit

b5bae50

•

1 Parent(s): 200cf24

Update code example

Files changed (1) hide show

README.md +3 -5

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ You can use the raw model for optical character recognition (OCR) on single text
 Here is how to use this model in PyTorch:
 ```python
-from transformers import TrOCRProcessor, VisionEncoderDecoderModel, AutoFeatureExtractor, XLMRobertaTokenizer
 from PIL import Image
 import requests
 import torch
@@ -32,13 +32,11 @@ import torch
 url = 'https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg'
 image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
-# For the time being, TrOCRProcessor does not support the small models, so the following temporary solution can be adopted
-# processor = TrOCRProcessor.from_pretrained('microsoft/trocr-small-stage1')
-feature_extractor = AutoFeatureExtractor.from_pretrained('microsoft/trocr-small-stage1')
 model = VisionEncoderDecoderModel.from_pretrained('microsoft/trocr-small-stage1')
 # training
-pixel_values = feature_extractor(image, return_tensors="pt").pixel_values  # Batch size 1
 decoder_input_ids = torch.tensor([[model.config.decoder.decoder_start_token_id]])
 outputs = model(pixel_values=pixel_values, decoder_input_ids=decoder_input_ids)
 ```

 Here is how to use this model in PyTorch:
 ```python
+from transformers import TrOCRProcessor, VisionEncoderDecoderModel
 from PIL import Image
 import requests
 import torch
 url = 'https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg'
 image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
+processor = TrOCRProcessor.from_pretrained('microsoft/trocr-small-stage1')
 model = VisionEncoderDecoderModel.from_pretrained('microsoft/trocr-small-stage1')
 # training
+pixel_values = processor(image, return_tensors="pt").pixel_values  # Batch size 1
 decoder_input_ids = torch.tensor([[model.config.decoder.decoder_start_token_id]])
 outputs = model(pixel_values=pixel_values, decoder_input_ids=decoder_input_ids)
 ```