Spaces:
Runtime error
Runtime error
File size: 13,854 Bytes
a1d409e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 |
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# μΆλ‘ μ μν Pipeline[[pipelines-for-inference]]
[`pipeline`]μ μ¬μ©νλ©΄ μΈμ΄, μ»΄ν¨ν° λΉμ , μ€λμ€ λ° λ©ν°λͺ¨λ¬ νμ€ν¬μ λν μΆλ‘ μ μν΄ [Hub](https://huggingface.co/models)μ μ΄λ€ λͺ¨λΈμ΄λ μ½κ² μ¬μ©ν μ μμ΅λλ€. νΉμ λΆμΌμ λν κ²½νμ΄ μκ±°λ, λͺ¨λΈμ μ΄λ£¨λ μ½λκ° μ΅μνμ§ μμ κ²½μ°μλ [`pipeline`]μ μ¬μ©ν΄μ μΆλ‘ ν μ μμ΄μ! μ΄ νν 리μΌμμλ λ€μμ λ°°μλ³΄κ² μ΅λλ€.
* μΆλ‘ μ μν΄ [`pipeline`]μ μ¬μ©νλ λ°©λ²
* νΉμ ν ν¬λμ΄μ λλ λͺ¨λΈμ μ¬μ©νλ λ°©λ²
* μΈμ΄, μ»΄ν¨ν° λΉμ , μ€λμ€ λ° λ©ν°λͺ¨λ¬ νμ€ν¬μμ [`pipeline`]μ μ¬μ©νλ λ°©λ²
<Tip>
μ§μνλ λͺ¨λ νμ€ν¬μ μΈ μ μλ 맀κ°λ³μλ₯Ό λ΄μ λͺ©λ‘μ [`pipeline`] μ€λͺ
μλ₯Ό μ°Έκ³ ν΄μ£ΌμΈμ.
</Tip>
## Pipeline μ¬μ©νκΈ°[[pipeline-usage]]
κ° νμ€ν¬λ§λ€ κ³ μ μ [`pipeline`]μ΄ μμ§λ§, κ°λ³ νμ΄νλΌμΈμ λ΄κ³ μλ μΆμνλ [`pipeline`]λ₯Ό μ¬μ©νλ κ²μ΄ μΌλ°μ μΌλ‘ λ κ°λ¨ν©λλ€. [`pipeline`]μ νμ€ν¬μ μλ§κ² μΆλ‘ μ΄ κ°λ₯ν κΈ°λ³Έ λͺ¨λΈκ³Ό μ μ²λ¦¬ ν΄λμ€λ₯Ό μλμΌλ‘ λ‘λν©λλ€.
1. λ¨Όμ [`pipeline`]μ μμ±νκ³ νμ€ν¬λ₯Ό μ§μ νμΈμ.
```py
>>> from transformers import pipeline
>>> generator = pipeline(task="automatic-speech-recognition")
```
2. κ·Έλ¦¬κ³ [`pipeline`]μ μ
λ ₯μ λ£μ΄μ£ΌμΈμ.
```py
>>> generator("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
{'text': 'I HAVE A DREAM BUT ONE DAY THIS NATION WILL RISE UP LIVE UP THE TRUE MEANING OF ITS TREES'}
```
κΈ°λνλ κ²°κ³Όκ° μλκ°μ? Hubμμ [κ°μ₯ λ§μ΄ λ€μ΄λ‘λλ μλ μμ± μΈμ λͺ¨λΈ](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&sort=downloads)λ‘ λ λμ κ²°κ³Όλ₯Ό μ»μ μ μλμ§ νμΈν΄λ³΄μΈμ.
λ€μμ [openai/whisper-large](https://huggingface.co/openai/whisper-large)λ‘ μλν΄λ³΄κ² μ΅λλ€.
```py
>>> generator = pipeline(model="openai/whisper-large")
>>> generator("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}
```
ν¨μ¬ λ λμμ‘κ΅°μ!
Hubμ λͺ¨λΈλ€μ μ¬λ¬ λ€μν μΈμ΄μ μ λ¬ΈλΆμΌλ₯Ό μμ°λ₯΄κΈ° λλ¬Έμ κΌ μμ μ μΈμ΄λ λΆμΌμ νΉνλ λͺ¨λΈμ μ°Ύμ보μκΈ° λ°λλλ€.
λΈλΌμ°μ λ₯Ό λ²μ΄λ νμμμ΄ Hubμμ μ§μ λͺ¨λΈμ μΆλ ₯μ νμΈνκ³ λ€λ₯Έ λͺ¨λΈκ³Ό λΉκ΅ν΄μ μμ μ μν©μ λ μ ν©νμ§, μ 맀ν μ
λ ₯μ λ μ μ²λ¦¬νλμ§λ νμΈν μ μμ΅λλ€.
λ§μ½ μν©μ μλ§λ λͺ¨λΈμ μλ€λ©΄ μΈμ λ μ§μ [νλ ¨](training)μν¬ μ μμ΅λλ€!
μ
λ ₯μ΄ μ¬λ¬ κ° μλ κ²½μ°, 리μ€νΈ ννλ‘ μ λ¬ν μ μμ΅λλ€.
```py
generator(
[
"https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac",
"https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/1.flac",
]
)
```
μ 체 λ°μ΄ν°μΈνΈμ μννκ±°λ μΉμλ²μ μ¬λ €λμ΄ μΆλ‘ μ μ¬μ©νκ³ μΆλ€λ©΄, κ° μμΈ νμ΄μ§λ₯Ό μ°Έμ‘°νμΈμ.
[λ°μ΄ν°μΈνΈμμ Pipeline μ¬μ©νκΈ°](#using-pipelines-on-a-dataset)
[μΉμλ²μμ Pipeline μ¬μ©νκΈ°](./pipeline_webserver)
## 맀κ°λ³μ[[parameters]]
[`pipeline`]μ λ§μ 맀κ°λ³μλ₯Ό μ§μν©λλ€. νΉμ νμ€ν¬μ©μΈ κ²λ μκ³ , λ²μ©μΈ κ²λ μμ΅λλ€.
μΌλ°μ μΌλ‘ μνλ μμΉμ μ΄λλ 맀κ°λ³μλ₯Ό λ£μ μ μμ΅λλ€.
```py
generator(model="openai/whisper-large", my_parameter=1)
out = generate(...) # This will use `my_parameter=1`.
out = generate(..., my_parameter=2) # This will override and use `my_parameter=2`.
out = generate(...) # This will go back to using `my_parameter=1`.
```
μ€μν 3κ°μ§ 맀κ°λ³μλ₯Ό μ΄ν΄λ³΄κ² μ΅λλ€.
### κΈ°κΈ°(device)[[device]]
`device=n`μ²λΌ κΈ°κΈ°λ₯Ό μ§μ νλ©΄ νμ΄νλΌμΈμ΄ μλμΌλ‘ ν΄λΉ κΈ°κΈ°μ λͺ¨λΈμ λ°°μΉν©λλ€.
νμ΄ν μΉμμλ ν
μνλ‘μ°μμλ λͺ¨λ μλν©λλ€.
```py
generator(model="openai/whisper-large", device=0)
```
λͺ¨λΈμ΄ GPU νλμ λμκ°κΈ° λ²κ²λ€λ©΄, `device_map="auto"`λ₯Ό μ§μ ν΄μ π€ [Accelerate](https://huggingface.co/docs/accelerate)κ° λͺ¨λΈ κ°μ€μΉλ₯Ό μ΄λ»κ² λ‘λνκ³ μ μ₯ν μ§ μλμΌλ‘ κ²°μ νλλ‘ ν μ μμ΅λλ€.
```py
#!pip install accelerate
generator(model="openai/whisper-large", device_map="auto")
```
### λ°°μΉ μ¬μ΄μ¦[[batch-size]]
κΈ°λ³Έμ μΌλ‘ νμ΄νλΌμΈμ [μ¬κΈ°](https://huggingface.co/docs/transformers/main_classes/pipelines#pipeline-batching)μ λμ¨ μ΄μ λ‘ μΆλ‘ μ μΌκ΄ μ²λ¦¬νμ§ μμ΅λλ€. κ°λ¨ν μ€λͺ
νμλ©΄ μΌκ΄ μ²λ¦¬κ° λ°λμ λ λΉ λ₯΄μ§ μκ³ μ€νλ € λ λλ €μ§ μλ μκΈ° λλ¬Έμ
λλ€.
νμ§λ§ μμ μ μν©μ μ ν©νλ€λ©΄, μ΄λ κ² μ¬μ©νμΈμ.
```py
generator(model="openai/whisper-large", device=0, batch_size=2)
audio_filenames = [f"audio_{i}.flac" for i in range(10)]
texts = generator(audio_filenames)
```
νμ΄νλΌμΈ μ μ 곡λ 10κ°μ μ€λμ€ νμΌμ μΆκ°λ‘ μ²λ¦¬νλ μ½λ μμ΄ (μΌκ΄ μ²λ¦¬μ λ³΄λ€ ν¨κ³Όμ μΈ GPU μ) λͺ¨λΈμ 2κ°μ© μ λ¬ν©λλ€.
μΆλ ₯μ μΌκ΄ μ²λ¦¬νμ§ μμμ λμ λκ°μμΌ ν©λλ€. νμ΄νλΌμΈμμ μλλ₯Ό λ λΌ μλ μλ λ°©λ² μ€ νλμΌ λΏμ
λλ€.
νμ΄νλΌμΈμ μΌκ΄ μ²λ¦¬μ 볡μ‘ν λΆλΆμ μ€μ¬μ£ΌκΈ°λ ν©λλ€. (μλ₯Ό λ€μ΄ κΈ΄ μ€λμ€ νμΌμ²λΌ) μ¬λ¬ λΆλΆμΌλ‘ λλ μΌ λͺ¨λΈμ΄ μ²λ¦¬ν μ μλ κ²μ [*chunk batching*](./main_classes/pipelines#pipeline-chunk-batching)μ΄λΌκ³ νλλ°, νμ΄νλΌμΈμ μ¬μ©νλ©΄ μλμΌλ‘ λλ μ€λλ€.
### νΉμ νμ€ν¬μ© 맀κ°λ³μ[[task-specific-parameters]]
κ° νμ€ν¬λ§λ€ ꡬνν λ μ μ°μ±κ³Ό μ΅μ
μ μ 곡νκΈ° μν΄ νμ€ν¬μ© 맀κ°λ³μκ° μμ΅λλ€.
μλ₯Ό λ€μ΄ [`transformers.AutomaticSpeechRecognitionPipeline.__call__`] λ©μλμλ λμμμ μλ§μ λ£μ λ μ μ©ν κ² κ°μ `return_timestamps` 맀κ°λ³μκ° μμ΅λλ€.
```py
>>> # Not using whisper, as it cannot provide timestamps.
>>> generator = pipeline(model="facebook/wav2vec2-large-960h-lv60-self", return_timestamps="word")
>>> generator("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
{'text': 'I HAVE A DREAM BUT ONE DAY THIS NATION WILL RISE UP AND LIVE OUT THE TRUE MEANING OF ITS CREED', 'chunks': [{'text': 'I', 'timestamp': (1.22, 1.24)}, {'text': 'HAVE', 'timestamp': (1.42, 1.58)}, {'text': 'A', 'timestamp': (1.66, 1.68)}, {'text': 'DREAM', 'timestamp': (1.76, 2.14)}, {'text': 'BUT', 'timestamp': (3.68, 3.8)}, {'text': 'ONE', 'timestamp': (3.94, 4.06)}, {'text': 'DAY', 'timestamp': (4.16, 4.3)}, {'text': 'THIS', 'timestamp': (6.36, 6.54)}, {'text': 'NATION', 'timestamp': (6.68, 7.1)}, {'text': 'WILL', 'timestamp': (7.32, 7.56)}, {'text': 'RISE', 'timestamp': (7.8, 8.26)}, {'text': 'UP', 'timestamp': (8.38, 8.48)}, {'text': 'AND', 'timestamp': (10.08, 10.18)}, {'text': 'LIVE', 'timestamp': (10.26, 10.48)}, {'text': 'OUT', 'timestamp': (10.58, 10.7)}, {'text': 'THE', 'timestamp': (10.82, 10.9)}, {'text': 'TRUE', 'timestamp': (10.98, 11.18)}, {'text': 'MEANING', 'timestamp': (11.26, 11.58)}, {'text': 'OF', 'timestamp': (11.66, 11.7)}, {'text': 'ITS', 'timestamp': (11.76, 11.88)}, {'text': 'CREED', 'timestamp': (12.0, 12.38)}]}
```
보μλ€μνΌ λͺ¨λΈμ΄ ν
μ€νΈλ₯Ό μΆλ‘ ν λΏλ§ μλλΌ κ° λ¨μ΄λ₯Ό λ§ν μμ κΉμ§λ μΆλ ₯νμ΅λλ€.
νμ€ν¬λ§λ€ λ€μν 맀κ°λ³μλ₯Ό κ°μ§κ³ μλλ°μ. μνλ νμ€ν¬μ APIλ₯Ό μ°Έμ‘°ν΄μ λ°κΏλ³Ό μ μλ μ¬λ¬ 맀κ°λ³μλ₯Ό μ΄ν΄λ³΄μΈμ!
μ§κΈκΉμ§ λ€λ€λ³Έ [`~transformers.AutomaticSpeechRecognitionPipeline`]μλ `chunk_length_s` 맀κ°λ³μκ° μμ΅λλ€. μνλ 1μκ° λΆλμ λμμμ μλ§ μμ
μ ν λμ²λΌ, μΌλ°μ μΌλ‘ λͺ¨λΈμ΄ μ체μ μΌλ‘ μ²λ¦¬ν μ μλ λ§€μ° κΈ΄ μ€λμ€ νμΌμ μ²λ¦¬ν λ μ μ©νμ£ .
λμμ΄ λ λ§ν 맀κ°λ³μλ₯Ό μ°Ύμ§ λͺ»νλ€λ©΄ μΈμ λ μ§ [μμ²](https://github.com/huggingface/transformers/issues/new?assignees=&labels=feature&template=feature-request.yml)ν΄μ£ΌμΈμ!
## λ°μ΄ν°μΈνΈμμ Pipeline μ¬μ©νκΈ°[[using-pipelines-on-a-dataset]]
νμ΄νλΌμΈμ λκ·λͺ¨ λ°μ΄ν°μΈνΈμμλ μΆλ‘ μμ
μ ν μ μμ΅λλ€. μ΄λ μ΄ν°λ μ΄ν°λ₯Ό μ¬μ©νλ κ±Έ μΆμ²λ립λλ€.
```py
def data():
for i in range(1000):
yield f"My example {i}"
pipe = pipe(model="gpt2", device=0)
generated_characters = 0
for out in pipe(data()):
generated_characters += len(out["generated_text"])
```
μ΄ν°λ μ΄ν° `data()`λ κ° κ²°κ³Όλ₯Ό νΈμΆλ§λ€ μμ±νκ³ , νμ΄νλΌμΈμ μ
λ ₯μ΄ μνν μ μλ μλ£κ΅¬μ‘°μμ μλμΌλ‘ μΈμνμ¬ GPUμμ κΈ°μ‘΄ λ°μ΄ν°κ° μ²λ¦¬λλ λμ μλ‘μ΄ λ°μ΄ν°λ₯Ό κ°μ Έμ€κΈ° μμν©λλ€.(μ΄λ λ΄λΆμ μΌλ‘ [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)λ₯Ό μ¬μ©ν΄μ.) μ΄ κ³Όμ μ μ 체 λ°μ΄ν°μΈνΈλ₯Ό λ©λͺ¨λ¦¬μ μ μ¬νμ§ μκ³ λ GPUμ μ΅λν λΉ λ₯΄κ² μλ‘μ΄ μμ
μ 곡κΈν μ μκΈ° λλ¬Έμ μ€μν©λλ€.
κ·Έλ¦¬κ³ μΌκ΄ μ²λ¦¬κ° λ λΉ λ₯Ό μ μκΈ° λλ¬Έμ, `batch_size` 맀κ°λ³μλ₯Ό μ‘°μ ν΄λ΄λ μ’μμ.
λ°μ΄ν°μΈνΈλ₯Ό μννλ κ°μ₯ κ°λ¨ν λ°©λ²μ π€ [Datasets](https://github.com/huggingface/datasets/)λ₯Ό νμ©νλ κ²μΈλ°μ.
```py
# KeyDataset is a util that will just output the item we're interested in.
from transformers.pipelines.pt_utils import KeyDataset
pipe = pipeline(model="hf-internal-testing/tiny-random-wav2vec2", device=0)
dataset = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation[:10]")
for out in pipe(KeyDataset(dataset["audio"])):
print(out)
```
## μΉμλ²μμ Pipeline μ¬μ©νκΈ°[[using-pipelines-for-a-webserver]]
<Tip>
μΆλ‘ μμ§μ λ§λλ κ³Όμ μ λ°λ‘ νμ΄μ§λ₯Ό μμ±ν λ§ν 볡μ‘ν μ£Όμ μ
λλ€.
</Tip>
[Link](./pipeline_webserver)
## λΉμ Pipeline[[vision-pipeline]]
λΉμ νμ€ν¬λ₯Ό μν΄ [`pipeline`]μ μ¬μ©νλ μΌμ κ±°μ λμΌν©λλ€.
νμ€ν¬λ₯Ό μ§μ νκ³ μ΄λ―Έμ§λ₯Ό λΆλ₯κΈ°μ μ λ¬νλ©΄ λ©λλ€. μ΄λ―Έμ§λ μΈν°λ· λ§ν¬ λλ λ‘컬 κ²½λ‘μ ννλ‘ μ λ¬ν΄μ£ΌμΈμ. μλ₯Ό λ€μ΄ μλμ νμλ κ³ μμ΄λ μ΄λ€ μ’
μΈκ°μ?

```py
>>> from transformers import pipeline
>>> vision_classifier = pipeline(model="google/vit-base-patch16-224")
>>> preds = vision_classifier(
... images="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
... )
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
>>> preds
[{'score': 0.4335, 'label': 'lynx, catamount'}, {'score': 0.0348, 'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'}, {'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'}, {'score': 0.0239, 'label': 'Egyptian cat'}, {'score': 0.0229, 'label': 'tiger cat'}]
```
### ν
μ€νΈ Pipeline[[text-pipeline]]
NLP νμ€ν¬λ₯Ό μν΄ [`pipeline`]μ μ¬μ©νλ μΌλ κ±°μ λμΌν©λλ€.
```py
>>> from transformers import pipeline
>>> # This model is a `zero-shot-classification` model.
>>> # It will classify text, except you are free to choose any label you might imagine
>>> classifier = pipeline(model="facebook/bart-large-mnli")
>>> classifier(
... "I have a problem with my iphone that needs to be resolved asap!!",
... candidate_labels=["urgent", "not urgent", "phone", "tablet", "computer"],
... )
{'sequence': 'I have a problem with my iphone that needs to be resolved asap!!', 'labels': ['urgent', 'phone', 'computer', 'not urgent', 'tablet'], 'scores': [0.504, 0.479, 0.013, 0.003, 0.002]}
```
### λ©ν°λͺ¨λ¬ Pipeline[[multimodal-pipeline]]
[`pipeline`]μ μ¬λ¬ λͺ¨λ¬λ¦¬ν°(μμ£Ό: μ€λμ€, λΉλμ€, ν
μ€νΈμ κ°μ λ°μ΄ν° νν)λ₯Ό μ§μν©λλ€. μμλ‘ μκ°μ μ§μμλ΅(VQA; Visual Question Answering) νμ€ν¬λ ν
μ€νΈμ μ΄λ―Έμ§λ₯Ό λͺ¨λ μ¬μ©ν©λλ€. κ·Έ μ΄λ€ μ΄λ―Έμ§ λ§ν¬λ λ¬»κ³ μΆμ μ§λ¬Έλ μμ λ‘κ² μ λ¬ν μ μμ΅λλ€. μ΄λ―Έμ§λ URL λλ λ‘컬 κ²½λ‘μ ννλ‘ μ λ¬ν΄μ£ΌμΈμ.
μλ₯Ό λ€μ΄ μ΄ [κ±°λλͺ
μΈμ μ¬μ§](https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png)μμ κ±°λλͺ
μΈμ λ²νΈλ₯Ό λ¬»κ³ μΆλ€λ©΄,
```py
>>> from transformers import pipeline
>>> vqa = pipeline(model="impira/layoutlm-document-qa")
>>> vqa(
... image="https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png",
... question="What is the invoice number?",
... )
[{'score': 0.42514941096305847, 'answer': 'us-001', 'start': 16, 'end': 16}]
```
|