File size: 1,652 Bytes
db5855f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# OpenVINO Tokenizers: Incorporate Text Processing Into OpenVINO Pipelines

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/eaidova/openvino_notebooks_binder.git/main?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fopenvinotoolkit%252Fopenvino_notebooks%26urlpath%3Dtree%252Fopenvino_notebooks%252Fnotebooks%2Fopenvino-tokenizers%2Fopenvino-tokenizers.ipynb)
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/openvino-tokenizers/openvino-tokenizers.ipynb)

<center><img src="https://github.com/openvinotoolkit/openvino_notebooks/assets/51917466/047f9167-a4ef-4d3d-a33b-d124541f9e2c"></center>

OpenVINO Tokenizers is an OpenVINO extension and a Python library designed to streamline tokenizer conversion for seamless integration into your projects. It supports Python and C++ environments and is compatible with all major platforms: Linux, Windows, and MacOS.

## Notebook Contents
The tutorial consists of the following steps:
- Explain the basics of tokenization
- Install OpenVINO Tokenizers
- Convert tokenizer from HuggingFace Hub using CLI and Python API
- Create a Text Generation pipeline with OpenVINO tokenizer and detokenizer
- Combine an OpenVINO tokenizer with a classification model

## Installation Instructions

This is a self-contained example that relies solely on its own code.</br>
We recommend  running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to [Installation Guide](../../README.md).