|
--- |
|
language: |
|
- fa |
|
library_name: hezar |
|
tags: |
|
- feature-extraction |
|
- hezar |
|
pipeline_tag: feature-extraction |
|
--- |
|
This is the original fasttext embedding model for Persian from [here](https://fasttext.cc/docs/en/crawl-vectors.html#models) loaded and converted using Gensim and exported to Hezar compatible format. |
|
For more info, see [here](https://fasttext.cc/docs/en/support.html). |
|
|
|
In order to use this model in Hezar you can simply use this piece of code: |
|
```bash |
|
pip install hezar |
|
``` |
|
```python |
|
from hezar.embeddings import Embedding |
|
|
|
fasttext = Embedding.load("hezarai/fasttext-fa-300") |
|
# Get embedding vector |
|
vector = fasttext("هزار") |
|
# Find the word that doesn't match with the rest |
|
doesnt_match = fasttext.doesnt_match(["خانه", "اتاق", "ماشین"]) |
|
# Find the top-n most similar words to the given word |
|
most_similar = fasttext.most_similar("هزار", top_n=5) |
|
# Find the cosine similarity value between two words |
|
similarity = fasttext.similarity("مهندس", "دکتر") |
|
``` |