first commit

Browse files

Files changed (4) hide show

README.md +12 -101
metadata.yaml +0 -24
train.csv +0 -0
validation.csv +0 -0

README.md CHANGED Viewed

@@ -1,107 +1,18 @@
 ---
-pipeline_tag: sentence-similarity
-license: apache-2.0
 tags:
-- sentence-transformers
-- feature-extraction
-- sentence-similarity
-- transformers
 ---
-# sentence-transformers/paraphrase-multilingual-mpnet-base-v2
-This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
-## Usage (Sentence-Transformers)
-Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
-```
-pip install -U sentence-transformers
-```
-Then you can use the model like this:
-```python
-from sentence_transformers import SentenceTransformer
-sentences = ["This is an example sentence", "Each sentence is converted"]
-model = SentenceTransformer('sentence-transformers/paraphrase-multilingual-mpnet-base-v2')
-embeddings = model.encode(sentences)
-print(embeddings)
-```
-## Usage (HuggingFace Transformers)
-Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
-```python
-from transformers import AutoTokenizer, AutoModel
-import torch
-#Mean Pooling - Take attention mask into account for correct averaging
-def mean_pooling(model_output, attention_mask):
-    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
-    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
-    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
-# Sentences we want sentence embeddings for
-sentences = ['This is an example sentence', 'Each sentence is converted']
-# Load model from HuggingFace Hub
-tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/paraphrase-multilingual-mpnet-base-v2')
-model = AutoModel.from_pretrained('sentence-transformers/paraphrase-multilingual-mpnet-base-v2')
-# Tokenize sentences
-encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
-# Compute token embeddings
-with torch.no_grad():
-    model_output = model(**encoded_input)
-# Perform pooling. In this case, max pooling.
-sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
-print("Sentence embeddings:")
-print(sentence_embeddings)
-```
-## Evaluation Results
-For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=sentence-transformers/paraphrase-multilingual-mpnet-base-v2)
-## Full Model Architecture
-```
-SentenceTransformer(
-  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
-  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
-)
-```
-## Citing & Authors
-This model was trained by [sentence-transformers](https://www.sbert.net/).
-If you find this model helpful, feel free to cite our publication [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084):
-```bibtex
-@inproceedings{reimers-2019-sentence-bert,
-    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
-    author = "Reimers, Nils and Gurevych, Iryna",
-    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
-    month = "11",
-    year = "2019",
-    publisher = "Association for Computational Linguistics",
-    url = "http://arxiv.org/abs/1908.10084",
-}
-```

 ---
+language:
+- en  # Example: en
+license: MIT  # Example: apache-2.0 or any license from https://hf.co/docs/hub/model-repos#list-of-license-identifiers
 tags:
+- text-generation
+datasets:
+- waiting-messages  # Example: common_voice. Use dataset id from https://hf.co/datasets
+widget:
+- text: 'List of funny waiting messages:'
+  example_title: 'Funny waiting messages'
 ---
+# Langame/gpt2-waiting
+This fine-tuned model can generate funny waiting messages.
+[Langame](https://langa.me) uses these within its platform 😛.

metadata.yaml DELETED Viewed

@@ -1,24 +0,0 @@
-language:
-  - "List of ISO 639-1 code for your language"
-  - lang1
-  - lang2
-thumbnail: "url to a thumbnail used in social sharing"
-tags:
-- tag1
-- tag2
-license: "any valid license identifier"
-datasets:
-- dataset1
-- dataset2
-metrics:
-- metric1
-- metric2
-widget:
-- text: "Is this review positive or negative? Review: Best cast iron skillet you will every buy."
-  example_title: "Sentiment analysis"
-- text: "Barack Obama nominated Hilary Clinton as his secretary of state on Monday. He chose her because she had ..."
-  example_title: "Coreference resolution"
-- text: "On a shelf, there are five books: a gray book, a red book, a purple book, a blue book, and a black book ..."
-  example_title: "Logic puzzles"
-- text: "The two men running to become New York City's next mayor will face off in their first debate Wednesday night ..."
-  example_title: "Reading comprehension"

train.csv ADDED Viewed

The diff for this file is too large to render. See raw diff

validation.csv ADDED Viewed

The diff for this file is too large to render. See raw diff