Chi Honolulu commited on
Commit
31eb429
·
1 Parent(s): b5047e6

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -18
README.md CHANGED
@@ -3,25 +3,23 @@
3
  # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
  license: mit
5
  language:
6
- - multilingual
7
  ---
8
  # Model Card for xlm-roberta-large-binary-cs-iib
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
- This model is fine-tuned for text classification of Supportive Interactions in Instant Messenger dialogs of Adolescents. The classification is binary and the model outputs probablities for labels> 0,1: Supportive Interactions present or not.
13
 
14
- ## Model Details
15
 
16
- ### Model Description
17
-
18
- Fine-tuned on the machine-translated version of a dataset of Instant Messenger dialogs of Adolescents originally in the Czech language.
19
 
20
  - **Developed by:** Anonymous
21
- - **Language(s):** multi-lingual
22
- - **Finetuned from:** xlm-roberta-large
23
 
24
- ### Model Sources
25
 
26
  <!-- Provide the basic links for the model. -->
27
 
@@ -32,12 +30,41 @@ Fine-tuned on the machine-translated version of a dataset of Instant Messenger d
32
  Here is how to use this model to classify a context-window of a dialogue:
33
 
34
  ```python
35
- from transformers import AutoTokenizer, AutoModel
36
- tokenizer = AutoTokenizer.from_pretrained('xlm-roberta-large')
37
- model = AutoModelForSequenceClassification.from_pretrained("chi2024/xlm-roberta-large-binary-cs-iib")
38
- # prepare input
39
- utterances = "Hi, how are you?;I am fine, how about you?;Thanks for asking."
40
- encoded_input = tokenizer(text, return_tensors='pt')
41
- # forward pass
42
- output = model(**encoded_input)
43
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
  license: mit
5
  language:
6
+ - cs
7
  ---
8
  # Model Card for xlm-roberta-large-binary-cs-iib
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
+ This model is fine-tuned for binary text classification of Supportive Interactions in Instant Messenger dialogs of Adolescents in Czech.
13
 
14
+ ## Model Description
15
 
16
+ The model was fine-tuned on a Czech dataset of Instant Messenger dialogs of Adolescents. The classification is binary and the model outputs probablities for labels {0,1}: Supportive Interactions present or not.
 
 
17
 
18
  - **Developed by:** Anonymous
19
+ - **Language(s):** cs
20
+ - **Finetuned from:** xml-roberta-large
21
 
22
+ ## Model Sources
23
 
24
  <!-- Provide the basic links for the model. -->
25
 
 
30
  Here is how to use this model to classify a context-window of a dialogue:
31
 
32
  ```python
33
+ import numpy as np
34
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
35
+
36
+ # Prepare input texts. This model is pretrained and fine-tuned for Czech
37
+ test_texts = ['Utterance1;Utterance2;Utterance3']
38
+
39
+ # Load the model and tokenizer
40
+ model = AutoModelForSequenceClassification.from_pretrained(
41
+ 'chi2024/xlm-roberta-large-binary-cs-iib', num_labels=2).to("cuda")
42
+
43
+ tokenizer = AutoTokenizer.from_pretrained(
44
+ 'chi2024/xlm-roberta-large-binary-cs-iib',
45
+ use_fast=False, truncation_side='left')
46
+ assert tokenizer.truncation_side == 'left'
47
+
48
+ # Define helper functions
49
+ def get_probs(text, tokenizer, model):
50
+ inputs = tokenizer(text, padding=True, truncation=True, max_length=256,
51
+ return_tensors="pt").to("cuda")
52
+ outputs = model(**inputs)
53
+ return outputs[0].softmax(1)
54
+
55
+ def preds2class(probs, threshold=0.5):
56
+ pclasses = np.zeros(probs.shape)
57
+ pclasses[np.where(probs >= threshold)] = 1
58
+ return pclasses.argmax(-1)
59
+
60
+ def print_predictions(texts):
61
+ probabilities = [get_probs(
62
+ texts[i], tokenizer, model).cpu().detach().numpy()[0]
63
+ for i in range(len(texts))]
64
+ predicted_classes = preds2class(np.array(probabilities))
65
+ for c, p in zip(predicted_classes, probabilities):
66
+ print(f'{c}: {p}')
67
+
68
+ # Run the prediction
69
+ print_predictions(test_texts)
70
+ ```