File size: 10,360 Bytes
092efe0 0089ab7 092efe0 0089ab7 092efe0 0089ab7 95e005c d91dcaa 0089ab7 0f35095 0089ab7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 |
---
datasets:
- natural_instructions
- the_pile
- cot
- Muennighoff/P3
inference:
parameters:
max_new_tokens: 5
temperature: 1.0
top_k: 1
language:
- en
pipeline_tag: text-generation
widget:
-
example_title: "ADE Corpus V2"
text: |-
Label the sentence based on whether it is related to an adverse drug effect (ADE). Details are described below:
Drugs: Names of drugs and chemicals that include brand names, trivial names, abbreviations and systematic names were annotated. Mentions of drugs or chemicals should strictly be in a therapeutic context. This category does not include the names of metabolites, reaction byproducts, or hospital chemicals (e.g. surgical equipment disinfectants).
Adverse effect: Mentions of adverse effects include signs, symptoms, diseases, disorders, acquired abnormalities, deficiencies, organ damage or death that strictly occur as a consequence of drug intake.
Possible labels:
1. ADE-related
2. not ADE-related
Sentence: A challenge with clozapine was feasible and showed no clinical symptoms of eosinophilia.
Label: not ADE-related
Sentence: CONCLUSIONS: These results suggest that clozapine may cause TD; however, the prevalence is low and the severity is relatively mild, with no or mild self-reported discomfort.
Label: ADE-related
Sentence: Best-corrected visual acuity measurements were performed at every visit.
Label: not ADE-related
Sentence: These cases were considered unusual in light of the short delay of their onset after initiation of immunosuppressive therapy and their fulminant course: 3 of these patients died of PCP occurring during the first month of treatment with prednisone.
Label: ADE-related
Sentence: The INR should be monitored more frequently when bosentan is initiated, adjusted, or discontinued in patients taking warfarin.
Label: not ADE-related
Sentence: NEH must be considered in lupus patients receiving cytotoxic agents to avoid inappropriate use of corticosteroids or antibiotics in this self-limited condition.
Label:
-
example_title: Banking77
text: |-
The following is a banking customer service query. Classify the query into one of the 77 categories available.
Possible labels:
1. Refund_not_showing_up
2. activate_my_card
3. age_limit
4. apple_pay_or_google_pay
5. atm_support
6. automatic_top_up
7. balance_not_updated_after_bank_transfer
8. balance_not_updated_after_cheque_or_cash_deposit
9. beneficiary_not_allowed
10. cancel_transfer
11. card_about_to_expire
12. card_acceptance
13. card_arrival
14. card_delivery_estimate
15. card_linking
16. card_not_working
17. card_payment_fee_charged
18. card_payment_not_recognised
19. card_payment_wrong_exchange_rate
20. card_swallowed
21. cash_withdrawal_charge
22. cash_withdrawal_not_recognised
23. change_pin
24. compromised_card
25. contactless_not_working
26. country_support
27. declined_card_payment
28. declined_cash_withdrawal
29. declined_transfer
30. direct_debit_payment_not_recognised
31. disposable_card_limits
32. edit_personal_details
33. exchange_charge
34. exchange_rate
35. exchange_via_app
36. extra_charge_on_statement
37. failed_transfer
38. fiat_currency_support
39. get_disposable_virtual_card
40. get_physical_card
41. getting_spare_card
42. getting_virtual_card
43. lost_or_stolen_card
44. lost_or_stolen_phone
45. order_physical_card
46. passcode_forgotten
47. pending_card_payment
48. pending_cash_withdrawal
49. pending_top_up
50. pending_transfer
51. pin_blocked
52. receiving_money
53. request_refund
54. reverted_card_payment?
55. supported_cards_and_currencies
56. terminate_account
57. top_up_by_bank_transfer_charge
58. top_up_by_card_charge
59. top_up_by_cash_or_cheque
60. top_up_failed
61. top_up_limits
62. top_up_reverted
63. topping_up_by_card
64. transaction_charged_twice
65. transfer_fee_charged
66. transfer_into_account
67. transfer_not_received_by_recipient
68. transfer_timing
69. unable_to_verify_identity
70. verify_my_identity
71. verify_source_of_funds
72. verify_top_up
73. virtual_card_not_working
74. visa_or_mastercard
75. why_verify_identity
76. wrong_amount_of_cash_received
77. wrong_exchange_rate_for_cash_withdrawal
Query: My card payment was not successful.
Label: declined_card_payment
Query: Is it possible for me to change my PIN number?
Label: change_pin
Query: limits on top ups
Label: top_up_limits
Query: I live in the EU - can I get a card?
Label: country_support
Query: How can I tell the source for my available funds?
Label: verify_source_of_funds
Query: Why am I getting declines when trying to make a purchase online?
Label:
-
example_title: Overruling
text: |-
In law, an overruling sentence is a statement that nullifies a previous case decision as a precedent, by a constitutionally valid statute or a decision by the same or higher ranking court which establishes a different rule on the point of law involved. Label the sentence based on whether it is overruling or not.
Possible labels:
1. not overruling
2. overruling
Sentence: see mciver, 134 n.c.app. at 588, 518 s.e.2d at 526.
Label: not overruling
Sentence: to the extent that paprskar v. state, supra, applied the general test of waiver of constitutional rights set forth in johnson v. zerbst, supra, it is no longer viable.
Label: overruling
Sentence: narrowstep, 2010 wl 5422405, at *12.
Label: not overruling
Sentence: accordingly, to the extent of any conflict nemecek v. state, 621 s.w.2d 404 (tex.cr.app. 1980) is overruled.
Label: overruling
Sentence: the following facts are taken from the administrative record.
Label: not overruling
Sentence: see scott, supra at 352; commonwealth v. ruffin, 475 mass. 1003, 1004 (2016).
Label:
-
example_title: "Tweet Eval Hate"
text: |-
Label whether the following tweet contains hate speech against either immigrants or women. Hate Speech (HS) is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics.
Possible labels:
1. hate speech
2. not hate speech
Tweet: #TakeAKnee trump's a chump! @user is the real deal-helping so many on his journey through greatness! Even @user can't stand #IQ45-stands to CLEAN UP while he's in prison. One hot, lucky #Immigrant babe! #OHi12 #VoteBlue @user in the #Columbus area. #BlueWave baby!
Label: not hate speech
Tweet: HOW REFRESHING! In South Korea, there is no such thing as 'political correctness" when it comes to dealing with Muslim refugee wannabes via @user
Label: hate speech
Tweet: New to Twitter-- any men on here know what the process is to get #verified?
Label: not hate speech
Tweet: UK Pensioner Faces 350 Lashes In Saudi Arabia why does this country exist it does nothing for migrants picks on old men no help from anyone
Label: not hate speech
Tweet: RT @user Her:I don't get what u want outta this relationship Him:Well, I was only looking for a bj but u kept coming back
Label: not hate speech
Tweet: Dont worry @user you are and will always be the most hysterical woman.
Label:
---
<h1 style="font-size: 42px">TOGETHER RESEARCH<h1/>
# Model Summary
We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks. GPT-JT was trained with a new decentralized algorithm with 1G interconnect.
GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
**Please check out our demo: [TOMA-app](https://huggingface.co/spaces/togethercomputer/TOMA-app).**
# Quick Start
```python
from transformers import pipeline
pipe = pipeline(model='togethercomputer/GPT-JT-6B-v1')
pipe('''Please answer the following question:\n\nQuestion: Where is Zurich?\nAnswer:''')
```
or
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/GPT-JT-6B-v1")
model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-JT-6B-v1")
```
# Training Data
We fine-tune [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6B) on NI, P3, COT, the pile data.
- [Natural-Instructions](https://github.com/allenai/natural-instructions)
- [P3](https://huggingface.co/datasets/Muennighoff/P3)
- [MMLU-COT](https://github.com/jasonwei20/flan-2/blob/main/mmlu-cot.json)
- [the pile](https://huggingface.co/datasets/the_pile)
# Hyperparameters
We used AdamW with a learning rate of 1e-5 and global batch size of 64, and train for 20k steps.
We used mix-precision training where the activation is in FP16 while the optimizer states are kept in FP32.
We use both data parallelism and pipeline parallelism to conduct training.
During training, we truncate the input sequence to 2048 tokens, and for input sequence that contains less than 2048 tokens, we concatenate multiple sequences into one long sequence to improve the data efficiency.
# Infrastructure
We used [the Together Research Computer](https://together.xyz/) to conduct training. |