datasets:
- natural_instructions
- the_pile
- cot
- Muennighoff/P3
inference:
parameters:
max_new_tokens: 5
temperature: 1
top_k: 1
language:
- en
pipeline_tag: text-generation
widget:
- example_title: ADE Corpus V2
text: >-
Label the sentence based on whether it is related to an adverse drug
effect (ADE). Details are described below:
Drugs: Names of drugs and chemicals that include brand names, trivial
names, abbreviations and systematic names were annotated. Mentions of
drugs or chemicals should strictly be in a therapeutic context. This
category does not include the names of metabolites, reaction byproducts,
or hospital chemicals (e.g. surgical equipment disinfectants).
Adverse effect: Mentions of adverse effects include signs, symptoms,
diseases, disorders, acquired abnormalities, deficiencies, organ damage or
death that strictly occur as a consequence of drug intake.
Possible labels:
1. ADE-related
2. not ADE-related
Sentence: A challenge with clozapine was feasible and showed no clinical
symptoms of eosinophilia.
Label: not ADE-related
Sentence: CONCLUSIONS: These results suggest that clozapine may cause TD;
however, the prevalence is low and the severity is relatively mild, with
no or mild self-reported discomfort.
Label: ADE-related
Sentence: Best-corrected visual acuity measurements were performed at
every visit.
Label: not ADE-related
Sentence: These cases were considered unusual in light of the short delay
of their onset after initiation of immunosuppressive therapy and their
fulminant course: 3 of these patients died of PCP occurring during the
first month of treatment with prednisone.
Label: ADE-related
Sentence: The INR should be monitored more frequently when bosentan is
initiated, adjusted, or discontinued in patients taking warfarin.
Label: not ADE-related
Sentence: NEH must be considered in lupus patients receiving cytotoxic
agents to avoid inappropriate use of corticosteroids or antibiotics in
this self-limited condition.
Label:
- example_title: Banking77
text: >-
The following is a banking customer service query. Classify the query into
one of the 77 categories available.
Possible labels:
1. Refund_not_showing_up
2. activate_my_card
3. age_limit
4. apple_pay_or_google_pay
5. atm_support
6. automatic_top_up
7. balance_not_updated_after_bank_transfer
8. balance_not_updated_after_cheque_or_cash_deposit
9. beneficiary_not_allowed
10. cancel_transfer
11. card_about_to_expire
12. card_acceptance
13. card_arrival
14. card_delivery_estimate
15. card_linking
16. card_not_working
17. card_payment_fee_charged
18. card_payment_not_recognised
19. card_payment_wrong_exchange_rate
20. card_swallowed
21. cash_withdrawal_charge
22. cash_withdrawal_not_recognised
23. change_pin
24. compromised_card
25. contactless_not_working
26. country_support
27. declined_card_payment
28. declined_cash_withdrawal
29. declined_transfer
30. direct_debit_payment_not_recognised
31. disposable_card_limits
32. edit_personal_details
33. exchange_charge
34. exchange_rate
35. exchange_via_app
36. extra_charge_on_statement
37. failed_transfer
38. fiat_currency_support
39. get_disposable_virtual_card
40. get_physical_card
41. getting_spare_card
42. getting_virtual_card
43. lost_or_stolen_card
44. lost_or_stolen_phone
45. order_physical_card
46. passcode_forgotten
47. pending_card_payment
48. pending_cash_withdrawal
49. pending_top_up
50. pending_transfer
51. pin_blocked
52. receiving_money
53. request_refund
54. reverted_card_payment?
55. supported_cards_and_currencies
56. terminate_account
57. top_up_by_bank_transfer_charge
58. top_up_by_card_charge
59. top_up_by_cash_or_cheque
60. top_up_failed
61. top_up_limits
62. top_up_reverted
63. topping_up_by_card
64. transaction_charged_twice
65. transfer_fee_charged
66. transfer_into_account
67. transfer_not_received_by_recipient
68. transfer_timing
69. unable_to_verify_identity
70. verify_my_identity
71. verify_source_of_funds
72. verify_top_up
73. virtual_card_not_working
74. visa_or_mastercard
75. why_verify_identity
76. wrong_amount_of_cash_received
77. wrong_exchange_rate_for_cash_withdrawal
Query: My card payment was not successful.
Label: declined_card_payment
Query: Is it possible for me to change my PIN number?
Label: change_pin
Query: limits on top ups
Label: top_up_limits
Query: I live in the EU - can I get a card?
Label: country_support
Query: How can I tell the source for my available funds?
Label: verify_source_of_funds
Query: Why am I getting declines when trying to make a purchase online?
Label:
- example_title: Overruling
text: >-
In law, an overruling sentence is a statement that nullifies a previous
case decision as a precedent, by a constitutionally valid statute or a
decision by the same or higher ranking court which establishes a different
rule on the point of law involved. Label the sentence based on whether it
is overruling or not.
Possible labels:
1. not overruling
2. overruling
Sentence: see mciver, 134 n.c.app. at 588, 518 s.e.2d at 526.
Label: not overruling
Sentence: to the extent that paprskar v. state, supra, applied the general
test of waiver of constitutional rights set forth in johnson v. zerbst,
supra, it is no longer viable.
Label: overruling
Sentence: narrowstep, 2010 wl 5422405, at *12.
Label: not overruling
Sentence: accordingly, to the extent of any conflict nemecek v. state, 621
s.w.2d 404 (tex.cr.app. 1980) is overruled.
Label: overruling
Sentence: the following facts are taken from the administrative record.
Label: not overruling
Sentence: see scott, supra at 352; commonwealth v. ruffin, 475 mass. 1003,
1004 (2016).
Label:
- example_title: Tweet Eval Hate
text: >-
Label whether the following tweet contains hate speech against either
immigrants or women. Hate Speech (HS) is commonly defined as any
communication that disparages a person or a group on the basis of some
characteristic such as race, color, ethnicity, gender, sexual orientation,
nationality, religion, or other characteristics.
Possible labels:
1. hate speech
2. not hate speech
Tweet: #TakeAKnee trump's a chump! @user is the real deal-helping so many
on his journey through greatness! Even @user can't stand #IQ45-stands to
CLEAN UP while he's in prison. One hot, lucky #Immigrant babe! #OHi12
#VoteBlue @user in the #Columbus area. #BlueWave baby!
Label: not hate speech
Tweet: HOW REFRESHING! In South Korea, there is no such thing as
'political correctness" when it comes to dealing with Muslim refugee
wannabes via @user
Label: hate speech
Tweet: New to Twitter-- any men on here know what the process is to get
#verified?
Label: not hate speech
Tweet: UK Pensioner Faces 350 Lashes In Saudi Arabia why does this country
exist it does nothing for migrants picks on old men no help from anyone
Label: not hate speech
Tweet: RT @user Her:I don't get what u want outta this relationship
Him:Well, I was only looking for a bj but u kept coming back
Label: not hate speech
Tweet: Dont worry @user you are and will always be the most hysterical
woman.
Label:
GPT-JT
Model Summary
We present GPT-JT, a fork of GPT-6B, trained for 20,000 steps, that outperforms most 100B+ parameter models at classification, and improves most tasks relative to GPT-J-6B. GPT-JT was trained with a new decentralized algorithm on computers networked on slow 1Gbps links. GPT-JT is a bidirectional dense model, trained through UL2 objective with NI, P3, COT, the pile data.
Please check out our demo: TOMA-app.
Quick Start
from transformers import pipeline
pipe = pipeline(model='togethercomputer/GPT-JT-6B-v1')
pipe('''I like this! <-- Is it positive or negative?\nA:''')
or
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/GPT-JT-6B-v1")
model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-JT-6B-v1")
Training Data
We fine-tune GPT-J-6B on NI, P3, COT, the pile data.
Hyperparameters
We used AdamW with a learning rate of 1e-5 and global batch size of 64, and train for 20k steps. We used mix-precision training where the activation is in FP16 while the optimizer states are kept in FP32. We use both data parallelism and pipeline parallelism to conduct training. During training, we truncate the input sequence to 2048 tokens, and for input sequence that contains less than 2048 tokens, we concatenate multiple sequences into one long sequence to improve the data efficiency.
Infrastructure
We used the Together Research Computer to conduct training.