---
library_name: setfit
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
metrics:
- accuracy
widget:
- text: ' '
- text: quantitative algorithmic hustle trading dot com
- text: cryptoart since early 2020 founder of ENCODE_graphics red_heart EARTH
- text: 'Chief Legal Officer krakenfx Not your lawyer Assumptions opinions prevarications
and predictions are mine not my employers '
- text: 'Chief of Staff at Remilia Corporation remiliacorp333 Warlord Commander at
YAYO Corporation YayoCorp THIS IS NOT A PROMISE OF EQUITY OR OWNERSHIP IN ANYTHING '
pipeline_tag: text-classification
inference: true
base_model: BAAI/bge-small-en-v1.5
model-index:
- name: SetFit with BAAI/bge-small-en-v1.5
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.4891640866873065
name: Accuracy
---
# SetFit with BAAI/bge-small-en-v1.5
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.
## Model Details
### Model Description
- **Model Type:** SetFit
- **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
- **Maximum Sequence Length:** 512 tokens
- **Number of Classes:** 27 classes
### Model Sources
- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
### Model Labels
| Label | Examples |
|:---------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| NFT_ARTIST |
- 'Onchain music cooprecordsxyz invinmusic cooprecsmusic Los Angeles'
- 'Artist etc Chicago'
- 'Where NFTs meet DeFi on SecretNetwork Are you Legendary nfts gamefi defi SCRT LGND Secret Network'
|
| UNDETERMINED | - '100 readerfunded writer All works free to republish bootleg use or cp All works coauthored by Tim Foley Patreon Australia'
- 'Pro skater father husband videogame character CEO philanthropist public skatepark advocate Old AF and still skating San Diego world at large'
- 'Sandy and blue '
|
| DEVELOPER | - 'BUIDL HODL CELR CelerNetwork BNB bnb48club BTC ETC ATOM No gods no masters only hashes Shadowy Coder trolling by day coding by night Metalhead Cosmos'
- 'crosspost to Farcaster Bluesky Twitter Lens Threads decentralized social in one feed iOS '
- 'VP of Engineering at avara space nerd my astrobin profile Opinions strictly my own '
|
| EXECUTIVE | - 'Epic Games founder and CEO '
- 'CEO compoundfinance San Francisco USA'
- 'DeFi FounderUnited_States bornraised Bahai Los Angeles'
|
| INFLUENCER | - 'That rug really tied the room together weekly news channel on hiatus '
- 'Conecto con personas de habla hispana con perfil propio dentro de bitcoin y comparto su valor preguntasbtc para respuestas en 24h lunaticoingetalbycom 2515 4D1C 6C36 C024'
- 'sols 学习新事物 不带偏见的去看币圈 macau'
|
| BUSINESS_DEVELOPER | - 'Chief of Growth fuel_network Ex0xMantle Ex0xPolygon Tea connoisseur Dog Lover Web3 Degen Views are my own Metaverse'
- 'Experience Bitcoin like never before '
- 'Bitcoin Blockchain bitcoinmining Bitpay Founded the Original WomenInBitcoin etc Los Angeles '
|
| TRADER | - 'your greater fool goblin town'
- 'Commander in chiefing Periodically 1 ranked trader on ByBit Zaza City'
- 'Ethereum Maximalist Synthetix Spartan '
|
| ONCHAIN_ANALYST | - 'technical and onchain analyst crypto stock real estate investor global head of news beincrypto spreading alpha United States'
- 'Cofounder reflexivityres acq by defitechglobal Using velodata New York NY'
- 'Cofounder ensuser research y2z_ventures Full time onchain moron ethereum'
|
| RESEARCHER | - 'Lets skip witty repartee discuss fundamental questions Views are mine not GMUs or Virginias Books Fairfax VA'
- 'research paradigm '
- 'Crypto Data Research 0x'
|
| INVESTOR | - 'peer to peer electronic cash enthusiast light__nh hethey '
- 'GoldLover '
- 'I enjoy business innovation lifelong learning to ChangeTheWorld to help others Entrepreneur interim CxO investor adventurer thinker doer NO DMs Global citizen'
|
| SECURITY_AUDITOR | - 'security researcher nascentsecurity EVM Enthusiast Gas Optimizoor Puzzle Cracker Fan of all things Static Analysis Fuzzing Symbolic Execution '
- 'think bad do good cofounder openpathsec los angeles'
- 'Head of GTM CyfrinAudits Ex Lead Dev Rel AlchemyPlatform Created cyfrinupdraft and AlchemyLearn Making web3 mainstream Ethereum'
|
| EDUCATOR | - 'Professor of Practice at Harvard Teaches Ec 10 some tweets might be educational Also Senior Fellow PIIE Was Chair of President Obamas CEA Cambridge MA'
- 'Jarrête des carrières Je vulgarise et décortique à vos côtés les nouvelles tokenomics et les influvoleurs crypto de notre époque Bitcoin'
- ' Bretton Woods NH'
|
| LAWYER | - 'Author of Digital Money Demystified DickinsonLaw AdvantageEvans AtTechIntersect Crypto IP Law As seen on Coindesk TV Yahoo Finance Bloomberg CNBC Nomad Team '
- 'UVa Vanderbilt Law Your guide to other worlds Crypto l Metaverse Web3 Not legal or financial advice I am A lawyerjust not YOUR lawyer USA'
- 'The Crypto Lawyer Юрист Rechtsanwältin محامية Advising Entrepreneurs Investors and Governments on Bitcoin Crypto since 2016 Contributor Forbes UAE Switzerland '
|
| ADVISOR | - 'Director of Government Relations at BlockchainAssn Author of the Token Taxonomy Act Former WarrenDavidson and Board of Advisors JoinSeedstarter Washington DC'
- 'Calculated Degen 2x cancer survivor 5x rug pull survivor Paper hands diamond wrist Building WumboLabs Advisoooor arcade_xyz '
- 'Doggfather Analytics Founderorange_square OrdData '
|
| COMMUNITY_MANAGER | - 'Contributor to the Optimism Collective OP'
- 'ecosystem growth indexcoop music NFT enjoyer wavWRLD_ not financial advice typos are my owm '
- 'Founder KryptoSeoul ericaplanet Ericaverse Organizer buidl_asia eth_seoul_ Seoul Chapter Lead She__Fi Alum Stanford Ewha Where is Erica'
|
| MARKETER | - 'Positivity Pusher CoFounder PurpleHorizons Future Tech Marketing Strategist Trend Spotter Storyteller Once a DJ Always a DJ Miami FL'
- 'elissa emm Head of Marketing at spruceid building decentralized identity Seattle WA'
- 'Marketing Superfluid_HQ Safaryclub solhotgirlclub She__Fi Cohort 9 Words in banklesshq PFP miladymaker 104 NFA Views are my own Brooklyn NY'
|
| ANGEL_INVESTOR | - 'Developer entrepreneur angel investor crypto enthusiast '
- 'larp LawliettesLab angel uvocapital '
- 'cofounder jokerace_io ecodao_ write on web3 angel thecowfund berlin'
|
| VENTURE_CAPITALIST | - 'visionary at core playful at the surface just launched GetCohosts WalkinEvents prev fabric_vc nothingnyc London UK New York USA'
- 'Crypto web3 Partner ColliderVC Standing on the shoulders of giants World State'
- 'startup investor and builder founder w_conviction before GP greylockVC accelerating AI adoption tech podcastchains'
|
| NFT_COLLECTOR | - 'FINE BITCOIN GOODS Get in THE BANTER Scarce City'
- 'Cofounder RKOTax omega based spicymargeth '
- 'Like a shadow following the light Time is actually another dimension Nikennftyeth Niken32lens bcard id 275 Multiverse of Madness'
|
| BLOGGER | - 'Reporter at Bloomberg business covering crypto blockchain companies Formerly CoinDesk DM open Opinions are my own New York USA'
- 'viamirror '
- 'senior writer NFT lead BanklessHQ '
|
| METAVERSE_ENTHUSIAST | - 'SMOL by Treasure_DAO Smolverse'
- 'Epic SciFi MMO strategy game from Pixelmatic ExordiumHQ Take command of a fleet of spaceships and fight for humanity NOW Sol System'
- 'Time to post tweets and save lives Creative Director PlayShadowWar Where dreams come true'
|
| FINANCIAL_ANALYST | - 'Editor of FTAlphaville Norwegian despite the Harry Potteresque name Author of TRILLIONS Views mine bla bla Oslo Norway'
- 'markets macro business anchor of 10am ET and ETF IQ on bloombergtv haverfordedu columbiajourn alum ktkaos on InstaThreads Opinions mine Midtown East Manhattan'
- 'Curious on how behavioral fallacies challenge financial markets and cryptos Always learning new things getting to know new people and having a bit of fun London England'
|
| DATA_SCIENTIST | - ' '
- 'Data Wizardry variantfund Chicago IL'
- 'NLP ML StatArb Math Bowdoincollege Team Doobro_CN Prev first intern Bybit_Official Plucking a feather from every goose but follow no one absolute New York NY'
|
| NODE_OPERATOR | - 'Founder ClayStack_HQ Building Liquid Staking long before they were called LSDs Running validator nodes at Vibing ClayClanDAO Metaverse'
- 'Restake ETH Never Worry about EigenLayer Caps EigenLayer'
- 'HonigdachsPod cohost Making Bitcoin green today netposmon Find me on nostr npub1cear2n95zcyze86s5hry2a0pdgs7euhnc0p7ewcq2284pp845t5szt8rhr '
|
| SHITCOINER | - 'Eternity belongs to those who live in the present I tweet once per week when Im pooping Results in occasional shitposting '
- 'Lets hold hands and be enemies enemieswithbenefitseth '
- '16th Chair of the Central Bank of Retards When I see chaos forming on the timeline I rush in to shitpost adding fuel to the fire Hyperbolic Time Chamber'
|
| MINER | - ' bitcoin beyonder economic futurist metagame winner rose'
- 'Steady lads deploying more hashrate Hashrate merchant luxortechnology btc Miami'
- 'SVP foundryservices I am a miner like my father before me previously greenidge_GREE Bitcoin '
|
| DATA_ANALYST | - 'Director of Research at proof_xyz Building charts that make NFTs a bit easier to understand '
- 'Shadowy mediocre Analyst tangent_xyz '
- 'Lead Analyst CryptoSlate Previously Saidler Bitcoin London'
|
## Evaluation
### Metrics
| Label | Accuracy |
|:--------|:---------|
| **all** | 0.4892 |
## Uses
### Direct Use for Inference
First install the SetFit library:
```bash
pip install setfit
```
Then you can load this model and run inference.
```python
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("kasparas12/crypto_individual_infer_model_setfit")
# Run inference
preds = model(" ")
```
## Training Details
### Training Set Metrics
| Training set | Min | Median | Max |
|:-------------|:----|:--------|:----|
| Word count | 2 | 13.6494 | 65 |
| Label | Training Sample Count |
|:---------------------|:----------------------|
| DEVELOPER | 702 |
| DATA_SCIENTIST | 34 |
| DATA_ANALYST | 8 |
| NODE_OPERATOR | 18 |
| MINER | 22 |
| SECURITY_AUDITOR | 129 |
| INVESTOR | 212 |
| ANGEL_INVESTOR | 84 |
| VENTURE_CAPITALIST | 467 |
| TRADER | 168 |
| SHITCOINER | 34 |
| BUSINESS_DEVELOPER | 306 |
| BUSINESS_ANALYST | 0 |
| COMMUNITY_MANAGER | 122 |
| MARKETER | 70 |
| FINANCIAL_ANALYST | 32 |
| ADVISOR | 79 |
| RESEARCHER | 227 |
| ONCHAIN_ANALYST | 29 |
| EXECUTIVE | 393 |
| INFLUENCER | 510 |
| LAWYER | 47 |
| BLOGGER | 55 |
| NFT_COLLECTOR | 174 |
| NFT_ARTIST | 312 |
| EDUCATOR | 134 |
| METAVERSE_ENTHUSIAST | 57 |
| UNDETERMINED | 740 |
### Training Hyperparameters
- batch_size: (64, 64)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
### Training Results
| Epoch | Step | Training Loss | Validation Loss |
|:------:|:----:|:-------------:|:---------------:|
| 0.0011 | 1 | 0.2537 | - |
| 0.0562 | 50 | 0.2412 | - |
| 0.1124 | 100 | 0.2242 | - |
| 0.1685 | 150 | 0.2066 | - |
| 0.2247 | 200 | 0.1811 | - |
| 0.2809 | 250 | 0.205 | - |
| 0.3371 | 300 | 0.1789 | - |
| 0.3933 | 350 | 0.1831 | - |
| 0.4494 | 400 | 0.1829 | - |
| 0.5056 | 450 | 0.1506 | - |
| 0.5618 | 500 | 0.1474 | - |
| 0.6180 | 550 | 0.0989 | - |
| 0.6742 | 600 | 0.1094 | - |
| 0.7303 | 650 | 0.1316 | - |
| 0.7865 | 700 | 0.1207 | - |
| 0.8427 | 750 | 0.1262 | - |
| 0.8989 | 800 | 0.1229 | - |
| 0.9551 | 850 | 0.0989 | - |
| 0.0003 | 1 | 0.2061 | - |
| 0.0155 | 50 | 0.2073 | - |
| 0.0310 | 100 | 0.1844 | - |
| 0.0465 | 150 | 0.1891 | - |
| 0.0619 | 200 | 0.1975 | - |
| 0.0774 | 250 | 0.1772 | - |
| 0.0929 | 300 | 0.2304 | - |
| 0.1084 | 350 | 0.2085 | - |
| 0.1239 | 400 | 0.1851 | - |
| 0.1394 | 450 | 0.1463 | - |
| 0.1548 | 500 | 0.1216 | - |
| 0.1703 | 550 | 0.1648 | - |
| 0.1858 | 600 | 0.1359 | - |
| 0.2013 | 650 | 0.163 | - |
| 0.2168 | 700 | 0.1563 | - |
| 0.2323 | 750 | 0.2 | - |
| 0.2478 | 800 | 0.1425 | - |
| 0.2632 | 850 | 0.1614 | - |
| 0.2787 | 900 | 0.1881 | - |
| 0.2942 | 950 | 0.133 | - |
| 0.3097 | 1000 | 0.1348 | - |
| 0.3252 | 1050 | 0.1256 | - |
| 0.3407 | 1100 | 0.1065 | - |
| 0.3561 | 1150 | 0.0932 | - |
| 0.3716 | 1200 | 0.122 | - |
| 0.3871 | 1250 | 0.0969 | - |
| 0.4026 | 1300 | 0.1386 | - |
| 0.4181 | 1350 | 0.1116 | - |
| 0.4336 | 1400 | 0.0866 | - |
| 0.4491 | 1450 | 0.084 | - |
| 0.4645 | 1500 | 0.1073 | - |
| 0.4800 | 1550 | 0.1065 | - |
| 0.4955 | 1600 | 0.1063 | - |
| 0.5110 | 1650 | 0.1235 | - |
| 0.5265 | 1700 | 0.0918 | - |
| 0.5420 | 1750 | 0.078 | - |
| 0.5574 | 1800 | 0.1358 | - |
| 0.5729 | 1850 | 0.0664 | - |
| 0.5884 | 1900 | 0.1123 | - |
| 0.6039 | 1950 | 0.0996 | - |
| 0.6194 | 2000 | 0.0471 | - |
| 0.6349 | 2050 | 0.1068 | - |
| 0.6504 | 2100 | 0.0933 | - |
| 0.6658 | 2150 | 0.0836 | - |
| 0.6813 | 2200 | 0.0858 | - |
| 0.6968 | 2250 | 0.0421 | - |
| 0.7123 | 2300 | 0.08 | - |
| 0.7278 | 2350 | 0.0902 | - |
| 0.7433 | 2400 | 0.0949 | - |
| 0.7587 | 2450 | 0.116 | - |
| 0.7742 | 2500 | 0.0733 | - |
| 0.7897 | 2550 | 0.101 | - |
| 0.8052 | 2600 | 0.0709 | - |
| 0.8207 | 2650 | 0.079 | - |
| 0.8362 | 2700 | 0.0706 | - |
| 0.8517 | 2750 | 0.0338 | - |
| 0.8671 | 2800 | 0.0812 | - |
| 0.8826 | 2850 | 0.063 | - |
| 0.8981 | 2900 | 0.075 | - |
| 0.9136 | 2950 | 0.081 | - |
| 0.9291 | 3000 | 0.1264 | - |
| 0.9446 | 3050 | 0.0766 | - |
| 0.9600 | 3100 | 0.0873 | - |
| 0.9755 | 3150 | 0.0512 | - |
| 0.9910 | 3200 | 0.0816 | - |
### Framework Versions
- Python: 3.9.16
- SetFit: 1.0.3
- Sentence Transformers: 2.2.2
- Transformers: 4.21.3
- PyTorch: 1.12.1+cu116
- Datasets: 2.4.0
- Tokenizers: 0.12.1
## Citation
### BibTeX
```bibtex
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
```