BwETA-IID-100M

BwETA (Boring's Experimental Transformer for Autoregression) is a small but feisty autoregressive model trained to predict the next token in a sequence. It might not be the best, but hey, it works!

Trained on determination, fueled by suffering, powered by free TPUs. πŸ”₯

πŸ› οΈ Model Details:

  • Size: 100M parameters
  • Training Data: 8M sentence (token length: 512)
  • Max Window Size: 512 tokens (It can handle bigger sequence but trained on 512 length)
  • Architecture: Transformer-based
  • Tokenizer: GPT-2 Tokenizer
  • Trainer: Custom-built because why not?

⚑ How to Use:

import BwETA #use v0.12

# Load the model from Hugging Face  
BwETA.load_hf("WICKED4950/BwETA-IID-100M")

# Load the model locally  
BwETA.load_local(path)

# Save the model locally  
model.save_pretrained(path)

# Generate text  
model.custom_generate()  # (Will be changed to model.generate() in future updates)

πŸ“Œ Notes:

  • This model is experimental and has basic functionalities.
  • If it breaks, don’t cryβ€”fix it (or let me know).
  • You can extend its functionalities in your own code.

πŸ“© Contact Me

If something doesn’t work or you just wanna chat about AI, hit me up on Instagram: Instagram

What's Next?

πŸš€ The future is uncertain... but it's going to be wild!

  • Possibly a 400M modelβ€”same architecture, but with more functionality.
  • Exploring new architectures & designing custom layers (because why not?).
  • Losing my sanity along the way? Most likely. But that’s the fun part. πŸ˜†
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train WICKED4950/BwETA-IID-100M

Collection including WICKED4950/BwETA-IID-100M