Triptuner Model

This model is trained to generate itineraries for locations in Sri Lanka's Central Province. It uses a custom transformer-based language model designed to handle character-level sequences.

Usage

The Triptuner model cannot be directly used with Hugging Face's built-in Inference API because it uses a custom architecture. Below are the instructions on how to manually load and use this model with PyTorch.

Load and Use the Model with PyTorch

import torch

# Define your custom model class
class BigramLanguageModel(nn.Module):
    # Include the complete definition of your BigramLanguageModel here
    
    # Example method definitions:
    def __init__(self):
        super().__init__()
        # Define your model layers here as per the training setup
        # Example:
        # self.token_embedding_table = nn.Embedding(vocab_size, n_embd)
        # self.position_embedding_table = nn.Embedding(block_size, n_embd)
        # self.blocks = nn.Sequential(*[Block(n_embd, n_head=n_head) for _ in range(n_layer)])
        # self.ln_f = nn.LayerNorm(n_embd)
        # self.lm_head = nn.Linear(n_embd, vocab_size)
        
    def forward(self, idx, targets=None):
        # Define the forward pass as per your model
        pass
    
    def generate(self, idx, max_new_tokens):
        # Implement the generate method for text generation
        pass

# Load the model weights from Hugging Face
model = BigramLanguageModel()
model_url = "https://huggingface.co/yoonusajwardapiit/triptuner/resolve/main/pytorch_model.bin"
model_weights = torch.hub.load_state_dict_from_url(model_url, map_location=torch.device('cpu'), weights_only=True)
model.load_state_dict(model_weights)
model.eval()

# Define your character mappings
chars = sorted(list(set("your_training_text_here")))  # Replace with the actual character set used in training
stoi = {ch: i for i, ch in enumerate(chars)}
itos = {i: ch for i, ch in enumerate(chars)}
encode = lambda s: [stoi[c] for c in s]
decode = lambda l: ''.join([itos[i] for i in l])

# Test the model with a sample prompt
prompt = "Hanthana"  # Replace with any relevant location or prompt
context = torch.tensor([encode(prompt)], dtype=torch.long)

# Generate text using the model
with torch.no_grad():
    generated = model.generate(context, max_new_tokens=250)  # Adjust the number of new tokens as needed

# Decode and print the generated text
generated_text = decode(generated[0].tolist())
print(generated_text)


## Training Data

The model was trained on a dataset containing information about various locations in Sri Lanka's Central Province.

## Model Architecture

- Number of Layers: 4
- Embedding Size: 64
- Number of Heads: 4
- Context Length: 32 tokens

## License

MIT License
Downloads last month
11
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-generation models for torch library.

Space using yoonusajwardapiit/triptuner 1