Sentence Transformer Quantized Model for Movie Recommendation on Movie-Lens-Dataset

This repository hosts a quantized version of the Sentence Transformer model, fine-tuned for Movie Recommendation using the Movie Lens dataset. The model has been optimized using FP16 quantization for efficient deployment without significant accuracy loss.

Model Details

Model Architecture: Sentence Transformer
Task: Movie Recommendation
Dataset: Movie Lens Dataset
Quantization: Float16
Fine-tuning Framework: Hugging Face Transformers

Installation

!pip install pandas torch sentence-transformers scikit-learn

Loading the Model

from sentence_transformers import SentenceTransformer, InputExample, losses, util
import torch

# Load  model
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2', device=device)

# pass the movie name
recommend_by_movie_name("Toy Story")


# Recommend Movies
def recommend_by_movie_name(movie_name, top_k=5):
    titles = movie_subset["title"].tolist()
    matches = get_close_matches(movie_name, titles, n=1, cutoff=0.6)
    
    if not matches:
        print(f"❌ Movie '{movie_name}' not found in dataset.")
        return
    
    matched_title = matches[0]
    movie_index = movie_subset[movie_subset["title"] == matched_title].index[0]
    
    query_embedding = movie_embeddings[movie_index]
    scores = util.pytorch_cos_sim(query_embedding, movie_embeddings)[0]
    top_results = torch.topk(scores, k=top_k + 1)

    print(f"\n🎬 Recommendations for: {matched_title}")
    for score, idx_tensor in zip(top_results[0][1:], top_results[1][1:]):  # skip itself
        idx = idx_tensor.item()  # ✅ Convert tensor to int
        title = movie_subset.iloc[idx]["title"]
        print(f"  {title} (Score: {score:.4f})")

Fine-Tuning Details

Dataset

The dataset is sourced from Hugging Face’s Movie-Lens dataset. It contains 20,000 movies and their genres.

Training

Epochs: 2
warmup_steps: 100
show_progress_bar: True
Evaluation strategy: epoch

Quantization

Post-training quantization was applied using PyTorch’s half() precision (FP16) to reduce model size and inference time.

Repository Structure

.
├── quantized-model/               # Contains the quantized model files
│   ├── config.json
│   ├── model.safetensors
│   ├── tokenizer_config.json
│   ├── modules.json
│   └── special_tokens_map.json
│   ├── sentence_bert_config.jason
│   └── tokenizer.json
│   ├── config_sentence_transformers.jason
│   └── vocab.txt

├── README.md                      # Model documentation

Limitations

The model is trained specifically for Movie Recommendation on Movies Dataset.
FP16 quantization may result in slight numerical instability in edge cases.

Contributing

Feel free to open issues or submit pull requests to improve the model or documentation.