SDSC6001 Project Implementation Code (Group 31)
Implement a hybrid recommendation system combining traditional rating-based item-based collaborative filtering with image and description similarity. The main steps are as follows:
Load and preprocess data
The dataset contains users (user_id
), items (asin
), and ratings (rating
). We construct a user-item rating pivot table and compute item-to-item collaborative filtering similarity (e.g., cosine similarity).Multimodal similarity search with Milvus
Assume item metadata includes image links (imageURLHighRes
) and textual descriptions (description
). We use the Milvus vector database to provide two types of similarity queries for eachasin
: image similarity and description similarity. These queries typically require:- Extracting image and description feature vectors for the target
asin
- Querying Milvus for the most similar items to the target vector
(The code below uses pseudocode for Milvus queries; in practice, implement with the Milvus Python SDK.)
- Extracting image and description feature vectors for the target
Hybrid similarity: constructing a fusion function
For any two items, we have:- Rating-based similarity (from collaborative filtering)
- Image similarity
- Description similarity
Define a fusion function, e.g., weighted sum:
hybrid_score = w_rating * rating_sim + w_image * image_sim + w_desc * desc_sim
The weights can be tuned on a validation set, e.g.,
w_rating=0.6
,w_image=0.2
,w_desc=0.2
.Generate recommendations for users
For a given user, first find all items they have rated. For each candidate item not yet rated, compute an aggregate hybrid similarity score with all items the user has rated, then rank candidates by score to produce the recommendation list.