Amitz244 commited on
Commit
2fe96d4
·
verified ·
1 Parent(s): 16936f2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -14
README.md CHANGED
@@ -13,15 +13,7 @@ tags:
13
  ---
14
  # Don’t Judge Before You CLIP: Memorability Prediction Model
15
 
16
- This model is part of our paper:
17
- *"Don’t Judge Before You CLIP: A Unified Approach for Perceptual Tasks"*
18
- It was trained on the *LaMem dataset* to predict image memorability scores.
19
-
20
- ## Model Overview
21
-
22
- Visual perceptual tasks, such as image memorability prediction, aim to estimate how humans perceive and interpret images. Unlike objective tasks (e.g., object recognition), these tasks rely on subjective human judgment, making labeled data scarce.
23
-
24
- Our approach leverages *CLIP* as a prior for perceptual tasks, inspired by cognitive research showing that CLIP correlates well with human judgment. This suggests that CLIP implicitly captures human biases, emotions, and preferences. We fine-tune CLIP minimally using LoRA and incorporate an MLP head to adapt it to each specific task.
25
 
26
  ## Training Details
27
 
@@ -32,10 +24,6 @@ Our approach leverages *CLIP* as a prior for perceptual tasks, inspired by cogni
32
  - *Learning Rate*: 5e-05
33
  - *Batch Size*: 32
34
 
35
- ## Performance
36
-
37
- The model was trained on the *LaMem dataset* and exhibits *state-of-the-art generalization* to the *THINGS memorability dataset*.
38
- For more models and results on the five common splits of LaMem, please refer to our paper. *We achieve state-of-the-art (SOTA) performance on the LaMem dataset based on both Spearman correlation and MSE metrics*.
39
  ## Usage
40
 
41
  To use the model for inference:
@@ -48,7 +36,7 @@ from PIL import Image
48
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
49
 
50
  # Load model
51
- model = torch.load("lamem_all_clip_Lora_16.0R_8.0alphaLora_32_batch_0.00005_lossmse_headmlp.pth").to(device).eval()
52
 
53
  # Load an image
54
  image = Image.open("image_path.jpg").convert("RGB")
 
13
  ---
14
  # Don’t Judge Before You CLIP: Memorability Prediction Model
15
 
16
+ PreceptCLIP-Memorability is a model designed to predict image memorability (the likelihood of an image to be remembered). This is the official model from the paper "Don't Judge Before You CLIP: A Unified Approach for Perceptual Tasks" (https://arxiv.org/abs/2503.13260). Our model combines LoRA adaptation on the CLIP visual encoder with an additional MLP head to achieve state-of-the-art results.
 
 
 
 
 
 
 
 
17
 
18
  ## Training Details
19
 
 
24
  - *Learning Rate*: 5e-05
25
  - *Batch Size*: 32
26
 
 
 
 
 
27
  ## Usage
28
 
29
  To use the model for inference:
 
36
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
37
 
38
  # Load model
39
+ model = torch.load("PercrptCLIP_Memorability.pth").to(device).eval()
40
 
41
  # Load an image
42
  image = Image.open("image_path.jpg").convert("RGB")