kashif HF staff commited on
Commit
a213c02
·
verified ·
1 Parent(s): 27e944c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -4
README.md CHANGED
@@ -14,7 +14,7 @@ tags:
14
 
15
  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/SmolVLM.png" width="800" height="auto" alt="Image description">
16
 
17
- # SmolVLM
18
 
19
  SmolVLM is a compact open multimodal model that accepts arbitrary sequences of image and text inputs to produce text outputs. Designed for efficiency, SmolVLM can answer questions about images, describe visual content, create stories grounded on multiple images, or function as a pure language model without visual inputs. Its lightweight architecture makes it suitable for on-device applications while maintaining strong performance on multimodal tasks.
20
 
@@ -65,7 +65,7 @@ SmolVLM is a compact open multimodal model that accepts arbitrary sequences of i
65
 
66
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
 
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
 
70
  ## How to Get Started with the Model
71
 
@@ -130,8 +130,6 @@ Use the code below to get started with the model.
130
 
131
  #### Summary
132
 
133
-
134
-
135
  ## Model Examination [optional]
136
 
137
  <!-- Relevant interpretability work for the model goes here -->
 
14
 
15
  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/SmolVLM.png" width="800" height="auto" alt="Image description">
16
 
17
+ # SmolVLM-Instruct-DPO
18
 
19
  SmolVLM is a compact open multimodal model that accepts arbitrary sequences of image and text inputs to produce text outputs. Designed for efficiency, SmolVLM can answer questions about images, describe visual content, create stories grounded on multiple images, or function as a pure language model without visual inputs. Its lightweight architecture makes it suitable for on-device applications while maintaining strong performance on multimodal tasks.
20
 
 
65
 
66
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
 
68
+ Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. More information needed for further recommendations.
69
 
70
  ## How to Get Started with the Model
71
 
 
130
 
131
  #### Summary
132
 
 
 
133
  ## Model Examination [optional]
134
 
135
  <!-- Relevant interpretability work for the model goes here -->