VictorSanh commited on
Commit
b7e1543
·
verified ·
1 Parent(s): 791b16d

tips of memory gpu

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -213,6 +213,12 @@ print(generated_texts)
213
 
214
  # Model optimizations
215
 
 
 
 
 
 
 
216
  **Using Flash-attention 2 to speed up generation**
217
 
218
  <details><summary>Click to expand.</summary>
 
213
 
214
  # Model optimizations
215
 
216
+ **Vision encoder efficiency**
217
+
218
+ Given the high resolution supported, the vision part of the model can be memory hungry depending on your configuration. If you are GPU-memory-constrained, you can:
219
+ - **deactivate the image splitting.** To do so, add `do_image_splitting=False` when initializing the processor (`AutoProcessor.from_pretrained`). There are no changes required on the model side. Note that only the sft model has been trained with image splitting.
220
+ - **decrease the maximum image resolution.** To do so, add `size= {"longest_edge": 448, "shortest_edge": 378}` when initializing the processor (`AutoProcessor.from_pretrained`). In particular, the `longest_edge` value can be adapted to fit the need. We recommend using values that are multiples of 14. There are no changes required on the model side.
221
+
222
  **Using Flash-attention 2 to speed up generation**
223
 
224
  <details><summary>Click to expand.</summary>